Request for Feedback
Hello, I've invited (mentioned) you all here today because you have been involved with this project in the past. I would like feedback on a code refactor that's in progress. If you want to get right to the code, here is the PR.
This project has a state machine that runs through a variety of states as a player progresses through ad and content playback. It did a great job solving the problems that the project initially set out to solve. As time went on and the project grew more fully-featured and complex, we found that the behavior of the state machine had become difficult to understand. The state machine diagram in the README became the focus of attention. Here is the initial state machine diagram from the initial commit in September 2013:
In time, the project maintainers agreed that the diagram was insufficient and that an update was needed. Here is the state machine diagram update from January 2015:
As time went on, the sentiment among contributors was that the new diagram was still insufficient and that more information was needed to contribute to the codebase. This is when the idea for this refactor was born: my impression was that the diagram wasn't insufficient, but rather, the code had become too difficult to understand.
Why had the code become too difficult to understand? The initial project was a simple state machine in one file, and as many new features and bugfixes were added, they all got added to that state machine in that one file. The file grew in size and scope and new features were added to existing code paths. Too many concerns got tangled together. Many functions had grown to serve multiple purposes, and most state was global.
The first major progress toward addressing this came in October 2016 when some features started getting moved into separate files: snapshot.js, redispatch.js, macros.js, and contentupdate.js. Separating code into separate files doesn't intrinsically solve any problems, but it does expose them: it became much clearer when code was relying too much on the implementation of unrelated features. The most dramatic example of this was the broad usage of the snapshot object to implement unrelated features. For example, one time we removed a case in the code where a snapshot was being taken when it was not needed. We later found that this broke
These successes have made an incredible different in the reliability of videojs-contrib-ads. They also taught us a lot about what still needed doing. Let's revisit that state machine. If a system is too complicated to understand, how do you simplify it without removing features? You separate the components into modules with clear responsibilities and clear contracts. Then you can understand the system at a high level without knowing all the details. When you modify one component, you only need to focus on the implementation of that component. With this goal in mind, I made a simpler state machine in January of 2017:
This diagram is easy to understand because it contains less information. In one immediate sense, this made it less useful. The important thing that I did here was decide what information constituted the architecture. Understanding of the system design improved among contributors.
This opened the doors to design discussions that were unburdened by implementation details. In those discussions, we found that the design did not always match our understanding of the domain problem. For example, we would often talk about if the player "was in ad mode". However, when we dug into it, we realized there was no consensus about what ad mode is. Further, the source code had no answer: edge cases that made decisions about if the plugin "was in ad mode" were not consistently implemented. One day, a few of us stood around in the Brightcove office and I admit I got a little stubborn: "Let's decide what ad mode is, once and for all, right now." The initial result of that discussion was a new method and new documentation in the README. As development continued, clarity around ad mode yielded dividends. The implementation became more consistent as edge cases started to reflect the shared understanding of the domain problem rather than being implemented on a case-by-case basis.
I have wanted to refactor the contrib-ads state machine for over a year. However, I knew it had to wait until the right time. I wanted to divide the state machine into sub-modules, but it had to wait until there was clarity and consensus about what those sub-modules should be. The simplified state diagram helped create that. Usage of new methods such as
The refactor that implements it is in this branch. Because this is an architecture we may live with for another 5 years, I want to make sure we get it right. I've written this post primarily to solicit feedback. Take a look, tell me if the code makes sense, and ask questions. Question both the code and the assumptions that the code depends on.
My goal for this section is that if you read everything that follows, you have enough context to understand anything you come across in the PR, even if I didn't cover it explicitly. If anything isn't clear, please ask about it!
This latest design uses our new, strict definition of ad mode. Yellow states are part of ad mode and blue states are not. Ad mode is defined as any time content playback is blocked by ads. Therefore, you transition from
Implementation details of prerolls are encapsulated within the Preroll class. The logic in the Preroll class was once contained by the
After a content video has ended, the plugin enters a new conceptual state where ads are never run again. This conceptual state was initially implemented on a case-by-case basis. In December 2016 it was refactored to use a new
All state transitions are initiated by the
Some event handlers are shared by all ad states or by all content states. Concrete states extend AdState and ContentState to inherit those common behaviors. Those in turn extend State which implements event handling and the all-important
New / Old State Mapping
During development, I created and frequently referred to a mapping of new states to old states. I assume it may be helpful for code review as well, so here it is:
The text was updated successfully, but these errors were encountered:
Another notable change is the handling of
In most cases that I've seen, this resulted in ad mode for postrolls taking an extra 1000 milliseconds to complete and for ended events to be delayed.
In the refactored code, we require that integrations no longer fire their own ended event after postrolls. This may require migration, but from what I've seen this was never strictly necessary. If an ended event is triggered due to ad playback, the integration must wait until after that event before calling endLinearAdMode. That way that particular ended event will be redispatched as adended.
Here is the new flow: