Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(driver): extract evaluateAsync logic for FR driver #11633

Merged
merged 5 commits into from
Nov 18, 2020

Conversation

patrickhulce
Copy link
Collaborator

@patrickhulce patrickhulce commented Nov 5, 2020

Summary
This was an itty bitty change originally, but by the time you factor in all the test updates this turned into a mega PR, not a great sign of the simplicity of things to come 😅

The goal of this PR is to create a minimal shared driver interface between Fraggle Rock and existing Lighthouse gathering with minimal disruption to the existing API.

The primary structural change moving forward is that the FR driver is mostly a delegator to other submodules of focused functionality. This will prevent the new driver from becoming the monolith that we see today and facilitate sharing code between both versions. In order to achieve this, we also need a minimal interface that these submodules can use that isn't all of driver. This new class is ProtocolSession which is analogous to puppeteer's CDP session combined with (eventually) a bit of our PROTOCOL__TIMEOUT logic. This also keeps driver as the place for higher level "smarts" ontop of a connection as opposed to a mixture of the two.

  • Create base Driver class for Fraggle Rock, only implements sendCommand and evaluateAsync
  • Create base ProtocolSession class for Fraggle Rock
  • Create RuntimeController that extracts the evaluateAsync logic from Driver (no behavioral changes as evidenced by leaving 100% of existing tests intact)
  • Move many of the testing utilities in driver-test to a shared location.
  • Change Accessibility gatherer to rely on the more limited transitional API.
  • Add typechecking to existing driver to ensure it meets the transitional API.

If this is too big, I can try to break up the extraction into RuntimeController from the creation of the two other FR files but honestly that only leads to a +500/+200 split which doesn't that feel much better and causes some wasted very, very temporary work :/ I broke it up.

Related Issues/PRs
ref #11313
see #11622 for the bigger picture

@patrickhulce patrickhulce requested a review from a team as a code owner November 5, 2020 18:47
@patrickhulce patrickhulce requested review from adamraine and removed request for a team November 5, 2020 18:47
@google-cla google-cla bot added the cla: yes label Nov 5, 2020
@@ -6,6 +6,7 @@
'use strict';

const Fetcher = require('./fetcher.js');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separately, do we want to rename the fetcher.js to fetch-controller?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure if we like this overall pattern :)

I think we can coalesce a lot of our existing logic into separate *-controller files in FR dir as we migrate things

types/gatherer.d.ts Show resolved Hide resolved
types/gatherer.d.ts Outdated Show resolved Hide resolved
this._runtimeController = new RuntimeController(this);

/** @type {LH.Gatherer.FRTransitionalDriver} */
const typecheck = this; // eslint-disable-line no-unused-vars
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be done on the class statement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @implements is supported now:

/**
* @implements {LH.Config.Json}
*/
class Config {

Copy link
Collaborator Author

@patrickhulce patrickhulce Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brendankenny said there were things on the way to have new solutions but I'm not sure if we're there yet. Are we Brendan? :) he replied already, it is 🎉

@brendankenny
Copy link
Member

brendankenny commented Nov 5, 2020

I'm not sure if I understand the goal. If the new Driver needs to maintain the API of the old Driver so existing gatherers work, why not just refactor Driver in place with a new Connection that wraps a CDPSession?

Also if the goal of RuntimeController is just to encapsulate evaluateAsync, can we name it something about that? Out of context it's not clear which of several runtimes "Runtime" is referring to :)

@patrickhulce
Copy link
Collaborator Author

patrickhulce commented Nov 5, 2020

I'm not sure if I understand the goal. If the new Driver needs to maintain the API of the old Driver so existing gatherers work, why not just refactor Driver in place with a new Connection that wraps a CDPSession?

A few reasons:

  1. We don't want the connection API to be the same.
  2. We won't be matching the entire driver API along the way even if we want to preserve those fns for any plugins until a breaking version.
  3. We won't be completely maintaining the existing behavior of driver in the new world even if we're matching the API. The future examples around session and domain management we'll reach later will be an example of divergent behavior. Goal is to have parallel tracks here.
  4. The existing driver muddles several different puppeteer layers together that will make it very difficult to unlock the ability to have simultaneous connections that don't interfere with each other. Splitting out session as a dedicated concept is key to this and would alter behavior of the legacy path before we're ready.

Also if the goal of RuntimeController is just to encapsulate evaluateAsync, can we name it something about that? Out of context it's not clear which of several runtimes "Runtime" is referring to :)

Sure, ideas?

  • ExecutionContextController
  • EvaluateAsyncController

were also under consideration

@brendankenny
Copy link
Member

  1. We don't want the connection API to be the same.

Isn't it still on()/off()/sendCommand() or is there more to come? There's also connect/disconnect but one out of two implementations (raw.js) doesn't use those, which show connect/disconnect could have been restructured, we've just never bothered since adding raw and removing extension :)

  • We won't be matching the entire driver API along the way even if we want to preserve those fns for any plugins until a breaking version.

sure, but those can be deleted on Driver as well :)

  • We won't be completely maintaining the existing behavior of driver in the new world even if we're matching the API. The future examples around session and domain management we'll reach later will be an example of divergent behavior. Goal is to have parallel tracks here.
  • The existing driver muddles several different puppeteer layers together that will make it very difficult to unlock the ability to have simultaneous connections that don't interfere with each other. Splitting out session as a dedicated concept is key to this and would alter behavior of the legacy path before we're ready.

I'm interested in seeing what the divergent behavior will be and absolutely agree about the existing driver muddling layers, but these two also sound a lot like "it's nice to start with a blank slate" :P

Teasing aside, it is nice to start with a blank slate, but it takes things farther away from the "ideal" iterative process, which makes it harder to evaluate if the new organization will serve the existing gatherers well when everything eventually switches over and it makes it harder to review thinking of future FR gatherers because it's being structured to service functionality that doesn't exist yet. That's why I was suggesting trying to take the existing Driver as far as it could go before making the needed break to new code.

If you think this is the optimal route for now, though, maybe there's more context you could give? e.g. the delegation pattern in javascript typically requires a lot of bookkeeping and extra steps for reading/editing the implementation, while the calling code doesn't end up benefiting from any abstraction. It's hard to evaluate if that's worth it without an idea of where the "session and domain management"/"simultaneous connections" needs are going (e.g. will all Drivers always also be Sessions or are you thinking the two will be separable/changeable at some point? etc). #11622 doesn't seem to provide any insight on the session stuff (seems new since that PR was written :).

/** @return {Promise<void>} */
async connect() {
if (this._session) return;
const session = await this._page.target().createCDPSession();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speaking of rethinking some of the foundations: it was a super ancient architecture decision that's long gone that made it so Driver was the one initiating the protocol connection in Driver.connect() (vs attaching in GatherRunner or index.js or whoever was passing in the connection). And even though we still go through the motions, we broke it with DevTools connections (Raw connections are already attached so connect() is just a lie).

Would it simplify things in Driver to throw that away definitively by constructing with a Session instead? If a driver is only usable when it has a session, there may be no reason to make it possible to end up with one without a session, and it would let us drop all of the "Driver not connected to page" checks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was actually fortuitous :) we will need on demand session creation like this, so we it's good that we have it this way based on a page

@patrickhulce
Copy link
Collaborator Author

patrickhulce commented Nov 7, 2020

After writing all of this, it feels like the primary issue here is the creation of the two new files that I already proposed splitting out of this PR at the expense of some additional temporary work. Would you prefer to just go this route and punt for now?

I have tried to answer all of your questions below, but I suspect the core problem at hand is a difference in preferred strategy. By sequencing this way I am attempting to prioritize clarity in the status of FR work and minimal disruption to the existing flow. I believe the clarity of a parallel track makes it easier to identify what is missing, what lacks existing tests, and how far along we are. I believe minimizing disruption will make it easier to land this work given the long review cycles it will face as well as make it easier for other contributors to continue core work.

Isn't it still on()/off()/sendCommand() or is there more to come? There's also connect/disconnect but one out of two implementations (raw.js) doesn't use those, which show connect/disconnect could have been restructured

API was a bit vague on my part, it's mostly the semantics that are different with one very significant impact to the API. Our current connection impl is a mixture of transport details and individual session muddled together. We see the effects of this by our hacky preventing of enable/disable in driver and the fact that we cannot get events from a domain once it has been enabled elsewhere. This is a non-starter for sharing the same session with someone else (we'll revisit this again in a second). Our FR connection replacement is a session object whose on/off/sendCommand are forcibly scoped to a particular session. Creating connections with such radically different semantics and passing them off interchangably

sure, but those can be deleted on Driver as well :)

I'm not sure I follow this one. My whole point on that one was that we don't want to delete those on driver until we're actually ready to switch to FR. Keeping track of what's been upgraded, what relies on old and new mixed within the same class seems unnecessarily difficult compared to a minimal (which I'm also confused why that's a bad thing, but we'll get to that in a second)

I'm interested in seeing what the divergent behavior will be

Time to revisit sessions as promised :) driver does everything over essentially a single session today. As we migrate gatherers to support both modes, any domain and event listening will need to be evaluated for session intent, does this actually need a dedicated session or can it reuse the default? None of this is something that current structure is capable of.

it takes things farther away from the "ideal" iterative process

I think the very issue we're discussing here is we don't have the same definition of the ideal iterative process 😅 The ideal iterative process from my perspective is getting to a minimal E2E flow of the new use cases so all iterative development and design decisions of the new driver and session are motivated by actual examples as we need them. Trying to do that in the middle of a class that needs to maintain all its existing behavior that we're trying to change seems to run counter to that goal.

which makes it harder to evaluate if the new organization will serve the existing gatherers well when everything eventually switches over and it makes it harder to review thinking of future FR gatherers because it's being structured to service functionality that doesn't exist yet

Maybe I don't understand the actual proposal because I'm having trouble seeing why this would be any easier to evaluate at this stage by continuing here with a temporary connection class based on puppeteer.

will all Drivers always also be Sessions or are you thinking the two will be separable/changeable at some point?

They will need to be separated for reasons above, I'm thinking of some sort of .defaultSession property where those exist.

@patrickhulce
Copy link
Collaborator Author

it feels like the primary issue here is the creation of the two new files that I already proposed splitting out of this PR at the expense of some additional temporary work. Would you prefer to just go this route and punt for now?

I'll assume the answer to this was yes, and this now reflects your requests. @adamraine over to you!

Copy link
Member

@adamraine adamraine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM

@brendankenny
Copy link
Member

I'll assume the answer to this was yes, and this now reflects your requests. @adamraine over to you!

Haha, fair enough, but I thought I had this fairly well covered in our meeting :)

A parallel track for Driver is harder to meaningfully review because the architectural context is both in progress (so specifics are still being explored and aren't necessarily important yet) and something the entire current Lighthouse will have to eventually live on top of (so the specifics are fairly important). I was trying to aim my suggestions to improving collaboration in that space, but definitely don't want to make things more difficult for no benefit.

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some bikeshedding for you, but LGTM2

lighthouse-core/fraggle-rock/gather/runtime-controller.js Outdated Show resolved Hide resolved
makePromiseInspectable,
createDecomposedPromise,
flushAllTimersAndMicrotasks,
...mockCommands,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it feels weird to have protocol-specific test utility methods in the overall test-utils.js. Seems fine to just pull them from mock-commands.js than have a large (and growing) list in one file you have to click through for the implementation anyways?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the benefit to having to import from multiple locations.

than have a large (and growing) list in one file you have to click through for the implementation anyways?

VSCode Cmd+Click already handles the re-export and goes immediately to the definition in mock-commands already so you don't even see this list if you wanted to look at the impl of createMockSendCommandFn. Maybe I'm misunderstanding what you're suggesting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the benefit to having to import from multiple locations.

Obviously it's also perfectly usable if you'd rather have them in there, but keeping task-specific functions together in a module and then importing them from that module when needed (instead of from one giant namespace that has all possible functions) doesn't seem all that weird to me :)

lighthouse-core/fraggle-rock/gather/runtime-controller.js Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants