Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(runner): independent gather and audit functions #13569

Merged
merged 22 commits into from
Feb 3, 2022

Conversation

adamraine
Copy link
Member

Step 2 of #13364

Splits Runner.run into two functions, Runner.gatherPhase and Runner.auditPhase. Right now, these two functions are always used together, but step 3 of #13364 will start using these functions separately in use flows.

Ref
#11313

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excited for this, though less so for this new Runner() interface :)

I was a fan of the

/**
 * @template {LH.Config.Config | LH.Config.FRConfig} TConfig
 * @param {(runnerData: {config: TConfig, driverMock?: Driver}) => Promise<LH.Artifacts>} gatherFn
 * @param {{config: TConfig, driverMock?: Driver}} options
 * @return {Promise<LH.Artifacts>}
 */
static async gather(gatherFn, options) {}

/**
 * @template {LH.Config.Config | LH.Config.FRConfig} TConfig
 * @param {LH.Artifacts} artifacts
 * @param {{config: TConfig, computedCache: Map<string, ArbitraryEqualityMap>}} options
 * @return {Promise<LH.RunnerResult|undefined>}
 */
static async audit(artifacts, options) {}

approach

lighthouse-core/fraggle-rock/gather/navigation-runner.js Outdated Show resolved Hide resolved
lighthouse-core/fraggle-rock/gather/navigation-runner.js Outdated Show resolved Hide resolved
Comment on lines +232 to +236
const gatherEntry = timingEntries.find(e => e.name === 'lh:runner:gather');
const auditEntry = timingEntries.find(e => e.name === 'lh:runner:audit');
const gatherTiming = gatherEntry?.duration || 0;
const auditTiming = auditEntry?.duration || 0;
return {entries: timingEntries, total: gatherTiming + auditTiming};
Copy link
Member Author

@adamraine adamraine Jan 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might encounter issues in a flows as we move forward with #13364. I don't know if a flow step is guaranteed to query the correct timing entry from the logger.

@brendankenny this might be a reason to use a Runner instance instead. We could store timing information on that instance instead of querying the logger.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not causing any issues right now because log.takeTimingEntries clears everything between each flow step. I'll investigate this in the next part. There's probably a way to remember timings for each step without adding state to Runner.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW it's not clear what total should mean going forward. Right now it's the total of the LH run that just completed, so if run with -A, the -G timings are in the entries array, but not included in the total. That would be changed here, and I kind of prefer the old way, but I'm not sure if the total timing of -A runs is important enough to care (if e.g. you're trying to optimizing something, you're likely looking at some other timing or tool).

In a user flow, is the amount of time to gather and audit a particular step the most meaningful interpretation of "total"? Seems ok.

Should add some total tests if we're making a decision.

Also it's worth pointing out that total was a placeholder while we figured out entries and only kept it around because it was kind of useful after entries landed, so the only bar to clear here is "kind of useful" :)

@adamraine adamraine marked this pull request as ready for review January 19, 2022 23:01
@adamraine adamraine requested a review from a team as a code owner January 19, 2022 23:01
@adamraine adamraine requested review from connorjclark and removed request for a team January 19, 2022 23:01
lighthouse-core/runner.js Outdated Show resolved Hide resolved
lighthouse-core/runner.js Outdated Show resolved Hide resolved
lighthouse-core/runner.js Outdated Show resolved Hide resolved
lighthouse-core/test/runner-test.js Outdated Show resolved Hide resolved
@@ -59,7 +59,7 @@ describe('lightrider-entry', () => {
});

it('specifies the channel as lr', async () => {
const runStub = jest.spyOn(Runner, 'run');
const runStub = jest.spyOn(Runner, 'gather');
Copy link
Member

@brendankenny brendankenny Jan 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not that these spy tests are the best, but seems like these should be on the higher-level lighthouse() function now (from lighthouse-core/index.js)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the tests verifies that the config.settings.channel is set to lr before config reaches Runner. I think there is still value in testing Runner.gather rather than lighthouse().


const sentryContext = Sentry.getContext();
Sentry.captureBreadcrumb({
message: 'Run started',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this came up in #13519 (comment). With an independent gather and audit, there's no guarantee that this is the start of the run (or even called at all...is there any reason for snapshot to call Runner.gather()?).

Moving it to lighthouse-core/index.js makes sense for classic Lighthouse (that's what #12393 did), but that doesn't help the FR runners. This (and timings and errors) are the same orchestrator issue. Honestly all the runners make their own config, maybe they should all be doing their own copy of this kind of setup as well (or runner-helpers.js or whatever)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW the FR runners should call Runner.gather every time, but there is no guarantee that the FR runners themselves will be run. The only question is what happens in user-flow.js. Should we consider each flow step an independent "run" from the sentry perspective, or should the entire flow be a single "run".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought from #13364 that they wouldn't be calling into gather() anymore, since it's kind of redundant (they already define their own gather function)? With multiple gathers, at least, -G and -A will have to be tweaked to work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought from #13364 that they wouldn't be calling into gather() anymore

Using Runner.gather() in the FR runners is still useful because it gives us the timing entry, error handling, and the sentry call. As you said, we could add those things to each FR runner.

I did update #13364 to reflect that but didn't announce the change sorry.

With multiple gathers, at least, -G and -A will have to be tweaked to work.

This is true, but we don't need to get rid of the logic entirely if the other stuff in gather() is still useful. The issue should only be a problem for user flows which will have it's own save/load artifacts system.

}
await Sentry.captureException(err, {level: 'fatal'});
throw err;
throw Runner.createRunnerError(err, settings);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there is a test that gathering errors are localized but probably want one on audit() now as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't find a place where a "friendly" error makes it all the way up here. They should all be caught be the catch block in _runAudit.

Comment on lines 184 to 186
// If `gather` is run multiple times before `audit`, the timing entries for each `gather` can pollute one another.
// Timing entries are stored in artifacts.Timing, so we can clear the timing entries here.
log.takeTimeEntries();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior was intentional because takeTimeEntries also clears out any ongoing measures (like the old lh:runner:run which included gather+audit), so they wouldn't be included at the end of the run. Multiple gather steps do make for a problem if they're going to be turned into LHRs individually, though.

If there aren't going to be any measures that go over the entire run, then this works, but then the base artifact finalization should be able to call it directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there aren't going to be any measures that go over the entire run, then this works, but then the base artifact finalization should be able to call it directly.

I tried that at first, it clears out the marker used by the log.timeEnd right above this. Had to clear the timings after it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it cleared out regardless by this call since log.takeTimeEntries() clears them and we don't retain the returned value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, it clears lh:runner:gather not lh:runner:audit before its log.timeEnd is called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I meant as well, I think. That's the change in sample_v2 in the CI basics failure?

sorry you had to step into the mess that is collecting our timings inside the thing being timed for this PR (ask @connorjclark how the last time he made a change to them went...).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm currently trying to unravel the CI failure. Long story short I don't think this is the correct approach anymore :)

Comment on lines +232 to +236
const gatherEntry = timingEntries.find(e => e.name === 'lh:runner:gather');
const auditEntry = timingEntries.find(e => e.name === 'lh:runner:audit');
const gatherTiming = gatherEntry?.duration || 0;
const auditTiming = auditEntry?.duration || 0;
return {entries: timingEntries, total: gatherTiming + auditTiming};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW it's not clear what total should mean going forward. Right now it's the total of the LH run that just completed, so if run with -A, the -G timings are in the entries array, but not included in the total. That would be changed here, and I kind of prefer the old way, but I'm not sure if the total timing of -A runs is important enough to care (if e.g. you're trying to optimizing something, you're likely looking at some other timing or tool).

In a user flow, is the amount of time to gather and audit a particular step the most meaningful interpretation of "total"? Seems ok.

Should add some total tests if we're making a decision.

Also it's worth pointing out that total was a placeholder while we figured out entries and only kept it around because it was kind of useful after entries landed, so the only bar to clear here is "kind of useful" :)

Copy link
Member

@paulirish paulirish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks nice

@adamraine adamraine merged commit bbe05ad into master Feb 3, 2022
@adamraine adamraine deleted the fr-defer-audit-runner branch February 3, 2022 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants