Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report: sort performance audits based on impact #15445

Merged
merged 39 commits into from
Oct 3, 2023
Merged

Conversation

adrianaixba
Copy link
Collaborator

@adrianaixba adrianaixba commented Sep 8, 2023

This

  • removes the Opportunities section and moves its audits into Diagnostics
  • in the renderer, calculates an audit's impact to sort using the guidance level as a tie breaker

In a follow up (#15447), score will be modified to use impact and will then also be used as a sorting variable.

https://lighthouse-git-metric-savings-score-googlechrome.vercel.app/sample-reports/english/

@brendankenny
Copy link
Member

  • in the renderer, calculates an audit's impact to sort using the guidance level as a tie breaker

    Would it be possible for this (the overallImpact() calc part) to happen in core? It would probably take a little rearranging (AFAIK audit results aren't currently altered after coming out of Audit.generateAuditResult(audit, product) but they would need to be for the perf category to do this scoring based on the values of the metrics), but for JSON consumers it would be far more valuable to get the processed impact-on-score numbers than curve control points and have to compute it manually.

  • What's the kind of case the linear impact fallback helps with? Is it for cases like 20s LCP, 5s improvement won't help the score much but is still a significant win? Does it end up mattering much in practice since (I'm assuming) overallImpact differences will typically dominate the sort?

  • Some comments for the two impact numbers and a quick high-level explanation comment on the goal of the sort would be greatly appreciated in there.

@adrianaixba
Copy link
Collaborator Author

@adamraine feel free to chime in!

Would it be possible for this (the overallImpact() calc part) to happen in core?

My understanding is that its difficult to do this, and maybe not immediately worth it because of the audits requiring a reference to others & the metric scores and such that you mentioned. We could potentially try for this in a future iteration?

What's the kind of case the linear impact fallback helps with?

the linear impact here is mostly for the bottom section of the audits (the less prioritized audits). You're right that the overallImpact will dominate the sort. The linear impact is mostly to help us prioritize audits when comparing between those that don't having savings (overallImpact).

@adamraine
Copy link
Member

adamraine commented Sep 27, 2023

Would it be possible for this (the overallImpact() calc part) to happen in core?

My understanding is that its difficult to do this, and maybe not immediately worth it because of the audits requiring a reference to others & the metric scores and such that you mentioned. We could potentially try for this in a future iteration?

Yeah, this creates a situation where the result of one audit is dependent on the result of another audit. What happens when someone does skipAudits: ['largest-contentful-paint']? FWIW we considered baking the overall impact into audit score for #15447 but decided to avoid it for this reason.

Is it for cases like 20s LCP, 5s improvement won't help the score much but is still a significant win?

Pretty much this. It's a way to compare impact of audits whose log normal impact gets rounded down to 0 because the actual metric value is so big.

Some comments for the two impact numbers and a quick high-level explanation comment on the goal of the sort would be greatly appreciated in there.

+1 to comments

@brendankenny
Copy link
Member

Would it be possible for this (the overallImpact() calc part) to happen in core?

My understanding is that its difficult to do this, and maybe not immediately worth it because of the audits requiring a reference to others & the metric scores and such that you mentioned. We could potentially try for this in a future iteration?

Yeah, this creates a situation where the result of one audit is dependent on the result of another audit. What happens when someone does skipAudits: ['largest-contentful-paint']? FWIW we considered baking the overall impact into audit score for #15447 but decided to avoid it for this reason.

Isn't the available information the same if doing it in runner vs in the report renderer? The values would just have to be optional, like how they're already treated in overallImpact?

Spending 30 seconds thinking about this so probably a better way to do it and definitely better naming possible, but MetricSavings could become

interface MetricSavings {
  LCP?: {
    value?: number;
    score?: number;
  };
  FCP?: {
    value?: number;
    score?: number;
  };
  CLS?: {
    value?: number;
    score?: number;
  };
  TBT?: {
    value?: number;
    score?: number;
  };
  INP?: {
    value?: number;
    score?: number;
  };
}

scores (or impacts, I don't know) are left empty in audit processing, filled in with a function call computeRelativeMetricScores(perfCategory) or whatever in runner, and stick overallImpact somewhere convenient. overallLinearImpact maybe too, but I don't know about that one's name :)

@adamraine
Copy link
Member

adamraine commented Sep 28, 2023

Isn't the available information the same if doing it in runner vs in the report renderer? The values would just have to be optional, like how they're already treated in overallImpact?

The difference is that overallImpact (in its current iteration) is an internal variable. If largest-contentful-paint is missing the sort order will change but the JSON will remain the same.

In general, I'm unsure if we should to expose overallImpact or individual metric impacts to users at all. I think it's better to keep the impact as an internal ranking heuristic and not an authoritative data point.

@brendankenny
Copy link
Member

FWIW:

  • the ordering isn't really internal, it's very visible in the HTML report :) I think it's ok to make it clear that heuristics are subject to change from version to version as this is iterated on.

  • Also, while I think it would be valuable to do a look at relative ranking of audits before and after this change to judge its impact (using e.g. HTTP Archive data for a broad sampling), my suggestion here is less about the ranking order and more about the much more concrete scoring impact, which is a straightforward application of the existing scoring system to the metric savings we're already exposing.

  • If largest-contentful-paint is missing the sort order will change but the JSON will remain the same.

    This is a good point for consistency (though possibly an argument for dropping the scoring-level dependency between opportunities and metrics and making the dependency explicit), but the 99% of the time or whatever when a metric isn't dropped it would be useful to have the score impact, so maybe it's worth the tradeoff?

@adamraine
Copy link
Member

the ordering isn't really internal, it's very visible in the HTML report :) I think it's ok to make it clear that heuristics are subject to change from version to version as this is iterated on.

I wasn't suggesting that the sorting is internal, but I don't think we are obligated to expose the exact order audits should appear in in the JSON.

Also, while I think it would be valuable to do a look at relative ranking of audits before and after this change to judge its impact (using e.g. HTTP Archive data for a broad sampling)

You might find this challenging to do because the report before this change has two different groups and sorts these categories using heuristics that are internal to the report renderer (overallSavingsMs in opportunities & score + drop informative to the bottom in diagnostics). Any comparison in HTTPA will require you to replicate the sorting logic before this patch as well.

my suggestion here is less about the ranking order and more about the much more concrete scoring impact, which is a straightforward application of the existing scoring system to the metric savings we're already exposing.

It will still be possible to compute the impact for scoring impact in HTTPA using the metric scoring options and metricSavings provided on the lhr.

This is a good point for consistency (though possibly an argument for dropping the scoring-level dependency between opportunities and metrics and making the dependency explicit), but the 99% of the time or whatever when a metric isn't dropped it would be useful to have the score impact, so maybe it's worth the tradeoff?

I think our options for this are:

  1. Create a proper audit dependency system.
    • The work required here doesn't seem worth the tradeoff to me
  2. Create some bespoke logic for the performance category when we create the lhr JSON. We currently treat all categories the same when constructing the JSON.
    • It's hard to imagine a solution here that won't introduce some technical debt

@brendankenny
Copy link
Member

Create some bespoke logic for the performance category when we create the lhr JSON. We currently treat all categories the same when constructing the JSON.

  • It's hard to imagine a solution here that won't introduce some technical debt

I don't believe that's the case. Because almost all the new perf audit infrastructure is optional (metricSavings, scoringOptions, even acronym, etc), overallImpact() would work essentially unchanged in runner.js over all categories because it already has to handle when those properties are missing (like long-tasks and other still informative audits). Or, equivalently, there is bespoke logic, but it's already implicitly encoded in all the special things perf audits get that the other categories' audits don't, so there's no new special handling needed.

In any case, it does feel like the JSON consumer story is being treated as an intermediate HTML-report-generation step here rather than an endpoint of its own.. BUT y'all have done a lot of good work and thinking on this and I don't want to block the effort. Carry on! ✌️

const {
overallImpact: aOverallImpact,
overallLinearImpact: aOverallLinearImpact,
} = this.overallImpact(a, metricAudits);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now we recompute each audit's overallImpact each time this comparator is run.

it ends up not being super costly, but... it just seems like good practice to avoid recomputing the same thing. can we move that higher?

.filter(audit => this._classifyPerformanceAudit(audit) === 'load-opportunity')
.filter(audit => !ReportUtils.showAsPassed(audit.result))
.sort((auditA, auditB) => this._getWastedMs(auditB) - this._getWastedMs(auditA));

const filterableMetrics = metricAudits.filter(a => !!a.relevantAudits);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the metric filter? it seems like we have all the info we need to resort when we filter to just FCP, etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. That might be best left as a follow up and the old metric filter is adequate for now. After this PR but before the release?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. clearly something about the score/icons will also have to land post this-PR , pre-release.
sg

@vercel
Copy link

vercel bot commented Oct 2, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
lighthouse ✅ Ready (Inspect) Visit Preview 💬 Add feedback Oct 2, 2023 11:44pm

@adamraine adamraine merged commit 896399b into main Oct 3, 2023
30 checks passed
@adamraine adamraine deleted the metric-savings-score branch October 3, 2023 18:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants