Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core(byte-efficiency): use log-normal distribution scoring #14977

Merged

Conversation

robatron
Copy link
Contributor

@robatron robatron commented Apr 15, 2023

Hi all!

This is my first PR to Lighthouse, and my first experience with log-normal distributions and curve-fitting, so please bear with me πŸ˜…

My intention is to address issue #11883. (Selected from the good first issue label.) Hopefully my approach isn't too far off, but in case it is, I hope this PR can still serve as a starting point for someone more familiar with these topics (or at least spark some discussion!)

Summary

This PR updates ByteEfficiencyAudits to use standard log-normal distribution scoring which replaces a custom method based on a linear piecewise function. Parameters for the log-normal distribution scoring were selected by fitting its CDF curve to the original linear piecewise function using an online curve fitting tool.

Screenshot 2023-04-02 at 09 29 07

Details

The ByteEfficiencyAudit.scoreForWastedMs() function uses a non-standard scoring method defined by linear interpolation between control points - (0, 1), (300, 0.75), (750, 0.5), and (5000, 0) - forming a linear piecewise function.

Screenshot 2023-03-30 at 18 47 05

desmos.com/dpuygs6bdh

To use log-normal distribution scoring, we need to find values for the two required parameters, p10 and median, such that the curve of the log-normal CDF fits this linear piecewise function. As suggested by @koraa in #11883, we can find these values with a function-fitting tool.

In this case, I opted for an online function fitting tool, MyCurveFit. MyCurveFit accepts a list of data points and a "fit function" containing coefficients, then computes the optimal coefficient values that best align the fit function with the given data points.

Generating data points for the linear piecewise function

MyCurveFit requires a list of up to 500 data points to fit the curve of the fit function. To approximate our linear piecewise function, we can use a spreadsheet to linearly interpolate between the control points (calculating the piecewise function's value at every 10th point between 0 and 5,000.)

Screenshot 2023-04-02 at 10 05 18

Note: I found that relying solely on the control points as the target led to a suboptimal fit b/c the curve-fitting process would focus on those points only, ignoring the points in between. Generating additional data points through linear interpolation as described above resulted in a better fit. Based on this, I wonder if 500 data points is enough for an accurate fit? Or are there better curve-fitting tools that would accept continuous functions as their target? πŸ€”

Defining a fit function from the log-normal CDF

MyCurveFit now needs a "fit function" that will be fit to the data points of the linear piecewise function. Deriving this fit function from the log-normal CDF used in getLogNormalScore(), we get:

y = (1 - erf((ln(x / m) * 0.9061938024368232) / ln(m / p))) / 2;

... which was derived like so:

  1. Start with the original log-normal CDF code from statistics.js:

    const INVERSE_ERFC_ONE_FIFTH = 0.9061938024368232;
    const xRatio = Math.max(Number.MIN_VALUE, value / median);
    const xLogRatio = Math.log(xRatio);
    const p10Ratio = Math.max(Number.MIN_VALUE, p10 / median);
    const p10LogRatio = -Math.log(p10Ratio);
    const standardizedX = (xLogRatio * INVERSE_ERFC_ONE_FIFTH) / p10LogRatio;
    const complementaryPercentile = (1 - erf(standardizedX)) / 2;
  2. Substitute standardizedX:

    (1 - erf((xLogRatio * INVERSE_ERFC_ONE_FIFTH) / p10LogRatio)) / 2;
  3. Substitute xLogRatio, INVERSE_ERFC_ONE_FIFTH, and p10LogRatio:

    (1 - erf((Math.log(xRatio) * 0.9061938024368232) / -Math.log(p10Ratio))) / 2;
  4. Substitute xRatio, and p10Ratio:

    (1 -
      erf(
        (Math.log(Math.max(Number.MIN_VALUE, value / median)) *
          0.9061938024368232) /
          -Math.log(Math.max(Number.MIN_VALUE, p10 / median))
      )) /
      2;
  5. Remove Math.max(Number.MIN_VALUE, ...) (b/c we assume value, median, and p10 are all > 0)

    (1 -
      erf(
        (Math.log(value / median) * 0.9061938024368232) / -Math.log(p10 / median)
      )) /
      2;
  6. [Final step] Substitute JavaScript β†’ MyCurveFit syntax: Replace Math.log() β†’ ln(), -ln(A / B) β†’ ln(B / A), value β†’ x (independent variable), median β†’ m, and p10β†’p` (coefficients)

    (1 - erf((ln(x / m) * 0.9061938024368232) / ln(m / p))) / 2;

To validate the derived fit function, we can plot it alongside the original log-normal CDF (defined in desmos.com/o98tbeyt1t from statistics.getLogNormalScore()). We would expect them to produce identical curves, which they do βœ…

Screenshot 2023-04-02 at 10 40 32

Original log-normal CDF (in blue) and the derived fit function (in red) producing identical curves (desmos.com/dpuygs6bdh)

Fitting the log-normal CDF to the linear piecewise function

Now that we've entered our data points and fit function into MyCurvefit, we can perform the curve fitting process. This will provide the best-fitting values for our coefficients, p10 and median, so that the fit function matches the data points as closely as possible.

Results:

  • p10 = 150.2663
  • median = 934.7609

Screenshot 2023-04-02 at 08 55 00

Results of fitting the log-normal CDF to the linear piecewise function (MyCurveFit)

Updating ByteEfficiencyAudit to use log-normal distribution scoring

With the values for p10 and median obtained from the curve-fitting process described above, we can now update ByteEfficiencyAudit.scoreForWastedMs() to use the standard log-normal distribution scoring method:

// core/audits/byte-efficiency/byte-efficiency-audit.js

const WASTED_MS_P10 = 150;
const WASTED_MS_MEDIAN = 935;

// ...

class ByteEfficiencyAudit extends Audit {
  static scoreForWastedMs(wastedMs) {
    return Audit.computeLogNormalScore(
      { p10: WASTED_MS_P10, median: WASTED_MS_MEDIAN },
      wastedMs
    );
  }

  // ...
}

Conclusion

In this PR, I aimed to address issue #11883 by replacing the custom linear piecewise function scoring method in ByteEfficiencyAudits with standard log-normal distribution scoring. I obtained the required parameters (p10 and median) for the log-normal distribution by fitting its CDF curve to the original linear piecewise function using an online curve fitting tool, MyCurveFit.

(Again, if my approach is off track, I hope this PR can still serve as a starting point for someone more familiar with these topics or at least spark some discussion ✨)

Thank you for your time reviewing this PR. I look forward to your feedback!

@robatron robatron requested a review from a team as a code owner April 15, 2023 05:51
@robatron robatron requested review from adamraine and removed request for a team April 15, 2023 05:51
@robatron robatron changed the title Update ByteEfficiencyAudits to use log-normal distribution scoring (#11883) Update ByteEfficiencyAudits to use log-normal distribution scoring Apr 15, 2023
@robatron robatron changed the title Update ByteEfficiencyAudits to use log-normal distribution scoring core(byte-efficiency): Update ByteEfficiencyAudits to use log-normal distribution scoring Apr 15, 2023
@robatron robatron changed the title core(byte-efficiency): Update ByteEfficiencyAudits to use log-normal distribution scoring core(byte-efficiency): Use log-normal dist. scoring in ByteEfficiencyAudits Apr 15, 2023
@robatron robatron changed the title core(byte-efficiency): Use log-normal dist. scoring in ByteEfficiencyAudits core(byte-efficiency): use log-normal dist. scoring in ByteEfficiencyAudits Apr 15, 2023
Copy link
Member

@adamraine adamraine left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting in a lot of work here @robatron! This looks good to me.

@brendankenny do you have any thoughts?

@adamraine
Copy link
Member

@robatron looks like you need to rebase and update a few unit tests for CI to pass

@brendankenny
Copy link
Member

@brendankenny do you have any thoughts?

Nope! Seems reasonable and the resulting curve is a nice fit @robatron!

I think the only thing is how it will affect scores in practice, which we should see when the tests are updated. The curve is well behaved and similar enough to the existing one that I can't imagine anything fatal will pop up.

@robatron you'll need to run yarn update:sample-json to update a bunch of stuff at once, and then there will likely be a few unit test stragglers asserting results that might need an update as well.

// log-normal cumulative distribution function curve to the former method of linear interpolation
// scoring between the control points {average = 300 ms, poor = 750 ms, zero = 5000 ms} using the
// curve-fit tool at https://mycurvefit.com/ rounded to the nearest integer. See
// https://www.desmos.com/calculator/gcexiyesdi for an interactive visualization of the curve fit.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great demo!

@robatron
Copy link
Contributor Author

robatron commented May 20, 2023

@robatron you'll need to run yarn update:sample-json to update a bunch of stuff at once, and then there will likely be a few unit test stragglers asserting results that might need an update as well.

@brendankenny, done! Tests are all passing for me locally now βœ…

package.json Outdated
@@ -10,7 +10,7 @@
"smokehouse": "./cli/test/smokehouse/frontends/smokehouse-bin.js"
},
"engines": {
"node": ">=16.16"
"node": ">=16.16 <19"
Copy link
Contributor Author

@robatron robatron May 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed tests fail part way through with confusing engine errors in node v19, but work fine v18 (latest LTS). Thought it might be good to cap the node version compatibility for friendlier error messages to help avoid confusion. What do you all think?

Before πŸ‘Ž

[I] ➜ node --version
v19.8.0

[I] ➜ yarn test
yarn run v1.22.18

< ... about 450 lines later ... >

$ bash flow-report/test/run-flow-report-tests.sh
+++ dirname flow-report/test/run-flow-report-tests.sh
++ cd flow-report/test
++ pwd
+ SCRIPT_DIR=/Users/robmc/code/lighthouse/flow-report/test
+ LH_ROOT=/Users/robmc/code/lighthouse/flow-report/test/../..
+ ARGS=(--testMatch='{flow-report/**/*-test.ts,flow-report/**/*-test.tsx}' --require="$LH_ROOT/flow-report/test/setup/env-setup.ts")
+ cd /Users/robmc/code/lighthouse/flow-report/test/../..
+ node --loader=@esbuild-kit/esm-loader core/test/scripts/run-mocha-tests.js '--testMatch={flow-report/**/*-test.ts,flow-report/**/*-test.tsx}' --require=/Users/robmc/code/lighthouse/flow-report/test/../../flow-report/test/setup/env-setup.ts
(node:66690) ExperimentalWarning: Custom ESM Loaders is an experimental feature and might change at any time
(Use `node --trace-warnings ...` to show where the warning was created)
running 12 test files
/usr/local/Cellar/node/19.8.0/bin/node[66690]: ../src/module_wrap.cc:599:MaybeLocal<v8::Promise> node::loader::ImportModuleDynamically(Local<v8::Context>, Local<v8::Data>, Local<v8::Value>, Local<v8::String>, Local<v8::FixedArray>): Assertion `(it) != (env->id_to_function_map.end())' failed.
 1: 0x104bd587c node::Abort() [/usr/local/Cellar/node/19.8.0/bin/node]
 2: 0x104bd5867 node::Abort() [/usr/local/Cellar/node/19.8.0/bin/node]
 3: 0x104b97745 node::loader::ImportModuleDynamically(v8::Local<v8::Context>, v8::Local<v8::Data>, v8::Local<v8::Value>, v8::Local<v8::String>, v8::Local<v8::FixedArray>) [/usr/local/Cellar/node/19.8.0/bin/node]
 4: 0x104e6006f v8::internal::Isolate::RunHostImportModuleDynamicallyCallback(v8::internal::MaybeHandle<v8::internal::Script>, v8::internal::Handle<v8::internal::Object>, v8::internal::MaybeHandle<v8::internal::Object>) [/usr/local/Cellar/node/19.8.0/bin/node]
 5: 0x1051d3ac3 v8::internal::Runtime_DynamicImportCall(int, unsigned long*, v8::internal::Isolate*) [/usr/local/Cellar/node/19.8.0/bin/node]
 6: 0x104a162b4 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvInRegister_NoBuiltinExit [/usr/local/Cellar/node/19.8.0/bin/node]
flow-report/test/run-flow-report-tests.sh: line 20: 66690 Abort trap: 6           node --loader=@esbuild-kit/esm-loader core/test/scripts/run-mocha-tests.js ${ARGS[*]} "$@"
error Command failed with exit code 134.

After πŸ‘

[I] ➜ node --version
v19.8.0

[I] ➜ yarn test
yarn run v1.22.18

error lighthouse@10.2.0: The engine "node" is incompatible with this module. Expected version ">=16.16 <19". Got "19.8.0"
error Commands cannot run with an incompatible environment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be done in a breaking change. I'll add a note to address this in #14909 but we should remove it in this PR along with the .nvmrc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can repro in 19.8.0, but only in that version. @robatron looks like you were unlucky ending up on that version and not 19.8.1 which was released the next day with what looks like a fix: https://github.com/nodejs/node/blob/main/doc/changelogs/CHANGELOG_V19.md#2023-03-15-version-1981-current-targos

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol, yep, > 19.8.0 seems to work fine. @adamraine , I reverted this and removed the .nvmrc.

.nvmrc Outdated Show resolved Hide resolved
@adamraine adamraine changed the title core(byte-efficiency): use log-normal dist. scoring in ByteEfficiencyAudits core(byte-efficiency): use log-normal distribution scoring May 22, 2023
@adamraine adamraine merged commit 0ab01b8 into GoogleChrome:main May 22, 2023
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants