perf(utils): improve the performance of scoring and reporting #212

MishaSeredenkoPushBased · 2023-11-09T17:40:34Z

Improved the performance of scoring and reporting.
Removed nested loops where possible, added dictionaries to go constant time O(1) instead of linear O(n).
Made minor refactoring, added some checks.
@BioPhoton As task owner, let me know if I understood the task correctly and implemented everything properly. Also, maybe you could propose additional changes for further performance improvements.

Closes #132

Closes #132 #132

BioPhoton · 2023-11-10T13:48:29Z

Could you add the new logic the the benchmark tests please.

BioPhoton

Looks good!
One thing would be good to add is the new benchmark like here: #137

packages/core/src/lib/implementation/persist.ts

packages/utils/src/lib/report.spec.ts

packages/utils/src/lib/report.ts

packages/utils/src/lib/scoring.ts

MishaSeredenkoPushBased · 2023-11-12T19:19:34Z

@BioPhoton @matejchalk well, it looks like the changes don't affect the performance in a positive way - I added the updated functions to the benchmark, file utils/perf/optimized3.mjs. The performance went 3 times slower than before, and even slower than base:

We definitely need to try the imperative + dictionaries. But it seems like the dictionaries filling also costs some performance.
Any other suggestions? 🤔

BioPhoton · 2023-11-12T19:56:35Z

could we get the traces for that?

MishaSeredenkoPushBased · 2023-11-13T08:57:09Z

I checked my changes by benchmarking - it appeared that it got 535.000 ops/s. Really bad I would say, even worse than base. Then, by reverting my changes one function by one, I figured out that the initial performance for the scoring was 750.000 - looks better, but not so cool. There are different other optimized solutions in the benchmark, which gain up to 1.600.000 (on my machine).
It looks like the performance is affected by the declarative approach, multiple spread operators, .map, .reduce and so on. I tried to go ultimately imperative and see what the difference could be.
I took the optimized2 solution from benchmark, fixed it (it didn't work properly), and optimized it more. I put it into optimized3 check, and it gained almost 1.800.000 ops/s!
The only thing is that the code it highly imperative - a lot of mutations, nested functions and so on. Looks definitely not perfect, I expect that someone could leave the comments where it could be improved. But overall, I consider that this huge 3.5 times performance improvement is more important than prettiness or higher readability.

@BioPhoton @matejchalk FYI

matejchalk

A have a question about the benchmarks. What numbers of audits and groups are you using?

The fact that imperative code is faster the declarative is not surprising (that's always the case in benchmarks which use extreme values), but the trick is figuring out if it's relevant to us or not.

If the performance optimizations are not noticeable for realistic amounts of data, then optimizing for code maintenance is still the way to go. E.g. if it doesn't make a difference until we reach thousands of groups, then it's not relevant to us, because that's not a realistic scenario and we would do better to prioritize our ability to understand and maintain the code and prevent bugs (side-effects like mutations are an anthesis to this). There's a famous quote that "premature optimization is the root of all evil", after all.

I would estimate that 100s and in extreme cases 1000s of audits are realistic (e.g. for the CLI, our very strict ESLint gives us 347 audits). So if the performance optimization has an impact at this level, then it's worth it, but if we don't see the performance benefits until we reach a million audits for example, then it isn't.

For groups, the number is much lower. I can't imagine we'd have over a 100 even.

e2e/cli-e2e/tests/__snapshots__/collect.spec.ts.snap

packages/utils/src/lib/scoring.ts

…one, fix comments

packages/utils/src/lib/transformation.spec.ts

MishaSeredenkoPushBased · 2023-11-14T11:41:44Z

Overall, as a comment on the changes - the final performance improvement is 2.3 times in comparison to base solution, with 1000 audits and 100 groups.

The more audits and groups are present, the bigger is the difference. Here are crazy 10.000 of audits and 100 groups:

Almost 3 times difference - looks like it's a really performant solution.

perf(utils): improve the performance of scoring and reporting

b669220

Closes #132 #132

MishaSeredenkoPushBased added 🔥 performance performance optimization 🧩 utils labels Nov 9, 2023

MishaSeredenkoPushBased added this to the 3. Presentation milestone Nov 9, 2023

MishaSeredenkoPushBased requested review from Tlacenka, vmasek, BioPhoton, IKatsuba, matejchalk and ChristopherPHolder November 9, 2023 17:40

MishaSeredenkoPushBased self-assigned this Nov 9, 2023

BioPhoton reviewed Nov 10, 2023

View reviewed changes

packages/core/src/lib/implementation/persist.ts Outdated Show resolved Hide resolved

matejchalk reviewed Nov 10, 2023

View reviewed changes

MishaSeredenkoPushBased and others added 4 commits November 12, 2023 10:03

Merge branch 'main' into improve-scoring-performance

88e3bb4

refactor(utils): fix pr comments, improve declarative style of scoring

6454525

refactor(utils): fix styling for linter

6c3d1d2

test(utils): extend the benchmark with the changed scoring functions

963c233

MishaSeredenkoPushBased added 2 commits November 13, 2023 09:35

perf(utils): change scoring logic to more imperative and performant

f7ad044

test(cli-e2e): update collect snapshot

ec581a2

matejchalk requested changes Nov 13, 2023

View reviewed changes

e2e/cli-e2e/tests/__snapshots__/collect.spec.ts.snap Outdated Show resolved Hide resolved

packages/utils/src/lib/scoring.ts Show resolved Hide resolved

packages/utils/src/lib/scoring.ts Outdated Show resolved Hide resolved

MishaSeredenkoPushBased added 3 commits November 14, 2023 12:20

perf(utils): change scoring to clone report object, add custom deepCl…

cf28a40

…one, fix comments

Merge branch 'main' into improve-scoring-performance

917910c

refactor(utils): move deepClone after merge main

09ae920

matejchalk approved these changes Nov 14, 2023

View reviewed changes

packages/utils/src/lib/transformation.spec.ts Show resolved Hide resolved

MishaSeredenkoPushBased merged commit 41d7c0b into main Nov 14, 2023
11 checks passed

MishaSeredenkoPushBased deleted the improve-scoring-performance branch November 14, 2023 11:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(utils): improve the performance of scoring and reporting #212

perf(utils): improve the performance of scoring and reporting #212

MishaSeredenkoPushBased commented Nov 9, 2023

BioPhoton commented Nov 10, 2023

BioPhoton left a comment

MishaSeredenkoPushBased commented Nov 12, 2023

BioPhoton commented Nov 12, 2023

MishaSeredenkoPushBased commented Nov 13, 2023 •

edited

Loading

matejchalk left a comment

MishaSeredenkoPushBased commented Nov 14, 2023

perf(utils): improve the performance of scoring and reporting #212

perf(utils): improve the performance of scoring and reporting #212

Conversation

MishaSeredenkoPushBased commented Nov 9, 2023

BioPhoton commented Nov 10, 2023

BioPhoton left a comment

Choose a reason for hiding this comment

MishaSeredenkoPushBased commented Nov 12, 2023

BioPhoton commented Nov 12, 2023

MishaSeredenkoPushBased commented Nov 13, 2023 • edited Loading

matejchalk left a comment

Choose a reason for hiding this comment

MishaSeredenkoPushBased commented Nov 14, 2023

MishaSeredenkoPushBased commented Nov 13, 2023 •

edited

Loading