Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve processBeaconEvents hotpath #3200

Merged
merged 2 commits into from Mar 9, 2023
Merged

Improve processBeaconEvents hotpath #3200

merged 2 commits into from Mar 9, 2023

Conversation

t3chguy
Copy link
Member

@t3chguy t3chguy commented Mar 8, 2023

Improved test by @ara4n from 4230ms to 6.2ms spent in processBeaconEvents()

  1. The way the dict was being built was horribly inefficient, turning an O(n) task into O(n log2 n) - made worse by a GC taking 10% extra time
  2. The way the events were looped and filtered had memory leak potential around decryption event handlers and didn't bail early
  3. This was done synchronously with /sync code, causing the UI thread to lock up

If we had a map storing the beaconInfoId -> Beacon we could make this even more efficient, given the data access pattern of every subsequent sync needing access to such a lookup it seems strange we make each /sync build one from scratch


Here's what your changelog entry will look like:

✨ Features

  • Improve processBeaconEvents hotpath (#3200).

@t3chguy t3chguy added the T-Enhancement New features, changes in functionality, performance boosts, user-facing improvements label Mar 8, 2023
@t3chguy t3chguy requested a review from a team as a code owner March 8, 2023 17:28
Copy link
Contributor

@germain-gg germain-gg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks sane, but there's quite a few test cases that are not passing anymore

@t3chguy
Copy link
Member Author

t3chguy commented Mar 8, 2023

Yup the tests were spying on once and asserting that is how decryption was watched for, did highlight one issue but the rest were spurious, most simply expected the code-path to still be synchronous

@@ -493,22 +493,24 @@ export class RoomState extends TypedEventEmitter<EmittedEvents, EventHandlerMap>
}
};

events.forEach((event: MatrixEvent) => {
for (const event of events) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL that forEach is up to 3 times slower than the classic loops. 😳

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which makes me wonder if other noticeable performance improvements could be had from banning it elsewhere, too?

Copy link
Contributor

@weeman1337 weeman1337 Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They ran the benchmark with Node 14.
I've tested forEach vs. for of based on their code in the browser.

Interesting results:

Firefox:
forEach of took: 47
for of took: 77

Chromium:
forEach of took: 3807
for of took: 4095

Here it looks like that forEach is the winner.

Code
<!DOCTYPE html>
<html lang="de">
    <head>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>Titel</title>
    </head>
    <body>
        <div id="results">

        </div>

        <script>
            console.log("Starting");

            function generateTestArray() {
                const result = [];
                for (let i = 0; i < 10000000; ++i) {
                    result.push({
                        a: i,
                        b: i / 2,
                        r: 0,
                    });
                }
                return result;
            }

            const testArray1 = generateTestArray();
            const testArray2 = generateTestArray();

            const results = document.getElementById("results");

            const before1 = Date.now();
            testArray1.forEach((x) => {
                x.r = x.a + x.b;
            });
            const after1 = Date.now();

            const before2 = Date.now();
            for (const obj of testArray2) {
                obj.r = obj.a + obj.b;
            }
            const after2 = Date.now();

            const result1 = document.createElement("DIV");
            result1.textContent = "forEach of took: " + (after1 - before1);
            results.appendChild(result1);

            const result2 = document.createElement("DIV");
            result2.textContent = "for of took: " + (after2 - before2);
            results.appendChild(result2);
        </script>
    </body>
</html>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting. Somehow it makes no sense to me that these aren't optimized into the same machine code. 🤔

Copy link
Contributor

@weeman1337 weeman1337 Mar 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet another benchmark: https://www.measurethat.net/Benchmarks/Show/18189/0/for-vs-foreach-vs-forin-vs-forof This is more aimed at the overhead of the different ways of iterating an array.

I would conclude: The performance of a loop highly depends on what you do in it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also depends on how the array is interpreted and what it contains.
I found https://v8.dev/blog/elements-kinds interesting to conceptualise some perf optimisation that happen at the JS engine level

I'd optimise for readibility in the source code, and if we found that one thing was really slower than other, looking at transpiling that at build time

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR the reason for the swap here was simply for access to async/await

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet another benchmark: measurethat.net/Benchmarks/Show/18189/0/for-vs-foreach-vs-forin-vs-forof This is more aimed at the overhead of the different ways of iterating an array.

That particular one showed a factor 100 difference between for of and for in in my browser. I find it mildly irritating that there's so much variation just from the choice of iteration mechanism.

I'd optimise for readibility in the source code, and if we found that one thing was really slower than other, looking at transpiling that at build time

💯

@t3chguy t3chguy added this pull request to the merge queue Mar 9, 2023
Merged via the queue into develop with commit 87641a6 Mar 9, 2023
@t3chguy t3chguy deleted the t3chguy/beacons branch March 9, 2023 09:48
@t3chguy t3chguy self-assigned this Mar 9, 2023
su-ex added a commit to SchildiChat/matrix-js-sdk that referenced this pull request Apr 21, 2023
* Allow via_servers property in findPredecessor (update to MSC3946) ([\matrix-org#3240](matrix-org#3240)). Contributed by @andybalaam.
* Fire `closed` event when IndexedDB closes unexpectedly ([\matrix-org#3218](matrix-org#3218)).
* Implement MSC3952: intentional mentions ([\matrix-org#3092](matrix-org#3092)). Fixes element-hq/element-web#24376.
* Send one time key count and unused fallback keys for rust-crypto ([\matrix-org#3215](matrix-org#3215)). Fixes element-hq/element-web#24795. Contributed by @florianduros.
* Improve `processBeaconEvents` hotpath ([\matrix-org#3200](matrix-org#3200)).
* Implement MSC3966: a push rule condition to check if an array contains a value ([\matrix-org#3180](matrix-org#3180)).
* indexddb-local-backend - return the current sync to database promise … ([\matrix-org#3222](matrix-org#3222)). Contributed by @texuf.
* Revert "Add the call object to Call events" ([\matrix-org#3236](matrix-org#3236)).
* Handle group call redaction ([\matrix-org#3231](matrix-org#3231)). Fixes vector-im/voip-internal#128.
* Stop doing O(n^2) work to find event's home (`eventShouldLiveIn`) ([\matrix-org#3227](matrix-org#3227)). Contributed by @jryans.
* Fix bug where video would not unmute if it started muted ([\matrix-org#3213](matrix-org#3213)). Fixes element-hq/element-call#925.
* Fixes to event encryption in the Rust Crypto implementation ([\matrix-org#3202](matrix-org#3202)).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-Enhancement New features, changes in functionality, performance boosts, user-facing improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants