Begin to add performance marks #697

djahandarie · 2024-02-17T03:24:39Z

[Draft]

github-actions · 2024-02-17T03:27:11Z

✔️ No visual differences introduced by this PR.

View Playwright Report (note: open the "playwright-report" artifact)

djahandarie · 2024-02-17T03:34:21Z

Trace-20240217T122523.json
Note: This trace includes page load, lookup 1 (cold lookup), lookup 2 (warm lookup), both of the same word on the page (追憶 on https://www.aozora.gr.jp/cards/000879/files/1141_15265.html). I have quite a few dictionaries loaded in, some of which have lots of structured content, media, etc. which show up in this lookup.

Note: The above screenshot of devtools is zoomed in on lookup 2 (warm lookup) only.

@toasted-nutbread I tried adding some basic performance markers (which you can see at the very top of this screenshot), but I'm shocked to learn there is so much going on outside of onStateChanged, as it only takes around 50ms compared to the full 200ms that there is CPU activity). The stuff after onStateChanged seems to be various media/images slowly asynchronously loading and causing relayouts (though I don't know why this takes so long since all the media is local). I don't know what all the stuff before it is and why it's taking so long either.

If you could either brain dump a bit of hints of what I should look into instrumenting, or just go ahead and do it yourself on top of this, either would be awesome

djahandarie · 2024-02-18T11:04:33Z

@toasted-nutbread I started to do some performance work on the media rendering, but I ran into a block so I could use your advice.

First, I tried to rework the media code such that it
1. fetches all media for a given entry at once, to make for more efficient use of IndexedDB and hot loops
2. creates all the media objects in Offscreen directly, instead of base64ing them back and forth between workers.

That helped a little bit, but not that much. With more profiling I learned that just img.decode() of one of the object URLs alone (containing a 640x480 webp) takes up 40ms within the content script, which blows through basically the entirety of our time budget if we're trying to hit high fps.

So I tried to take another approach, of creating canvas elements in the content script, then sending OffscreenCanvas all the way to the right spot in offscreen.html/background.html such that the backend could draw directly to those canvas and it'd be entirely out of the way of the frontend rendering loop.

However, I ran into an issue that it's actually impossible to transfer OffscreenCanvas from the content script to our service worker (which would then further transfer it to offscreen in the case of chrome). This is because we are using chrome.runtime.sendMessage to for our message passing, which doesn't allow for transferring due to restrictions in message passing between content scripts and other contexts. (Some interesting discussion here: w3c/webextensions#293)

That said, it looks like there MAY be a way to really hack around it, via this: https://groups.google.com/a/chromium.org/g/chromium-extensions/c/IPJSfjNSgh8/m/Dh35-tZPAgAJ

The above seems like a very complicated way to go, and I'm not even sure it will be worth it (will drawing to canvas from offscreen actually result in smoother frontend performance? kinda hard to know until we try...). So I'm curious what you think. If there is no better idea I can try the above hack but we'll need to figure out how to nicely package it along with the rest of our API...

toasted-nutbread · 2024-02-18T21:56:44Z

What is a good dictionary to test that includes images, or should I use dummy data? I can do some testing as well, but my dev setup usually only includes a few basic dictionaries.

djahandarie · 2024-02-19T12:11:19Z

@toasted-nutbread The latest 三省堂国語辞典 should be a good test of small images, while the latest 大辞泉 is a good test of big images (if you look up something likely to have images like 犬). Feel free to get in touch directly if you don't already have access.

That said I think it'd eventually be good to have some artificial tests too, for better reproducibility for developers (and even potential for including in our CI benchmarks, though it'd require work to make the results consistent).

djahandarie · 2024-02-19T12:28:18Z

BTW, I just pushed two branches, performance-batching (the one that batches together all the media lookups for a single entry, and avoids base64ing, and also simplifies the DOM a bit) and performance-canvas (the one that attempts to use OffscreenCanvas but fails), in case you'd like to take a look at them.

Begin to add performance marks

c50519e

themoeway-bot approved these changes Feb 17, 2024

View reviewed changes

cashewnuttynuts added kind/meta The issue or PR is meta area/performance The issue or PR is related to performance labels Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Begin to add performance marks #697

Begin to add performance marks #697

djahandarie commented Feb 17, 2024

github-actions bot commented Feb 17, 2024

djahandarie commented Feb 17, 2024

djahandarie commented Feb 18, 2024

toasted-nutbread commented Feb 18, 2024

djahandarie commented Feb 19, 2024

djahandarie commented Feb 19, 2024

Begin to add performance marks #697

Are you sure you want to change the base?

Begin to add performance marks #697

Conversation

djahandarie commented Feb 17, 2024

github-actions bot commented Feb 17, 2024

djahandarie commented Feb 17, 2024

djahandarie commented Feb 18, 2024

toasted-nutbread commented Feb 18, 2024

djahandarie commented Feb 19, 2024

djahandarie commented Feb 19, 2024