Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Begin to add performance marks #697

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Begin to add performance marks #697

wants to merge 1 commit into from

Conversation

djahandarie
Copy link
Collaborator

[Draft]

Copy link

✔️ No visual differences introduced by this PR.

View Playwright Report (note: open the "playwright-report" artifact)

@djahandarie
Copy link
Collaborator Author

Trace-20240217T122523.json
Note: This trace includes page load, lookup 1 (cold lookup), lookup 2 (warm lookup), both of the same word on the page (追憶 on https://www.aozora.gr.jp/cards/000879/files/1141_15265.html). I have quite a few dictionaries loaded in, some of which have lots of structured content, media, etc. which show up in this lookup.


yomitan-performance-warm
Note: The above screenshot of devtools is zoomed in on lookup 2 (warm lookup) only.


@toasted-nutbread I tried adding some basic performance markers (which you can see at the very top of this screenshot), but I'm shocked to learn there is so much going on outside of onStateChanged, as it only takes around 50ms compared to the full 200ms that there is CPU activity). The stuff after onStateChanged seems to be various media/images slowly asynchronously loading and causing relayouts (though I don't know why this takes so long since all the media is local). I don't know what all the stuff before it is and why it's taking so long either.

If you could either brain dump a bit of hints of what I should look into instrumenting, or just go ahead and do it yourself on top of this, either would be awesome

@djahandarie
Copy link
Collaborator Author

@toasted-nutbread I started to do some performance work on the media rendering, but I ran into a block so I could use your advice.

First, I tried to rework the media code such that it
1. fetches all media for a given entry at once, to make for more efficient use of IndexedDB and hot loops
2. creates all the media objects in Offscreen directly, instead of base64ing them back and forth between workers.

That helped a little bit, but not that much. With more profiling I learned that just img.decode() of one of the object URLs alone (containing a 640x480 webp) takes up 40ms within the content script, which blows through basically the entirety of our time budget if we're trying to hit high fps.

So I tried to take another approach, of creating canvas elements in the content script, then sending OffscreenCanvas all the way to the right spot in offscreen.html/background.html such that the backend could draw directly to those canvas and it'd be entirely out of the way of the frontend rendering loop.

However, I ran into an issue that it's actually impossible to transfer OffscreenCanvas from the content script to our service worker (which would then further transfer it to offscreen in the case of chrome). This is because we are using chrome.runtime.sendMessage to for our message passing, which doesn't allow for transferring due to restrictions in message passing between content scripts and other contexts. (Some interesting discussion here: w3c/webextensions#293)

That said, it looks like there MAY be a way to really hack around it, via this: https://groups.google.com/a/chromium.org/g/chromium-extensions/c/IPJSfjNSgh8/m/Dh35-tZPAgAJ

The above seems like a very complicated way to go, and I'm not even sure it will be worth it (will drawing to canvas from offscreen actually result in smoother frontend performance? kinda hard to know until we try...). So I'm curious what you think. If there is no better idea I can try the above hack but we'll need to figure out how to nicely package it along with the rest of our API...

@toasted-nutbread
Copy link
Collaborator

What is a good dictionary to test that includes images, or should I use dummy data? I can do some testing as well, but my dev setup usually only includes a few basic dictionaries.

@djahandarie
Copy link
Collaborator Author

@toasted-nutbread The latest 三省堂国語辞典 should be a good test of small images, while the latest 大辞泉 is a good test of big images (if you look up something likely to have images like 犬). Feel free to get in touch directly if you don't already have access.

That said I think it'd eventually be good to have some artificial tests too, for better reproducibility for developers (and even potential for including in our CI benchmarks, though it'd require work to make the results consistent).

@djahandarie
Copy link
Collaborator Author

BTW, I just pushed two branches, performance-batching (the one that batches together all the media lookups for a single entry, and avoids base64ing, and also simplifies the DOM a bit) and performance-canvas (the one that attempts to use OffscreenCanvas but fails), in case you'd like to take a look at them.

@cashewnuttynuts cashewnuttynuts added kind/meta The issue or PR is meta area/performance The issue or PR is related to performance labels Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance The issue or PR is related to performance kind/meta The issue or PR is meta
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants