Improve startup performance of workers #3153

jfirebaugh · 2016-09-06T17:39:32Z

One of the main determinants of time-to-first-render (TTFR) is how fast the workers are able to boot and begin processing tile data. On my laptop, the earliest I see tile requests getting made by workers is at about the 2000ms mark. We should try to improve this.

Ideas:

Reduce the amount of code included in the worker blob (Generated worker blob includes main dependency graph browserify/webworkify#12 may be relevant). (My guess is this is the most important thing we can do to improve TTFR.)
Reduce the overhead of creating the blob URLs for the worker
- Create only one, not one per worker
- Start loading them earlier (at JS load time, rather than the first time a Style is created)
Create only one worker at boot, and additional workers when idle
- In my tests, booting with only a single worker (vs 3) reduces TTFR by several hundred milliseconds
Reduce the overhead of transferring layers to the worker
- Manual JSON.stringify/parse?
- Reload from style URL on worker?
- Custom StructArray format?
Reduce style validation overhead (Don't redo style validation in workers #3149, Eliminate duplicate style validation during style loading #3151, Skip style validation when loading a style from mapbox.com #3152)
Benchmark removing *.tiles.mapbox.com DNS sharding altogether (use api.mapbox.com for everything). This would avoid additional DNS and SSL startup costs.

The text was updated successfully, but these errors were encountered:

mourner · 2016-09-07T15:02:03Z

Reduce the amount of code included in the worker blob

I found a way to reduce it to only the things it needs — see PR browserify/webworkify#30. It reduces the size from 1.12MB to 467KB, although I'm not sure whether it actually affects time-to-first-render that much — can you check @jfirebaugh?

Create only one blob, not one per worker

It seems to be too cheap to bother — the whole workify process up to blob URL creation takes just a few milliseconds in my tests.

jfirebaugh · 2016-09-07T17:15:13Z

Before:

After:

It definitely helps blob creation and worker boot time, although the overall effect on TTFR is only a few hundred milliseconds. I think much of the benefit is being lost due to poor parallelization, and we can recoup it with improved main thread scheduling (boot workers as early as possible, start style XHR as early as possible, reduce validation overhead).

Also it seems that the first message sent to the worker incurs a significant penalty (seen as orange "Function Call" bar in the DedicatedWorker timeline of the "After" results. I wonder if Chrome does lazy evaluation of worker source. Maybe we should try posting a no-op message right after creating the worker.

anandthakker · 2016-09-07T17:24:52Z

I wonder if Chrome does lazy evaluation of worker source

@jfirebaugh I've been looking into TTFR as well this morning -- I'm seeing a "compile script" block before the first function call:

jfirebaugh · 2016-09-07T17:27:59Z

Yeah, I see that too... just wondering why processing the first message takes so much time, but it's not attributed to any specific function in gl-js.

jfirebaugh · 2016-09-07T18:36:18Z

Looking at this further, my hunch is the unattributed time is actually deserialization of the message data, so this goes back to "Reduce the overhead of transferring layers to the worker".

mourner · 2016-11-11T02:47:22Z

Here's what contributes to TTFR if you set up explicit console.log checkpoints across the code (timings are ms since previous checkpoints):

thread	event	time since prev
main	loaded GL JS	269ms
main	created map	64ms
main	style loaded	191ms
main	style created	47ms
worker	worker initialized	497ms
worker	got style layers	14ms
worker	started parsing tile	247ms
worker	parsed non-symbol layers	85ms
worker	got symbol deps	55ms
worker	symbols placed	90ms
main	got tile buffers	20ms

You can see here that sending style layers isn't the bottleneck. Here's where most bottlenecks are instead:

Getting the worker to run (by far the biggest contributor for some reason)
Loading assets ("time to first byte" when requesting things like style, tilejson, sprites, tiles & glyphs)

We need to focus on investigating and fixing the first if possible.

mourner · 2016-11-11T03:37:23Z

If you use a minified GL JS build, worker initialization happens in 290ms after creating the style, down from 500ms. So it looks like it's linearly dependent on the size of the worker bundle.

#3034 should help with this a bit because worker bundle parts will be loaded lazily, e.g. it won't bundle geojson-vt & supercluster until you add a GeoJSON source.

Another thing that might help is rewiring some dependencies so that unnecessary code is not bundled on the worker side. One example is validation code, which takes 7% of the bundle — it's required by some StyleLayer methods but none of those get called on the worker side.

mourner · 2016-11-11T04:25:49Z

Also it seems that the first message sent to the worker incurs a significant penalty (seen as orange "Function Call" bar in the DedicatedWorker timeline of the "After" results. I wonder if Chrome does lazy evaluation of worker source. Maybe we should try posting a no-op message right after creating the worker.

It doesn't look like it's a first message penalty (tried, doesn't make any difference). According to my checkpoints research, it simply takes a while for a worker to start a thread, load the blob and evaluate the JS bundle.

Around ~120ms is spent evaluating the bundle (measured by inserting console.log checkpoints in the beginning and the end of the generated bundle in webworkify). Which is yet another reason to reduce the worker bundle size and/or break it down into parts.

mourner · 2016-11-11T04:47:56Z

Here's a minimal snippet of code that proves that it takes a long time for a Worker to parse its code:

var src = 'console.log("worker: " + performance.now());' + Array(100000).join('(function(){})();'); 
new Worker(URL.createObjectURL(new Blob([src], {type: 'text/javascript'})));
console.log('main: ' + performance.now());

It takes the same time if you create barebones worker and then call importScripts of an expensive script from it.

jfirebaugh · 2016-11-11T16:25:13Z

Worker startup is paying the cost of both parsing/executing the bundle for the first time, and then when actually doing work, running very slowly at first before the optimizer kicks in. And all of that is on a per-worker basis -- AFAICT there's no sharing of compiler/optimizer data between workers. This is why creating only a single worker at startup time is better for TTFR, even when multiple workers are better once reaching a steady state.

mourner · 2019-04-09T11:56:36Z

Just stumbled upon this tweet and it sounds promising for TTFR — we should definitely test it out.

In Chrome, any JavaScript files in a service worker cache are bytecode-cached automatically.
This means there is 0 parse + compile cost for them on repeat visits. 🤯
https://v8.dev/blog/code-caching-for-devs#use-service-worker-caches

jfirebaugh added the performance ⚡ Speed, stability, CPU usage, memory usage, or power usage label Sep 6, 2016

ekelleyv mentioned this issue Sep 6, 2016

Long delay (~2s) between map style loading and vector tiles being requested #2991

Closed

mourner self-assigned this Sep 7, 2016

mourner mentioned this issue Sep 12, 2016

Stringify layers for faster worker transfer #3172

Closed

jfirebaugh mentioned this issue Sep 22, 2016

Share a single blob URL between all workers #3239

Merged

3 tasks

mourner mentioned this issue Dec 9, 2016

Use mapbox-gl-rtl-text to do Arabic text shaping and bidirectional layout (#3708) #3758

Merged

mourner mentioned this issue Jan 3, 2018

Switch to ES Modules with Rollup #5939

Closed

mgmeedendorp mentioned this issue Oct 12, 2018

Mapbox performance Expeditie-Grensland/Website#18

Closed

karimnaaji mentioned this issue Dec 8, 2020

v2.0.0 #10160

Merged

karimnaaji closed this as completed Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve startup performance of workers #3153

Improve startup performance of workers #3153

jfirebaugh commented Sep 6, 2016 •

edited

Loading

mourner commented Sep 7, 2016

jfirebaugh commented Sep 7, 2016

anandthakker commented Sep 7, 2016

jfirebaugh commented Sep 7, 2016

jfirebaugh commented Sep 7, 2016

mourner commented Nov 11, 2016

mourner commented Nov 11, 2016

mourner commented Nov 11, 2016

mourner commented Nov 11, 2016 •

edited

Loading

jfirebaugh commented Nov 11, 2016 •

edited

Loading

mourner commented Apr 9, 2019 •

edited

Loading

Improve startup performance of workers #3153

Improve startup performance of workers #3153

Comments

jfirebaugh commented Sep 6, 2016 • edited Loading

mourner commented Sep 7, 2016

jfirebaugh commented Sep 7, 2016

anandthakker commented Sep 7, 2016

jfirebaugh commented Sep 7, 2016

jfirebaugh commented Sep 7, 2016

mourner commented Nov 11, 2016

mourner commented Nov 11, 2016

mourner commented Nov 11, 2016

mourner commented Nov 11, 2016 • edited Loading

jfirebaugh commented Nov 11, 2016 • edited Loading

mourner commented Apr 9, 2019 • edited Loading

jfirebaugh commented Sep 6, 2016 •

edited

Loading

mourner commented Nov 11, 2016 •

edited

Loading

jfirebaugh commented Nov 11, 2016 •

edited

Loading

mourner commented Apr 9, 2019 •

edited

Loading