Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement and reduced shader workload with globe projection #12039

Merged
merged 11 commits into from
Jul 18, 2022

Conversation

karimnaaji
Copy link
Contributor

@karimnaaji karimnaaji commented Jun 24, 2022

This PR aims to reduce the workload and improve performance during the globe - mercator transition and initial map load. It currently does two kind of improvements:

  1. Removes unnecessary variant compilation by checking whether a requested define is actually in use by the shader. This reduces the total number of shader compiled by roughly half (59 shaders to 33 shaders compiled) on streets-v11 with globe.
Reduced shader compilation scenario can be tested with the following diff by comparing against `main`
+++ b/debug/index.html
@@ -20,12 +20,17 @@
 
 var map = window.map = new mapboxgl.Map({
     container: 'map',
-    zoom: 12.5,
+    zoom: 5.5,
     center: [-122.4194, 37.7749],
     style: 'mapbox://styles/mapbox/streets-v11',
-    hash: true
+    hash: false,
+    projection: 'globe'
 });
 
+map.on('click', () => {
+    map.zoomTo(map.getZoom() + 1);
+})
+
 </script>
 </body>
 </html>
diff --git a/src/render/painter.js b/src/render/painter.js
index 42fa1db0f..710eefbd7 100644
--- a/src/render/painter.js
+++ b/src/render/painter.js
@@ -879,6 +879,7 @@ class Painter {
         const key = Program.cacheKey(name, allDefines, programConfiguration);
 
         if (!this.cache[key]) {
+            console.log('compiling ', key)
             this.cache[key] = new Program(this.context, name, shaders[name], programConfiguration, programUniforms[name], allDefines);
         }
         return this.cache[key];
  1. Simplifies uniform bindings by reducing boilerplate code for their instantiation and defers uniform location fetching via getUniformLocation at first use. This prevents us from trying to upload uniforms on unused uniform locations and overall simplifies extra code necessary to instantiate them, which also reduces library size.

The results from benchmap can be seen at: https://sites.mapbox.com/benchmap-js-results/stack/benchmap-karim-dev/runs/25. In summary, it improves the metrics load time by ~38ms, speed index by ~5% and render and totalBlockingTime by 57ms and 40ms respectively on the rendering-globe fixture. It does not regress any other fixtures unrelated to globe.

Screen Shot 2022-07-13 at 2 32 11 PM

Other metrics improvement

metrics

Here is a before/after video taken from benchmap fixture to see the change and reduced stutter visually:

before-after.mov

There might still be room for improvement, but I kept the scope small so that it can be reviewed and eventual follow-ups investigated and addressed separately. Specifically, one follow-up investigation to consider could be to defer the compilation of shader variants that we know could be of use as a low priority task whenever the map will become idle.

As part of this PR, I also investigated whether we could reduce the number of redundant gl calls that we might have when drawing globes with regards to texture bindings. We currently allow to cache the texture globally, but we should instead cache per texture slot, since as long as a texture is bound to a slot it can be considered strongly associated with that slot. So once it is bound, any further request to bind the combination texture slot/texture id is redundant.

Redundant trace chunk
...
activeTexture: TEXTURE2
bindTexture: TEXTURE_2D, WebGLTexture - ID: 17
activeTexture: TEXTURE3
bindTexture: TEXTURE_2D, WebGLTexture - ID: 18
useProgram: WebGLProgram - ID: 48
uniform3f: WebGLUniformLocation - ID: 339, -4.8806683524653517e-17, -0.9171523356672743, -0.3985368153383868
uniform3f: WebGLUniformLocation - ID: 340, -0.3985368153383868, -0.9171523356672743, 2.4403341762326758e-17
uniform3f: WebGLUniformLocation - ID: 341, -1, 0, 6.123233995736766e-17
uniform3f: WebGLUniformLocation - ID: 342, -1.2246467991473532e-16, 0, -1
activeTexture: TEXTURE0
bindTexture: TEXTURE_2D, WebGLTexture - ID: 130
texParameteri: TEXTURE_2D, TEXTURE_MAG_FILTER, NEAREST
texParameteri: TEXTURE_2D, TEXTURE_MIN_FILTER, NEAREST
texParameteri: TEXTURE_2D, TEXTURE_WRAP_S, CLAMP_TO_EDGE
texParameteri: TEXTURE_2D, TEXTURE_WRAP_T, CLAMP_TO_EDGE
uniform1f: WebGLUniformLocation - ID: 356, 616.9919375574335
uniform1f: WebGLUniformLocation - ID: 359, 9.207999999920528
uniformMatrix4fv: WebGLUniformLocation - ID: 360, false, [..(16)..]
uniformMatrix4fv: WebGLUniformLocation - ID: 361, false, [..(16)..]
uniform3f: WebGLUniformLocation - ID: 367, 0, 1, 4
uniformMatrix4fv: WebGLUniformLocation - ID: 368, false, [..(16)..]
uniform2f: WebGLUniformLocation - ID: 369, 0.16479996625863183, 0.3849127470218343
uniformMatrix4fv: WebGLUniformLocation - ID: 373, false, [..(16)..]
uniform3f: WebGLUniformLocation - ID: 366, 0.5322607871029793, -0.7855211485201578, 0.3156817697303555
uniform3f: WebGLUniformLocation - ID: 372, 3353.1070108583663, 503.82343502165713, -13365.5372258812
createVertexArrayOES -> [object WebGLVertexArrayObjectOES]
bindVertexArrayOES: [object WebGLVertexArrayObjectOES]
enableVertexAttribArray: 0
enableVertexAttribArray: 1
enableVertexAttribArray: 2
bindBuffer: ARRAY_BUFFER, WebGLBuffer - ID: 1117
vertexAttribPointer: 0, 4, SHORT, false, 24, 0
vertexAttribPointer: 1, 4, UNSIGNED_SHORT, false, 24, 8
vertexAttribPointer: 2, 4, SHORT, false, 24, 16
enableVertexAttribArray: 3
bindBuffer: ARRAY_BUFFER, WebGLBuffer - ID: 1119
vertexAttribPointer: 3, 4, FLOAT, false, 16, 0
enableVertexAttribArray: 4
bindBuffer: ARRAY_BUFFER, WebGLBuffer - ID: 1120
vertexAttribPointer: 4, 1, UNSIGNED_BYTE, false, 1, 0
enableVertexAttribArray: 5
enableVertexAttribArray: 6
bindBuffer: ARRAY_BUFFER, WebGLBuffer - ID: 1121
vertexAttribPointer: 5, 3, SHORT, false, 20, 0
vertexAttribPointer: 6, 3, FLOAT, false, 20, 8
bindBuffer: ELEMENT_ARRAY_BUFFER, WebGLBuffer - ID: 1118
drawElements: TRIANGLES, 54, UNSIGNED_SHORT, 0VertexFragment
activeTexture: TEXTURE2
bindTexture: TEXTURE_2D, WebGLTexture - ID: 17
activeTexture: TEXTURE3
bindTexture: TEXTURE_2D, WebGLTexture - ID: 18
uniform3f: WebGLUniformLocation - ID: 339, -0.3985368153383868, -0.9171523356672743, 2.4403341762326758e-17
uniform3f: WebGLUniformLocation - ID: 340, 0, -0.9171523356672743, 0.3985368153383868
uniform3f: WebGLUniformLocation - ID: 341, 0, 0, 1
uniform3f: WebGLUniformLocation - ID: 342, -1, 0, 6.123233995736766e-17
activeTexture: TEXTURE0
bindTexture: TEXTURE_2D, WebGLTexture - ID: 145
uniformMatrix4fv: WebGLUniformLocation - ID: 360, false, [..(16)..]
uniformMatrix4fv: WebGLUniformLocation - ID: 361, false, [..(16)..]
uniform3f: WebGLUniformLocation - ID: 367, 1, 1, 4
uniformMatrix4fv: WebGLUniformLocation - ID: 368, false, [..(16)..]
uniform3f: WebGLUniformLocation - ID: 372, -3316.923173722793, 2951.914489552259, -10258.040268974077
bindVertexArrayOES: [object WebGLVertexArrayObjectOES]
bindBuffer: ARRAY_BUFFER, WebGLBuffer - ID: 1254
bindBuffer: ARRAY_BUFFER, WebGLBuffer - ID: 1255
bindBuffer: ARRAY_BUFFER, WebGLBuffer - ID: 1256
drawElements: TRIANGLES, 120, UNSIGNED_SHORT, 0VertexFragment
activeTexture: TEXTURE2
bindTexture: TEXTURE_2D, WebGLTexture - ID: 17
activeTexture: TEXTURE3
bindTexture: TEXTURE_2D, WebGLTexture - ID: 18
useProgram: WebGLProgram - ID: 49
uniform3f: WebGLUniformLocation - ID: 376, -4.8806683524653517e-17, -0.9171523356672743, -0.3985368153383868
uniform3f: WebGLUniformLocation - ID: 377, -0.3985368153383868, -0.9171523356672743, 2.4403341762326758e-17
uniform3f: WebGLUniformLocation - ID: 378, -1, 0, 6.123233995736766e-17
uniform3f: WebGLUniformLocation - ID: 379, -1.2246467991473532e-16, 0, -1
activeTexture: TEXTURE0
bindTexture: TEXTURE_2D, WebGLTexture - ID: 131
uniform1f: WebGLUniformLocation - ID: 391, 0
uniform1f: WebGLUniformLocation - ID: 400, 616.9919375574335
uniform1f: WebGLUniformLocation - ID: 401, 9.207999999920528
uniformMatrix4fv: WebGLUniformLocation - ID: 393, false, [..(16)..]
uniformMatrix4fv: WebGLUniformLocation - ID: 394, false, [..(16)..]
uniform2f: WebGLUniformLocation - ID: 402, 225, 240
uniform3f: WebGLUniformLocation - ID: 404, 0, 1, 4
uniformMatrix4fv: WebGLUniformLocation - ID: 405, false, [..(16)..]
uniform2f: WebGLUniformLocation - ID: 406, 0.16479996625863183, 0.3849127470218343
uniformMatrix4fv: WebGLUniformLocation - ID: 410, false, [..(16)..]
uniform3f: WebGLUniformLocation - ID: 403, 0.5322607871029793, -0.7855211485201578, 0.3156817697303555
uniform3f: WebGLUniformLocation - ID: 409, 3353.1070108583663, 503.82343502165713, -13365.5372258812
uniform1i: WebGLUniformLocation - ID: 411, 1
uniform4f: WebGLUniformLocation - ID: 416, 1, 1, 1, 1
uniform1f: WebGLUniformLocation - ID: 417, 1
uniform1f: WebGLUniformLocation - ID: 418, 1
createVertexArrayOES -> [object WebGLVertexArrayObjectOES]
...

Note the frequent and unnecessary redundant bindings:

activeTexture: TEXTURE2
bindTexture: TEXTURE_2D, WebGLTexture - ID: 17
activeTexture: TEXTURE3
bindTexture: TEXTURE_2D, WebGLTexture - ID: 18

Eliminating these redundant calls from our traces did not yet result in a visible performance improvement (e09e055, https://sites.mapbox.com/benchmap-js-results/stack/benchmap-karim-dev/runs/17) so I didn't pursue this idea any further at the moment, and it might need more fine-grained profiling or more data samples.

Fixes https://github.com/mapbox/gl-js-team/issues/455

Launch Checklist

  • briefly describe the changes in this PR
  • include before/after visuals or gifs if this PR includes visual changes
  • write tests for all new functionality
  • post benchmark scores
  • manually test the debug page
  • apply changelog label ('bug', 'feature', 'docs', etc) or use the label 'skip changelog'
  • add an entry inside this element for inclusion in the mapbox-gl-js changelog: <changelog>Improve globe-mercator transition and map load performance with globe projection</changelog>

@karimnaaji karimnaaji added the skip changelog Used for PRs that do not need a changelog entry label Jun 24, 2022
@karimnaaji karimnaaji force-pushed the karim/simplify-uniform-bindings branch from 6bbccbc to 501c74e Compare June 24, 2022 21:28
@karimnaaji karimnaaji closed this Jul 7, 2022
@mourner
Copy link
Member

mourner commented Jul 7, 2022

@karimnaaji Any comments? Was the refactoring unfeasible?

@karimnaaji
Copy link
Contributor Author

@mourner I might have accidentally closed this. It's still a good WIP and totally feasible, I'll pick that up again once benchmap has globe in the set of fixtures as it's related to improvement to the transition.

@karimnaaji karimnaaji reopened this Jul 7, 2022
@karimnaaji karimnaaji force-pushed the karim/simplify-uniform-bindings branch 2 times, most recently from ab160f0 to d6b5668 Compare July 8, 2022 22:01
@karimnaaji karimnaaji force-pushed the karim/simplify-uniform-bindings branch from d6b5668 to 7fe3bb1 Compare July 12, 2022 21:51
@karimnaaji karimnaaji changed the title WIP: Simplify uniform bindings Performance improvement and reduced shader workload with globe projection Jul 14, 2022
@karimnaaji karimnaaji added performance ⚡ Speed, stability, CPU usage, memory usage, or power usage and removed skip changelog Used for PRs that do not need a changelog entry labels Jul 14, 2022
@karimnaaji karimnaaji marked this pull request as ready for review July 14, 2022 17:04
@karimnaaji karimnaaji requested a review from mourner July 14, 2022 17:05
Copy link
Member

@mourner mourner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice win here, looks great!

src/render/uniform_binding.js Outdated Show resolved Hide resolved
@karimnaaji karimnaaji merged commit 04a4447 into main Jul 18, 2022
@karimnaaji karimnaaji deleted the karim/simplify-uniform-bindings branch July 18, 2022 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance ⚡ Speed, stability, CPU usage, memory usage, or power usage
Projects
None yet
2 participants