Skip to content

perf(mesh-layer): memoize per-Device GLSL shader assembly#541

Draft
kylebarron wants to merge 1 commit into
mainfrom
shader-assembler-memo
Draft

perf(mesh-layer): memoize per-Device GLSL shader assembly#541
kylebarron wants to merge 1 commit into
mainfrom
shader-assembler-memo

Conversation

@kylebarron
Copy link
Copy Markdown
Member

Summary

Wrap ShaderAssembler.getDefaultShaderAssembler() in a Proxy that memoizes assembleGLSLShaderPair results by (modules-by-name, defines, vs, fs) and inject it into MeshTextureLayer's Model via getShaders(). All other methods (hook registration, default modules, WGSL assembly) delegate to the inner assembler, so deck.gl's globally-registered hook functions remain visible.

The cache lives in a WeakMap<Device, ShaderAssembler> — two Deck instances on the same page get independent caches, and the entries die with the device.

Why

Inside @luma.gl/shadertools's assembleGLSLShaderPair, every shader-module's source is regex-parsed by validateShaderModuleUniformLayout and the fully-assembled source is regex-parsed by warnIfGLSLUniformBlocksAreNotStd140. There is no result cache anywhere in the luma stack — every Model constructor re-runs the full assembly + validation work.

Within a RasterTileLayer, every tile sublayer passes identical inputs to the assembler (same modules, same vs/fs source). With many tile sublayers active (e.g. a usgs-topo mosaic), this work runs hundreds of times per assembly burst even though the inputs are byte-identical.

Diagnostics

The package now exports two accessors so apps can verify cache effectiveness from devtools:

```js
import {
getMemoShaderAssemblerStats,
getMemoShaderAssemblerMissLog,
} from "@developmentseed/deck.gl-raster";

getMemoShaderAssemblerStats(deckInstance.deck.device);
// → { hits: 410, misses: 1, entries: 1 }

getMemoShaderAssemblerMissLog(deckInstance.deck.device);
// → ["createTexture|cutlineBbox::...::vs::fs"] (first 20 miss keys)
```

Status

Draft — opened for review later. In initial testing in usgs-topo this change appeared to cause a shader compile error (`'DECKGL_FILTER_COLOR' : no matching overloaded function found`). The error was bisected against #540 but the result was inconclusive — it persists with or without this change wired in. Needs further investigation before this PR moves forward.

Suspected mechanism for the assembler-memo-related risk: deck.gl's `getShaderAssembler` does `_hookFunctions.length = 0` and re-adds hooks on every device init. If our cache populates during a window where hooks aren't yet registered (or were just reset), the cached assembled source would lack the hook function declarations. The cache key doesn't include `hookFunctions` content, so it can't recover from this. A real fix likely needs to either invalidate the cache on hook changes or include a hooks-version into the cache key.

Test plan

  • Unit tests for cache hit/miss/delegation are in this PR (6 tests, all passing)
  • Reproduce + diagnose the `DECKGL_FILTER_COLOR` shader compile error in a minimal example
  • Confirm `getMemoShaderAssemblerStats` shows healthy hit ratio in usgs-topo
  • Investigate whether to also upstream the memoization into `@luma.gl/shadertools` itself (benefits all deck.gl users)

🤖 Generated with Claude Code

Wrap `ShaderAssembler.getDefaultShaderAssembler()` in a `Proxy` that memoizes
`assembleGLSLShaderPair` results by `(modules-by-name, defines, vs, fs)` and
inject it into `MeshTextureLayer`'s Model via `getShaders()`. All other
methods (hook registration, default modules, WGSL assembly) delegate to the
inner assembler, so deck.gl's globally-registered hook functions remain
visible.

Cache lives in a `WeakMap<Device, ShaderAssembler>`: two Deck instances on
the same page get independent caches, and the entries die with the device.
Within a `RasterTileLayer`, every tile sublayer passes identical inputs to
the assembler, so the per-Device cache collapses N regex-heavy assembly
passes into one.

Also exposes diagnostic accessors `getMemoShaderAssemblerStats(device)` and
`getMemoShaderAssemblerMissLog(device)` from the package entry point so apps
can confirm cache effectiveness from devtools (and identify which inputs are
varying when the cache underperforms).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kylebarron
Copy link
Copy Markdown
Member Author

kylebarron commented May 14, 2026

Status note (drafted by Claude Code):

With #540's mesh-ref stabilization + module-aware Model rebuild in place, the steady-state per-frame assembleGLSLShaderPair cost goes away. In the naip-mosaic example, the NDVI slider went from ~120 ms / frame to instant without this PR's memoization in play.

#541 still has residual benefit at one-shot moments:

  • Initial paint — first load assembles per tile; the proxy cache would amortize to one assembly + N−1 hits.
  • Panning that loads new tiles — each newly-loaded tile assembles.
  • Render-mode switchMeshTextureLayer.updateState (per perf(raster-layer): avoid re-compiling shader Model as much as possible #540) flips changeFlags.extensionsChanged when the renderPipeline module list changes, which fires one Model rebuild per visible tile. With ~10 tiles in naip-mosaic, that's ~10× assembly at the moment of the switch.
  • Heavy mosaic apps — usgs-topo's ~411 active sublayers cited in the PR body. naip-mosaic's tile count is small enough that the absolute savings are sub-perceptible.

Blocker: the DECKGL_FILTER_COLOR: no matching overloaded function found compile error diagnosed in the PR body is a real correctness issue (cache entries outliving the _hookFunctions.length = 0 reset in deck.gl's getShaderAssembler), not just an edge case. The cache key would need to include some form of hook-functions version.

Suggested path:

  1. Ship perf(raster-layer): avoid re-compiling shader Model as much as possible #540 standalone — already does what naip-mosaic needs.
  2. Park this PR until the hook-functions cache-key invalidation is solved.
  3. Consider upstreaming the memoization into @luma.gl/shadertools — broader ecosystem benefit and the natural place to handle hook-functions invalidation rigorously, rather than in a downstream proxy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant