Skip to content

perf improvements#306

Merged
wass08 merged 1 commit into
mainfrom
perf/viewer-render-improvements
May 13, 2026
Merged

perf improvements#306
wass08 merged 1 commit into
mainfrom
perf/viewer-render-improvements

Conversation

@wass08
Copy link
Copy Markdown
Collaborator

@wass08 wass08 commented May 13, 2026

Performance investigation pass on the WebGPU viewer. Ships measurable GPU-time reductions on the rendering path and adds a runtime diagnostic toolkit for ongoing thermal/perf debugging — ?perf overlay with real GPU-time-per-frame measurement and ?disable=... URL flags to ablate individual passes without code changes.

What's in this PR

Rendering perf improvements (always-on)

  • shadow-radius: 3 → 2 on the main directional light (lights.tsx). ~30% cheaper PCF sampling on every shadow-receiving fragment, visually near-identical.
  • DPR ceiling 1.5 → 1.25 on coarse-pointer devices (viewer/index.tsx). ~30% fewer fragments on every pass for phones/tablets; desktops keep 1.5.
  • Site groundShapemeshBasicMaterial (site-renderer.tsx). The ground fill is just the canvas background color used as a depth-buffer occluder — PBR + 16 lights + receiveShadow was wasted work over almost the whole viewport. The previous meshStandardMaterial block is left as a commented reference.
  • glassMaterial: MeshStandardNodeMaterialMeshLambertNodeMaterial (lib/materials.ts). Glass doesn't need PBR specular/roughness/metalness; Lambert is a fraction of the cost and still responds to the existing theme-aware light tinting (so the glass shifts naturally between light/dark mode).
  • Drop castShadow + receiveShadow on the window root mesh (window-renderer.tsx). The root mesh has its material overridden to an invisible hitbox at runtime — it was casting shadows of an invisible box every frame.

Diagnostic toolkit (?perf and ?disable=...)

  • ?perf overlay — mounts an upgraded <PerfMonitor /> showing FPS, real GPU ms per frame (avg + max), per-frame draw calls / triangles, dirty-node count, and a scene-graph breakdown by drawable type (MESH, LINE, SPRITE, LIGHT). (perf-monitor.tsx)

  • GPU-time measurement via device.queue.onSubmittedWorkDone() (lib/gpu-perf.ts). The custom RenderPipeline.render() path bypasses three.js's trackTimestamp infrastructure, so timestamp queries can't see our work. Instead we time from CPU submit to GPU-done via the native WebGPU promise — accurate per-frame GPU duration with no instrumentation overhead. Samples are pushed from post-processing.tsx (PostProcessing path) and DebugRenderer (raw path).

  • info.autoReset = false + explicit info.reset() per window in PerfMonitor. The custom RenderPipeline.render() path doesn't trigger three.js's automatic per-frame info reset, so info.render.calls accumulated across frames and the previous display showed lifetime totals. The overlay now shows true per-frame averages.

  • ?disable=... URL flags (post-processing.tsx, lights.tsx) — comma-separated subset of:

    • ao — skip SSGI entirely (and denoise, since denoise has nothing to denoise)
    • denoise — keep SSGI but feed raw noisy AO straight through (isolates denoise cost)
    • outline — skip the merged-outline node and its 14 internal RTs
    • postFx — bypass the whole RenderPipeline and use renderer.render(scene, camera) directly (isolates raw scene-render cost from any post-FX overhead)
    • shadows — skip the shadow-map render pass

    Each flag prevents allocation + per-frame work for that stage, so device-temperature deltas across combos isolate which pass is the actual culprit. Picked up once at pipeline build — reload after changing the URL.

Other

  • A few unrelated import reorderings touched by the formatter on save in item-renderer.tsx and node-renderer.tsx — harmless, pure ordering changes.
  • getMaterialForOriginal return type relaxed from MeshStandardNodeMaterial to Material so the glass case (now Lambert) typechecks.

How to use the diagnostic toolkit

?perf                       overlay on, full pipeline
?perf&disable=outline       overlay on, no outline
?perf&disable=ao,denoise    overlay on, no AO/denoise
?perf&disable=postFx        overlay on, raw renderer.render path only
?perf&disable=ao,denoise,outline,shadows,postFx
                            overlay on, bare scene render — true baseline

The deltas between these tell you exactly how much each pass costs. Reading the GPU number: at 50fps the budget is 20ms, so gpuMs / 20 ≈ GPU utilization. Sustained 10ms+ is the "warm device" zone; <4ms is cool.

Test plan

  • bun typecheck from private-editor root — green ✓
  • Visual check on a typical scene, light + dark themes — confirm: ground still occludes, windows still look like glass, shadows still render, selection outline still appears
  • Compare GPU ms with and without the change on the same scene+camera using ?perf — expect a meaningful reduction on idle thermal load
  • Confirm ?perf&disable=ao,denoise,outline,postFx still renders (the fallback path)
  • Test on a mobile device to confirm the DPR clamp engages ((pointer: coarse) media query)

@wass08 wass08 merged commit 90c948d into main May 13, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant