From dbf9eea479c87fb807aa1a21184c41b22e9695c2 Mon Sep 17 00:00:00 2001
From: Universe <universe@grida.co>
Date: Sun, 22 Mar 2026 17:57:21 +0900
Subject: [PATCH] docs: document viewport culling investigation and findings

Linear O(n) viewport culling was benchmarked against real-world SVG fixtures
(up to 300K nodes). It regresses 8-13% on dense scenes where most nodes are
visible. Updated optimization.md item 12 to note spatial index requirement.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 .../feat-2d/investigation-viewport-culling.md | 62 +++++++++++++++++++
 docs/wg/feat-2d/optimization.md               | 19 +-----
 2 files changed, 63 insertions(+), 18 deletions(-)
 create mode 100644 docs/wg/feat-2d/investigation-viewport-culling.md

diff --git a/docs/wg/feat-2d/investigation-viewport-culling.md b/docs/wg/feat-2d/investigation-viewport-culling.md
new file mode 100644
index 0000000000..6b125d95fa
--- /dev/null
+++ b/docs/wg/feat-2d/investigation-viewport-culling.md
@@ -0,0 +1,62 @@
+---
+title: "Investigation: Viewport Culling & Camera Caching"
+status: rejected
+date: 2026-03-22
+---
+
+# Investigation: Viewport Culling & Camera Caching
+
+## Hypothesis
+
+During view-only camera transforms (pan/zoom), skip drawing layers whose bounds fall outside the visible viewport. Cache Camera2D derived values (view matrix, inverse, zoom, world rect) to avoid redundant per-frame math.
+
+## What Was Tried
+
+1. **Camera2D caching** — `warm_cache()` precomputes view matrix, inverse, zoom, and world rect once per mutation. Read accessors (`view_matrix()`, `rect()`, `get_zoom()`, `screen_to_canvas_point()`) return cached fields in O(1).
+
+2. **Viewport culling** — Before each `draw_layer` call, check if the layer's render bounds (from `GeometryCache`) intersect the camera's world rect. Skip layers that are entirely off-screen.
+
+## Results
+
+### Synthetic scenes (Criterion, CPU raster, 1920x1080)
+
+Sparse grids where most nodes are off-screen:
+
+| Metric | 100 nodes | 1K nodes | 10K nodes |
+| ------ | --------- | -------- | --------- |
+| Pan    | ~same     | **−46%** | **−85%**  |
+| Zoom   | ~same     | **−32%** | **−81%**  |
+
+### Real-world SVGs (headless, CPU raster, 1920x1080)
+
+Dense content where most nodes overlap the viewport:
+
+| Scene                            | Nodes | Pan Δ      | Zoom Δ     |
+| -------------------------------- | ----- | ---------- | ---------- |
+| Koppen-Geiger climate map (96MB) | 235K  | **+8.7%**  | **+13.3%** |
+| San Francisco Bay map (40MB)     | 85K   | **+11.0%** | −7.3%      |
+| Lorenz 3D attractor (20MB)       | 300K  | +3.5%      | ~same      |
+| Lyon fortification map (30MB)    | 34    | −2.0%      | −3.0%      |
+| Propane flame contours (30MB)    | 1.8K  | −6.5%      | −3.3%      |
+
+## Why It Failed on Real Content
+
+Linear viewport culling is **O(n) per frame** — every node's bounds are checked against the viewport. For dense scenes (maps, scientific visualizations), nearly all nodes pass the intersection test, so the check is pure overhead.
+
+The synthetic benchmarks were misleading: a sparse grid at 10K nodes has ~90% off-screen at any given viewport, so culling skips most work. Real documents are the opposite — content is concentrated in the viewport.
+
+## Conclusion
+
+- **Camera caching**: safe but negligible (~30ns/frame savings vs 200ms+ frame times)
+- **Linear viewport culling**: net negative on real content. Do not adopt without a spatial index.
+- **Actual bottleneck**: Skia path rasterization dominates frame time on large scenes (235K paths = 800ms). CPU-side culling cannot fix this.
+
+## What Would Actually Help
+
+Per items 6, 12, and 36 in `optimization.md`:
+
+- **Spatial index** (R-tree/quadtree, item 36) would make culling O(log n) instead of O(n)
+- **Tile-based raster cache** (item 6) would avoid re-rasterizing static content on camera change
+- **SkPicture caching** (item 5) with dirty-region invalidation would let Skia replay recorded ops instead of re-drawing paths
+
+The draw stage (Skia path rasterization) is where 95%+ of frame time goes on large scenes. Optimizations must target that.
diff --git a/docs/wg/feat-2d/optimization.md b/docs/wg/feat-2d/optimization.md
index 388668e21a..ab09a50f34 100644
--- a/docs/wg/feat-2d/optimization.md
+++ b/docs/wg/feat-2d/optimization.md
@@ -11,17 +11,14 @@ A summary of all discussed optimization techniques for achieving high-performanc
 ## Transform & Geometry
 
 1. **Transform Cache**
-
    - Store `local_transform` and derived `world_transform`.
    - Use dirty flags and top-down updates.
 
 2. **Geometry Cache**
-
    - Cache `local_bounds`, `world_bounds`.
    - Used for culling, layout, and hit-testing.
 
 3. **Flat Scene Graph + Parent Pointers**
-
    - Flat arena with parent/children relationships.
    - Enables O(1) access and traversal.
 
@@ -30,17 +27,14 @@ A summary of all discussed optimization techniques for achieving high-performanc
 ## Rendering Pipeline
 
 4. **GPU Acceleration (Skia Backend::GL/Vulkan)**
-
    - Use hardware compositing, filters, transforms.
 
 5. **Scene-Level Picture Caching**
-
    - Use `SkPicture` to record full-scene vector draw ops.
    - Serves as the always-up-to-date canonical snapshot.
    - Resolution-independent; ideal for rerendering or tile regeneration.
 
 6. **Tile-Based Raster Cache (Hybrid Rendering)**
-
    - Render the full viewport, take snapshot. debounced (after no more changes. e.g. 150ms)
    - Divide the snapshot into fixed-size tiles (e.g., 512×512).
    - When new area discovered, render the cached, non-overlapping parts with tile cache. only render newly discovered area.
@@ -48,24 +42,19 @@ A summary of all discussed optimization techniques for achieving high-performanc
    - Optional padding per tile to account for effects (blur, shadows).
 
 7. **Dynamic Mode Switching (Picture vs Tile)**
-
    - Render from `SkPicture` directly during normal zoom or active edits.
    - Fallback to raster tiles for zoomed-out or complex views.
    - Tile invalidation/redraw is driven by zoom level, camera transform, or frame budget.
 
 8. **Dirty & Re-Cache Strategy**
-
    - Nodes marked dirty will trigger re-recording of affected picture regions or tiles.
    - Use change tracking to only re-record minimum needed areas.
    - Recording large subtrees is expensive—optimize granularity based on tree structure.
 
 9. **Scene Cache Config / Strategy**
-
    - Defines how scene caching is organized.
    - Properties include:
-
      - `depth`:
-
        - `0` → Entire scene is one cache.
        - `1` → Cache per top-level container.
        - `n` → Cache at depth `n`, chunking deeper layers.
@@ -83,10 +72,8 @@ A summary of all discussed optimization techniques for achieving high-performanc
    - Cache accessors like `get_picture_cache_by_id()` support scoped re-rendering.
 
 10. **Will-Change Optimization**
-
     - Nodes marked with "will-change" are expected to become dirty soon.
     - Examples:
-
       - Image node waiting on async src resolution
       - Text node waiting on font availability
 
@@ -94,9 +81,7 @@ A summary of all discussed optimization techniques for achieving high-performanc
     - Prevents re-recording full subtrees—minimizes recording cost.
 
 11. **Flattened Render Command List**
-
     - Scene is compiled into a flat list of `RenderCommand` structs with resolved:
-
       - Transform
       - Clip bounds
       - Opacity
@@ -128,9 +113,8 @@ A summary of all discussed optimization techniques for achieving high-performanc
     - This model is essential for dynamic caching, parallel planning, and GPU-aware scheduling.
 
 12. **Dirty-Region Culling**
-
     - Use camera’s `visible_rect` to cull `world_bounds`.
-    - Optional: accelerate with quadtree or BVH.
+    - **Requires spatial index** (quadtree or BVH, see item 36). Linear O(n) culling was benchmarked and causes 8-13% regression on dense real-world scenes (235K nodes) because the per-node bounds check adds overhead when most nodes are visible. See `investigation-viewport-culling.md` for full data.
 
 13. **Minimize Canvas State Changes**
 
@@ -309,7 +293,6 @@ Even if content is temporarily low-res, the tool still feels precise.
 ## Text & Glyph Optimization
 
 29. **Glyph Cache (Atlas or Paragraph Caching)**
-
     - Cache rasterized or vector glyphs used across the document.
     - Prevents redundant layout or rendering of text.
     - Essential for high-DPI or frequently zoomed views.