explorer: B1 viewport-aware facet counts (#234 step 3) by rdhyee · Pull Request #237 · isamplesorg/isamplesorg.github.io

rdhyee · 2026-05-25T05:51:13Z

Summary

The facet legend now scopes per-source / per-material / per-context / per-object_type counts to the current map viewport bbox instead of the whole dataset. Pan or zoom → counts recompute. This resolves the silent disagreement between the legend ("global pivot tool") and the rest of the page ("Samples in View", table count, map dots) that the 2026-05-22 design session called out and that #234 picks as direction B1.

This is step 3 of the #234 roadmap, after #235 (facetNote URL-load, merged) and #236 (search real-count, merged). Subsequent steps (A1 search-as-global-filter, C3 auto-promote to point mode) are independent and build on this.

Decisions locked with @rdhyee before implementation

Q	Decision
Q1 — where do coordinates live for the bbox JOIN?	No existing parquet carries both the four facet columns and lat/lon. JOIN `facets_url` to `lite_url` on `pid` for this PR; follow-up to bake a pre-joined parquet if real-world latency proves bad.
Q2 — progressive refinement now?	Single-pass full query, debounced 250 ms via the existing `facetCountsReqId` infrastructure. Defer coarse-then-fine scaffold until latency shows it's needed.
Q3 — error-path semantics?	Fall back to global baseline via existing `applyFacetCounts(key, null)` — matches current no-filter behavior; user can re-pan to retry.

Architectural changes

New isGlobalView() helper (altitude shortcut + rect-saturation check) decides when the cube fast-path stays valid. At alt > 10,000 km the camera is treated as global regardless of computeViewRectangle per-angle quirks, so legend totals don't jiggle on tiny camera rotations near the default zoom-out.
updateCrossFilteredCounts snapshots bboxSQL once at function entry. Cube fast-path gated additionally on bboxSQL === null. The no-filter + non-global case now falls through to the slow path (B1's whole point: counts reflect what's in view even with zero facet filters).
Slow path (when bboxSQL set): JOIN facets_url f ↔ lite_url l ON l.pid = f.pid and apply the bbox predicate against (l.latitude, l.longitude). buildCrossFilterWhere takes an optional column-prefix arg so the JOIN-shape WHERE qualifies columns with f. to avoid ambiguity with lite_url.source.
viewer.camera.moveStart → synchronous markFacetCountsRecomputing(). Italic-stale legend appears the instant the user starts panning, cleared when applyFacetCounts writes the post-moveEnd result.
viewer.camera.moveEnd → refreshFacetCounts(). Reuses the existing 250 ms debounce + facetCountsReqId stale guard — bursts coalesce to one query, superseded in-flight queries discard their result on await resume.

Test plan

quarto render explorer.qmd — passes
New tests/playwright/facet-viewport.spec.js — 3 specs pass locally
- zoom-in → counts shrink relative to global, per-source counts ≤ global; zoom-out → restore to within 1 %
- moveStart marks .recomputing synchronously (verified via in-page listener that runs after the explorer's)
- negative control: cold global-view boot leaves no .recomputing stuck
Regression suite — facetnote-url-load.spec.js (2), search-real-count.spec.js (3), url-roundtrip.spec.js (5) — all pass (10/10)
git diff --check upstream/main...HEAD — clean
Pre-deploy smoke gate on CI

Known follow-ups

Latency not formally benchmarked. The JOIN turns 4 parallel GROUP BY queries into 4 parallel JOIN+GROUP BY queries against the ≈300 MB lite_url parquet. If UX feels slow in real use, follow-up to bake a pre-joined facets-with-coords parquet and route the bbox path through it (Q1 fallback).
Cancellation-race spec deferred. The facetCountsReqId stale guard is in place but not exercised by a Playwright test (two pans back-to-back inside the 250 ms window). Add if Codex flags it as a coverage gap.
Progressive refinement (coarse-then-fine) explicitly out of scope — Q2 locked single-pass. File as a separate issue if latency benchmarks demand it.

Cross-refs

Closes the B1 row in explorer: architectural direction — make filter semantics coherent across all surfaces #234
Supersedes the bbox-scope portion of explorer: facet counts — remaining work after #157/#226 #230
Unblocks step 4 (A1 search-as-global-filter)

Roadmap context: #234. Direction is A1 + B1 + (C3-when-feasible, C2-with-warning) with progressive refinement underlying the dynamic surfaces.

The facet legend now scopes per-source / per-material / per-context / per-object_type counts to the current viewport bbox instead of the whole dataset. Pan or zoom → counts recompute. Resolves the silent disagreement between the legend and the "Samples in View" / table-count surfaces. What changed - New `isGlobalView()` helper: altitude shortcut (h > 10,000 km) plus a rect-saturation check on `paddedViewportBounds`. Returns true on the initial zoomed-out view so the legend doesn't jiggle on tiny rotations. - `updateCrossFilteredCounts` snapshots `bboxSQL` once at function entry. Cube fast-path is gated additionally on `bboxSQL === null` — the pre-aggregated cube has no spatial granularity, so it's invalid for any non-global view. The no-filter + non-global case now falls through to the slow path (B1's whole point: "what's in view" even with zero facet filters). - Slow path: when `bboxSQL` is set, JOIN `facets_url f` to `lite_url l` on `l.pid = f.pid` and apply the bbox predicate against `(l.latitude, l.longitude)`. `buildCrossFilterWhere` accepts an optional column prefix so the JOIN-shape WHERE qualifies columns with `f.` to avoid ambiguity with `lite_url.source`. - `viewer.camera.moveStart` → synchronous `markFacetCountsRecomputing()` so the legend shows italic-stale state the instant the user starts panning, instead of waiting for the 250 ms debounce. Cleared when `applyFacetCounts` writes the post-moveEnd query result. - `viewer.camera.moveEnd` → `refreshFacetCounts()`. Reuses the existing `facetCountsReqId` + 250 ms debounce, so bursts of moveEnd coalesce to one query and any superseded in-flight query discards its result on `await` resume via the stale guard. - On query error: existing `applyFacetCounts(key, null)` baseline fallback covers the Q3 decision (show globals, not error indicator). Q1 / Q2 / Q3 plan decisions (locked with rdhyee before implementation) - Q1: JOIN to `lite_url` for this PR; no existing parquet carries both the four facet columns and lat/lon. If real-world latency is bad, follow-up to bake a pre-joined facet+coords parquet. - Q2: Single-pass full query (no coarse-then-fine progressive scaffold yet — defer until latency is shown to need it). - Q3: On bbox-query throw, fall back to global baseline counts. Tests - New `tests/playwright/facet-viewport.spec.js`: 3 specs covering zoom-in-then-out (counts shrink, then restore to within 1 %), moveStart-sync `.recomputing` affordance, and the negative control that a cold global-view boot leaves no `.recomputing` stuck. - Regression: facetnote-url-load (2), search-real-count (3), url-roundtrip (5) all pass — 13 tests green locally. Known follow-ups - Latency was not formally benchmarked; the JOIN turns 4 parallel GROUP-BY queries into 4 parallel JOIN+GROUP-BY queries against the lite parquet (≈300 MB). If real-world UX feels slow, bake a pre-joined facets-with-coords parquet and route the bbox path through it (Q1 fallback). - A cancellation-race Playwright spec (two pans back-to-back inside the 250 ms window) is conceptually covered by the existing `facetCountsReqId` stale guard but not yet exercised by a test — add if Codex review flags it. Closes the B1 row in isamplesorg#234.

…#237 Codex caught three real issues + one documentation polish. 1. (BLOCKING) moveStart did not supersede in-flight or debounced facet count work. An older `updateCrossFilteredCounts` could land mid-pan, pass the stale guard at its `await` resume, call `applyFacetCounts`, and CLEAR `.recomputing` while writing pre-move counts — leaving the legend looking authoritative but stale until moveEnd's 250 ms debounce fired the new query. Fix: the moveStart listener now `clearTimeout(facetCountsDebounce); ++facetCountsReqId;` in addition to marking the italic-stale class. moveEnd's `refreshFacetCounts()` bumps the id AGAIN, harmlessly — supersession of already-superseded work, debounce coalesced. 2. (TEST GAP) The PR exercised the no-filter bbox path but NOT the filter-active JOIN path — exactly the new SQL shape this change introduces. Added a 4th Playwright spec `bbox-aware count under an active source filter (JOIN + filter)`: - activate a single source filter (cube fast-path at global view) - capture filtered material counts at global view (cube-derived) - fly to Cyprus → forces slow path with `f.source IN (…)` + JOIN - assert material counts shrink (a silent fallback to baseline on a column-ambiguity SQL error would write global counts, which would GROW back to the un-filtered globals — caught by the `cyprusTotal < filteredTotal` assertion) 3. (TOLERANCE TOO LOOSE) `zoom in → zoom out → restore` accepted a 1 % delta on the restored totals, but at `alt=15e6` the altitude shortcut in `isGlobalView()` forces the no-bbox baseline path → the counts should be value-by-value EXACTLY equal to the first global capture. Switched to `expect(restoredCounts).toEqual(globalCounts)`. A loose tolerance could hide a stale-read race writing a bbox-scoped count that just happens to look "close enough." 4. (POLISH) Sharpened the `moveStart` listener comment to explain why bumping the id is necessary (was missing the rationale). Test results: 4/4 new + 10/10 regression (facetnote-url-load, search-real-count, url-roundtrip) — 14 green locally. Codex's point about the altitude-shortcut UX (high-altitude pivot returns "global" even if camera angle doesn't show the full data extent) is an intentional product decision per the design rationale in `isGlobalView()`'s comment; nothing to change there.

rdhyee · 2026-05-25T16:52:44Z

Codex review iteration log (2 rounds → LGTM)

Round	Codex finding	Resolution	Commit
1	(BLOCKING) `moveStart` only marked `.recomputing`; did not bump `facetCountsReqId` or clear the pending debounce. An older `updateCrossFilteredCounts` could resume mid-pan, pass the stale guard, write pre-move counts, and clear `.recomputing` — leaving the legend looking authoritative but stale until moveEnd's debounce fired the new query.	moveStart now `clearTimeout(facetCountsDebounce); ++facetCountsReqId;` in addition to marking the italic state. moveEnd's `refreshFacetCounts()` bumps the id AGAIN — supersession of already-superseded work, debounce coalesced.	`63b0457`
1	Test coverage gap: the spec exercised the no-filter bbox path but NOT the filter-active JOIN path — exactly the new SQL shape this PR introduces.	Added `bbox-aware count under an active source filter (JOIN + filter)`: activate a single source filter (cube fast-path at global view), capture filtered material counts, fly to Cyprus → forces slow path with `f.source IN (…)` JOIN. Assertion `cyprusTotal < filteredTotal` catches the silent fallback-to-baseline case (column-ambiguity SQL errors fall back to unfiltered globals, which would GROW back instead of shrinking — caught).	`63b0457`
1	1 % restore tolerance was too loose. At `alt=15e6` the altitude shortcut forces the no-bbox baseline path, so restored counts should be value-by-value EXACTLY equal to the first global capture.	`expect(restoredCounts).toEqual(globalCounts)` — strict equality.	`63b0457`
1	Polish: moveStart comment missed the rationale for the id bump.	Comment expanded with the race-condition explanation.	`63b0457`
1	Documentation polish — high-altitude shortcut can return "global" for a non-fully-data-covering camera angle.	Intentional per the design rationale; no change. (Existing comment in `isGlobalView()` already explains the UX call.)	n/a

Round 2: LGTM (Codex)

Findings: none. I don't see a new correctness issue in 63b0457. […] moveStart double-bump: safe. […] JOIN + filter test exercises the intended SQL shape because the checkbox state is changed synchronously before the Cyprus fly. […] Handler mechanics: clearTimeout(null) is safe, facetCountsReqId is initialized, and the mark/clear/bump ordering is fine under JS's single-threaded event execution.

Final test count: 14 passed locally (4 new B1 specs + 10 regression: 2 facetnote-url-load + 3 search-real-count + 5 url-roundtrip).

Known follow-ups deferred per Q1/Q2 plan decisions, not raised by Codex:

Latency was not formally benchmarked; baking a pre-joined facets+coords parquet (Q1 fallback) waits on real-world UX feedback.
Cancellation-race Playwright spec (two pans back-to-back inside the 250 ms window) — not added; the facetCountsReqId stale guard is now exercised structurally by moveStart's invalidation, which Codex round 2 walked through and accepted.

Thanks codex for the round-1 catch on moveStart — the stale-and-not-italic interval was a real bug, not a theoretical one.

…s PR isamplesorg#237)

rdhyee added 2 commits May 24, 2026 22:50

rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 27, 2026

Merge feat/b1-viewport-aware-facets for rdhyee staging deploy (mirror…

8bcbb4a

…s PR isamplesorg#237)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

explorer: B1 viewport-aware facet counts (#234 step 3)#237

explorer: B1 viewport-aware facet counts (#234 step 3)#237
rdhyee wants to merge 2 commits into
isamplesorg:mainfrom
rdhyee:feat/b1-viewport-aware-facets

rdhyee commented May 25, 2026

Uh oh!

rdhyee commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rdhyee commented May 25, 2026

Summary

Decisions locked with @rdhyee before implementation

Architectural changes

Test plan

Known follow-ups

Cross-refs

Uh oh!

rdhyee commented May 25, 2026

Codex review iteration log (2 rounds → LGTM)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant