Follow-up to #156 / PR #162 (Phase 5 unified explorer). Captured during post-merge review on rdhyee staging at https://rdhyee.github.io/isamplesorg.github.io/explorer.html. Filing for design + scoping rather than as immediate-blocker tasks.
1. Number-of-samples discrepancy in hero copy
Observation (Raymond): page subtitle says "Search and explore 6.7 million material samples" but the stats panel reports e.g. `5,980,282 samples in view`. The 6.7M is the dataset total; the lower number is samples with coordinates (likely — needs verification).
Reaction (CC): yes, the gap is georeferenced rows. The per-source facet counts compound the confusion — a user sees SESAR=4.6M in the legend, then a smaller in-view count, with no explanation. Two non-exclusive fixes:
- Hero copy: `6.7M samples · 6.0M with coordinates` (or similar — confirm exact number against the lite parquet)
- Separate "dataset total" from "on the map" in the side panel so the relationship is legible
2. Tags on the page
Current: `PARQUET`, `SPATIAL`, `H3`, `PERFORMANCE`, `ISAMPLES` rendered as a tag row near the title.
Observation (Raymond): rethink — these read like internal repo categorization, not user-facing.
Reaction (CC): agree. Options: (a) drop the tag bar entirely — this is the marquee page, doesn't need self-categorization; (b) replace with user-facing tags like `interactive`, `globe`, `search`, `samples`. Lean toward (a).
3. H3 res4 / res6 / res8 labels
Current: the "How This Demo Works" table uses literal H3 resolution numbers and the stats panel shows `H3 Res4` etc.
Observation (Raymond): technically accurate but maybe too techie.
Reaction (CC): agree. Labels could be `Continents (instant)` → `Regions` → `Neighborhoods` → `Individual samples`. Keep the "instant / 5K / 1.4M / 90M" perf story in the deep-dive table but don't front-load H3 jargon.
4. Search results vs map sync
Observation (Raymond): at https://rdhyee.github.io/isamplesorg.github.io/explorer.html?search=pottery+Cyprus the search panel says "no results" but all the dots stay on the map. Expected: empty map.
Reaction (CC): current behavior — search only filters `#searchResults` (the side list), doesn't gate the map layer. Easiest model:
- When search returns N results: render only those N as points on the globe (drop to point mode if needed; H3 summaries don't carry FTS-aware counts).
- When search returns 0: clear the layer, surface a banner.
- When search input is empty: revert to current behavior.
Trade-off: cluster mode + active search is awkward — the H3 summary parquets aren't text-indexed. Either auto-drop to point mode on active search (clean), or keep cluster colors but hide non-matching dominant-source clusters (messy). Lean clean.
5. Click table-row → fly camera
Observation (Raymond): in the table view, clicking a result should make the globe zoom + rotate to that sample.
Reaction (CC): straightforward. The hash deep-link path (`#v=1&lat=…&pid=…`) already does the camera flight + sample-card update; just need a click handler on table rows that pushes that hash. Implement once table is the active view; switching back to globe should restore camera to that sample.
6. Table paging in URL
Observation (Raymond): table view should have paging parameters in the URL.
Reaction (CC): `page=N` (1-indexed) is the common case. `TABLE_PAGE_SIZE = 100` is currently a constant — keep it so unless we have a clear reason to expose. Wire `page=` into `writeQueryState` and the existing URL-hydration path; `prevBtn`/`nextBtn` already update internal `page` state, just need to call `writeQueryState` after.
7. The big design question: what should facet counts show?
Observation (Raymond): at any given state of the page (search box content + facet selections + view + viewport), what should the facet counts reflect?
Reaction (CC): this is the question worth settling before any of the smaller items. Three honest options:
- (A) Cross-filter (what Codex's recent fix landed on): count for each value = "samples matching all OTHER active facet filters." Drill-out story — "if I added this, how many more." Counts respond to checkboxes only, not search.
- (B) Conjunctive: count = "samples matching ALL currently-applied filters, including this facet's other selections." Counts shrink monotonically as you narrow. More predictable; less drill-out useful.
- (C) Total: count = "samples in this category, full dataset." Counts never change. Simplest, least informative.
Sub-questions that fall out:
- Should counts respect viewport? Probably no — keep counts dataset-wide for predictability across pan/zoom. The map already shows "in view" via dot density.
- Should counts respect search? Codex's recent fix says no — search is for finding specific samples; facets are for narrowing the dataset. That decoupling stops counts dimming based on text-match co-occurrence.
- Should counts respect view (globe vs table)? No reason for them to differ; same dataset.
Recommendation: keep (A) as landed, write a short design note in the page or docs explaining the semantics, and don't conflate further. If we eventually want viewport-aware counts, that's a separate "Counts in viewport" toggle, not the default.
Cross-references:
0. Meta: architecture review before piling on features
Observation (Raymond): items 1–7 below are a feature/UX backlog, but part of this issue is also to look at the overall architecture of the UI and ask whether it's time to do a solid refactor. We might have piled up all sorts of changes incoherently across phases 1–5. Worth being rigorous about:
- URL parameters: which params are owned by which subsystem, defaults, encoding, what survives
replaceState, what's hash vs query, what's reserved by Quarto (q collision)
- Compartmentalization of UI chunks: globe / facets / search / table / sample-card — currently they mostly share a single OJS cell graph with implicit dependencies (
zoomWatcher waits on phase1 and facetFilters, tableView waits on facetFilters, click handlers re-register, etc.)
- Internal state of widgets:
viewer._globeState, viewer._initialHash, viewer._baselineCounts, body class as view marker, _urlParamsHydrated shenanigans (now removed) — this is scattered, untyped, and easy to break
- Reactivity model: OJS cells + manual DOM event listeners +
addEventListener-as-imperative-glue. Re-evaluation of OJS cells re-runs handler registration code, which has bitten us before
- Function definitions per cell: OJS allows multiple function declarations in one cell but only one top-level statement; this constraint has caused at least one regression (the
let _urlParamsHydrated debacle that took down textSearchScore)
Reaction (CC): agree this should come before the feature work below. Two recommendations:
-
Inventory before refactor: write a one-pager that enumerates current URL params (name, owner, default, hydration site, write-back site, validation), the OJS cell graph (cells, dependencies, side effects), and the global mutable state surface (every `viewer.`, every `window.`, every body class used as state). The act of listing surfaces what's incoherent. Diagram-as-doc, not as prose.
-
Decide on reactivity model: the current hybrid (OJS reactive cells + imperative DOM event listeners) is the source of most of the timing bugs we've hit (writeQueryState-before-hydration, double handler registration, etc.). Two coherent endpoints:
- Pure OJS-reactive: every UI control is an OJS Input or wrapped widget; state flows through the cell graph; no imperative `addEventListener` outside of cell bodies. Pros: principled. Cons: heavy retrofit, OJS `Inputs.checkbox` doesn't respond to programmatic `.click()` (which is why our test suite skipped cross-filter tests until Phase 5).
- Pure imperative + DOM-as-state: drop the OJS reactivity for control plumbing; OJS cells used only as scoped scripts that load data and bind handlers once; URL is the canonical state store, DOM is its mirror. Pros: matches how the code already mostly works; tests can use programmatic events. Cons: need to be disciplined about when handlers register and what they read.
- The current middle ground is the worst of both — OJS dependency declarations imply reactive re-evaluation, but the inside of cells uses imperative handlers that don't re-bind correctly on re-eval.
Recommendation: commit to imperative + URL-as-canonical-state. Refactor cells to be "run-once" idiomatic — guard with `if (window._wired) return; window._wired = true;` patterns where needed. Keeps OJS for what it's actually good for here (loading data, providing globals) and stops fighting its reactivity model.
-
Compartmentalize: extract per-subsystem modules (one per: `urlState`, `globeView`, `facetCounts`, `tableView`, `sampleCard`, `search`) — each exposing a small public API (`init(deps)`, `update(state)`) and owning its own DOM. The host cell wires them together. This is what Phase 1's "portable predicate" started, generalized.
This work probably wants to be its own issue with a design doc PR before any code change. Filing this as part of #163 to keep the rethink scope visible; if it grows, split out as #164.
Follow-up to #156 / PR #162 (Phase 5 unified explorer). Captured during post-merge review on rdhyee staging at https://rdhyee.github.io/isamplesorg.github.io/explorer.html. Filing for design + scoping rather than as immediate-blocker tasks.
1. Number-of-samples discrepancy in hero copy
Observation (Raymond): page subtitle says "Search and explore 6.7 million material samples" but the stats panel reports e.g. `5,980,282 samples in view`. The 6.7M is the dataset total; the lower number is samples with coordinates (likely — needs verification).
Reaction (CC): yes, the gap is georeferenced rows. The per-source facet counts compound the confusion — a user sees SESAR=4.6M in the legend, then a smaller in-view count, with no explanation. Two non-exclusive fixes:
2. Tags on the page
Current: `PARQUET`, `SPATIAL`, `H3`, `PERFORMANCE`, `ISAMPLES` rendered as a tag row near the title.
Observation (Raymond): rethink — these read like internal repo categorization, not user-facing.
Reaction (CC): agree. Options: (a) drop the tag bar entirely — this is the marquee page, doesn't need self-categorization; (b) replace with user-facing tags like `interactive`, `globe`, `search`, `samples`. Lean toward (a).
3. H3 res4 / res6 / res8 labels
Current: the "How This Demo Works" table uses literal H3 resolution numbers and the stats panel shows `H3 Res4` etc.
Observation (Raymond): technically accurate but maybe too techie.
Reaction (CC): agree. Labels could be `Continents (instant)` → `Regions` → `Neighborhoods` → `Individual samples`. Keep the "instant / 5K / 1.4M / 90M" perf story in the deep-dive table but don't front-load H3 jargon.
4. Search results vs map sync
Observation (Raymond): at https://rdhyee.github.io/isamplesorg.github.io/explorer.html?search=pottery+Cyprus the search panel says "no results" but all the dots stay on the map. Expected: empty map.
Reaction (CC): current behavior — search only filters `#searchResults` (the side list), doesn't gate the map layer. Easiest model:
Trade-off: cluster mode + active search is awkward — the H3 summary parquets aren't text-indexed. Either auto-drop to point mode on active search (clean), or keep cluster colors but hide non-matching dominant-source clusters (messy). Lean clean.
5. Click table-row → fly camera
Observation (Raymond): in the table view, clicking a result should make the globe zoom + rotate to that sample.
Reaction (CC): straightforward. The hash deep-link path (`#v=1&lat=…&pid=…`) already does the camera flight + sample-card update; just need a click handler on table rows that pushes that hash. Implement once table is the active view; switching back to globe should restore camera to that sample.
6. Table paging in URL
Observation (Raymond): table view should have paging parameters in the URL.
Reaction (CC): `page=N` (1-indexed) is the common case. `TABLE_PAGE_SIZE = 100` is currently a constant — keep it so unless we have a clear reason to expose. Wire `page=` into `writeQueryState` and the existing URL-hydration path; `prevBtn`/`nextBtn` already update internal `page` state, just need to call `writeQueryState` after.
7. The big design question: what should facet counts show?
Observation (Raymond): at any given state of the page (search box content + facet selections + view + viewport), what should the facet counts reflect?
Reaction (CC): this is the question worth settling before any of the smaller items. Three honest options:
Sub-questions that fall out:
Recommendation: keep (A) as landed, write a short design note in the page or docs explaining the semantics, and don't conflate further. If we eventually want viewport-aware counts, that's a separate "Counts in viewport" toggle, not the default.
Cross-references:
0. Meta: architecture review before piling on features
Observation (Raymond): items 1–7 below are a feature/UX backlog, but part of this issue is also to look at the overall architecture of the UI and ask whether it's time to do a solid refactor. We might have piled up all sorts of changes incoherently across phases 1–5. Worth being rigorous about:
replaceState, what's hash vs query, what's reserved by Quarto (qcollision)zoomWatcherwaits onphase1andfacetFilters,tableViewwaits onfacetFilters, click handlers re-register, etc.)viewer._globeState,viewer._initialHash,viewer._baselineCounts, body class as view marker,_urlParamsHydratedshenanigans (now removed) — this is scattered, untyped, and easy to breakaddEventListener-as-imperative-glue. Re-evaluation of OJS cells re-runs handler registration code, which has bitten us beforelet _urlParamsHydrateddebacle that took downtextSearchScore)Reaction (CC): agree this should come before the feature work below. Two recommendations:
Inventory before refactor: write a one-pager that enumerates current URL params (name, owner, default, hydration site, write-back site, validation), the OJS cell graph (cells, dependencies, side effects), and the global mutable state surface (every `viewer.`, every `window.`, every body class used as state). The act of listing surfaces what's incoherent. Diagram-as-doc, not as prose.
Decide on reactivity model: the current hybrid (OJS reactive cells + imperative DOM event listeners) is the source of most of the timing bugs we've hit (writeQueryState-before-hydration, double handler registration, etc.). Two coherent endpoints:
Recommendation: commit to imperative + URL-as-canonical-state. Refactor cells to be "run-once" idiomatic — guard with `if (window._wired) return; window._wired = true;` patterns where needed. Keeps OJS for what it's actually good for here (loading data, providing globals) and stops fighting its reactivity model.
Compartmentalize: extract per-subsystem modules (one per: `urlState`, `globeView`, `facetCounts`, `tableView`, `sampleCard`, `search`) — each exposing a small public API (`init(deps)`, `update(state)`) and owning its own DOM. The host cell wires them together. This is what Phase 1's "portable predicate" started, generalized.
This work probably wants to be its own issue with a design doc PR before any code change. Filing this as part of #163 to keep the rethink scope visible; if it grows, split out as #164.