Skip to content

Interactive Explorer rethink: architecture review + UX/feature backlog #163

@rdhyee

Description

@rdhyee

Follow-up to #156 / PR #162 (Phase 5 unified explorer). Captured during post-merge review on rdhyee staging at https://rdhyee.github.io/isamplesorg.github.io/explorer.html. Filing for design + scoping rather than as immediate-blocker tasks.

1. Number-of-samples discrepancy in hero copy

Observation (Raymond): page subtitle says "Search and explore 6.7 million material samples" but the stats panel reports e.g. `5,980,282 samples in view`. The 6.7M is the dataset total; the lower number is samples with coordinates (likely — needs verification).

Reaction (CC): yes, the gap is georeferenced rows. The per-source facet counts compound the confusion — a user sees SESAR=4.6M in the legend, then a smaller in-view count, with no explanation. Two non-exclusive fixes:

  • Hero copy: `6.7M samples · 6.0M with coordinates` (or similar — confirm exact number against the lite parquet)
  • Separate "dataset total" from "on the map" in the side panel so the relationship is legible

2. Tags on the page

Current: `PARQUET`, `SPATIAL`, `H3`, `PERFORMANCE`, `ISAMPLES` rendered as a tag row near the title.

Observation (Raymond): rethink — these read like internal repo categorization, not user-facing.

Reaction (CC): agree. Options: (a) drop the tag bar entirely — this is the marquee page, doesn't need self-categorization; (b) replace with user-facing tags like `interactive`, `globe`, `search`, `samples`. Lean toward (a).

3. H3 res4 / res6 / res8 labels

Current: the "How This Demo Works" table uses literal H3 resolution numbers and the stats panel shows `H3 Res4` etc.

Observation (Raymond): technically accurate but maybe too techie.

Reaction (CC): agree. Labels could be `Continents (instant)` → `Regions` → `Neighborhoods` → `Individual samples`. Keep the "instant / 5K / 1.4M / 90M" perf story in the deep-dive table but don't front-load H3 jargon.

4. Search results vs map sync

Observation (Raymond): at https://rdhyee.github.io/isamplesorg.github.io/explorer.html?search=pottery+Cyprus the search panel says "no results" but all the dots stay on the map. Expected: empty map.

Reaction (CC): current behavior — search only filters `#searchResults` (the side list), doesn't gate the map layer. Easiest model:

  • When search returns N results: render only those N as points on the globe (drop to point mode if needed; H3 summaries don't carry FTS-aware counts).
  • When search returns 0: clear the layer, surface a banner.
  • When search input is empty: revert to current behavior.

Trade-off: cluster mode + active search is awkward — the H3 summary parquets aren't text-indexed. Either auto-drop to point mode on active search (clean), or keep cluster colors but hide non-matching dominant-source clusters (messy). Lean clean.

5. Click table-row → fly camera

Observation (Raymond): in the table view, clicking a result should make the globe zoom + rotate to that sample.

Reaction (CC): straightforward. The hash deep-link path (`#v=1&lat=…&pid=…`) already does the camera flight + sample-card update; just need a click handler on table rows that pushes that hash. Implement once table is the active view; switching back to globe should restore camera to that sample.

6. Table paging in URL

Observation (Raymond): table view should have paging parameters in the URL.

Reaction (CC): `page=N` (1-indexed) is the common case. `TABLE_PAGE_SIZE = 100` is currently a constant — keep it so unless we have a clear reason to expose. Wire `page=` into `writeQueryState` and the existing URL-hydration path; `prevBtn`/`nextBtn` already update internal `page` state, just need to call `writeQueryState` after.

7. The big design question: what should facet counts show?

Observation (Raymond): at any given state of the page (search box content + facet selections + view + viewport), what should the facet counts reflect?

Reaction (CC): this is the question worth settling before any of the smaller items. Three honest options:

  • (A) Cross-filter (what Codex's recent fix landed on): count for each value = "samples matching all OTHER active facet filters." Drill-out story — "if I added this, how many more." Counts respond to checkboxes only, not search.
  • (B) Conjunctive: count = "samples matching ALL currently-applied filters, including this facet's other selections." Counts shrink monotonically as you narrow. More predictable; less drill-out useful.
  • (C) Total: count = "samples in this category, full dataset." Counts never change. Simplest, least informative.

Sub-questions that fall out:

  • Should counts respect viewport? Probably no — keep counts dataset-wide for predictability across pan/zoom. The map already shows "in view" via dot density.
  • Should counts respect search? Codex's recent fix says no — search is for finding specific samples; facets are for narrowing the dataset. That decoupling stops counts dimming based on text-match co-occurrence.
  • Should counts respect view (globe vs table)? No reason for them to differ; same dataset.

Recommendation: keep (A) as landed, write a short design note in the page or docs explaining the semantics, and don't conflate further. If we eventually want viewport-aware counts, that's a separate "Counts in viewport" toggle, not the default.


Cross-references:


0. Meta: architecture review before piling on features

Observation (Raymond): items 1–7 below are a feature/UX backlog, but part of this issue is also to look at the overall architecture of the UI and ask whether it's time to do a solid refactor. We might have piled up all sorts of changes incoherently across phases 1–5. Worth being rigorous about:

  • URL parameters: which params are owned by which subsystem, defaults, encoding, what survives replaceState, what's hash vs query, what's reserved by Quarto (q collision)
  • Compartmentalization of UI chunks: globe / facets / search / table / sample-card — currently they mostly share a single OJS cell graph with implicit dependencies (zoomWatcher waits on phase1 and facetFilters, tableView waits on facetFilters, click handlers re-register, etc.)
  • Internal state of widgets: viewer._globeState, viewer._initialHash, viewer._baselineCounts, body class as view marker, _urlParamsHydrated shenanigans (now removed) — this is scattered, untyped, and easy to break
  • Reactivity model: OJS cells + manual DOM event listeners + addEventListener-as-imperative-glue. Re-evaluation of OJS cells re-runs handler registration code, which has bitten us before
  • Function definitions per cell: OJS allows multiple function declarations in one cell but only one top-level statement; this constraint has caused at least one regression (the let _urlParamsHydrated debacle that took down textSearchScore)

Reaction (CC): agree this should come before the feature work below. Two recommendations:

  1. Inventory before refactor: write a one-pager that enumerates current URL params (name, owner, default, hydration site, write-back site, validation), the OJS cell graph (cells, dependencies, side effects), and the global mutable state surface (every `viewer.`, every `window.`, every body class used as state). The act of listing surfaces what's incoherent. Diagram-as-doc, not as prose.

  2. Decide on reactivity model: the current hybrid (OJS reactive cells + imperative DOM event listeners) is the source of most of the timing bugs we've hit (writeQueryState-before-hydration, double handler registration, etc.). Two coherent endpoints:

    • Pure OJS-reactive: every UI control is an OJS Input or wrapped widget; state flows through the cell graph; no imperative `addEventListener` outside of cell bodies. Pros: principled. Cons: heavy retrofit, OJS `Inputs.checkbox` doesn't respond to programmatic `.click()` (which is why our test suite skipped cross-filter tests until Phase 5).
    • Pure imperative + DOM-as-state: drop the OJS reactivity for control plumbing; OJS cells used only as scoped scripts that load data and bind handlers once; URL is the canonical state store, DOM is its mirror. Pros: matches how the code already mostly works; tests can use programmatic events. Cons: need to be disciplined about when handlers register and what they read.
    • The current middle ground is the worst of both — OJS dependency declarations imply reactive re-evaluation, but the inside of cells uses imperative handlers that don't re-bind correctly on re-eval.

    Recommendation: commit to imperative + URL-as-canonical-state. Refactor cells to be "run-once" idiomatic — guard with `if (window._wired) return; window._wired = true;` patterns where needed. Keeps OJS for what it's actually good for here (loading data, providing globals) and stops fighting its reactivity model.

  3. Compartmentalize: extract per-subsystem modules (one per: `urlState`, `globeView`, `facetCounts`, `tableView`, `sampleCard`, `search`) — each exposing a small public API (`init(deps)`, `update(state)`) and owning its own DOM. The host cell wires them together. This is what Phase 1's "portable predicate" started, generalized.

This work probably wants to be its own issue with a design doc PR before any code change. Filing this as part of #163 to keep the rethink scope visible; if it grows, split out as #164.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions