Goal
Show look-ahead counts next to each facet value in the legend (Source / Material /
Sampled Feature / Specimen Type). When a user toggles a facet, every other facet
value displays "if you added me to the current selection, you'd have N samples" —
the classic guided-navigation UX.
This issue is the design-review gate before the Layer 2+ work below. Layer 1 is
trivial wiring and will likely ship first; this issue exists to settle the harder
questions before they block.
Foundation already in place
R2 parquets (live)
| URL |
Rows |
Purpose |
https://data.isamples.org/isamples_202601_sample_facets_v2.parquet |
5.98M |
Per-sample normalized facet values for live queries |
https://data.isamples.org/isamples_202601_facet_summaries.parquet |
56 |
Baseline counts (the legend numbers we show today) |
https://data.isamples.org/isamples_202601_facet_cross_filter.parquet |
526 |
Pre-computed cross-filter cube |
The cube — what it covers
Schema: (filter_source, filter_material, filter_context, filter_object_type, facet_type, facet_value, count).
NULL filter columns mean "axis open".
| Filter pattern |
Cells |
Covers |
···· no filter |
56 |
baseline |
S··· source only |
68 |
given a source, counts for material/context/object_type |
·M·· material only |
144 |
given a material, counts for the other three axes |
··C· context only |
116 |
given a context, counts for the other three axes |
···O object_type only |
142 |
given an object_type, counts for the other three axes |
Not in the cube: 2+ active filters, multi-select within an axis, bbox, text search.
The cube correctly encodes the faceted-search rule: when filter is S=SESAR, only
material/context/object_type counts are stored — the Source axis is "open" relative
to itself, so its counts come from elsewhere.
Frontend hooks (already half-built in explorer.qmd)
- ~
L672-675: parquet URL declarations
- ~
L877: facetFilterSQL() — predicate builder, supports OR-within/AND-across
- ~
L976-996: applyFacetCounts(facetKey, countsMap) — currently only called once on baseline
- ~
L593-596, L514-538: .facet-count spans in the DOM, ready for population
- CSS classes
.facet-row.zero (dim) and .facet-count.recomputing (loading) already defined
Faceted-search semantics (the rule)
OR within an axis, AND across axes. The axis being calculated is "open" for its
own count.
If user has material = {bone} and we want to show the count for "Material: pottery":
- ❌ Wrong:
... AND material = 'bone' AND material = 'pottery' → returns 0
- ✅ Right:
... AND material = 'pottery' (drop the current Material constraint)
The cube bakes this in. Live queries must do it explicitly.
Implementation layers
Layer 1 — surface what already exists (~half day)
Hook a cube lookup into the filter-toggle handler. Single-active-filter case only.
This is pure wiring and will land first as a separate small PR; tracking here for
context but not for design review.
Layer 2 — multi-axis active filters (~1–2 days) ← needs decision
When both source=SESAR AND context=earthmaterial are active, the cube can't
help. Two paths:
- (a) Live DuckDB-WASM GROUP BY against
sample_facets_v2.parquet (5.98M rows).
Budget: 50–200 ms per filter change. Needs a freshSelectionToken-style
cancellation primitive (one already exists for sample-card detail loads).
- (b) Expand the cube to pre-compute pairwise (SM, SC, SO, MC, MO, CO — 6
patterns × ~50 × ~50 ≈ 15k extra cells, still tiny). Triples don't fit.
Recommendation: try (a) first, fall back to (b) if perf is bad.
Layer 3 — spatial + text + multi-select within axis (~2–3 days)
- Multi-select within axis (e.g.
source IN (SESAR, OpenContext)) — live query
handles this naturally.
- Bbox constraint — needs
JOIN wide_url for lat/lng, or pre-bake H3 cell into
facets_url. Adds 5–10× query cost.
- Text search —
ILIKE on label/description/place_name. The existing search
path already does this; needs to feed into the count query.
Layer 4 — UX polish (ongoing)
Threshold for "..." spinner on expensive queries, hooking up .facet-row.zero
dimming, mobile collapse, "Reset" behavior on huge categories.
Open design questions
-
Stale-while-revalidate? On filter toggle, show baseline counts immediately
and update to filtered counts when the live query returns? Or block on the
query and show "…" in the meantime? (Lean: stale-while-revalidate, with
.facet-count.recomputing italic styling already in place.)
-
Bbox constraint scope. Should facet counts reflect "in current viewport" or
"globally"? In-map H3 cells already show in-viewport counts. (Lean: globally
for the legend, viewport stays an H3-only thing.)
-
Hierarchical concept rollup. context=earthinterior is a sub-concept of
anysampledfeature. Should parent counts include children? Does the cube
already do this, or are counts strictly leaf-level? (Needs verification against
the cube.)
-
Non-canonical axes. Cube is limited to source/material/context/object_type.
If the explorer ever surfaces project / site / curation-location facets, the
cube schema needs migration. Is this on the roadmap?
Out of scope
Acceptance
- Decisions recorded on the four questions above.
- Layer 2 path (a vs. b) chosen.
- Layers 1, 2, 3 broken into trackable sub-issues once the design lands.
Cross-refs: #163, #164, #226.
Goal
Show look-ahead counts next to each facet value in the legend (Source / Material /
Sampled Feature / Specimen Type). When a user toggles a facet, every other facet
value displays "if you added me to the current selection, you'd have N samples" —
the classic guided-navigation UX.
This issue is the design-review gate before the Layer 2+ work below. Layer 1 is
trivial wiring and will likely ship first; this issue exists to settle the harder
questions before they block.
Foundation already in place
R2 parquets (live)
https://data.isamples.org/isamples_202601_sample_facets_v2.parquethttps://data.isamples.org/isamples_202601_facet_summaries.parquethttps://data.isamples.org/isamples_202601_facet_cross_filter.parquetThe cube — what it covers
Schema:
(filter_source, filter_material, filter_context, filter_object_type, facet_type, facet_value, count).NULL filter columns mean "axis open".
····no filterS···source only·M··material only··C·context only···Oobject_type onlyNot in the cube: 2+ active filters, multi-select within an axis, bbox, text search.
The cube correctly encodes the faceted-search rule: when filter is
S=SESAR, onlymaterial/context/object_type counts are stored — the Source axis is "open" relative
to itself, so its counts come from elsewhere.
Frontend hooks (already half-built in
explorer.qmd)L672-675: parquet URL declarationsL877:facetFilterSQL()— predicate builder, supports OR-within/AND-acrossL976-996:applyFacetCounts(facetKey, countsMap)— currently only called once on baselineL593-596,L514-538:.facet-countspans in the DOM, ready for population.facet-row.zero(dim) and.facet-count.recomputing(loading) already definedFaceted-search semantics (the rule)
OR within an axis, AND across axes. The axis being calculated is "open" for its
own count.
If user has
material = {bone}and we want to show the count for "Material: pottery":... AND material = 'bone' AND material = 'pottery'→ returns 0... AND material = 'pottery'(drop the current Material constraint)The cube bakes this in. Live queries must do it explicitly.
Implementation layers
Layer 1 — surface what already exists (~half day)
Hook a cube lookup into the filter-toggle handler. Single-active-filter case only.
This is pure wiring and will land first as a separate small PR; tracking here for
context but not for design review.
Layer 2 — multi-axis active filters (~1–2 days) ← needs decision
When both
source=SESARANDcontext=earthmaterialare active, the cube can'thelp. Two paths:
sample_facets_v2.parquet(5.98M rows).Budget: 50–200 ms per filter change. Needs a
freshSelectionToken-stylecancellation primitive (one already exists for sample-card detail loads).
patterns × ~50 × ~50 ≈ 15k extra cells, still tiny). Triples don't fit.
Recommendation: try (a) first, fall back to (b) if perf is bad.
Layer 3 — spatial + text + multi-select within axis (~2–3 days)
source IN (SESAR, OpenContext)) — live queryhandles this naturally.
JOIN wide_urlfor lat/lng, or pre-bake H3 cell intofacets_url. Adds 5–10× query cost.ILIKEon label/description/place_name. The existing searchpath already does this; needs to feed into the count query.
Layer 4 — UX polish (ongoing)
Threshold for "..." spinner on expensive queries, hooking up
.facet-row.zerodimming, mobile collapse, "Reset" behavior on huge categories.
Open design questions
Stale-while-revalidate? On filter toggle, show baseline counts immediately
and update to filtered counts when the live query returns? Or block on the
query and show "…" in the meantime? (Lean: stale-while-revalidate, with
.facet-count.recomputingitalic styling already in place.)Bbox constraint scope. Should facet counts reflect "in current viewport" or
"globally"? In-map H3 cells already show in-viewport counts. (Lean: globally
for the legend, viewport stays an H3-only thing.)
Hierarchical concept rollup.
context=earthinterioris a sub-concept ofanysampledfeature. Should parent counts include children? Does the cubealready do this, or are counts strictly leaf-level? (Needs verification against
the cube.)
Non-canonical axes. Cube is limited to source/material/context/object_type.
If the explorer ever surfaces project / site / curation-location facets, the
cube schema needs migration. Is this on the roadmap?
Out of scope
explorer.qmdinto modules.in Layer 2 is chosen).
Interactive Explorer rethink: architecture review + UX/feature backlog #163 (umbrella).
Acceptance
Cross-refs: #163, #164, #226.