explorer: dynamic cross-filter facet counts by rdhyee · Pull Request #155 · isamplesorg/isamplesorg.github.io

rdhyee · 2026-05-01T04:28:48Z

Closes #154.

Implements the path-A plan from #154 (Codex-resolved): when the user toggles a filter, the count next to every other facet value recalculates against the new filter combination. Matches the OpenContext UX Eric Kansa pointed at. Source + material + sampled feature + specimen type all participate. Search and camera viewport stay out of scope per the issue.

Live preview: https://rdhyee.github.io/isamplesorg.github.io/tutorials/isamples_explorer.html?sources=OPENCONTEXT

What changed

Data

New constant cross_filter_url = .../isamples_202601_facet_cross_filter.parquet (the existing ~6 KB single-filter pre-aggregation cache).

DOM scaffolding

Source legend and material/context/object_type rows now share a uniform shape:
- <label class="facet-row" data-facet="K" data-value="V">
- <span class="facet-count" data-facet="K" data-value="V">
This lets applyFacetCounts(facetKey, countsMap) update text in place without rebuilding checkbox HTML — important because mid-interaction rebuilds would lose user selections.
New CSS: .facet-row.zero { opacity: 0.4 } for dim-on-zero, and .facet-count.recomputing { opacity: 0.55; font-style: italic } while a refresh is in flight.

Helpers (helpers cell)

applyFacetCounts(facetKey, countsMap | null) — writes count text and toggles .zero on each row. Null = restore baseline (from viewer._baselineCounts).
markFacetCountsRecomputing() — italicizes counts during a query.

`refreshFacetCounts()` (inside `zoomWatcher`)

Debounced 250 ms with a generation guard (facetCountsReqId). Three cases:

No filters active → restore baseline. No query.
Exactly one facet value selected → cache hit. SQL:
```
SELECT facet_type, facet_value, count
FROM read_parquet(cross_filter_url)
WHERE filter_<key> = '<value>'
  AND <other filter cols> IS NULL
```
Verified schema: (filter_source, filter_material, filter_context, filter_object_type, facet_type, facet_value, count).
Anything else → four concurrent on-the-fly GROUP BY queries on sample_facets_v2, each excluding the column being recomputed:
```
SELECT <col> AS value, COUNT(*) FROM facets_v2
WHERE <other dim filters>
GROUP BY <col>
```

Selected dim's own values keep baseline counts (so the user can switch their selection within that group without the counts collapsing to "now I only have one option").

Wiring

Existing source / material / context / object_type change handlers each gain a refreshFacetCounts() call alongside the existing refreshResultsTable(). One initial paint at startup so any ?sources= URL param's narrowing is reflected immediately.

What's deferred (per #154)

Search composition. Search is currently an imperative panel lookup; folding it into count math would require promoting it to a real persistent filter with URL sync. Not in v1.
Camera-aware counts. Mixes "what filters are possible in the dataset" with "what is visible in viewport." Defer.
Path C (denser pre-aggregated cache covering all 2-filter combinations). Touches the data-build pipeline. Revisit only if path B's perf turns out unacceptable in practice.

Test plan

quarto render tutorials/isamples_explorer.qmd succeeds
Playwright smoke test: 0 JS exceptions, 0 console errors, 0 network failures
Cross-filter parquet schema verified via DuckDB (4 filter cols + facet_type/facet_value/count)
Deployed to rdhyee fork preview
Reviewer eyeballs preview URL above:
- check OpenContext in the source legend → material / sampled feature / specimen type counts should all drop to OpenContext-only counts; values with 0 OpenContext samples should dim. Source counts stay at baseline.
- within the dimmed material list, check Pottery → should now have ≥2 filters active, hitting the on-the-fly fallback path. Other facets (source, sampled feature, specimen type) should recompute against OpenContext+Pottery; material list itself stays at baseline.
- Clear Filters → all counts return to baseline, no rows dimmed.
Codex pass

References

Explorer: dynamic facet count recalculation (cross-filter counts) #154 — issue + Codex-resolved plan
PR explorer: rewrite on progressive_globe foundation for speed + add results table #153 — the rewrite this builds on
isamples_202601_facet_cross_filter.parquet — pre-aggregated single-filter cache
isamples_202601_sample_facets_v2.parquet — full per-pid facet table for the on-the-fly fallback

🤖 Generated with Claude Code

Implements path A from issue isamplesorg#154 (Codex-resolved plan): facet counts recalculate when filters are toggled, matching the OpenContext UX Eric Kansa flagged. Counts reflect the full source + material + context + object_type filter combination; search and camera viewport remain out of v1 scope per the issue. How it works: - All four filter groups (source legend + material + sampled feature + specimen type) carry uniform `data-facet` / `data-value` count spans + `.facet-row` parents so refreshFacetCounts() updates text in place without rebuilding checkbox HTML (which would lose mid-interaction selections). - One debounced refreshFacetCounts() (~250 ms) with a generation/stale- result guard, mirroring the existing loadViewportSamples and refreshResultsTable patterns. - Strategy: * No filters active → restore baseline counts from facet_summaries (stashed on viewer._baselineCounts after initial load). * Exactly one facet value selected → cache lookup against isamples_202601_facet_cross_filter.parquet (~6 KB, single-filter pre-aggregation; one matching row per target facet value). * Anything else → on-the-fly: four concurrent GROUP BY queries on sample_facets_v2, each excluding the column being recomputed. - The selected facet's own values keep baseline counts (so the user can see what other values WOULD yield if they switched their selection within that group). - Zero-count rows dim (opacity 0.4) rather than hide; less layout churn and clearer signal that the selection eliminated those values. - During recompute the counts get a subtle italic .recomputing class so users see something is updating. Source counts are sample-level (matching facet_summaries / facets_url) and not H3 cluster dominant_source counts; the cluster-zoom caveat in "How It Works" already covers the dominant_source semantics for the globe view. Smoke-test: 0 JS exceptions, 0 console errors, 0 network failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rdhyee · 2026-05-01T04:41:45Z

Semantics discussion (Raymond ↔ Claude, 2026-04-30)

Anchoring the within-facet semantics + "count next to each value" before continuing the patch.

Raymond

let's talk about the semantics of the UI. I think that if for any given facet (including collection) if nothing is selected, it means all...and then if you pick a filter then it's just that one item and if multiple items of a given filter, you do an "or" amonng all the choices. Ideally, we'd calculate the number of items for if you add that filter item.

Does that make sense? I think this is my intution for how it should behave. Let's think about it

Claude

Yes — and your intuition is the standard faceted-search behavior, which is good news. But there's a subtle ambiguity in the "count if you add this" part that's worth pinning down before I touch the code.

Within-facet semantics you described (empty = all; 1 = just that; multi = OR) — this is what the SQL filter logic already does in PR #155. WHERE material IN ('Pottery', 'Glass') is OR; material clause absent is "all." That part is settled.

"Count if you add this" — two readings:

(A) Per-value count under cross-dim filters. Next to "Glass" we display the count of samples whose material = Glass AND <other dim filters>, regardless of what else is selected within the same Material facet.

Pros: simple, one query per dim, this is what OpenContext shows.
Reads as: "how many Glass samples exist under my other filters."

(B) Hypothetical-result-after-add count. Next to "Glass" we display the new total result set size if Glass were added to whatever's already selected in Material.

Pros: literally "what happens if I click this."
Cons: every value's number is current_result + per_value_count_of_that_value (since same-dim values are disjoint per sample) — they all shift by the same amount, so they convey less relative information than (A). And clicking selected values (which would remove them) needs a separate computation. Not what OpenContext does.

I'd go with (A). It's what every major faceted UI does (OpenContext included), it answers "how big is each option?" which is what users actually want for picking next, and (B)'s numbers move in lockstep so they don't help comparison.

Bug in my current PR #155 to fix regardless: when ANY value in a dim is selected, I currently fall back to baseline counts for that dim's own values (the unfiltered facet_summaries). I should be applying cross-dim filters to that dim too, just excluding the dim's own current selection. So if you've selected Source=OpenContext, the Material list values should each show "count of that material under OpenContext" — not the global counts. That's plain (A).

Decision

Going with A and patching PR #155 to fix the dim-with-selections bug.

Per semantics (A) confirmed in the PR isamplesorg#155 thread: every dim's value counts should reflect cross-dim filters, not just dims without active selections. Previously refreshFacetCountsNow's on-the-fly path skipped any dim that had selections (paint baseline), which broke multi-filter compositions: with Source=OpenContext + Material=Pottery selected, Material values still showed unfiltered baseline counts instead of "count of each material under Source=OpenContext." Fix: remove the skip in the on-the-fly fallback. The existing buildExcludeWhere() already excludes the dim being recomputed from the WHERE, so every dim's GROUP BY yields the correct (A)-semantic count. Single-active cache path is unchanged: under one active dim with no other filters, recomputing that dim's values = baseline by definition, and the cache is empirically confirmed to not carry rows for the filter-dim's own facet_type. Updated the comment to explain why. Smoke test still PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rdhyee · 2026-05-01T04:56:47Z

Review findings from the semantics pass anchored at #155 (comment):

tutorials/isamples_explorer.qmd:1265 treats “zero source checkboxes selected” as no active source filter, so refreshFacetCountsNow() restores baseline counts. The existing result SQL treats zero selected sources as AND 1=0 at tutorials/isamples_explorer.qmd:308, so the UI can show nonzero facet counts while the actual result set is empty. The source dimension needs a distinct representation for “all selected” versus “none selected.”
tutorials/isamples_explorer.qmd:1377 increments facetCountsReqId only after the 250 ms debounce fires. If an older count query is in flight and the user changes filters, the old query can still pass the generation check and repaint stale counts during the debounce window. Increment the generation when scheduling/invalidating a refresh, then pass that captured id into the delayed call.

I did not run the Quarto render or browser smoke test; this was a code-review pass focused on the linked semantics comment and PR diff.

…+ debounce race Two fixes from the Codex review of isamplesorg#155: 1. Source filter "0 checked" now means "all" (no filter), matching Raymond's stated intuition for facet UX (PR isamplesorg#155 thread) and the semantics already used by material/context/object_type. Previously sourceFilterSQL returned `AND 1=0` when zero source checkboxes were checked, while refreshFacetCountsNow treated zero-checked as "no active source filter" and restored baseline counts — meaning the UI could show non-zero facet counts while the actual result set was empty. The describeActiveFilters helper already mapped both 0-checked and 4-checked to "source not filtering," so this just unifies sourceFilterSQL with that view. 2. Debounce race fix: refreshFacetCounts() now increments facetCountsReqId synchronously when scheduling, not after the 250 ms debounce fires. Previously, an in-flight count query could resolve during the debounce window of a newer schedule and still pass its generation check (because the gen was unchanged), repainting stale counts. Bumping the gen on schedule invalidates all in-flight queries immediately. Also added an early-stale guard at the top of refreshFacetCountsNow. Smoke test still PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rdhyee · 2026-05-01T05:10:22Z

Codex re-review finding after commit 1bcf1ad:

P3: tutorials/isamples_explorer.qmd:1403 still applies .disabled to every unchecked source checkbox. After the new fix, zero checked sources means “all sources,” but the UI will show every source label dimmed as if every source is excluded, while results/counts include all sources. Either don’t apply the disabled styling when active.length === 0, or normalize the zero-source state back to all checked.

The two prior review findings are addressed in current head 1bcf1ad: result filtering and facet-count semantics now agree for zero checked sources, and facetCountsReqId is invalidated before the debounce delay.

I did not run Quarto/browser tests for this re-review.

After the prior fix unifying source semantics so 0 checked == 4 checked == "all sources / no filter," the .disabled styling on the legend rows was still based on the per-checkbox checked state. Result: unchecking all 4 source boxes painted every label dimmed (looked like every source was excluded) while results and counts correctly included all sources. Cosmetically inconsistent. Fix: dim a row only when the source filter is actually narrowing the result set (1–3 checked). 0 or 4 checked → no dimming, matching the "no source filter" semantics. Also reuses the resulting filterIsActive flag for the URL persistence branch a few lines down (was duplicating the `> 0 && < 4` test). Smoke test still PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rdhyee merged commit e9a8288 into isamplesorg:main May 1, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

explorer: dynamic cross-filter facet counts#155

explorer: dynamic cross-filter facet counts#155
rdhyee merged 4 commits intoisamplesorg:mainfrom
rdhyee:explorer/cross-filter-counts

rdhyee commented May 1, 2026

Uh oh!

rdhyee commented May 1, 2026

Uh oh!

rdhyee commented May 1, 2026

Uh oh!

rdhyee commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rdhyee commented May 1, 2026

What changed

Data

DOM scaffolding

Helpers (helpers cell)

refreshFacetCounts() (inside zoomWatcher)

Wiring

What's deferred (per #154)

Test plan

References

Uh oh!

rdhyee commented May 1, 2026

Semantics discussion (Raymond ↔ Claude, 2026-04-30)

Raymond

Claude

Decision

Uh oh!

rdhyee commented May 1, 2026

Uh oh!

rdhyee commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`refreshFacetCounts()` (inside `zoomWatcher`)