Skip to content

globe: Phase 1 unification — Specimen Type filter, SKOS labels, portable predicate, v2 facets#157

Merged
rdhyee merged 1 commit intoisamplesorg:mainfrom
rdhyee:feature/explorer-phase-1-facets-skos-predicate
May 1, 2026
Merged

globe: Phase 1 unification — Specimen Type filter, SKOS labels, portable predicate, v2 facets#157
rdhyee merged 1 commit intoisamplesorg:mainfrom
rdhyee:feature/explorer-phase-1-facets-skos-predicate

Conversation

@rdhyee
Copy link
Copy Markdown
Contributor

@rdhyee rdhyee commented May 1, 2026

Implements Phase 1 of #156 — unify Search Explorer and Interactive Explorer into a single page. This PR brings tutorials/progressive_globe.qmd to facet parity with the explorer, so subsequent phases can layer cross-filter counts, URL params, and the table view on a single source of truth.

What changed

Data contract — switch to v2 facets. facets_url now points at isamples_202601_sample_facets_v2.parquet, which carries URI-string columns for material, context, and object_type (the explorer was already on v2; the globe was on the older short-label file). vocab_labels.parquet is preloaded for SKOS prefLabel lookups.

Specimen Type filter. New collapsible #objectTypeFilter section in the side panel, populated alongside Material / Sampled Feature from facet_summaries.parquet.

SKOS prefLabels. prettyLabel(uri) ports cleanly from the explorer: looks up the URI in a 535-row vocab map (English pref_labels) and falls back to URI tail when no match. Display labels for all three facet groups now read as human terms ("Natural Solid Material", "Site of past human activities", "Other solid object") rather than URI fragments.

Portable facetFilterSQL() predicate. Replaces the alias-dependent AND f.material IN (...) fragments with a portable AND pid IN (SELECT DISTINCT pid FROM read_parquet(facets_url) WHERE ...) subquery. Two consequences:

  • The cluster-click and viewport-sample queries no longer need the JOIN/no-JOIN branching — both branches collapse to a single query that works whether or not facet filters are active.
  • Multi-valued facets (a sample with two materials) no longer duplicate rows via JOIN, which matters for Phase 4's table mode. Addresses Codex's contract #5.

UNCHECKED-by-default semantics for the URI-valued facet groups. Empty = no filter, checked = include only those values. Matches the explorer's UX — defaulting to "all 200 materials checked" would be unusable, and "empty = no filter" is the natural reading. Source filter semantic unchanged (still all-checked-by-default with 4 named sources).

Cluster-mode honesty. The H3 summary parquets only carry dominant_source, so material/feature/specimen filters cannot affect cluster counts. When any of the three facet groups is active in cluster mode, #facetNote surfaces an explanatory message: "Material / feature / specimen filters apply at sample zoom level — zoom in or click a cluster." Ships with the new filters in the same PR so users never see silently-wrong filter results.

Verification

  • quarto render tutorials/progressive_globe.qmd — PASS
  • Playwright smoke test (/tmp/globe_smoke_test.py) — PASS, 0 JS exceptions, 0 console errors, 0 network failures
  • Filter populations verified: 19 materials, 16 features, 17 specimen types, all with SKOS labels
  • Default state confirmed UNCHECKED across all three URI-valued groups
  • Vocab map size: 535 entries

Diff summary

 tutorials/progressive_globe.qmd | 230 ++++++++++++++++++++----------------
 1 file changed, 136 insertions(+), 94 deletions(-)

Single-file change. Globe-only — explorer file untouched, both pages remain live and independently functional.

Out of scope (lands in later phases)

  • Cross-filtered live counts → Phase 2
  • URL query params + multi-term search → Phase 3
  • Table view + maxSamples slider → Phase 4
  • /explorer.html rename + redirects → Phase 5

🤖 Generated with Claude Code

…ble predicate, v2 facets

Implements Phase 1 of issue isamplesorg#156 (unify Search Explorer + Interactive
Explorer). Brings the globe page to facet parity with the explorer:

- Switch facets_url to sample_facets_v2.parquet (URI-string columns)
- Add Specimen Type filter section + load object_type from facet_summaries
- Add vocab_labels.parquet load + prettyLabel() for SKOS prefLabels
- Refactor facetFilterSQL() to portable `pid IN (SELECT ...)` predicate;
  no more f.alias dependency, no more JOIN/no-JOIN branching, no
  duplicate rows from multi-valued facets (Codex contract #5)
- Default UNCHECKED checkbox semantic (empty = no filter) for the
  three URI-valued facet groups; matches explorer UX
- Cluster-mode honesty: facetNote text expanded to include specimen,
  shown when any of the three facet groups is active in cluster mode

Verification: Quarto render PASS; smoke test PASS (0 JS errors, 0
console errors, 0 network failures); all three filter sections populate
with SKOS labels (535 vocab labels loaded from R2).

See issue isamplesorg#156 for the full 5-phase plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented May 1, 2026

Codex review (relayed): no findings.

Reviewed PR #157's single-file change in tutorials/progressive_globe.qmd, including the generated docs/tutorials/progressive_globe.html after rendering. The new facet SQL matches the actual sample_facets_v2.parquet shape: one row per pid with material, context, and object_type columns, so the combined AND predicate is valid and does not introduce duplicate rows.

Verified:

  • gh pr view / gh pr diff for PR metadata and patch
  • quarto render tutorials/progressive_globe.qmd passed
  • DuckDB checks against downloaded parquet assets:
    • sample_facets_v2: 5,980,282 rows, 5,980,282 distinct pids
    • facet summaries include object_type
    • vocab labels load with 535 English URI labels

Small clarification on the PR description: the "duplicate rows from multi-valued facets" framing is prophylactic given the v2 file's actual one-row-per-pid shape — no rows are being de-duplicated against the current data. The portable predicate is still the right shape (forward-compatible if the schema ever goes multi-valued, and removes the alias dependency Phase 4 needs), but the dedup framing is overstated for today's file.

Posted from Codex.

@rdhyee rdhyee merged commit cd2374d into isamplesorg:main May 1, 2026
1 check passed
rdhyee added a commit to rdhyee/isamplesorg.github.io that referenced this pull request May 1, 2026
Closes the unified-explorer migration (issue isamplesorg#156). Phases 1-4 (isamplesorg#157-isamplesorg#160)
built the unified UI on tutorials/progressive_globe.qmd; this PR promotes
it to the canonical site-root URL and retires the old Search Explorer
page.

Rename + asset-path fix
- Move tutorials/progressive_globe.qmd → explorer.qmd at site root.
- Adjust source-palette import from `../assets/js/source-palette.js` to
  `assets/js/source-palette.js` so it resolves on both isamples.org and
  on rdhyee.github.io/isamplesorg.github.io PR previews.

URL param: search, not q
- The explorer's URL state now uses `?search=` instead of `?q=`. Quarto's
  site-wide search reserves `?q=` for its highlight feature and strips
  the param via history.replaceState before any of our cells run (see
  docs/site_libs/quarto-search/quarto-search.js). `?search=` is unused
  by Quarto and survives intact.

Redirect stubs at the old URLs
- tutorials/progressive_globe.html and tutorials/isamples_explorer.html
  become preview-safe redirect stubs:
    new URL('../explorer.html' + search + hash, href)
- They forward whatever query string the browser presents. Note: legacy
  `?q=basalt` URLs lose the search term because Quarto strips `?q=`
  before our stub script runs (the stub is itself a Quarto-rendered
  page, so its <head> loads quarto-search.js). Non-q params (sources,
  material, etc.) and the hash fragment all survive — the only
  affected URLs are Phase 3 dev test links that were never published.

_quarto.yml + internal links
- Navbar Interactive Explorer href → explorer.qmd. Search Explorer
  removed from both the How-to-Use menu and the sidebar.
- Update internal links to /explorer.html in index.qmd, how-to-use.qmd,
  tutorials/index.qmd, about.qmd, data.qmd, design/index.qmd,
  index_alt.qmd, query-spec.qmd, tutorials/narrow_vs_wide_performance.qmd,
  and the existing parquet_cesium_isamples_wide redirect stub.

Tests
- tests/test_explorer.py → tests/test_globe.py targeting /explorer.html.
- Selectors updated for the unified DOM-based UI: #sourceFilter,
  #materialFilter, #contextFilter, #objectTypeFilter, #globeViewBtn /
  #tableViewBtn (no List view), #maxSamples number input.
- Unskip the cross-filter facet tests deferred in isamplesorg#155 — native HTML
  checkboxes respond to programmatic .click() unlike the old Explorer's
  OJS Inputs.checkbox.
- Add redirect-preserves-params tests for both old URLs (using the
  current ?search= param, which survives Quarto's q-stripping).
- test_navigation.py + test_tutorials_landing.py drop Search Explorer
  assertions and retarget the globe-loads test to /explorer.html.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rdhyee added a commit that referenced this pull request May 1, 2026
…#162)

* Phase 5: rename Interactive Explorer to /explorer.html with redirects

Closes the unified-explorer migration (issue #156). Phases 1-4 (#157-#160)
built the unified UI on tutorials/progressive_globe.qmd; this PR promotes
it to the canonical site-root URL and retires the old Search Explorer
page.

Rename + asset-path fix
- Move tutorials/progressive_globe.qmd → explorer.qmd at site root.
- Adjust source-palette import from `../assets/js/source-palette.js` to
  `assets/js/source-palette.js` so it resolves on both isamples.org and
  on rdhyee.github.io/isamplesorg.github.io PR previews.

URL param: search, not q
- The explorer's URL state now uses `?search=` instead of `?q=`. Quarto's
  site-wide search reserves `?q=` for its highlight feature and strips
  the param via history.replaceState before any of our cells run (see
  docs/site_libs/quarto-search/quarto-search.js). `?search=` is unused
  by Quarto and survives intact.

Redirect stubs at the old URLs
- tutorials/progressive_globe.html and tutorials/isamples_explorer.html
  become preview-safe redirect stubs:
    new URL('../explorer.html' + search + hash, href)
- They forward whatever query string the browser presents. Note: legacy
  `?q=basalt` URLs lose the search term because Quarto strips `?q=`
  before our stub script runs (the stub is itself a Quarto-rendered
  page, so its <head> loads quarto-search.js). Non-q params (sources,
  material, etc.) and the hash fragment all survive — the only
  affected URLs are Phase 3 dev test links that were never published.

_quarto.yml + internal links
- Navbar Interactive Explorer href → explorer.qmd. Search Explorer
  removed from both the How-to-Use menu and the sidebar.
- Update internal links to /explorer.html in index.qmd, how-to-use.qmd,
  tutorials/index.qmd, about.qmd, data.qmd, design/index.qmd,
  index_alt.qmd, query-spec.qmd, tutorials/narrow_vs_wide_performance.qmd,
  and the existing parquet_cesium_isamples_wide redirect stub.

Tests
- tests/test_explorer.py → tests/test_globe.py targeting /explorer.html.
- Selectors updated for the unified DOM-based UI: #sourceFilter,
  #materialFilter, #contextFilter, #objectTypeFilter, #globeViewBtn /
  #tableViewBtn (no List view), #maxSamples number input.
- Unskip the cross-filter facet tests deferred in #155 — native HTML
  checkboxes respond to programmatic .click() unlike the old Explorer's
  OJS Inputs.checkbox.
- Add redirect-preserves-params tests for both old URLs (using the
  current ?search= param, which survives Quarto's q-stripping).
- test_navigation.py + test_tutorials_landing.py drop Search Explorer
  assertions and retarget the globe-loads test to /explorer.html.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix Codex review findings

1. test_baseline_sesar_count_matches_summaries was racy: facet-count
   spans are present in static HTML before being populated, so
   wait_for("attached") returned immediately and the test parsed an
   empty string. Wait until the SESAR span text matches \(\d (i.e., a
   parenthesised number) before reading.

2. Drop ?maxSamples= from URL state. Phase 3 introduced it to control
   the globe POINT_BUDGET; Phase 4 added a separate #maxSamples input
   for the table cap with different defaults (5000 vs 25000) and ranges
   (1-1000000 vs 1000-100000). The two were never the same concept and
   conflating them under one URL param meant `?maxSamples=10000`
   silently affected the globe but not the visible table input, and
   table-input changes never made it back to the URL. Remove the URL
   param entirely: globe POINT_BUDGET reverts to the constant
   DEFAULT_POINT_BUDGET (5000), table input remains a UI-only control.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix explorer facet dimming

* Drop search-triggered refreshFacetCounts calls

Follow-on to 8559b56 (Fix explorer facet dimming) which decoupled facet
counts from the search predicate. After that change, calling
refreshFacetCounts on every search keystroke / button click / Enter
triggered DB requeries that produced no visible difference (counts no
longer depend on search text). Drop those calls.

The single refreshFacetCounts() at the end of the cell (initial paint)
and the calls from facet checkbox handlers remain — those are still
load-bearing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant