Skip to content

docs: query-spec honesty-fix for live search field set (#168 Direction B)#176

Merged
rdhyee merged 2 commits intoisamplesorg:mainfrom
rdhyee:explorer-search-doc-honesty
May 8, 2026
Merged

docs: query-spec honesty-fix for live search field set (#168 Direction B)#176
rdhyee merged 2 commits intoisamplesorg:mainfrom
rdhyee:explorer-search-doc-honesty

Conversation

@rdhyee
Copy link
Copy Markdown
Contributor

@rdhyee rdhyee commented May 8, 2026

Summary

Direction B of #168, decided mechanically by the locked latency thresholds against measured #167 baseline data. Doc-only PR — no code changes.

The current Interactive Explorer searches `label` + `place_name` against `samples_map_lite.parquet`. `query-spec.qmd:225` previously claimed `label` + `description` + `place_name` — a known mismatch. This PR corrects the spec to match the live behavior and forward-points to #169 / SEARCH_INDEX_V1.md as the substrate work that lifts the gap.

What the data said

I implemented Direction A (swap to `sample_facets_v2.parquet`) on a side branch, ran the perf-smoke against three forms (baseline, naive LEFT JOIN, CTE-then-keyed-join), and posted the comparison on #168 (comment). Headlines:

metric baseline A (CTE) threshold (#168) passes?
cold `pottery` 8.7 s 12.0 s ≤ 5 s NO
cold multi-term 5.1 s 14.8 s ≤ 6 s NO
cold composed-source 5.0 s 5.7 s ≤ 8 s YES

Recall improvement is real (`pottery Cyprus` flips 0 → 50 results, `Çatalhöyük` flips 0 → 50, `100%` flips 0 → 50), but latency cost exceeds the locked threshold for the bare-text and multi-term cases. Native DuckDB benchmark showed CTE-then-join is 8× faster than naive LEFT JOIN; in-browser, cold-cache HTTP range fetches dominate cost, so the optimization evaporates. Direction A is structurally too slow on the current parquets.

Per the #168 decision rule (latency thresholds drive direction): land Direction B.

What this PR changes

  • `query-spec.qmd:225`: corrects the live-Explorer field-set claim from `label + description + place_name` to `label + place_name` only against `samples_map_lite.parquet`.
  • Adds a forward pointer to Explorer FTS Track 2: search_index_v1 contract doc #169 / SEARCH_INDEX_V1.md as the path that lifts both the recall gap (description + concept labels in v1 minimum) and the latency gap (substrate-backed FTS).

What this PR explicitly does not do

Test plan

  • Read `query-spec.qmd:225` after merge; verify the wording is honest about current behavior and points forward correctly.

Closes #168. Refs #165, #167, #169, PR #173.

🤖 Generated with Claude Code

…org#168 Direction B)

isamplesorg#168 baseline (isamplesorg#167) showed that swapping doSearch to sample_facets_v2
(Direction A) recovers real recall — pottery Cyprus flips from 0 to 50
results — but exceeds the locked latency thresholds (cold pottery 12s
vs ≤5s, multi-term 15s vs ≤6s). Native DuckDB benchmark showed CTE
optimization is 8x faster, but in-browser DuckDB-WASM cold-cache HTTP
range fetches dominate cost, evaporating the 8x win.

Per the isamplesorg#168 decision rule (latency thresholds drive direction): land
Direction B — keep doSearch on samples_map_lite, narrow query-spec.qmd
to honestly describe what the live Explorer searches today, and point
forward to isamplesorg#169 / SEARCH_INDEX_V1.md as the path that lifts both the
recall gap and the latency gap.

Refs isamplesorg#165, isamplesorg#167, isamplesorg#168, isamplesorg#169.
Closes isamplesorg#168.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor Author

@rdhyee rdhyee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the query-spec honesty fix. The wording change itself looks correct and matches the live label + place_name behavior.

One scope note: query-spec.qmd still says substrates that cannot index the full field set MUST surface the limitation in UI, while this PR explicitly remains doc-only and does not add the inline UI hint. I am treating that as a follow-up rather than a blocker for this doc honesty fix.

@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented May 8, 2026

Review result: the query-spec honesty fix itself looks correct and matches the live label + place_name behavior.

Scope note: query-spec.qmd still says substrates that cannot index the full field set MUST surface the limitation in UI, while this PR explicitly remains doc-only and does not add the inline UI hint. I am treating that as a follow-up rather than a blocker for this doc honesty fix.

…samplesorg#176)

query-spec.qmd §3.2 says: "Substrates that can't index all 15 fields
MUST document which subset they cover and surface the limitation in
UI." The original PR isamplesorg#176 only updated the doc text and left the UI
side undone. Codex review correctly flagged that as half a fix.

Adds:
- A .search-help line under the search bar saying "Searches sample
  labels and place names only — descriptions are not yet indexed."
- Forward link to isamplesorg#169 (substrate FTS) so users see the limitation is
  tracked, not abandoned.
- Replaces the placeholder example "pottery Cyprus" (which returns 0
  results in the current substrate per isamplesorg#167 baseline) with
  "basalt California" which actually matches.

Inline styles on the .search-help div to avoid touching styles.css.

Refs isamplesorg#167, isamplesorg#168, isamplesorg#169, isamplesorg#176.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rdhyee
Copy link
Copy Markdown
Contributor Author

rdhyee commented May 8, 2026

Added UI hint (commit `006d335`)

Codex review correctly flagged that `query-spec.qmd` §3.2's MUST clause ("surface the limitation in UI") wasn't closed by the doc-only fix. Adding the inline hint here so the spec normative requirement lands in the same PR.

`explorer.qmd` changes (+5 lines, inline style only — no styles.css touch):

Diff: +5/-1.

@rdhyee rdhyee merged commit 6b3413c into isamplesorg:main May 8, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Explorer FTS Track 1b: Honesty fix for query-spec / live mismatch

1 participant