You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The "Samples in View" stat box reads exactly 5000 — a suspiciously round number.
Pan and zoom from that starting state, copy the URL, paste into a different browser → the view that comes back is not the view that was captured.
Both symptoms have concrete root causes. Treating as a single issue because they were investigated together and overlap; happy to split if a maintainer prefers.
Part 1 — "Samples in View" is the fetch budget, not the real in-view count
Root cause
explorer.qmd:418
DEFAULT_POINT_BUDGET=5000
explorer.qmd:1530-1538 — the point-mode viewport query:
SELECT pid, label, source, latitude, longitude, place_name, result_time
FROM read_parquet('${lite_url}')
WHERE latitude BETWEEN ${padded.south} AND ${padded.north}
AND longitude BETWEEN ${padded.west} AND ${padded.east}
${sourceFilterSQL('source')}
${facetFilterSQL()}
LIMIT5000
explorer.qmd:1557 — the UI counter:
updateStats('Samples',cachedData.length,cachedData.length, ...,'Samples in View','Samples in View');
cachedData is the result of the LIMIT 5000 query. The counter therefore tops out at 5000 by construction. In dense regions it does not represent "samples in view" — it represents "samples we chose to load from a slightly-padded box around the view."
Ground-truth numbers for the Cyprus URL
Direct DuckDB query against https://data.isamples.org/isamples_202601_samples_map_lite.parquet, centered on the URL's lat=34.9957, lng=33.6798:
Viewport (degrees half-extent)
Actual samples
±0.10°
23,421
±0.20°
23,803
±0.50°
24,305
±1.00°
37,869
So when the UI says "5,000 Samples in View" at Cyprus, the truth is 23,000+ even within a viewport tighter than the explorer's 30%-padded fetch box. Counter is wrong by ~5x in this region.
Secondary smells in the same query path
No ORDER BY before LIMIT 5000 → which 5000 rows are returned is undefined. DuckDB-on-parquet is probably stable file-order in practice, but it's not a contract.
Label says "in View" but fetch uses a padded box (30% larger; explorer.qmd:1514-1522). Even if we set aside the cap, the count is loosely defined.
The displayed count never shrinks to "true visible" as the user pans inside the cached padded box — renderSamplePoints plots all of cachedData, including rows outside the actual viewport.
What's at Cyprus (for context)
The 23K samples in the box are one dense cluster around lat 34.98, lng 33.71, all OPENCONTEXT. That's the Polis excavations project (Excavations at Polis had 52,762 OC records per the Open Context facet API). So the cap is hiding a single very-dense site, not a diffuse distribution.
Suggested fix directions (for discussion, not prescriptive)
Cheapest: change the label from "Samples in View" to "Samples Loaded (max N)" so the counter no longer lies. Wire the budget value into the label.
Show the real count separately: a fast SELECT count(*) against the same WHERE clause (no LIMIT) is cheap on the lite parquet via DuckDB-WASM range reads. Display two numbers: real count and rendered count.
Adaptive budget / aggregation: if real count > budget, fall back to a server-side aggregation or show "23,421 samples — too dense to render individually" with a UI affordance to drill in.
Add ORDER BY pid (or similar) so the LIMIT 5000 subset is at least deterministic across browsers and sessions.
Part 2 — URL state round-trip doesn't reproduce the view
buildHash (explorer.qmd:651-671) encodes only: v=1, lat, lng, alt, optional heading (only if abs(heading % 360) > 1), optional pitch (only if not nadir), optional mode=point, optional pid or h3. The query-string state (?search=, ?sources=, ?material=, etc.) is written by a separate writeQueryState() function (explorer.qmd:494-526) on filter changes, not by the camera handler.
Hypotheses (need empirical confirmation per the EXPLORER_STATE.md state-contract framing, #164)
Copy mid-debounce. 600ms debounce means a user who pans/zooms and immediately copies the URL gets stale state. Easy to confirm: pan, wait 2s, then check the address bar.
Heading normalization drops 360.0.buildHash only writes heading if abs(heading % 360) > 1 (line 661). The deep-link URL Raymond used has heading=360.0, but after one camera-handler tick that value is normalized to 0 and the param is dropped. If the user's view depends on heading != 0, the next URL write silently discards it.
5000-cap non-determinism (Part 1). Even when both browsers finish loading, the displayed 5000-sample subset is undefined without ORDER BY. The two browsers might render different 5000 dots.
_suppressHashWrite could stay stuck. Hashchange handler sets it true (line 2127) and clears it after a 2000ms timeout (lines 2140-2145). If a user chains hashchanges (back/forward repeatedly) faster than 2000ms, the flag may stay set across writes. Edge case; less likely in Raymond's flow but worth ruling out.
Cleanest discriminating test
After pan/zoom and a 2-second pause:
If the address bar contains the current camera state → problem is on the load side. Suspects: (3) cold-cache latency, (4) 5000-cap subset roulette, (5) stuck suppress flag.
If the address bar still has stale state → problem is on the write side. Suspects: (1) longer-than-600ms thing keeping _suppressHashWrite true, or (2) heading-normalization dropping a param the user needed.
URL: https://isamples.org/explorer.html#v=1&lat=34.9957&lng=33.6798&alt=15212&heading=360.0&mode=point
Observed: "Samples in View: 5,000"
Real count in tight ±0.1° box: 23,421 (one dense OPENCONTEXT cluster, probably Polis)
DuckDB reproduction (no auth needed):
importduckdbcon=duckdb.connect()
con.execute("INSTALL httpfs; LOAD httpfs;")
url='https://data.isamples.org/isamples_202601_samples_map_lite.parquet'con.execute(f""" SELECT count(*) FROM read_parquet('{url}') WHERE latitude BETWEEN 34.8957 AND 35.0957 AND longitude BETWEEN 33.5798 AND 33.7798""").fetchone()
# (23421,)
Investigated 2026-05-11 starting from this URL:
https://isamples.org/explorer.html#v=1&lat=34.9957&lng=33.6798&alt=15212&heading=360.0&mode=point
User-visible symptoms:
Both symptoms have concrete root causes. Treating as a single issue because they were investigated together and overlap; happy to split if a maintainer prefers.
Part 1 — "Samples in View" is the fetch budget, not the real in-view count
Root cause
explorer.qmd:418explorer.qmd:1530-1538— the point-mode viewport query:explorer.qmd:1557— the UI counter:cachedDatais the result of theLIMIT 5000query. The counter therefore tops out at 5000 by construction. In dense regions it does not represent "samples in view" — it represents "samples we chose to load from a slightly-padded box around the view."Ground-truth numbers for the Cyprus URL
Direct DuckDB query against
https://data.isamples.org/isamples_202601_samples_map_lite.parquet, centered on the URL'slat=34.9957, lng=33.6798:So when the UI says "5,000 Samples in View" at Cyprus, the truth is 23,000+ even within a viewport tighter than the explorer's 30%-padded fetch box. Counter is wrong by ~5x in this region.
Secondary smells in the same query path
ORDER BYbeforeLIMIT 5000→ which 5000 rows are returned is undefined. DuckDB-on-parquet is probably stable file-order in practice, but it's not a contract.explorer.qmd:1514-1522). Even if we set aside the cap, the count is loosely defined.renderSamplePointsplots all ofcachedData, including rows outside the actual viewport.What's at Cyprus (for context)
The 23K samples in the box are one dense cluster around lat 34.98, lng 33.71, all OPENCONTEXT. That's the Polis excavations project (Excavations at Polis had 52,762 OC records per the Open Context facet API). So the cap is hiding a single very-dense site, not a diffuse distribution.
Suggested fix directions (for discussion, not prescriptive)
SELECT count(*)against the same WHERE clause (no LIMIT) is cheap on the lite parquet via DuckDB-WASM range reads. Display two numbers: real count and rendered count.ORDER BY pid(or similar) so the LIMIT 5000 subset is at least deterministic across browsers and sessions.Part 2 — URL state round-trip doesn't reproduce the view
What we know about the write path
Camera-change handler (
explorer.qmd:1965-2030):buildHash(explorer.qmd:651-671) encodes only:v=1,lat,lng,alt, optionalheading(only ifabs(heading % 360) > 1), optionalpitch(only if not nadir), optionalmode=point, optionalpidorh3. The query-string state (?search=,?sources=,?material=, etc.) is written by a separatewriteQueryState()function (explorer.qmd:494-526) on filter changes, not by the camera handler.Hypotheses (need empirical confirmation per the EXPLORER_STATE.md state-contract framing, #164)
360.0.buildHashonly writes heading ifabs(heading % 360) > 1(line 661). The deep-link URL Raymond used hasheading=360.0, but after one camera-handler tick that value is normalized to0and the param is dropped. If the user's view depends on heading != 0, the next URL write silently discards it.mode=pointtriggers the res8 + samples_map_lite fetch path that takes 60–90s on a cold cache (the same path explorer: 60–90 s 'no dots' window on cold-cache deep-link to point mode (DuckDB-WASM 1.24.0 falls back to full HTTP read) #190 / PR explorer: surface 'Fetching sample index…' during cold-cache boot→point-mode wait (#190 fix 2) #191 worked around). In a fresh browser the view "looks different" might mean "samples haven't loaded yet."ORDER BY. The two browsers might render different 5000 dots._suppressHashWritecould stay stuck. Hashchange handler sets it true (line 2127) and clears it after a 2000ms timeout (lines 2140-2145). If a user chains hashchanges (back/forward repeatedly) faster than 2000ms, the flag may stay set across writes. Edge case; less likely in Raymond's flow but worth ruling out.Cleanest discriminating test
After pan/zoom and a 2-second pause:
_suppressHashWritetrue, or (2) heading-normalization dropping a param the user needed.Relationship to other work
Repro (Cyprus, 2026-05-11)
DuckDB reproduction (no auth needed):