feat(ui): live stats panel in occurrence list sidebar#1308
Open
mihow wants to merge 4 commits into
Open
Conversation
✅ Deploy Preview for antenna-preview canceled.
|
Contributor
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
326cd68 to
4ae69ec
Compare
mihow
pushed a commit
that referenced
this pull request
May 21, 2026
…ry params - Rename `agreed_under_order_*` → `agreed_any_rank_*` to match the endpoint's dropped ORDER threshold (0565f06). - Add optional `agreement_coarsest_rank` + `agreed_coarser_rank_*` fields to the response type (not consumed yet — UI follows in #1308). - Widen `filters` to accept arrays and append repeated query params so multi-value filters (e.g. `algorithm`, `not_algorithm` — backend reads via `request.query_params.getlist(...)`) survive. Per CodeRabbit review. Co-Authored-By: Claude <noreply@anthropic.com>
d621ac3 to
3692eba
Compare
5 tasks
3692eba to
d0669ee
Compare
Adds an OccurrenceStats panel above the filter sections on the occurrence list page. Consumes the /occurrences/stats/model-agreement/ endpoint, threading the same active filter array the list view sends so the numbers always reflect the current result set. Shows two metrics: verified occurrences % and human-model agreement rate % (rank-level / under-order agreement). Co-Authored-By: Claude <noreply@anthropic.com>
`StatBar` takes an optional `count` rendered as "0% (121)". Wired into the Verified occurrences bar so a small-but-nonzero verified set that rounds to 0% still surfaces the underlying count. Co-Authored-By: Claude <noreply@anthropic.com>
5e5252d to
50c5ff9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Frontend consumer for the
/occurrences/stats/model-agreement/endpoint added in #1307. Adds a Stats panel at the top of the occurrence list sidebar, above the filter sections.OccurrenceStatscomponent (ui/src/pages/occurrences/occurrence-stats.tsx)occurrences.tsx, threading the same active filter array the list view sends touseOccurrences— so the stats always match the current result set (taxon, deployment, date, verification status, default filters, etc.)verified_pct, with the rawverified_countshown alongside (e.g.0% (121)) so a small-but-nonzero set that rounds to 0% still surfaces the count.agreed_any_rank_pct(exact matches plus any disagreement whose LCA is at a real taxonomic rank; the upstream filter scope bounds what counts as meaningful)Stacked on the backend branch — base is
feat/human-model-agreement-endpoint(#1307), notmain. Rebase/retarget tomainonce #1307 merges.Filter parity
The panel reuses the list view's
filtersarray verbatim and converts it to query params with the same active/error rules asgetFetchUrl(value?.length && !error). The endpoint accepts the full occurrence-list filter set (#1307), so the numbers stay consistent with the visible results.Test plan
tsc --noEmit— no errors in touched fileseslint+prettierclean on new/modified files0% (121), HUMAN-MODEL AGREEMENT RATE94%.?apply_defaults=falseand the Stats panel re-queried with the same param. Same filter array drives both list and stats.Toolchain note for reviewers
The worktree
ui/has nonode_modules. Installing under the host's Node 22 breaks the dev server (nova-ui-kit dereferences a React-18 internal removed in React 19 at tailwind-config eval). Use the repo-pinned Node 18 (.nvmrc→ 18.12.0):nvm use 18.12.0 && yarn install && yarn start. Under Node 18 it boots cleanly.Design discussion (open — feedback wanted)
The "agreement rate" is the share of human-verified occurrences where the human pick matched the model's pick. The catch: only a handful of occurrences are usually verified, so the rate can swing wildly. If 1 person verified 4 occurrences and agreed on 3, the panel says "75%" — which feels solid but is really just 3 out of 4. Decisions to make before this is more than a rough indicator:
1. Show the raw counts, not just the percentage (done for verified).
A percentage hides how much data is behind it. "94%" could be 94-out-of-100 or 47-out-of-50. Verified occurrences now shows
0% (121)— the count makes "0%" readable (it's 121 out of ~24k, not literally zero). Open question: do the same for the agreement rate so it reads94% (94 of 100)— the reader instantly sees how many verifications the number is built on.2. Should we hide the agreement rate when too few occurrences are verified?
A rate built on 3 verifications isn't trustworthy. Options:
3. Show a margin of error instead of a hard cutoff.
Rather than a yes/no "enough data" line, we can show how shaky the number is. A confidence interval (specifically a Wilson score interval, which behaves well for small samples) turns "94%" into something like "94%, somewhere between 87% and 97%". When few occurrences are verified the range is wide; as more get verified it tightens. This is more honest than a binary cutoff and needs only the count + total we already return.
4. A fairer agreement score that accounts for luck (follow-up).
Plain agreement % has a blind spot: if 95% of moths in a project are one common species, the model and human will "agree" most of the time just by both guessing the common one — that's luck, not skill. Cohen's kappa (κ) is the standard fix: it measures how much they agree beyond what you'd expect by chance. κ of 1.0 = perfect, 0 = no better than guessing. It's a more defensible "how good is the model, really" number than raw %. We can compute it from the exact same human/model pairs the endpoint already collects — no extra database work. Same caveat as #2: it still only describes the occurrences people chose to verify, not the whole project. Worth doing as a follow-up if the team wants a real quality metric rather than a rough indicator.
None of these block the panel landing as a quick live indicator — they're about how much statistical weight to let users put on the number.
🤖 Generated with Claude Code