feat(core/reviews): parse kind:38000, materialize aggregates, rank by bayesian damping#31
Merged
Conversation
… bayesian damping
Reviews pipeline split across three modules under `src/reviews/`:
- `parse.ts` — kind:38000 parser emitting `ReviewRow` directly; rating
precedence per data-model-v1.md §4 (structured `["rating","N","5"]` >
legacy `["rating","N"]` > content `N/5` or `N/10` > leading star emoji
run > null). Parse stays pure (no d-tag gate) so it's usable from log
views and pagination dedup.
- `aggregate.ts` — `bayesianScore(avg, n) = avg * log10(n+1)` (null-avg →
0 for sort-index friendliness) and `recomputeAggregateInTx(db, d)`
scoped to the reviews+mintAggregate transaction.
- `upsert.ts` — `upsertReviewWithAggregate` wraps cache CAS + aggregate
recompute in one `rw` transaction so concurrent reads never see a
review without its aggregate reflection. Recompute skipped on
rejected-stale and rejected-invalid.
- `rank.ts` — `rankMints(db, limit=50)` using the `bayesianScore` index.
Schema v3 bump: `bayesianRank` → `bayesianScore`, `averageRating` →
`avgRating`, adds `ratedCount`; indexes `bayesianScore` + `avgRating` +
`updatedAt`. Additive migration via Dexie `.version(3).stores({...})`.
Scheduler's kind:38000 branch now delegates to the reviews module and
counts `rejected-invalid` against `rejectedByLayerA`. The Layer A
bot-spam gate moves into `upsertReview` with a Fedimint (k=38173)
bypass, so 38172 Cashu-mint reviews with bogus 16-char d-tags get
firewalled at the edge, while federation IDs (k=38173) pass through
unconstrained per research.
Tests: +60 added across 6 files — parse branches (30), aggregate formula
+ recompute, CAS+aggregate-in-sync, rank sort + limit, and a 5-event
real-corpus smoke from audit/relay-data/recs-38000.json. Integration
test updated: 2 of 5 fixture reviews now firewall on d-tag shape, which
bumps `rejectedByLayerA` from 5→7 and `accepted` from 11→9.
Full workspace: 219 tests passing.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
The prior free-pass on kind=38173 let any short/junk d-tag land in Dexie as long as the `k` tag claimed Fedimint. Every real federation ID in the audit corpus (fedimint-observer.md + reviews/corpus.test.ts) is lowercase 64-char hex — bot spam is not. Add isValidFedimintDTag as a sibling shape gate so Layer A catches the Fedimint branch at the same choke point as Cashu. Apply in both upsertAnnouncement and upsertReview. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…r errors The prior code relied on an outdated claim (now-removed doc comment) that each kind's handler did its own try/catch — it didn't. A thrown Dexie transaction (QuotaExceeded, schema collision, unexpected disk state) in the 38000/38172/38173/0/10002 branch would escape as an unhandled rejection at the `void onEvent(event)` subscription boundary and stats would silently freeze at last-good while ingest continued to look healthy. Wrap each case body so branch-local exceptions count into a new `stats.handlerErrors` counter and log with a stable `[scheduler]` prefix. Also add `stats.rejectedByParse` for the kind:38000 branch so `parseReview`-null drops are observable rather than silent. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Dexie auto-migrates the v3 schema indexes (`bayesianScore`, `avgRating`
replacing `bayesianRank`), but does NOT transform row PAYLOADS. A dev
with a v2 IndexedDB would have rows shaped `{d, averageRating,
bayesianRank, updatedAt}` — field names renamed in v3 — which fail every
v3 query shape. No prod data yet and the aggregate re-derives from live
review traffic, so the correct migration is a clean wipe.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…Content The N/5 and N/10 content blocks previously didn't return when they matched but were out of range — the function silently fell through to the emoji regex. Behaviour was correct-by-accident: `0/10 total trash` would yield no-rating because there were no leading stars after, but `7/5 ⭐⭐⭐⭐⭐` would latch onto the emoji run and report 5★ despite the numeric prefix saying otherwise. Make the precedence explicit — when an N/5 or N/10 prefix MATCHES the content, we commit to that format and return (even when the parsed number is out of range, in which case we return undefined so the null fallback in parseReview kicks in). Add a header comment naming the rule so future readers don't re-introduce the silent fall-through. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…unds - reviews/upsert.test.ts: two concurrent upserts for the same d with distinct pubkeys converge — both reviews land AND the aggregate sees both (5 shuffled trials). Mirrors the announcement-side regression in cache/upsert.test.ts and guards the aggregate-in-same-transaction invariant against a pre-write-snapshot recompute. - reviews/parse.test.ts: table-driven coverage for malformed rating tag shapes (non-numeric, empty, missing, null-as-string). All fall through to `rating: null`. - reviews/rank.test.ts: `limit=0` returns [], `limit=Infinity` returns every mint in score-descending order — covers the boundary conditions the Dexie limit() clamp handles. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e format Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…k typecheck TS2783: `...over` spread already supplies both required fields.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #5 of 8 for the v2 rebuild. Implements the reviews pipeline end-to-end:
src/reviews/parse.ts) — kind:38000 parser emittingReviewRow. Rating precedence (data-model-v1.md §4): structured["rating","N","5"]> legacy["rating","N"]> contentN/5orN/10> leading star-emoji run >null. Pure (no d-tag gate) so callable from log views / pagination dedup.src/reviews/aggregate.ts) — Bayesian formulaavgRating * log10(ratedCount + 1)(null-avg → 0 for sort-index friendliness);recomputeAggregateInTxruns inside the same Dexie transaction as the review write.src/reviews/upsert.ts) —upsertReviewWithAggregatewraps cache CAS + aggregate recompute atomically, so concurrent readers never see a review without its aggregate reflection. Recompute skipped onrejected-stale/rejected-invalid.src/reviews/rank.ts) —rankMints(db, limit = 50)via thebayesianScoreindex.Schema
v3 additive migration:
bayesianRank→bayesianScore(new index)averageRating→avgRating(new index)ratedCountupdatedAtindexmintAggregate(payload rename — see Fixup pass below)Scheduler wiring
Kind:38000 branch now delegates to the new module. Layer A bot-spam gate lives in
upsertReviewwith a Fedimint (k=38173) sibling shape check — 38172 Cashu-mint reviews with 16-char legacy d-tags and 38173 Fedimint reviews with non-hex junk d-tags both firewall at the edge.Test plan
bun run test— 224 passed (219 baseline + 5 fixup regressions)bun --filter='*' run test— green workspace-widebunx biome check packages/— 0 errors (16 pre-existing warnings for non-null assertions in test files)reviews/corpus.test.ts)rejectedByLayerA: 5→7,accepted: 11→9)Coverage summary
parse.test.tsaggregate.test.tsupsert.test.tsrank.test.tscorpus.test.tsschema.test.tsFixup pass
Post-review hardening from the three-reviewer triangulation (code-reviewer + silent-failure-hunter + pr-test-analyzer). No correctness regressions found — these are observability and edge-case firewalls.
19b46cafix(core/cache): narrow k=38173 d-tag shape check — the prior free-pass on kind:38173 let any short/junk d-tag land as long as thektag claimed Fedimint. Every real federation ID in the audit corpus is 64-char lowercase hex. AddedisValidFedimintDTagas a sibling shape gate innip87/dtag.ts; applied in bothupsertAnnouncementandupsertReview.f8b385bfix(core/scheduler): wrap onEvent cases in try/catch and count handler errors — the scheduler header comment claimed "each kind's handler does its own try/catch" but the 38000 / 38172 / 38173 / 0 / 10002 branches didn't. A thrown Dexie transaction would escape as an unhandled rejection and ingest would look healthy while stats froze. Each case body now wraps a try/catch that bumps a newstats.handlerErrorscounter, logs[scheduler] handler errorwith kind+eventId. Also addsstats.rejectedByParseforparseReview-null drops on the 38000 branch so those silent reject paths become observable.6c57602fix(core/cache): clear mintAggregate on v2 → v3 upgrade — Dexie auto-migrates the v3 schema indexes but NOT row PAYLOADS. A dev with v2 IndexedDB would have rows shaped{d, averageRating, bayesianRank, updatedAt}that fail every v3 query shape. Added.upgrade(tx => tx.table("mintAggregate").clear())— no prod data, the aggregate re-derives from reviews on next upsert, so a clean wipe is the correct migration.1a9aa49fix(core/reviews): explicit precedence fallthrough in parseRatingFromContent — the N/5 and N/10 content blocks didn't return when they matched with an out-of-range number; behaviour was correct-by-accident.7/5 ⭐⭐⭐⭐⭐would previously fall through to the emoji and report 5★. Now each numeric block commits to its format on match and returns (includingundefinedfor out-of-range).a7d9a45test(core/reviews): concurrent CAS race, malformed tags, rankMints bounds — three new tests: (1) two concurrentupsertReviewWithAggregatecalls for the same d with distinct pubkeys converge across 5 shuffled trials, aggregate sees both; (2) table-driven malformed rating tag coverage (foo,"", missing,null-as-string) all land onrating: null; (3)rankMintswithlimit=0returns[],limit=Infinityreturns the full ranked list.a1918eastyle(core/reviews): inline single-arg parseReview call to match biome format — mechanical formatter fix.Deviations
None — all three must-fix items, all three should-fix items, and all three new tests landed. The
audit/data-model-v1.md §13spec doc update lives outside the repo (keeper's local audit corpus) and is recorded in the keeper memory, not the PR diff.🤖 Generated with Claude Code