Use canonical wordmark assets at repo root#7
Merged
Conversation
Swaps the assets/wordmark-{light,dark}.svg I created during the
wordmark PR for the canonical versions you provided at repo root:
- augur-wordmark-light.svg
- augur-wordmark-dark.svg
The dark variant is meaningfully better than what I generated — the
wizard pixel art itself is brightened for dark backgrounds (purples
shift from #5B2C8A → #7A4AB5 / #6B3AA0 → #9359C9 / #7A4AB5 → #A878E0,
brown shifts to lighter #A0522D / #C97A4F), not just the AUGUR text.
Under GitHub dark mode the staff and robe now read clearly instead of
fading into the background.
README header swapped to your exact snippet:
<picture>
<source media="(prefers-color-scheme: dark)" srcset="augur-wordmark-dark.svg">
<img src="augur-wordmark-light.svg" alt="Augur">
</picture>
The assets/ directory I added is now empty and removed from the tree.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
willgitdata
added a commit
that referenced
this pull request
May 8, 2026
…ranker Implements 10 fixes from a critical code review on the publish-ready SDK. Each item is independently shippable; this lands them as one coordinated bump because several feed each other (the eval smoke test exercises the new fusion module; the explicit-reranker breaking change needs MIGRATING.md to land at the same time). #1 Smoke eval harness — 16-doc / 12-query synthetic fixture with a regression floor (NDCG@10 > 0.65 on the stub stack). Runs in <50 ms as part of `pnpm test`. The full BEIR + 504-query eval stays where it already lives — git history — for "did this tweak win?" measurement. #2 Extract pure helpers to `fusion.ts` (composeFilter, pickVectorWeight, weightedRrfFuse, adaptWeightByConfidence, topGapNormalized, clamp) + 31 unit tests. Each empirical threshold is now annotated with what it was tuned against. #3 autoLanguageFilter integration tests — OFF default, ON for non-English, user-filter override wins, soft-fallback when the filtered pool empties. #4 Basic-search example wires the recommended stack: LocalEmbedder + LocalReranker + MetadataChunker(SentenceChunker) + InMemoryAdapter({ useStemming: true }). Matches the README's headline configuration so users copying from "hello world" land on the auto path that produces NDCG@10 = 0.920. #5 PineconeAdapter mocked-fetch tests (13 tests). Pins the wire format (URL, method, auth header, body shape, response decode) so refactors can't silently regress one of the three production adapters. Previously had zero coverage. #6 Ad-hoc scratch adapter cache — bounded LRU keyed by a deterministic fingerprint of (id, content). Repeat searches over the same `req.documents` skip re-chunking + re-embedding. Tunable via `adHocCacheSize` (default 8; set 0 to disable). #7 **BREAKING** Drop `HeuristicReranker` as silent default. The previous default did almost nothing while emitting a "yes I rerank" line in the trace. Default is now `null`; pass `new LocalReranker()` (or any provider's reranker) explicitly to keep cross-encoder voting on. Documented in MIGRATING.md. #8 MIGRATING.md — covers every BREAKING change in 0.2 with smallest-diff examples; non-breaking adoptions documented separately. #9 SemanticChunker tests (8 tests) — boundary detection, maxSize cap, async-only API, metadata propagation. Was the only chunker without coverage. #10 Magic-number provenance documented in `router.ts` and `fusion.ts`. Every threshold (≤2 / ≤6 word counts, 0.6 ambiguity floor, 800 ms latency floor, k=60 RRF, ±0.20 shift, 0.10–0.90 weight clamp, 0.3/0.4/0.5/0.7 priors) now records what it's tuned against and what's load-bearing vs negotiable. Test count: 100 → 163. Build green; full eval still ships against the published packages, not against the smoke fixture. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
willgitdata
added a commit
that referenced
this pull request
May 8, 2026
Round 1 left two critical lying-trace bugs and a handful of architectural papercuts. This commit clears the table. #1 (critical) Router no longer lies about reranking when no reranker is configured. `Router.decide` gained an optional `hasReranker` parameter; `HeuristicRouter` plumbs it through to `shouldRerank` and forces `reranked: false` when absent. The trace now records "reranking skipped (no reranker configured)" instead of pretending the cross-encoder fired. Augur passes `this.reranker !== null`. Default is `true` so existing third-party Router implementations keep working. #2 (critical) `SearchTrace` declares the four fields that augur.ts attaches at runtime: `adHoc`, `adHocCacheHit`, `autoLanguageFilter`, `autoLanguageFilterDropped`. `Tracer.finish` opts widened to `Omit<SearchTrace, "id"|"query"|"startedAt"|"totalMs"|"spans">` so adding a SearchTrace field propagates automatically. Tests drop their `as unknown as` casts. #3 (high) PgVectorAdapter mocked-fetch tests (14 tests). Pin SQL shape, INSERT batching at 200/round-trip, parameter renumbering across filter clauses, identifier-validation guard against `; DROP TABLE`, **and a filter-key SQL-injection regression test** — the JSON-path quote-doubling defense gets explicit coverage. #4 (high) Adapter trace-string format change reverted. `trace.adapter` is always the bare adapter name; ad-hoc / cache-hit signals surface as the new structured boolean fields from #2. No more "in-memory (ad-hoc, cached)" string parsing. #5 (high) `fingerprintDocs` extracted to `fingerprint.ts` with 10 direct unit tests covering reorder, byte-change, prefix-equal corpora, id|content boundary, doc-record boundary, empty list, unicode, and the output format contract. #6 (medium) Async chunkers no longer pretend to be `Chunker`s. Introduced an explicit `AsyncChunker` interface; `SemanticChunker`, `Doc2QueryChunker`, `ContextualChunker` implement it (no longer Chunker). The runtime traps in their throwing `chunk()` methods are gone — the type system catches misuse at compile time. APIs that accept either flavor (`AugurOptions.chunker`, all chunker `base` fields, `chunkDocument`) now use `Chunker | AsyncChunker`. `MetadataChunker` keeps its dual sync+async path with a runtime guard for the user-opted-in case where its base is async. #7 (medium) `StubEmbedder` consolidated into `packages/core/src/test-fixtures.ts`. Excluded from the published package via tsconfig. Three duplicated copies dropped. #8 (low) `eval-smoke.test.ts` header explicitly distinguishes the synthetic-fixture smoke test (structural) from the BEIR / 504-query eval that produced the README's NDCG@10 = 0.920 numbers (preserved at git `feffc73^`, runs in ~30 min). #9 (low) `BaseAdapter` JSDoc rewritten as the canonical "starting point for custom adapters" comment, including the RRF / capability / `searchHybrid` override guidance. `AsyncChunker` added to public exports. #10 (low) `examples/basic-search/index.ts` header documents the `npm i @huggingface/transformers` requirement for users copying the file out of the repo. Test count: 163 → 191 (+28). Build green; published-package contents verified clean (`find dist -name "test-fixtures*"` → empty, `find dist -name "*.test.*"` → empty). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
willgitdata
added a commit
that referenced
this pull request
May 11, 2026
…ranker Implements 10 fixes from a critical code review on the publish-ready SDK. Each item is independently shippable; this lands them as one coordinated bump because several feed each other (the eval smoke test exercises the new fusion module; the explicit-reranker breaking change needs MIGRATING.md to land at the same time). #1 Smoke eval harness — 16-doc / 12-query synthetic fixture with a regression floor (NDCG@10 > 0.65 on the stub stack). Runs in <50 ms as part of `pnpm test`. The full BEIR + 504-query eval stays where it already lives — git history — for "did this tweak win?" measurement. #2 Extract pure helpers to `fusion.ts` (composeFilter, pickVectorWeight, weightedRrfFuse, adaptWeightByConfidence, topGapNormalized, clamp) + 31 unit tests. Each empirical threshold is now annotated with what it was tuned against. #3 autoLanguageFilter integration tests — OFF default, ON for non-English, user-filter override wins, soft-fallback when the filtered pool empties. #4 Basic-search example wires the recommended stack: LocalEmbedder + LocalReranker + MetadataChunker(SentenceChunker) + InMemoryAdapter({ useStemming: true }). Matches the README's headline configuration so users copying from "hello world" land on the auto path that produces NDCG@10 = 0.920. #5 PineconeAdapter mocked-fetch tests (13 tests). Pins the wire format (URL, method, auth header, body shape, response decode) so refactors can't silently regress one of the three production adapters. Previously had zero coverage. #6 Ad-hoc scratch adapter cache — bounded LRU keyed by a deterministic fingerprint of (id, content). Repeat searches over the same `req.documents` skip re-chunking + re-embedding. Tunable via `adHocCacheSize` (default 8; set 0 to disable). #7 **BREAKING** Drop `HeuristicReranker` as silent default. The previous default did almost nothing while emitting a "yes I rerank" line in the trace. Default is now `null`; pass `new LocalReranker()` (or any provider's reranker) explicitly to keep cross-encoder voting on. Documented in MIGRATING.md. #8 MIGRATING.md — covers every BREAKING change in 0.2 with smallest-diff examples; non-breaking adoptions documented separately. #9 SemanticChunker tests (8 tests) — boundary detection, maxSize cap, async-only API, metadata propagation. Was the only chunker without coverage. #10 Magic-number provenance documented in `router.ts` and `fusion.ts`. Every threshold (≤2 / ≤6 word counts, 0.6 ambiguity floor, 800 ms latency floor, k=60 RRF, ±0.20 shift, 0.10–0.90 weight clamp, 0.3/0.4/0.5/0.7 priors) now records what it's tuned against and what's load-bearing vs negotiable. Test count: 100 → 163. Build green; full eval still ships against the published packages, not against the smoke fixture. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
willgitdata
added a commit
that referenced
this pull request
May 11, 2026
Round 1 left two critical lying-trace bugs and a handful of architectural papercuts. This commit clears the table. #1 (critical) Router no longer lies about reranking when no reranker is configured. `Router.decide` gained an optional `hasReranker` parameter; `HeuristicRouter` plumbs it through to `shouldRerank` and forces `reranked: false` when absent. The trace now records "reranking skipped (no reranker configured)" instead of pretending the cross-encoder fired. Augur passes `this.reranker !== null`. Default is `true` so existing third-party Router implementations keep working. #2 (critical) `SearchTrace` declares the four fields that augur.ts attaches at runtime: `adHoc`, `adHocCacheHit`, `autoLanguageFilter`, `autoLanguageFilterDropped`. `Tracer.finish` opts widened to `Omit<SearchTrace, "id"|"query"|"startedAt"|"totalMs"|"spans">` so adding a SearchTrace field propagates automatically. Tests drop their `as unknown as` casts. #3 (high) PgVectorAdapter mocked-fetch tests (14 tests). Pin SQL shape, INSERT batching at 200/round-trip, parameter renumbering across filter clauses, identifier-validation guard against `; DROP TABLE`, **and a filter-key SQL-injection regression test** — the JSON-path quote-doubling defense gets explicit coverage. #4 (high) Adapter trace-string format change reverted. `trace.adapter` is always the bare adapter name; ad-hoc / cache-hit signals surface as the new structured boolean fields from #2. No more "in-memory (ad-hoc, cached)" string parsing. #5 (high) `fingerprintDocs` extracted to `fingerprint.ts` with 10 direct unit tests covering reorder, byte-change, prefix-equal corpora, id|content boundary, doc-record boundary, empty list, unicode, and the output format contract. #6 (medium) Async chunkers no longer pretend to be `Chunker`s. Introduced an explicit `AsyncChunker` interface; `SemanticChunker`, `Doc2QueryChunker`, `ContextualChunker` implement it (no longer Chunker). The runtime traps in their throwing `chunk()` methods are gone — the type system catches misuse at compile time. APIs that accept either flavor (`AugurOptions.chunker`, all chunker `base` fields, `chunkDocument`) now use `Chunker | AsyncChunker`. `MetadataChunker` keeps its dual sync+async path with a runtime guard for the user-opted-in case where its base is async. #7 (medium) `StubEmbedder` consolidated into `packages/core/src/test-fixtures.ts`. Excluded from the published package via tsconfig. Three duplicated copies dropped. #8 (low) `eval-smoke.test.ts` header explicitly distinguishes the synthetic-fixture smoke test (structural) from the BEIR / 504-query eval that produced the README's NDCG@10 = 0.920 numbers (preserved at git `4d52844^`, runs in ~30 min). #9 (low) `BaseAdapter` JSDoc rewritten as the canonical "starting point for custom adapters" comment, including the RRF / capability / `searchHybrid` override guidance. `AsyncChunker` added to public exports. #10 (low) `examples/basic-search/index.ts` header documents the `npm i @huggingface/transformers` requirement for users copying the file out of the repo. Test count: 163 → 191 (+28). Build green; published-package contents verified clean (`find dist -name "test-fixtures*"` → empty, `find dist -name "*.test.*"` → empty). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Swaps the `assets/wordmark-{light,dark}.svg` I created in PR #5 for the canonical versions you provided at the repo root:
The dark variant is meaningfully better than what I generated — under GitHub dark mode the wizard's robe and staff now read clearly instead of the deep purples fading into the background. Specifically:
README header swapped to your exact snippet:
```html

\`\`\`The `assets/` directory is now empty and removed from the tree.
Test plan
🤖 Generated with Claude Code