Skip to content

Slim @augur/core: drop hosted-provider classes#3

Merged
willgitdata merged 2 commits into
mainfrom
chore/lean-up-for-oss
May 8, 2026
Merged

Slim @augur/core: drop hosted-provider classes#3
willgitdata merged 2 commits into
mainfrom
chore/lean-up-for-oss

Conversation

@willgitdata
Copy link
Copy Markdown
Owner

Summary

Smaller, lighter, easier-to-grok `@augur/core` for OSS launch. Hosted providers are a 30-line interface implementation away — shipping them in core was decision paralysis ("which embedder?", "is OpenAIEmbedder maintained?") more than help.

What's removed

Class Why removed
`GeminiEmbedder` Fetch wrapper around `gemini-embedding-001` (~230 lines). Users with API keys are better served by `@google/generative-ai` directly.
`OpenAIEmbedder` 60-line fetch wrapper around `POST /v1/embeddings`. Idiomatic users use the `openai` SDK; we drift with each API change.
`CohereReranker` 50-line fetch wrapper around Cohere rerank. Use `cohere-ai` SDK.
`JinaReranker` 50-line fetch wrapper around Jina rerank. Use `jinaai` SDK or fetch direct.
Helpers (`sleep`, `parseRetryAfter`) Only used by GeminiEmbedder.

Plus 6 fetch-mock test cases that go with them.

Total: ~390 lines + 11 tests removed. Net diff: −713 / +83.

What stays in core

Offline-only (zero network at runtime):

  • Embedders: `HashEmbedder` (placeholder), `TfIdfEmbedder` (real IR baseline, no deps), `LocalEmbedder` (semantic embeddings via on-device ONNX, opt-in transformers.js dep)
  • Rerankers: `HeuristicReranker`, `LocalReranker` (cross-encoder ONNX), `MMRReranker` (diversity), `CascadedReranker` (chain pipelines)
  • Adapters: `InMemoryAdapter`, `PineconeAdapter`, `TurbopufferAdapter`, `PgVectorAdapter` (these `fetch` wrappers stay because each is genuinely thin and the providers' SDKs are heavier than warranted)
  • Chunkers: Fixed/Sentence/Semantic + Metadata/Doc2Query wrappers
  • Routing + traces unchanged

Hosted-provider migration path

EXAMPLES.md §5 now shows users writing their own `Embedder` / `Reranker` against the official SDK:

```ts
import OpenAI from "openai";
import { Augur, type Embedder } from "@augur/core";

class OpenAIEmbedder implements Embedder {
readonly name = "openai:text-embedding-3-small";
readonly dimension = 1536;
private client = new OpenAI();
async embed(texts: string[]): Promise<number[][]> {
const r = await this.client.embeddings.create({
model: "text-embedding-3-small",
input: texts,
});
return r.data.map((d) => d.embedding);
}
}
```

Same number of lines as our deleted class; uses the maintained, idiomatic SDK instead of our reimplementation that drifted with each API change.

Server CLI changes

`AUGUR_EMBEDDER` values:

  • `"openai"` → no longer recognized (use `hash`/`tfidf`/`local`, or pass a custom embedder via your own server build)
  • `"tfidf"` → `TfIdfEmbedder` (newly wired, was missing)
  • `"local"` → `LocalEmbedder` (newly wired)
  • `"hash"` → unchanged
  • `AUGUR_LOCAL_MODEL` env var lets you override the default ONNX model

Verification

  • `pnpm -r build` clean (core + server)
  • `pnpm typecheck` clean (core, server, dashboard, evaluations)
  • `pnpm --filter @augur/core test` — 91/91 pass (was 102; -11 hosted-provider tests removed)
  • `pnpm --filter @augur/evaluations test` — 12/12 pass

🤖 Generated with Claude Code

willgitdata and others added 2 commits May 7, 2026 18:02
Goal: smaller, lighter, easier-to-grok @augur/core for OSS launch.
Hosted providers are a 30-line interface implementation away — shipping
them in core was decision paralysis ("which embedder?", "is OpenAIEmbedder
maintained?") more than help.

# Removed from public API

  - GeminiEmbedder        (gemini-embedding-001 fetch wrapper, ~230 lines)
  - OpenAIEmbedder        (openai embeddings fetch wrapper, ~60 lines)
  - CohereReranker        (cohere rerank fetch wrapper, ~50 lines)
  - JinaReranker          (jina rerank fetch wrapper, ~50 lines)
  - parseRetryAfter, sleep helpers (only used by GeminiEmbedder)

That's ~390 lines + 6 fetch-mock tests removed. The remaining offline
classes (HashEmbedder, TfIdfEmbedder, LocalEmbedder, HeuristicReranker,
LocalReranker, MMRReranker, CascadedReranker) cover every actual usage
pattern; hosted = "implement the interface against your provider's SDK".

# Replaced with documentation

  - EXAMPLES.md §5 now shows OpenAI + Cohere classes the user writes
    against the official SDKs (openai, cohere-ai). Same number of lines
    as the deleted classes; uses the maintained, idiomatic SDK instead
    of our reimplementation that drifted with each API change.
  - ARCHITECTURE.md and DEVELOPMENT_GUIDE.md updated to point at
    LocalEmbedder / LocalReranker as the canonical real-baselines.

# Server CLI

  AUGUR_EMBEDDER values changed:
    "openai" → no longer recognized
    "tfidf"  → uses TfIdfEmbedder (was new, now wired)
    "local"  → uses LocalEmbedder (was new, now wired)
    "hash"   → unchanged
  AUGUR_LOCAL_MODEL env var lets you override the default ONNX model.

# Verification

  - pnpm -r build         clean (core + server)
  - pnpm typecheck        clean (core, server, dashboard, evaluations)
  - pnpm --filter @augur/core test     91/91 pass (was 102; -6 Gemini
                          tests, -3 Cohere/Jina tests, -2 obsolete
                          deduped)
  - pnpm --filter @augur/evaluations test  12/12 pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The /traces page polls the server every 4s for the latest trace list. The
previous `load()` closed over the initial `selected = null` from first
render — every poll re-evaluated `if (!selected) setSelected(top)` and
forced the user back to the most recent trace within seconds of clicking
a different one.

Fix: track first-pick vs user-pick via a ref. The auto-select still fires
once on initial load when the page is empty, but later polls leave the
user's selection alone. Also moved `load()` inside the effect for clearer
dependency surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@willgitdata willgitdata merged commit dbcb227 into main May 8, 2026
1 check passed
@willgitdata willgitdata deleted the chore/lean-up-for-oss branch May 8, 2026 01:16
willgitdata added a commit that referenced this pull request May 8, 2026
…ranker

Implements 10 fixes from a critical code review on the publish-ready
SDK. Each item is independently shippable; this lands them as one
coordinated bump because several feed each other (the eval smoke test
exercises the new fusion module; the explicit-reranker breaking change
needs MIGRATING.md to land at the same time).

  #1  Smoke eval harness — 16-doc / 12-query synthetic fixture with
      a regression floor (NDCG@10 > 0.65 on the stub stack). Runs in
      <50 ms as part of `pnpm test`. The full BEIR + 504-query eval
      stays where it already lives — git history — for "did this
      tweak win?" measurement.
  #2  Extract pure helpers to `fusion.ts` (composeFilter,
      pickVectorWeight, weightedRrfFuse, adaptWeightByConfidence,
      topGapNormalized, clamp) + 31 unit tests. Each empirical
      threshold is now annotated with what it was tuned against.
  #3  autoLanguageFilter integration tests — OFF default, ON for
      non-English, user-filter override wins, soft-fallback when the
      filtered pool empties.
  #4  Basic-search example wires the recommended stack: LocalEmbedder
      + LocalReranker + MetadataChunker(SentenceChunker) +
      InMemoryAdapter({ useStemming: true }). Matches the README's
      headline configuration so users copying from "hello world"
      land on the auto path that produces NDCG@10 = 0.920.
  #5  PineconeAdapter mocked-fetch tests (13 tests). Pins the wire
      format (URL, method, auth header, body shape, response decode)
      so refactors can't silently regress one of the three production
      adapters. Previously had zero coverage.
  #6  Ad-hoc scratch adapter cache — bounded LRU keyed by a
      deterministic fingerprint of (id, content). Repeat searches
      over the same `req.documents` skip re-chunking + re-embedding.
      Tunable via `adHocCacheSize` (default 8; set 0 to disable).
  #7  **BREAKING** Drop `HeuristicReranker` as silent default. The
      previous default did almost nothing while emitting a
      "yes I rerank" line in the trace. Default is now `null`;
      pass `new LocalReranker()` (or any provider's reranker)
      explicitly to keep cross-encoder voting on. Documented in
      MIGRATING.md.
  #8  MIGRATING.md — covers every BREAKING change in 0.2 with
      smallest-diff examples; non-breaking adoptions documented
      separately.
  #9  SemanticChunker tests (8 tests) — boundary detection, maxSize
      cap, async-only API, metadata propagation. Was the only
      chunker without coverage.
  #10 Magic-number provenance documented in `router.ts` and
      `fusion.ts`. Every threshold (≤2 / ≤6 word counts, 0.6
      ambiguity floor, 800 ms latency floor, k=60 RRF, ±0.20
      shift, 0.10–0.90 weight clamp, 0.3/0.4/0.5/0.7 priors) now
      records what it's tuned against and what's load-bearing
      vs negotiable.

Test count: 100 → 163. Build green; full eval still ships against
the published packages, not against the smoke fixture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
willgitdata added a commit that referenced this pull request May 8, 2026
Round 1 left two critical lying-trace bugs and a handful of
architectural papercuts. This commit clears the table.

  #1 (critical) Router no longer lies about reranking when no
     reranker is configured. `Router.decide` gained an optional
     `hasReranker` parameter; `HeuristicRouter` plumbs it through to
     `shouldRerank` and forces `reranked: false` when absent. The
     trace now records "reranking skipped (no reranker configured)"
     instead of pretending the cross-encoder fired. Augur passes
     `this.reranker !== null`. Default is `true` so existing
     third-party Router implementations keep working.

  #2 (critical) `SearchTrace` declares the four fields that augur.ts
     attaches at runtime: `adHoc`, `adHocCacheHit`, `autoLanguageFilter`,
     `autoLanguageFilterDropped`. `Tracer.finish` opts widened to
     `Omit<SearchTrace, "id"|"query"|"startedAt"|"totalMs"|"spans">`
     so adding a SearchTrace field propagates automatically. Tests
     drop their `as unknown as` casts.

  #3 (high) PgVectorAdapter mocked-fetch tests (14 tests). Pin SQL
     shape, INSERT batching at 200/round-trip, parameter renumbering
     across filter clauses, identifier-validation guard against
     `; DROP TABLE`, **and a filter-key SQL-injection regression
     test** — the JSON-path quote-doubling defense gets explicit
     coverage.

  #4 (high) Adapter trace-string format change reverted. `trace.adapter`
     is always the bare adapter name; ad-hoc / cache-hit signals
     surface as the new structured boolean fields from #2. No more
     "in-memory (ad-hoc, cached)" string parsing.

  #5 (high) `fingerprintDocs` extracted to `fingerprint.ts` with 10
     direct unit tests covering reorder, byte-change, prefix-equal
     corpora, id|content boundary, doc-record boundary, empty list,
     unicode, and the output format contract.

  #6 (medium) Async chunkers no longer pretend to be `Chunker`s.
     Introduced an explicit `AsyncChunker` interface;
     `SemanticChunker`, `Doc2QueryChunker`, `ContextualChunker`
     implement it (no longer Chunker). The runtime traps in their
     throwing `chunk()` methods are gone — the type system catches
     misuse at compile time. APIs that accept either flavor
     (`AugurOptions.chunker`, all chunker `base` fields,
     `chunkDocument`) now use `Chunker | AsyncChunker`.
     `MetadataChunker` keeps its dual sync+async path with a runtime
     guard for the user-opted-in case where its base is async.

  #7 (medium) `StubEmbedder` consolidated into
     `packages/core/src/test-fixtures.ts`. Excluded from the
     published package via tsconfig. Three duplicated copies dropped.

  #8 (low) `eval-smoke.test.ts` header explicitly distinguishes the
     synthetic-fixture smoke test (structural) from the BEIR /
     504-query eval that produced the README's NDCG@10 = 0.920
     numbers (preserved at git `feffc73^`, runs in ~30 min).

  #9 (low) `BaseAdapter` JSDoc rewritten as the canonical
     "starting point for custom adapters" comment, including the
     RRF / capability / `searchHybrid` override guidance.
     `AsyncChunker` added to public exports.

  #10 (low) `examples/basic-search/index.ts` header documents the
      `npm i @huggingface/transformers` requirement for users
      copying the file out of the repo.

Test count: 163 → 191 (+28). Build green; published-package
contents verified clean (`find dist -name "test-fixtures*"` →
empty, `find dist -name "*.test.*"` → empty).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
willgitdata added a commit that referenced this pull request May 11, 2026
…ranker

Implements 10 fixes from a critical code review on the publish-ready
SDK. Each item is independently shippable; this lands them as one
coordinated bump because several feed each other (the eval smoke test
exercises the new fusion module; the explicit-reranker breaking change
needs MIGRATING.md to land at the same time).

  #1  Smoke eval harness — 16-doc / 12-query synthetic fixture with
      a regression floor (NDCG@10 > 0.65 on the stub stack). Runs in
      <50 ms as part of `pnpm test`. The full BEIR + 504-query eval
      stays where it already lives — git history — for "did this
      tweak win?" measurement.
  #2  Extract pure helpers to `fusion.ts` (composeFilter,
      pickVectorWeight, weightedRrfFuse, adaptWeightByConfidence,
      topGapNormalized, clamp) + 31 unit tests. Each empirical
      threshold is now annotated with what it was tuned against.
  #3  autoLanguageFilter integration tests — OFF default, ON for
      non-English, user-filter override wins, soft-fallback when the
      filtered pool empties.
  #4  Basic-search example wires the recommended stack: LocalEmbedder
      + LocalReranker + MetadataChunker(SentenceChunker) +
      InMemoryAdapter({ useStemming: true }). Matches the README's
      headline configuration so users copying from "hello world"
      land on the auto path that produces NDCG@10 = 0.920.
  #5  PineconeAdapter mocked-fetch tests (13 tests). Pins the wire
      format (URL, method, auth header, body shape, response decode)
      so refactors can't silently regress one of the three production
      adapters. Previously had zero coverage.
  #6  Ad-hoc scratch adapter cache — bounded LRU keyed by a
      deterministic fingerprint of (id, content). Repeat searches
      over the same `req.documents` skip re-chunking + re-embedding.
      Tunable via `adHocCacheSize` (default 8; set 0 to disable).
  #7  **BREAKING** Drop `HeuristicReranker` as silent default. The
      previous default did almost nothing while emitting a
      "yes I rerank" line in the trace. Default is now `null`;
      pass `new LocalReranker()` (or any provider's reranker)
      explicitly to keep cross-encoder voting on. Documented in
      MIGRATING.md.
  #8  MIGRATING.md — covers every BREAKING change in 0.2 with
      smallest-diff examples; non-breaking adoptions documented
      separately.
  #9  SemanticChunker tests (8 tests) — boundary detection, maxSize
      cap, async-only API, metadata propagation. Was the only
      chunker without coverage.
  #10 Magic-number provenance documented in `router.ts` and
      `fusion.ts`. Every threshold (≤2 / ≤6 word counts, 0.6
      ambiguity floor, 800 ms latency floor, k=60 RRF, ±0.20
      shift, 0.10–0.90 weight clamp, 0.3/0.4/0.5/0.7 priors) now
      records what it's tuned against and what's load-bearing
      vs negotiable.

Test count: 100 → 163. Build green; full eval still ships against
the published packages, not against the smoke fixture.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
willgitdata added a commit that referenced this pull request May 11, 2026
Round 1 left two critical lying-trace bugs and a handful of
architectural papercuts. This commit clears the table.

  #1 (critical) Router no longer lies about reranking when no
     reranker is configured. `Router.decide` gained an optional
     `hasReranker` parameter; `HeuristicRouter` plumbs it through to
     `shouldRerank` and forces `reranked: false` when absent. The
     trace now records "reranking skipped (no reranker configured)"
     instead of pretending the cross-encoder fired. Augur passes
     `this.reranker !== null`. Default is `true` so existing
     third-party Router implementations keep working.

  #2 (critical) `SearchTrace` declares the four fields that augur.ts
     attaches at runtime: `adHoc`, `adHocCacheHit`, `autoLanguageFilter`,
     `autoLanguageFilterDropped`. `Tracer.finish` opts widened to
     `Omit<SearchTrace, "id"|"query"|"startedAt"|"totalMs"|"spans">`
     so adding a SearchTrace field propagates automatically. Tests
     drop their `as unknown as` casts.

  #3 (high) PgVectorAdapter mocked-fetch tests (14 tests). Pin SQL
     shape, INSERT batching at 200/round-trip, parameter renumbering
     across filter clauses, identifier-validation guard against
     `; DROP TABLE`, **and a filter-key SQL-injection regression
     test** — the JSON-path quote-doubling defense gets explicit
     coverage.

  #4 (high) Adapter trace-string format change reverted. `trace.adapter`
     is always the bare adapter name; ad-hoc / cache-hit signals
     surface as the new structured boolean fields from #2. No more
     "in-memory (ad-hoc, cached)" string parsing.

  #5 (high) `fingerprintDocs` extracted to `fingerprint.ts` with 10
     direct unit tests covering reorder, byte-change, prefix-equal
     corpora, id|content boundary, doc-record boundary, empty list,
     unicode, and the output format contract.

  #6 (medium) Async chunkers no longer pretend to be `Chunker`s.
     Introduced an explicit `AsyncChunker` interface;
     `SemanticChunker`, `Doc2QueryChunker`, `ContextualChunker`
     implement it (no longer Chunker). The runtime traps in their
     throwing `chunk()` methods are gone — the type system catches
     misuse at compile time. APIs that accept either flavor
     (`AugurOptions.chunker`, all chunker `base` fields,
     `chunkDocument`) now use `Chunker | AsyncChunker`.
     `MetadataChunker` keeps its dual sync+async path with a runtime
     guard for the user-opted-in case where its base is async.

  #7 (medium) `StubEmbedder` consolidated into
     `packages/core/src/test-fixtures.ts`. Excluded from the
     published package via tsconfig. Three duplicated copies dropped.

  #8 (low) `eval-smoke.test.ts` header explicitly distinguishes the
     synthetic-fixture smoke test (structural) from the BEIR /
     504-query eval that produced the README's NDCG@10 = 0.920
     numbers (preserved at git `4d52844^`, runs in ~30 min).

  #9 (low) `BaseAdapter` JSDoc rewritten as the canonical
     "starting point for custom adapters" comment, including the
     RRF / capability / `searchHybrid` override guidance.
     `AsyncChunker` added to public exports.

  #10 (low) `examples/basic-search/index.ts` header documents the
      `npm i @huggingface/transformers` requirement for users
      copying the file out of the repo.

Test count: 163 → 191 (+28). Build green; published-package
contents verified clean (`find dist -name "test-fixtures*"` →
empty, `find dist -name "*.test.*"` → empty).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant