Conversation
…ends
Introduces the control-plane abstraction that workspaces, catalogs,
vector-store descriptors, and documents will flow through. Three backends
are planned: memory (default), file (single-node self-hosted), and
astra (Data API Tables, landing in a follow-up).
This PR ships the interface plus two backends:
- MemoryControlPlaneStore — in-process Map-of-Maps, default for CI
and `docker run` without external services.
- FileControlPlaneStore — JSON-on-disk with per-file mutex and
atomic-rename writes, for single-node self-hosted deployments.
A shared behavioral contract suite runs the same 14 assertions against
every backend, guaranteeing identical semantics as new backends land.
Key design properties:
- Records are immutable. Every update spreads into a new object.
- Secrets are never stored by value — all credentials, embedding, and
reranking config use a `SecretRef` string ("provider:path") resolved
lazily by a pluggable SecretProvider.
- Catalog → vector-store is N:1 (multiple catalogs may share one
underlying collection).
- Cascade delete on deleteWorkspace (catalogs, vector stores,
documents) and on deleteCatalog (documents).
- Memory layout mirrors the CQL partition structure so the future
astra backend translates 1:1.
- `kind` enum: astra | hcd | openrag | mock. Mock stays first-class
so CI and offline dev never need external services.
No routes are wired yet — that's the next PR in this series, which
layers /api/v1/workspaces CRUD on top of this interface.
8 tasks
4 tasks
erichare
added a commit
that referenced
this pull request
Apr 22, 2026
Third backend for ControlPlaneStore, alongside memory (PR #4) and file (PR #4). Uses @datastax/astra-db-ts to operate on the four wb_* tables that mirror the CQL schema Cédrick published. Module layout: src/astra-client/ table-definitions.ts DDL for each wb_* table (Data API Table definitions, snake_case columns, composite PKs where appropriate). row-types.ts Literal row shapes (the JSON wire format). converters.ts Pure record↔row conversion. camelCase records in control-plane/types.ts flatten to the snake_case wb_* rows (vector store's nested embedding/lexical/reranking configs flatten into embedding_*, lexical_*, reranking_* columns). tables.ts Narrow structural interface (TablesBundle) over the astra-db-ts subset we use. Lets tests inject an in-memory fake. client.ts openAstraClient(): creates the four tables with ifNotExists:true at init, returns a TablesBundle over real astra-db-ts tables. src/control-plane/astra/ store.ts AstraControlPlaneStore implementing ControlPlaneStore. Holds no state; delegates every op to the TablesBundle. Handles cascade semantics identically to the file backend (no cross-partition transaction). Tests (+24): tests/control-plane/astra-fake.ts tests/control-plane/astra.test.ts Runs the existing 14-assertion shared contract suite against AstraControlPlaneStore wired to an in-memory fake. All 14 pass. tests/astra-client/converters.test.ts 10 tests: round-trip equivalence for all four record types, snake_case/flat row shape checks, null/undefined handling. A real-Astra integration test gated on ASTRA_DB_* env vars will ship with the route layer in PR-1a.3 — the fake exercises the store's internal logic (conversions, cascades, existence checks), and the fixture conformance diff (PR-1a.3) verifies wire-level behavior. Verification: - tsc --noEmit clean (5 new .ts files, all typed through) - biome check clean - vitest run 95/95 passing (10 converter + 14 astra contract + 71 prior)
6 tasks
erichare
added a commit
that referenced
this pull request
Apr 22, 2026
…xtures (#8) * feat(api): /api/v1/* CRUD wired to ControlPlaneStore + conformance fixtures Ends the route-less era. The TS runtime now exposes the full workspace / catalog / vector-store CRUD surface over `/api/v1/*`, backed by the pluggable ControlPlaneStore that shipped in #4 and #7. Also lands the fixture regeneration script so every language green box can diff against a single canonical contract. Config (breaking, pre-prod): - Replaces the old `workspaces: [...]` block (per-workspace driver + auth + nested vectorStores/catalogs) with: - `controlPlane: { driver: "memory" | "file" | "astra", ... }` - `seedWorkspaces: [...]` — optional, only used by memory driver - The default is `controlPlane.driver: memory`, so `docker run` with no config still boots cleanly. - YAML interpolation (${VAR}) still works; new `tokenRef` fields use our SecretRef format (`env:FOO` / `file:/path`) instead of inline expansion. Runtime wiring (src/): - New `src/secrets/` module: - provider.ts — SecretResolver + SecretProvider interface - env.ts — resolves `env:VAR_NAME` - file.ts — resolves `file:/path`, trimmed - New `src/control-plane/factory.ts` — builds a ControlPlaneStore from config (memory | file | astra); astra resolves tokenRef via the SecretResolver. - New `src/routes/api-v1/` module: - workspaces.ts / catalogs.ts / vector-stores.ts — full CRUD, scoped via path params, validated via OpenAPIHono. - helpers.ts — maps typed ControlPlane* errors to the canonical envelope (404 / 409 / 503). Invoked from the top-level `onError` so route handlers can throw normally without fighting OpenAPIHono's typed-response inference. - Updated src/app.ts to mount the new routes at /api/v1 (Cédrick's path prefix) and serve OpenAPI at /api/v1/openapi.json. - Updated src/root.ts to wire SecretResolver → factory → app. - Removed: src/workspaces/registry.ts, src/routes/workspaces.ts, src/lib/auth.ts, src/lib/redact.ts. None are referenced anymore. - Rewrote src/openapi/schemas.ts around the new record types. Conformance harness (clients/conformance/): - scenarios.json — machine-readable version of scenarios.md. Template language: `$N.field` refers to step N's response body. - runner.mjs — generic scenario runner; takes any fetcher that returns { status, body } and returns normalized captures. Used by both the drift test and the regeneration script. - normalize.mjs rewritten — shape-agnostic (walks any JSON tree), collapses all timestamps to a single {{TS}} placeholder for CI determinism (ms-granularity clock collisions made ordered {{TS_N}} placeholders flaky). - Fixtures committed: - workspace-crud-basic.json - catalog-under-workspace.json - vector-store-definition.json Tooling: - `npm run conformance:regenerate` (new) — runs each scenario in scenarios.json against a fresh in-process memory-backed app, normalizes responses, writes fixtures. - `tests/conformance/drift.test.ts` (new) — runs the same loop and asserts captures match the committed fixtures. Stable across 5 consecutive runs. Tests (+24, -14): - Removed the old per-workspace-auth and redaction tests (those modules are gone). - Rewrote tests/app.test.ts around /api/v1/* against a memory store (workspaces / catalogs / vector-stores: 24 tests). - Rewrote tests/config.test.ts around the new schema (controlPlane, seedWorkspaces, secretRef format: 10 tests). - Added tests/conformance/drift.test.ts (4 tests: 3 scenarios + 1 presence check). Examples + docs: - examples/workbench.yaml — minimal default (memory driver). - docs/examples/workbench.yaml — annotated sample showing all three driver shapes and seedWorkspaces usage. Verification: - tsc --noEmit clean - biome check clean - vitest run 98/98 passing - conformance:regenerate idempotent across runs - drift test stable across 5 consecutive `npm test` invocations * fix(conformance): emit fixtures with tab indentation to satisfy biome * ci: update docker smoke paths to /api/v1
erichare
added a commit
that referenced
this pull request
Apr 22, 2026
…es (#9) * feat(api): /api/v1/* CRUD wired to ControlPlaneStore + conformance fixtures Ends the route-less era. The TS runtime now exposes the full workspace / catalog / vector-store CRUD surface over `/api/v1/*`, backed by the pluggable ControlPlaneStore that shipped in #4 and #7. Also lands the fixture regeneration script so every language green box can diff against a single canonical contract. Config (breaking, pre-prod): - Replaces the old `workspaces: [...]` block (per-workspace driver + auth + nested vectorStores/catalogs) with: - `controlPlane: { driver: "memory" | "file" | "astra", ... }` - `seedWorkspaces: [...]` — optional, only used by memory driver - The default is `controlPlane.driver: memory`, so `docker run` with no config still boots cleanly. - YAML interpolation (${VAR}) still works; new `tokenRef` fields use our SecretRef format (`env:FOO` / `file:/path`) instead of inline expansion. Runtime wiring (src/): - New `src/secrets/` module: - provider.ts — SecretResolver + SecretProvider interface - env.ts — resolves `env:VAR_NAME` - file.ts — resolves `file:/path`, trimmed - New `src/control-plane/factory.ts` — builds a ControlPlaneStore from config (memory | file | astra); astra resolves tokenRef via the SecretResolver. - New `src/routes/api-v1/` module: - workspaces.ts / catalogs.ts / vector-stores.ts — full CRUD, scoped via path params, validated via OpenAPIHono. - helpers.ts — maps typed ControlPlane* errors to the canonical envelope (404 / 409 / 503). Invoked from the top-level `onError` so route handlers can throw normally without fighting OpenAPIHono's typed-response inference. - Updated src/app.ts to mount the new routes at /api/v1 (Cédrick's path prefix) and serve OpenAPI at /api/v1/openapi.json. - Updated src/root.ts to wire SecretResolver → factory → app. - Removed: src/workspaces/registry.ts, src/routes/workspaces.ts, src/lib/auth.ts, src/lib/redact.ts. None are referenced anymore. - Rewrote src/openapi/schemas.ts around the new record types. Conformance harness (clients/conformance/): - scenarios.json — machine-readable version of scenarios.md. Template language: `$N.field` refers to step N's response body. - runner.mjs — generic scenario runner; takes any fetcher that returns { status, body } and returns normalized captures. Used by both the drift test and the regeneration script. - normalize.mjs rewritten — shape-agnostic (walks any JSON tree), collapses all timestamps to a single {{TS}} placeholder for CI determinism (ms-granularity clock collisions made ordered {{TS_N}} placeholders flaky). - Fixtures committed: - workspace-crud-basic.json - catalog-under-workspace.json - vector-store-definition.json Tooling: - `npm run conformance:regenerate` (new) — runs each scenario in scenarios.json against a fresh in-process memory-backed app, normalizes responses, writes fixtures. - `tests/conformance/drift.test.ts` (new) — runs the same loop and asserts captures match the committed fixtures. Stable across 5 consecutive runs. Tests (+24, -14): - Removed the old per-workspace-auth and redaction tests (those modules are gone). - Rewrote tests/app.test.ts around /api/v1/* against a memory store (workspaces / catalogs / vector-stores: 24 tests). - Rewrote tests/config.test.ts around the new schema (controlPlane, seedWorkspaces, secretRef format: 10 tests). - Added tests/conformance/drift.test.ts (4 tests: 3 scenarios + 1 presence check). Examples + docs: - examples/workbench.yaml — minimal default (memory driver). - docs/examples/workbench.yaml — annotated sample showing all three driver shapes and seedWorkspaces usage. Verification: - tsc --noEmit clean - biome check clean - vitest run 98/98 passing - conformance:regenerate idempotent across runs - drift test stable across 5 consecutive `npm test` invocations * fix(conformance): emit fixtures with tab indentation to satisfy biome * ci: update docker smoke paths to /api/v1 * docs: refresh post-1a.3 — reflect control plane, /api/v1/*, green boxes Every doc in the repo was written during the Phase 0 / Phase 0.5 era and assumed: - `workspaces:` nested in workbench.yaml as static config - Per-workspace bearer auth + nested vectorStores/catalogs - Routes under /v1/* - Strict 1:1 catalog↔vector-store binding - A single TypeScript runtime with no polyglot story None of that has been true since PR #4 (control-plane foundation), and by PR #8 the gap was embarrassing. This PR updates: README.md: - Architecture diagram showing N green boxes behind BACKEND_URL - Current HTTP surface table (operational + /api/v1/*) - Project layout reflecting src/control-plane/, src/astra-client/, src/secrets/, src/routes/api-v1/, clients/python-runtime/ - Points to new docs (green-boxes.md, conformance.md) - License: TBD (no LICENSE file yet) docs/README.md: - Drops "Phase 0" framing - Adds new docs to the landing page docs/architecture.md: - Design principles rewritten around the green-box model - Component map: runtime / ControlPlaneStore / astra-client / secrets / routes - Data model table showing the four wb_* tables - Request-flow diagram for workspace creation - Catalog↔vector-store binding: explicit N:1 - kind enum: astra | hcd | openrag | mock docs/api-spec.md: - Full rewrite. Documents every implemented route plus planned Phase 1b/2/3 routes with honest "planned" labels - Error-code table - Workspace / Catalog / VectorStore response shape examples - Phase 1b + Phase 2 preview (documents, ingest, search, queries) docs/configuration.md: - Completely new schema: controlPlane (discriminated union on driver) + optional seedWorkspaces (memory only) - Three driver sections with required-field tables - Secrets section distinguishing YAML interpolation from SecretRefs - Validation rules table docs/workspaces.md: - Workspaces are runtime records, not config - CRUD lifecycle + cascade semantics - Kind enum explained - Credentials-by-reference pattern - Example curl session docs/roadmap.md: - Status snapshot table (Phase 0 + 1a shipped) - Phase 1a deliverables as-shipped - Phase 1b + 2 + 3 detailed - Open questions refreshed docs/green-boxes.md (new): - Multi-runtime architecture - Current runtimes table - The shared HTTP contract vs. per-runtime internals - BACKEND_URL deployment pattern - How to add a new language docs/conformance.md (new): - Scenarios, fixtures, normalization rules - Running the suite per language runtime - When to update fixtures (and when never to) - Adding a scenario All relative links verified. TS typecheck, lint, and test suite (98/98) unchanged.
5 tasks
erichare
added a commit
that referenced
this pull request
Apr 27, 2026
Adds the new control-plane CRUD surface for the knowledge-base
schema, layered on top of the table set introduced in phase 1a.
Coexists with the legacy /catalogs and /vector-stores routes —
phase 1c removes those.
Endpoints (all under /api/v1/workspaces/{workspaceUid}/):
/knowledge-bases GET POST GET-by-id PUT DELETE
/chunking-services GET POST GET-by-id PUT DELETE
/embedding-services GET POST GET-by-id PUT DELETE
/reranking-services GET POST GET-by-id PUT DELETE
Store interface extension:
Twenty new methods on ControlPlaneStore (5 each for KB,
chunking, embedding, reranking). Implemented across all three
backends — memory, file, astra. Existing legacy methods
unchanged; phase 1c removes them.
Decisions baked into the routes:
- KB.embeddingServiceId / chunkingServiceId immutable post-create.
Update schema is `.strict()` so PUT bodies that include those
keys hit a 400 instead of silently overwriting. The workspace
owner gets the gap-#4 invariant for free.
- Service deletion is refused (409) when any KB still references
the service, mirroring the existing vector-store rule.
- Vector collection is auto-provisioned on KB create using
`wb_vectors_<kb_id>` (hyphen-stripped — Astra collection names
must match `[a-zA-Z][a-zA-Z0-9_]*`).
- supportedLanguages / supportedContent / tags exposed as
`readonly string[]` (sorted, deduplicated) instead of
`ReadonlySet<string>`. JSON-friendly and matches the wire shape
one-for-one. The Astra row layer keeps Sets — astra-db-ts maps
CQL `SET<TEXT>` to native Sets — and the converter normalises
at the boundary.
Tests:
- 5 new contract tests run against memory + file + astra fakes
(15 total) — referential integrity, cascade delete, immutability,
array round-trip, and the auto-collection naming convention.
- 8 new route-level tests covering happy-path CRUD, validation
errors, 409 service-still-referenced, and pagination.
573 → 581 tests passing; typecheck clean.
12 tasks
erichare
added a commit
that referenced
this pull request
Apr 27, 2026
* feat: knowledge-base schema (issue #98, phase 1a) Add the new wb_config_*, wb_rag_*, and wb_agentic_* tables alongside the legacy wb_workspaces / wb_catalog_* / wb_vector_store_* / docs / saved_queries set. Phase 1a is purely additive — nothing reads or writes the new tables yet, so all 558 existing tests still pass. New tables (created idempotently at startup): config: wb_config_workspaces (replaces wb_workspaces) wb_config_knowledge_bases_by_workspace (replaces wb_catalog_by_workspace) wb_config_chunking_service_by_workspace wb_config_embedding_service_by_workspace wb_config_reranking_service_by_workspace wb_config_llm_service_by_workspace (Stage 2) wb_config_mcp_tools_by_workspace (Stage 2) rag: wb_rag_documents_by_knowledge_base wb_rag_documents_by_knowledge_base_and_status wb_rag_documents_by_content_hash agentic (Stage 2): wb_agentic_agents_by_workspace wb_agentic_conversations_by_agent (clustered created_at DESC) wb_agentic_messages_by_conversation Decisions baked in (see issue thread): - KB row carries vector_collection (auto-provisioned) plus lexical_* fields. Lexical isn't a callable service; folding it onto the KB avoids inventing five empty endpoint columns just to fit the shape. - KB references services by id (embedding/chunking/reranking). Embedding service id is intended to be immutable post-create — enforcement lands in 1b with the route rewrite. - Agent.reranking_service_id overrides KB.reranking_service_id at query time; KB value is the default for non-agentic search. - Conversations are partition-clustered (created_at DESC) so list endpoints get newest-first without server-side sort. - saved_queries does not appear in the new schema — drop in phase 1c. - wb_api_key_* unchanged; orthogonal to the data model. Phase 1b switches the routes to read/write these tables; phase 1c drops the legacy ones. Python and Java runtimes get SCHEMA_NOTES.md pointing at the TypeScript source of truth — no implementation changes there. * feat: knowledge-base CRUD routes + store (issue #98, phase 1b) Adds the new control-plane CRUD surface for the knowledge-base schema, layered on top of the table set introduced in phase 1a. Coexists with the legacy /catalogs and /vector-stores routes — phase 1c removes those. Endpoints (all under /api/v1/workspaces/{workspaceUid}/): /knowledge-bases GET POST GET-by-id PUT DELETE /chunking-services GET POST GET-by-id PUT DELETE /embedding-services GET POST GET-by-id PUT DELETE /reranking-services GET POST GET-by-id PUT DELETE Store interface extension: Twenty new methods on ControlPlaneStore (5 each for KB, chunking, embedding, reranking). Implemented across all three backends — memory, file, astra. Existing legacy methods unchanged; phase 1c removes them. Decisions baked into the routes: - KB.embeddingServiceId / chunkingServiceId immutable post-create. Update schema is `.strict()` so PUT bodies that include those keys hit a 400 instead of silently overwriting. The workspace owner gets the gap-#4 invariant for free. - Service deletion is refused (409) when any KB still references the service, mirroring the existing vector-store rule. - Vector collection is auto-provisioned on KB create using `wb_vectors_<kb_id>` (hyphen-stripped — Astra collection names must match `[a-zA-Z][a-zA-Z0-9_]*`). - supportedLanguages / supportedContent / tags exposed as `readonly string[]` (sorted, deduplicated) instead of `ReadonlySet<string>`. JSON-friendly and matches the wire shape one-for-one. The Astra row layer keeps Sets — astra-db-ts maps CQL `SET<TEXT>` to native Sets — and the converter normalises at the boundary. Tests: - 5 new contract tests run against memory + file + astra fakes (15 total) — referential integrity, cascade delete, immutability, array round-trip, and the auto-collection naming convention. - 8 new route-level tests covering happy-path CRUD, validation errors, 409 service-still-referenced, and pagination. 573 → 581 tests passing; typecheck clean. * refactor: drop saved-queries surface (issue #98, phase 1c.1) The new schema does not include saved queries — the proposed agent layer (Stage 2) replaces the use case. This commit removes the surface end-to-end: - DELETE route file + handlers - DELETE store interface methods (list/get/create/update/delete) - DELETE memory/file/astra implementations - DELETE wb_saved_queries_by_catalog DDL + bundle entry + bootstrap - DELETE SavedQueryRecord, SavedQueryRow, OpenAPI schemas - DELETE saved-query-related tests (5 contract tests, 14 route tests) - DELETE catalog-saved-queries conformance scenario + fixture - DELETE structure-test reference to the saved-queries route file Phase 1c continues with /catalogs, /vector-stores teardown + documents/data-plane move to KB scope. 581 → 567 tests passing; typecheck clean. * feat: KB data-plane + auto-provisioning (issue #98, phase 1c.2) Knowledge Bases now own their underlying vector collection end-to-end. Three changes hang off this: 1. `resolveKb` (kb-descriptor.ts) materialises a `VectorStoreRecord`- shaped descriptor on the fly from a KB + its bound embedding / reranking services. The driver and dispatch layers don't need to know KBs exist — they keep consuming the legacy descriptor shape. 2. KB CRUD now provisions / drops the collection: - POST /knowledge-bases creates the row, then `driver.createCollection` on the data plane. On provisioning failure the KB row is rolled back so the two planes can't drift. - DELETE drops the collection first, then the row. 3. New data-plane endpoints under `/knowledge-bases/{kb}`: - POST .../records upsert - DELETE .../records/{recordId} delete - POST .../search vector / hybrid / rerank Coexists with /vector-stores during 1c. UI migrates to KB endpoints in 1d; legacy routes get retired in a follow-up cleanup. 569 tests passing; typecheck clean. * feat: KB-scoped documents + ingest (issue #98, phase 1d) Adds the KB-scoped document and ingest surface, mirroring the legacy catalog-scoped routes. Both stay live during 1c/1d so the UI migrates without flag-flipping. New endpoints under /api/v1/workspaces/{w}/knowledge-bases/{kb}/: GET /documents list (paginated) POST /documents register a doc GET /documents/{d} fetch PUT /documents/{d} patch metadata DELETE /documents/{d} drop doc + cascade chunks GET /documents/{d}/chunks list by index POST /ingest sync chunk + embed + upsert POST /ingest?async=true 202 + job pointer Control plane: Five new ControlPlaneStore methods for RAG documents (list, get, create, update, delete) backed by `wb_rag_documents_by_knowledge_base` in astra (already provisioned in 1a) and parallel maps in memory/ file. Astra writes also maintain the by-status secondary index; by-content-hash gets written on create + content_hash changes. deleteKnowledgeBase now cascades RAG document rows. Job layer: JobRecord gains `knowledgeBaseUid` alongside `catalogUid`. KB-scoped ingest jobs leave catalogUid null and vice versa. wb_jobs_by_workspace gets a `knowledge_base_uid` column (idempotent CREATE TABLE picks it up on next boot). New `runKbIngestJob` async worker resolves the KB descriptor on every run so renames / service swaps can't drift mid-flight. Pipeline: `runKbIngest` is the KB-scoped sibling of `runIngest`. New `KB_SCOPE_KEY = "knowledgeBaseUid"` payload key gets stamped on every chunk so search filters can scope to a KB. Tests: - 5 new route tests (CRUD, sync ingest, async ingest 202, delete) - 3 new contract tests × 3 backends = 9 (RAG CRUD, 404 on unknown KB, deleteKnowledgeBase cascade) - 1 conformance fixture updated for the new JobRecord shape 569 → 583 tests passing; typecheck clean. Phase 1c (drop legacy /catalogs, /vector-stores) and the UI rewire follow. * refactor: drop legacy /catalogs and /vector-stores (issue #98, phase 1c.3) Retires the catalog/vector-store/document surface in favour of the KB-scoped equivalents shipped in 1d. Routes deleted: /api/v1/workspaces/{w}/catalogs /api/v1/workspaces/{w}/catalogs/{c}/documents /api/v1/workspaces/{w}/catalogs/{c}/documents/search /api/v1/workspaces/{w}/catalogs/{c}/documents/{d}/chunks /api/v1/workspaces/{w}/catalogs/{c}/ingest /api/v1/workspaces/{w}/vector-stores /api/v1/workspaces/{w}/vector-stores/{vs}/... /api/v1/workspaces/{w}/vector-stores/discoverable /api/v1/workspaces/{w}/vector-stores/adopt Control plane: - 15 ControlPlaneStore methods removed (catalog × 5, vector-store × 5, document × 5). - Memory / file / astra implementations stripped along with their assertCatalog / assertVectorStore / assertVectorStoreNotReferenced helpers. - Workspace delete now resolves each KB into a driver descriptor via `resolveKb` and drops the underlying collection — no longer walks `wb_vector_store_by_workspace`. - `assertVectorStorePatchIsEmpty` removed from defaults. Schema: - DDL dropped: `wb_catalog_by_workspace`, `wb_vector_store_by_workspace`, `wb_documents_by_catalog`. Astra row types and converters for them go too. - `wb_jobs_by_workspace.catalog_uid` column dropped (the existing table picks this up implicitly: idempotent CREATE TABLE doesn't drop columns, so deployed schemas still have the dead column — harmless, ignored on read/write). - JobRecord loses `catalogUid`; `knowledgeBaseUid` now the only parent pointer. OpenAPI: - `Catalog`, `VectorStore`, `Document`, `AdoptableCollection`, `AdoptCollectionInput`, `Ingest*` (catalog-shaped) schemas retired. KB-shaped equivalents (`KnowledgeBase`, `RagDocument`, `KbIngestRequest`, `KbAsyncIngestResponse`) are the only document surface. Pipeline + worker: - `runIngest` (catalog-scoped) and `runIngestJob` removed. - `runKbIngest` + `runKbIngestJob` are the only ingest path. The cross-replica orphan sweeper now resumes via `runKbIngestJob`. - `CATALOG_SCOPE_KEY` removed from payload keys; `KB_SCOPE_KEY` is the only chunk-payload scope key now. Tests: - app.test.ts trimmed from 3129 → 1023 lines: ~2100 lines of legacy catalog / vector-store / document / ingest / chunk / adopt tests deleted. Replacement coverage lives in `knowledge-bases.test.ts` (15 KB-scoped route tests) and `control-plane/contract.ts` (RAG document × 3 backends). - Contract tests trimmed: `deleteWorkspace cascades to KBs and api keys` replaces the catalog/vector-store cascade test. - Converters test rewritten around `RagDocumentRecord`. - Astra-fake bundle stripped of catalogs / vectorStores / documents tables. Conformance: - 10 legacy fixtures deleted (catalog-*, vector-store-*, document-crud-basic). 5 workspace-only scenarios remain. - scenarios.md re-numbered; KB scenarios deferred to a follow-up. 466 tests passing; typecheck clean. The legacy `wb_workspaces` and `wb_jobs_by_workspace` tables stay in place for now — both are used unchanged by the KB-scoped surface. Migrating to `wb_config_workspaces` is a separate refactor. * feat: rewire web UI to KB endpoints (issue #98, phase 1c.4) Closes the loop on phase 1c — the React UI now speaks the new KB-scoped surface end-to-end. Schemas + API: apps/web/src/lib/schemas.ts and api.ts trimmed of every catalog/vector-store/document/saved-query type and helper. Replaced with KB-shaped schemas (KnowledgeBaseRecord, ChunkingService / EmbeddingService / RerankingService records, RagDocumentRecord) and matching create/update/delete API calls for each. JobRecord loses `catalogUid`, gains `knowledgeBaseUid` to mirror the runtime. Hooks: Deleted: useCatalogs, useVectorStores, useSavedQueries. Added: useKnowledgeBases (CRUD), useServices (chunking + embedding + reranking, factory-style to keep three near-identical surfaces in one file). Updated: useDocuments / useIngest / usePlaygroundSearch — all KB-scoped now. Pages: - WorkspaceDetailPage swaps VectorStoresPanel + CatalogsPanel for ServicesPanel + KnowledgeBasesPanel. - CatalogExplorerPage → KnowledgeBaseExplorerPage. Route changes from /workspaces/:wid/catalogs/:cid → /workspaces/:wid/knowledge-bases/:kbid in App.tsx. - PlaygroundPage picks a workspace + knowledge base instead of a workspace + vector store. The query form now takes a `QueryFormTarget` (vectorDimension + provider description + lexical/rerank flags) so it doesn't need to know about descriptors. - OnboardingPage copy updated. Components: Deleted: CatalogsPanel, VectorStoresPanel, AdoptCollectionDialog, CreateCatalogDialog, CreateVectorStoreDialog, SavedQueriesSection. Added: KnowledgeBasesPanel, ServicesPanel, CreateKnowledgeBaseDialog. Updated: DocumentTable / DocumentDetailDialog / IngestQueueDialog switched to RagDocumentRecord and KB-scoped props (`knowledgeBase` / `knowledgeBaseUid`, `documentId` instead of `documentUid`, `contentHash` instead of `md5Hash`). Tests: - QueryForm test rewritten around the new `target` prop shape. - DocumentTable test rewritten around RagDocumentRecord. - IngestQueueDialog test rewritten around `knowledgeBase` + `kbIngestAsync` mock. - golden-path.spec.ts (Playwright) walks the new flow: onboard → service creation (via API) → KB → upsert → playground. UI typecheck clean; all 76 unit tests green; runtime tests still 466/466 green. * fix: CI failures (lint formatting + e2e cache invalidation) CI on PR #99 was red across two checks; this commit gets them both back to green. (Java Runtime is a pre-existing failure on main — package org.springframework.boot.test.autoconfigure.web.servlet missing under Spring Boot 4.0.6 — and is out of scope for this PR.) Lint, Typecheck, Test, Build: Biome flagged a mix of formatting drift, organize-imports, unused imports, and noNonNullAssertion in the files touched by phases 1c-1d. `biome check --write` auto-fixed almost everything; the one manual fix was `let resolved` in `runKbIngestJob` — Biome refuses implicit `any` at the declaration. Now annotated as `Awaited<ReturnType<typeof resolveKb>>`. Web E2E (Playwright): The golden-path spec navigated to /playground via a SPA-style nav link click after creating services + KB through the `request` fixture. React Query had already populated `useKnowledgeBases(workspaceUid)` with an empty list while `WorkspaceDetailPage` was mounted, and that cached value rode along into the playground — the KB select stayed disabled with "No knowledge bases yet" because the freshly-created KB was invisible to the page's QueryClient. Fix: hard-load `/playground` via `page.goto`, which remounts the React app and clears the cache. Comment in the spec explains why. 466/466 runtime tests, 76/76 web tests, 1/1 e2e green locally. * fix: biome formatter (golden-path.spec.ts) Reformat the chunking-service post-create response check — `expect(chunkRes.ok(), `chunking-service create: ${await chunkRes.text()}`).toBe(true)` overflowed Biome's line length and got auto-wrapped.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces the control-plane abstraction that workspaces, catalogs,
vector-store descriptors, and documents flow through. Three backends are
planned (memory, file, astra); this PR ships the interface plus the
first two. Astra — backed by Data API Tables via
@datastax/astra-db-ts— lands in the follow-up PR.
No routes are wired yet. Route layer + OpenAPI for
/api/v1/workspacescome in PR-1a.3. This PR is the foundation everything else sits on.
What's in the box
ControlPlaneStoreinterface (src/control-plane/store.ts) — backend-agnostic CRUD for workspaces, catalogs, vector-store descriptors, documents.src/control-plane/types.ts) — 1:1 mirror of the CQL schema discussed in the spec doc, but expressed as TypeScript dataclasses. Time fields are ISO-8601 strings; UUIDs are strings; secrets areSecretRefpointers ("env:FOO"/"file:/path"), never raw values.MemoryControlPlaneStore— in-processMap-of-Maps, default for CI anddocker runwithout external services. State is lost on restart; that's fine.FileControlPlaneStore— JSON-on-disk, single-node. Per-table mutex + atomic rename so concurrent writes within one Node process are safe. Not a replacement for a distributed lock — multi-writer setups use the upcomingastrabackend.ControlPlaneNotFoundError/ControlPlaneConflictError/ControlPlaneUnavailableError) that the future route layer maps to HTTP status codes.tests/control-plane/contract.ts) — 14 assertions every backend must satisfy. Memory and file both run it today; astra will run it tomorrow.Design properties worth calling out
update*methods in both backendsSecretReftyped pointers intypes.ts; no raw credential type existsmultiple catalogs may bind the same vector store (N:1)deleteWorkspacedeleteCatalogMap<workspaceUid, Map<childUid, Record>>— makes 1:1 translation to the astra backend trivialmockstays a first-classkindWhy this shape
The
wb_*table structure from the architecture spec drives everything:Each backend's method signatures take partition keys as leading
arguments (
listCatalogs(workspace),getDocument(workspace, catalog, uid))so the CQL translation in the astra backend is direct.
Test plan
npm run typecheck— cleannpm run lint— clean (Biome)npm test— 71/71 passing; 28 new (14 memory + 14 file running the same contract)update*return new records, originals unchangeddeleteWorkspace cascades to catalogs, vector stores and documentsWhat's NOT in this PR
/api/v1/workspacesCRUD) — PR-1a.3@datastax/astra-db-ts) — PR-1a.2WorkspaceRegistry— PR-1a.3 (registry becomes a cache-through layer overControlPlaneStore)SecretProviderimplementations — PR-1a.3