Skip to content
121 changes: 121 additions & 0 deletions docs/design/phase-1a-composed-boot-parity-test.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Phase 1A — Minimum Missing Integration Test: Composed Boot Parity

**Date:** 2026-04-16
**Status:** Gap identified (codex2 review of `0da468e`); test not yet written.
Test-plan artifact only — this memo does not introduce code.
**Scope:** `Atomicmemory-core`, branch `feat/phase-1a-runtime-container`.

---

## What is missing

One live-HTTP integration test that boots

```
createApp(createCoreRuntime({ pool }))
```

against an ephemeral port and proves the composed app's config-facing
HTTP behavior matches the runtime backed by the module-level config
singleton end-to-end. Today no such test exists on this branch.

## Why `runtime-container.test.ts` alone is insufficient

`src/app/__tests__/runtime-container.test.ts` (126 lines) is a
**composition-shape test**, not a parity test. Specifically:

- **Stubbed pool, no DB round-trip.** `stubPool()` at line 17 returns a
minimal `{ query: vi.fn(...) }` object. No SQL actually runs; no
repo→DB seam is exercised.
- **`createApp` test is type-shape only.** Line 119–126 asserts
`typeof app.use === 'function'` and `typeof app.listen === 'function'`
on the returned app. No request is issued, no route is hit, no
response is observed. A bug that binds `/memories` to the wrong
service — or silently drops the route mount — would pass this test.
- **No observed config-driven behavior.** Lines 42–80 verify that
`runtime.config === config` and that `repos.entities` /
`repos.lessons` track `entityGraphEnabled` / `lessonsEnabled` at the
container level. None of that proves the flags surface correctly
into the HTTP layer once routes are mounted.
- **No parity assertion.** There is no comparison of composed-boot
behavior against a reference. If someone later threads config
through routes/services and introduces a subtle divergence between
the composed path and the singleton-backed path, the current suite
would not detect it.

The existing `src/__tests__/route-validation.test.ts` is a
**route-behavior test**, not a composition-parity test: it constructs
its own `express()` app (line 41), its own `new MemoryService(repo,
claimRepo)` (line 51), and mounts `createMemoryRouter` directly. It
deliberately bypasses `createCoreRuntime` and `createApp`. Useful for
route logic; silent on whether the Phase 1A composition seam actually
produces an equivalent app.

## The minimum missing test — shape (not implementation)

A single integration test file under `src/app/__tests__/` that:

1. Boots a live Express listener via
`createApp(createCoreRuntime({ pool })).listen(0)` where `pool` is
the shared test pool (same pattern as
`route-validation.test.ts:54-61`).
2. Applies `schema.sql` to the test DB once in `beforeAll` (match the
existing integration-test pattern; do not reinvent setup).
3. Issues a small number of HTTP requests against the ephemeral port
that collectively exercise at least one config-sensitive path.
Minimum request set: one write (ingest or verbatim store) and one
read that surfaces the write. This proves routes → services →
repos → pool all connected correctly through the composition seam.
4. Asserts responses match what a reference request against a
hand-wired runtime (the `route-validation.test.ts` 2-arg
`MemoryService` shape, or a second composed runtime built the same
way) produces. Parity is the point; the specific endpoint matters
less than the observation that the composed path and the
known-working path agree.
5. Closes the listener and releases any ephemeral resources in
`afterAll`.

The test does not need to be exhaustive. It needs to prove, with at
least one live round-trip, that the composition seam does not drop
fidelity between the runtime container and the HTTP surface.

## Narrow acceptance criteria

The test is considered sufficient iff:

1. It boots `createApp(createCoreRuntime({ pool }))` — not a hand-wired
Express app, not a manually-constructed `MemoryService`.
2. It binds `app.listen(0)` (ephemeral port) inside the test lifecycle,
not a hardcoded port.
3. It issues at least one HTTP request whose response depends on code
paths reached through the runtime container (routes, services,
repos). A `GET /health` alone does not satisfy this — `/health`
does not traverse the runtime graph.
4. It asserts equivalence against a reference behavior, either by
(a) constructing a second runtime the same way and comparing
responses, or (b) asserting concrete expected values that a
singleton-backed server would also produce under identical config.
5. It runs green under the existing `npm test` harness with no new
dependencies and no new env requirements beyond `.env.test`.
6. It lives under `src/app/__tests__/` — adjacent to the composition
code it covers, not under `src/__tests__/`, and not under
`src/services/__tests__/`.

## What this test deliberately does not cover

- **Route-level input validation** — already covered by
`route-validation.test.ts` (UUID guards, filter params).
- **Per-endpoint business logic** — covered by service-level and
repo-level integration tests.
- **Config-threading correctness across 53 `import { config }` sites**
— that is Phase 1B scope and requires its own test strategy.
- **Startup-check behavior** — already covered in
`runtime-container.test.ts:83-116`.
- **Process lifecycle** (SIGTERM, SIGINT, uncaught handlers) — covered
in Phase 1C when `bootstrap()` is extracted.

## One-line framing

Prove that `createApp(createCoreRuntime(...))` produces an HTTP server
whose observable behavior matches the singleton-backed reference on
at least one config-sensitive round-trip. No more; no less.
192 changes: 192 additions & 0 deletions docs/design/phase-1a-singleton-hazards.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
# Phase 1A — Singleton & Global Hazard Audit + Follow-on Sequence

**Date:** 2026-04-16
**Status:** Phase 1A landed (`ff57540`, `0da468e`); this memo captures the
source-grounded audit that informed the refactor and the minimum safe
sequence for Phase 1B and beyond.
**Scope:** `Atomicmemory-core` only.

---

## Why Phase 1A was needed

Two independent singleton/global patterns were entangling startup,
tests, and research harnesses, making it impossible to boot two
runtimes in one process (e.g. for `/compare` style evaluations) or to
unit-test the search pipeline without module-load side effects.

### Hazard 1 — `src/services/search-pipeline.ts` module-level entity repo

A mutable module global and a dedicated setter:

- `let entityRepo: EntityRepository | null = null;` at module scope
- `export function setEntityRepository(repo)` as the only way to mutate
- ~11 read sites inside the pipeline dereferencing it directly
(`runSearchPipelineWithTrace`, `generateLinks`, query-expansion,
query-augmentation, link-expansion branches)

Production flow: `server.ts` called `setEntityRepository(entityRepo)`
at module load, gated by `config.entityGraphEnabled`. Tests that
imported anything pulling in `server.ts` inherited that global state.
Two tests mocked the whole module via `createSearchPipelineMockFactory`,
locking in three exported names (`runSearchPipelineWithTrace`,
`generateLinks`, `setEntityRepository`) — any rename silently broke
the mock.

### Hazard 2 — `src/server.ts` as a singleton composition root

- `repo`, `claimRepo`, `trustRepo`, `linkRepo`, `entityRepo`,
`lessonRepo`, `service`, and `app` all declared at module scope
- `checkEmbeddingDimensions().then(() => app.listen(config.port, …))`
fired at module evaluation — **importing `./server.ts` bound a port**
- Process-global handlers (`uncaughtException`, `unhandledRejection`,
`SIGTERM`, `SIGINT`) registered at import time
- Exports (`app, service, repo, claimRepo, trustRepo, linkRepo`) had
zero in-repo readers; the shape advertised a reusable API that no
one could actually use without triggering the side effects

### Hazard 3 — parallel DI paths for the entity repo

`MemoryService` accepted `entities` as an optional constructor arg
(`memory-service.ts:33-48`) and threaded it through `deps.entities`
for storage/dedup/audn paths. The search pipeline, meanwhile, read
its entity repo from the module global. Two independent sources of
truth for "is the entity graph wired?" — one constructor-injected,
one module-global.

---

## What Phase 1A resolved

Commits on `feat/phase-1a-runtime-container`:

| Commit | Scope |
|---|---|
| `ff57540` refactor(core): introduce runtime container composition root | Extracted `runtime-container.ts`, `create-app.ts`, `startup-checks.ts`; server.ts now just bootstraps |
| `0da468e` refactor(app): remove misleading config override from createCoreRuntime | Removed the nascent `config` override parameter that would have been silently ignored by the 53 downstream `import { config }` sites |

### Concrete changes

- **`src/app/runtime-container.ts`** (new, 107 lines) — `createCoreRuntime({ pool })` returns an explicit `{ config, pool, repos, services }` graph. No hidden singletons.
- **`src/app/create-app.ts`** (new, 39 lines) — `createApp(runtime)` is the only way to build the Express app. CORS, body parsing, routes, health are bound to the passed-in runtime.
- **`src/app/startup-checks.ts`** (new, 70 lines) — embedding-dimension guard extracted behind a plain function, no process exits from inside the check.
- **`src/app/__tests__/runtime-container.test.ts`** (new, 126 lines) — coverage of the composition root.
- **`src/server.ts`** shrunk from 101 → 70 lines — now just process lifecycle (boot, listen, shutdown) over the composed runtime.
- **`src/services/search-pipeline.ts`** — module-level `entityRepo` and `setEntityRepository` removed; `runSearchPipelineWithTrace` and `generateLinks` now accept `entityRepo: EntityRepository | null` as an explicit function parameter.
- **`src/services/__tests__/test-fixtures.ts`** — `createSearchPipelineMockFactory` no longer needs to stub `setEntityRepository`.
- **`src/services/__tests__/current-state-retrieval-regression.test.ts`** — the `setEntityRepository(null)` reset in `beforeEach` is gone (not needed without the global).

### What Phase 1A deliberately did **not** do

- **Config override in the runtime.** An early draft of `createCoreRuntime` accepted a `config` override. Removed in `0da468e` because 53 files still `import { config }` directly from `../config.js`; an override would have influenced only repo construction (`entityGraphEnabled`, `lessonsEnabled`) and been silently ignored everywhere else. Reintroducing this override is a Phase 1B deliverable, not a Phase 1A one.
- **Process-global handlers.** `uncaughtException`/`unhandledRejection`/`SIGTERM`/`SIGINT` registrations still live at import time in `server.ts`. Safe while `server.ts` is purely an entrypoint — becomes unsafe if anyone imports `server.ts` from library code.
- **`server.ts` re-exports.** `{ app, service, repo, claimRepo, trustRepo, linkRepo, runtime }` are still exported. Kept for back-compat with any external consumer; zero in-repo readers.

---

## Hazards that remain (post Phase 1A)

### Remaining 1 — config as a 53-reader module singleton

- 53 files `import { config }` from `../config.js` directly.
- `createCoreRuntime` reflects this honestly: `runtime.config` is the
same reference as the module singleton. You cannot construct two
runtimes with different configs in one process.
- Any attempt to override per-runtime config without first threading
config through routes/services/pipeline is a silent no-op for
every flag not named `entityGraphEnabled` or `lessonsEnabled`.

### Remaining 2 — `server.ts` side-effects-on-import

- `bootstrap()` fires at module load.
- Four `process.on(...)` handlers register at module load.
- Importing `./server.ts` still has a nontrivial runtime contract.

### Remaining 3 — zero-reader back-compat export surface on `server.ts`

- `export { app, service, repo, claimRepo, trustRepo, linkRepo, runtime };`
- No in-repo importers; external consumers (if any) would inherit the
import-time side effects above.

---

## Files and tests to watch during Phase 1B+

### Production code (expect churn)

- `src/config.ts` — the singleton that will grow a per-runtime construction path
- `src/services/search-pipeline.ts` — currently reads `config` inline (e.g. `config.entityGraphEnabled`, `config.queryExpansionEnabled`, `config.mmrEnabled`); candidate for threading config through rather than module-level reads
- `src/services/memory-search.ts` — same pattern
- `src/services/memory-ingest.ts`, `memory-audn.ts`, `memory-storage.ts` — 53-site config audit lands here
- `src/routes/memories.ts`, `src/routes/agents.ts` — routes also read config directly in places
- `src/server.ts` — when `bootstrap()` becomes a function callable by tests/harnesses, the import-time side effects go with it

### Tests (fragile surfaces if Phase 1A exports shift)

- `src/app/__tests__/runtime-container.test.ts` — locks the shape of `createCoreRuntime` and the `CoreRuntime` interface
- `src/__tests__/route-validation.test.ts` — builds its own `express()` app, calls `new MemoryService(repo, claimRepo)` (2-arg ctor); if `MemoryService` ctor changes, update here
- `src/__tests__/smoke.test.ts` — consumes `createServiceTestContext` (2-arg `MemoryService`)
- `src/db/__tests__/test-fixtures.ts` — `createServiceTestContext` locks the 2-arg `MemoryService` ctor shape
- `src/services/__tests__/test-fixtures.ts` — `createSearchPipelineMockFactory` locks two exported names now (`runSearchPipelineWithTrace`, `generateLinks`); any rename is a silent mock break
- `src/services/__tests__/stale-composite-retrieval.test.ts`, `current-state-composite-packaging.test.ts` — both use `createSearchPipelineMockFactory`
- `src/services/__tests__/current-state-retrieval-regression.test.ts` — no longer imports `setEntityRepository`; should now pass an explicit entity repo (or leave as null) when it invokes the pipeline

---

## Minimum safe follow-on sequence

Each step is independently testable and reversible. Don't skip ahead.

### Phase 1B — thread config through the service layer

**Goal:** `runtime.config` becomes the one config source; reintroduce a genuine `config` override on `createCoreRuntime`.

1. **Audit the 53 `import { config }` sites.** Classify each as (a) flag read at request time, (b) flag read at construction time, or (c) constant-use-as-pseudo-env. The counts inform Phase 1B's scope.
2. **Thread config through `MemoryService` construction.** Add a config argument to `MemoryService` and `MemoryServiceDeps`; remove inline `import { config }` from the service-level modules called from the facade.
3. **Thread config through the search pipeline.** `runSearchPipelineWithTrace` and `generateLinks` already take `entityRepo` explicitly; extend the same treatment to the config fields they read (`queryExpansionEnabled`, `mmrEnabled`, `rerankSkipTopSimilarity`, `rerankSkipMinGap`, etc.).
4. **Thread config through routes.** Route factories (`createMemoryRouter`, `createAgentRouter`) already accept services; add a config parameter where routes read `config` inline today (spot-check first — some routes may not need it).
5. **Reintroduce `config` override on `createCoreRuntime`.** Once steps 2–4 are landed, a `CoreRuntimeDeps.config` override is honest again: every flag passed in will actually be consulted by downstream code.
6. **Test:** extend `runtime-container.test.ts` to assert two concurrently-constructed runtimes with different configs behave independently on at least one flag.

**Exit criterion:** `grep -r "import { config } from" src | wc -l` drops meaningfully; the remaining sites are deliberate (e.g. `config.ts` itself, bootstrap files, truly environment-scoped reads).

### Phase 1C — extract `bootstrap()` to `src/app/bootstrap.ts`

**Goal:** `server.ts` becomes a thin `bin/`-style entrypoint; `bootstrap(runtime)` is callable from tests and harnesses.

1. Move `bootstrap()` body, the `app.listen` call, and the four `process.on` handlers into `src/app/bootstrap.ts`. Export `startRuntime(runtime): Promise<() => Promise<void>>` returning a shutdown function.
2. `server.ts` becomes: `const runtime = createCoreRuntime({ pool }); const app = createApp(runtime); startRuntime({ runtime, app });` — no exports.
3. Delete the back-compat re-exports from `server.ts`. Verified above: zero in-repo readers.
4. **Test:** unit test `startRuntime` can bind to an ephemeral port (`listen(0)`) and be cleanly shut down by its returned function.

**Exit criterion:** `server.ts` has no exports and no side effects beyond the bootstrap call; importing it from a test does not bind a port or register process handlers.

### Phase 1D — `createCoreRuntime` accepts override hooks for testing

**Goal:** tests can substitute individual repositories or services without reconstructing the whole graph.

1. Add optional `repos?: Partial<CoreRuntimeRepos>` and `services?: Partial<CoreRuntimeServices>` to `CoreRuntimeDeps`.
2. Merge overrides after default construction; add a runtime-container test that verifies an injected mock `MemoryService` is what `createApp` binds to `/memories`.
3. Migrate `route-validation.test.ts` and `smoke.test.ts` to use `createCoreRuntime` instead of constructing `MemoryService` directly.

**Exit criterion:** no test file reaches past `createCoreRuntime` to import repositories or the `MemoryService` class directly.

### Phase 1E — lifecycle for `pool`

**Goal:** the process-global `pool` singleton (imported only by `server.ts` now) becomes an explicit parameter that tests can replace with an isolated pool.

1. Audit importers of `src/db/pool.ts`. If `server.ts` is the only one, Phase 1E is a 2-line change to thread `pool` via a constructor arg to any remaining importer.
2. If tests still import `pool` directly for setup (`route-validation.test.ts`, `test-fixtures.ts`), add a `createTestPool()` helper so test and production paths don't share the same pg connection pool by accident.

**Exit criterion:** `pool` is never imported outside `server.ts` (production) and the test-pool helper (tests).

---

## One-line summary

Phase 1A eliminated the search-pipeline module global and the
server.ts composition root; what remains is config-as-singleton
(53 sites), bootstrap-on-import in `server.ts`, and the zero-reader
re-export surface. Phase 1B threads config, Phase 1C extracts
bootstrap, Phase 1D enables test overrides, Phase 1E isolates pool.
Each step is independently landable.
15 changes: 14 additions & 1 deletion src/__tests__/deployment-config.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,19 @@ function extractComposeEnvVar(composeContent: string, varName: string): string |
return match ? match[1].trim().replace(/^["']|["']$/g, '') : null;
}

/**
* Build a regex that matches a docker-compose `ports:` list entry binding
* an external host port to the given internal container port. Accepts both
* a literal external port (`"3050:3050"`) and a shell-variable substitution
* (`"${APP_PORT:-3050}:3050"`). The substitution form is the side-by-side-CI
* shape introduced in PR #6.
*/
function composePortBindingRegex(internalPort: number): RegExp {
return new RegExp(
`ports:\\s*\\n\\s*-\\s*["']?(?:\\d+|\\$\\{[A-Z_]+:-\\d+\\}):${internalPort}`,
);
}

describe('deployment configuration', () => {
describe('docker-compose.yml', () => {
it('app service uses host.docker.internal for OLLAMA_BASE_URL, not localhost', () => {
Expand Down Expand Up @@ -102,7 +115,7 @@ describe('deployment configuration', () => {

it('app port is exposed', () => {
const compose = readComposeRaw();
expect(compose).toMatch(/ports:\s*\n\s*-\s*["']?(\$\{APP_PORT:-\d+\}|\d+):3050/);
expect(compose).toMatch(composePortBindingRegex(3050));
});
});

Expand Down
Loading