Skip to content

build(deps-dev): update testcontainers[postgres] requirement from <5,>=4.10.0 to >=4.14.2,<5 in /apps/api#3

Merged
xmap merged 1 commit into
mainfrom
dependabot/uv/apps/api/testcontainers-postgres--gte-4.14.2-and-lt-5
May 12, 2026
Merged

build(deps-dev): update testcontainers[postgres] requirement from <5,>=4.10.0 to >=4.14.2,<5 in /apps/api#3
xmap merged 1 commit into
mainfrom
dependabot/uv/apps/api/testcontainers-postgres--gte-4.14.2-and-lt-5

Conversation

@dependabot
Copy link
Copy Markdown
Contributor

@dependabot dependabot Bot commented on behalf of github May 12, 2026

Updates the requirements on testcontainers[postgres] to permit the latest version.

Release notes

Sourced from testcontainers[postgres]'s releases.

testcontainers: v4.14.2

4.14.2 (2026-03-18)

Features

  • kafka: allow configurable listener name and security protocol (#966) (44dd40b)
Changelog

Sourced from testcontainers[postgres]'s changelog.

4.14.2 (2026-03-18)

Features

  • kafka: allow configurable listener name and security protocol (#966) (44dd40b)

4.14.1 (2026-01-31)

Bug Fixes

  • Allow passing in a custom wait strategy string in MySQL, Cassandra, Kafka and Trino (#953) (be4d09e)
  • compose: expose useful compose options (#951) (183e1aa)
  • core: bring back dind tests (7337266)
  • core: Use WaitStrategy internally for wait_for function (#942) (e323317)
  • nats: add support for jetstream (#938) (49c9af8)
  • Support Elasticsearch 9.x (#881) (f690e88), closes #860

4.14.0 (2026-01-07)

Features

  • Add ExecWaitStrategy and migrate Postgres from deprecated decorator (#935) (2d9eee3)

Bug Fixes

  • add ruff to deps (#919) (5853d32)
  • cassandra,mysqk,kafka: Use wait strategy instead of deprecated wait_for_logs (#945) (b7791b9)
  • core: recreate poetry lockfile with latest versions of libraries (#946) (9a97385)
  • elasticsearch: Use wait strategy instead of deprecated decorator (#915) (c785ecd)
  • minio: minio client requires kwargs now (#933) (37f5902)
  • minio: Use wait strategy instead of deprecated decorator (#899) (febccb7)

4.13.3 (2025-11-14)

Bug Fixes

  • do not require consumer of library to state nonsupport for py4 (#912) (f608df9)
  • docs: Update dependencies for docs (#900) (3f66784)
  • support python 3.14!!! - (#917) (f76e982)

4.13.2 (2025-10-07)

Bug Fixes

... (truncated)

Commits
  • 5c67efb chore(main): release testcontainers 4.14.2 (#969)
  • 44dd40b feat(kafka): allow configurable listener name and security protocol (#966)
  • a78475a chore(main): Migrate to uv (#960)
  • 17eb0b0 chore(main): release testcontainers 4.14.1 (#954)
  • f690e88 fix: Support Elasticsearch 9.x (#881)
  • 15e99ee add modifications
  • 7337266 fix(core): bring back dind tests
  • 49c9af8 fix(nats): add support for jetstream (#938)
  • e323317 fix(core): Use WaitStrategy internally for wait_for function (#942)
  • 183e1aa fix(compose): expose useful compose options (#951)
  • Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [testcontainers[postgres]](https://github.com/testcontainers/testcontainers-python) to permit the latest version.
- [Release notes](https://github.com/testcontainers/testcontainers-python/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-python/blob/main/CHANGELOG.md)
- [Commits](testcontainers/testcontainers-python@testcontainers-v4.10.0...testcontainers-v4.14.2)

---
updated-dependencies:
- dependency-name: testcontainers[postgres]
  dependency-version: 4.14.2
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code labels May 12, 2026
@xmap xmap merged commit 155d9e4 into main May 12, 2026
3 of 4 checks passed
@xmap xmap deleted the dependabot/uv/apps/api/testcontainers-postgres--gte-4.14.2-and-lt-5 branch May 12, 2026 21:55
xmap added a commit that referenced this pull request May 16, 2026
Lands the 4 in-scope items flagged by the Phase 11a gate review
(3-agent parallel: architecture / test coverage / cross-BC consistency).
Nit #3 (`test_hazard_classification.py`) was a false positive — the
file already exists with full coverage of bool-trap + 4 Invalid*
errors + boundary values.

## Doc drift (#4)

`state.py:13,38` claimed "12-value ClearanceKind" — the enum has 10
values (post-11a-a refactor split form-type from facility identity
via the orthogonal `facility_asset_id` field). Comment now reflects
the locked 10 + cites the refactor history.

## 5 missing projection apply() arms (#2)

`test_clearance_summary_projection.py` had direct `apply()`
assertions for ClearanceRegistered / ClearanceApproved /
ClearanceRejected + the ClearanceReviewStepAppended no-op (4 of 9
event types). The 5 missing arms (Submitted / ReviewStarted /
Activated / Expired / Superseded) were only covered transitively
through `test_list_clearances_handler_postgres.py`. New direct tests
pin each event's status update + last_status_changed_at +
event-specific columns (Expired carries `reason`; Superseded
deliberately drops `by_clearance_id` per deferred column).

Notable: the Superseded test asserts `by_clearance_id` does NOT
land in the SQL args — pins the deferred-projection-column intent
so an accidental SQL change surfacing it would fail loudly.

## End-to-end Run.start gate integration test (#1)

`test_postgres_clearance_lookup.py` pins the adapter in isolation;
`test_start_run_clearance_gate_decider.py` pins the decider in
isolation; until this commit, NOTHING exercised the COMPOSITION of
real PG event store + real `PostgresClearanceLookup` + Run.start
handler chain. The 11a gate review's #1 coverage gap.

New `test_start_run_clearance_gate_postgres.py` seeds the full
upstream chain (Capability + Asset + Method + Practice + Plan +
Subject + mount), overrides `build_postgres_deps`'s default
`AlwaysCoveredClearanceLookup` with the real
`PostgresClearanceLookup(db_pool)`, and pins three scenarios:
  1. Active Clearance bound to Subject -> Run.start succeeds
  2. NO Clearance references the scope -> RunRequiresActiveClearanceError
  3. Defined-only Clearance -> RunClearanceCoverageMismatchError

Uses `SubjectBinding` (not `RunBinding`) to decouple from
FixedIdGenerator ordering -- subject_id is operator-supplied.

## Amend idempotency contract test (#5)

`register_clearance` had `test_register_clearance_idempotency.py`;
`amend_clearance` (also create-style 201-returning + idempotency-
wrapped at wire.py) had no equivalent. New
`test_amend_clearance_idempotency.py` pins three flows mirroring
the register tests:
  1. No key + same body -> two calls; second hits the parent's
     post-Superseded gate and returns 409 (documents why the key
     matters for amend specifically)
  2. Same key + same body -> 201 with the SAME child clearance_id
     (cached response; handler not re-executed; parent not
     transitioned twice)
  3. Same key + different body -> 422 idempotency conflict

## Verification

  - pyright 0/0/0
  - ruff clean (lint + format)
  - 5423 unit+contract+architecture tests pass (8 new: 5 projection
    + 3 amend idempotency)
  - 275 PG integration tests pass (3 new gate end-to-end)
  - All pre-commit hooks pass (ruff, pyright, tach, architecture
    fitness, secrets scan)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xmap added a commit that referenced this pull request May 18, 2026
…ase 11b-b)

Closes the read-side surface for the Caution BC. Three-layered
discovery (Asset view eager, REST/MCP list, future Run.start banner)
is now SQL-backed via the `proj_caution_active` projection.

Atlas migration (20260517100000_init_proj_caution_active.sql):
  - `proj_caution_active` table with 17 columns (caution_id PK, target
    kind+id, category, severity, text, workaround, author, tags TEXT[],
    expires_at, propagate_to_children, status, parent/superseded_by
    pointers, retired_reason, registered_at, last_status_changed_at,
    updated_at)
  - 4 indexes: keyset (registered_at, caution_id) for pagination;
    partial (target_kind, target_id) WHERE status='Active' for the
    11b-c Run.start hot-path lookup; GIN (tags) for tag filter;
    author_actor_id B-tree for author scope queries
  - CHECK constraints lock all 4 closed enums day one (status,
    severity, category, retired_reason; target_kind 2-value)
  - GRANT to cora_app role; projection_bookmarks insert

CautionActiveProjection (subscribed to all 3 CautionEvent types):
  - CautionRegistered -> INSERT (wrapped in SAVEPOINT for cross-stream
    UniqueViolation defense per supply precedent; today's caution_id
    PK alone cannot collide, but the SAVEPOINT is forward-compat for
    a future (target_kind, target_id, text) UNIQUE if it ever lands)
  - CautionSuperseded -> UPDATE status='Superseded' +
    superseded_by_caution_id + last_status_changed_at
  - CautionRetired -> UPDATE status='Retired' + retired_reason +
    last_status_changed_at
  - Supersession child genesis carries parent_caution_id on the
    CautionRegistered payload; the same INSERT arm writes both
    top-level and supersession-child rows

list_cautions slice (5 files; mirrors list_supplies + list_clearances):
  - 8 optional filters: target_kind, target_id, category, severity,
    min_severity (Z535 ordinal threshold: Notice=0/Caution=1/Warning=2),
    status (default 'Active'; pass 'all' for no status filter), tag
    (via $N = ANY(tags) GIN match), author_actor_id
  - Keyset cursor on (registered_at, caution_id); limit 1-100, default 50
  - GET /cautions REST + MCP `list_cautions` tool; both surface the
    full read shape (every projection column as a typed Pydantic field
    + `target` wire-shape dict built from target_kind+target_id)
  - status='Active' default honors anti-pattern #5 (pull-on-read only):
    retired/superseded cautions excluded unless explicitly requested

Anti-hooks pinned by tests:
  - No outbox / no notification on CautionRegistered (anti-pattern #5)
  - No Asset.parent_id hierarchy walk at query time (Watch #8; today
    propagate_to_children is hint-only)
  - No category exhaustiveness enforced on read (anti-pattern #9)
  - Default status filter is 'Active' (retired/superseded excluded)
  - Subscribed event types frozen at 3 (Registered/Superseded/Retired)

Wiring: register_caution_projections plumbed through cora.api.main
lifespan; CautionHandlers gains list_cautions field; routes + tools
registered.

Tests: 91 new unit (38 projection + 53 handler) + 30 new contract
(20 endpoint + 10 MCP) + 10 new integration (PG drain across 3-event
lifecycle + pagination + 8 filter combinations + tags array round-trip
+ cross-stream id-unique smoke). Pyright 0/0/0, ruff clean,
architecture suite passes (no new module / no cross-slice violations).

Cross-BC siblings (list_clearances, list_supplies, list_runs)
re-tested to confirm the new register_caution_projections lifespan
call did not regress sibling BCs: 11 integration tests pass clean.

NOT shipped in 11b-b (per design memo):
  - Run.start non-blocking integration via CautionLookup port (11b-c)
  - Auto-retire on expires_at (Watch #3)
  - Hierarchy-propagation projection denorm (Watch #8)

Atlas hash re-computed via `atlas migrate hash --dir file://migrations`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xmap added a commit that referenced this pull request May 18, 2026
Renames the closed `CampaignIntent` enum from 5 domain-specific values
(IN_SITU / OPERANDO / PARAMETER_SWEEP / MULTI_MODAL / PROPOSAL_BLOCK)
to 4 abstract shape values (SERIES / SWEEP / COORDINATED / BLOCK). The
new enum answers "what shape of coordination" rather than conflating
shape with scientific technique or purpose.

Conceptual mapping (pre-pilot; no event-payload migration needed):
  - IN_SITU + OPERANDO -> SERIES (both describe repeated measurements
    over time on shared resources; the in-situ/operando distinction is
    a technique tag, not a shape)
  - PARAMETER_SWEEP -> SWEEP
  - MULTI_MODAL -> COORDINATED (covers both multi-Method and
    multi-Subject coordinated acquisition)
  - PROPOSAL_BLOCK -> BLOCK (scheduling envelope: proposal / beamtime /
    cycle)

Rationale: closed enum should track ONE axis cleanly. Mixing "what
shape" with "what scientific technique" or "what purpose" causes the
enum to collapse under tag-pressure (Watch #1 trigger thresholds). The
shape-vs-purpose split:
  - intent (closed, 4 values) = coordination shape (Series / Sweep /
    Coordinated / Block)
  - tags (free frozenset) = technique + purpose (in-situ, operando,
    tomography, EDD, calibration, maintenance, validation, replication,
    pilot, longitudinal, ...)

Mirrors Caution BC's category + tags dual-shape pattern.

Watch #1 trigger candidates rewritten: future shape additions only
(Comparison, Discovery if cluster distinctly). Purpose-tags (calibration
/ maintenance / validation) EXPLICITLY stay in tags, not promoted to
shape values.

Pre-pilot file replacement (no events on disk; mirrors gate-cleanup N8
projection rename precedent). Atlas migration body rewritten:
  CHECK (intent IN ('Series', 'Sweep', 'Coordinated', 'Block'))
Atlas hash re-computed.

~57 files touched: aggregate state.py + 6 slice src files + Atlas
migration + atlas.sum + 19 unit tests + 22 contract tests + 4
integration tests + 3 memory docs (design memo Decision + Why §5
rewrite + Locks block + Watch #1 + Watch #3 + LOOSE Subject paragraph;
MEMORY.md line 47; phase plan 6i row).

Semantic test-intent preservation: where the IN_SITU+OPERANDO -> SERIES
collapse would have broken assertions (parametrized intent uniqueness,
filter-narrows-correctly tests), the test data was hand-fixed to use
distinct shapes (SERIES vs COORDINATED, SWEEP vs BLOCK) preserving the
test's behavioral pin.

Pyright 0/0/0, ruff clean, 209 campaign unit + 126 campaign contract +
21 PG integration + 1 XPASS (the documented race-skeleton).
Zero grep hits for old values across src, tests, infra, memory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xmap added a commit that referenced this pull request May 18, 2026
…rrorPort + gen_ai telemetry (Phase 8f-b iter 2a)

Ships the infrastructure plumbing 8f-b iter 2b's RunDebrief
subscriber will consume: a provider-neutral LLMPort with value-
type bundle (CacheBreakpoint, LLMContentBlock, LLMSystemPrompt,
ModelRef, LLMUsage, LLMResponse, LLMChatRequest) and a six-class
LLMError taxonomy; a production AnthropicLLMAdapter implementing
the port; an abstract LogbookMirrorPort with no implementor; a
gen_ai OTel telemetry helper with per-call token + cost histograms;
Kernel wiring for both ports; an Anthropic SDK dependency.

NO subscriber, NO Agent seed, NO actual LLM behavior at iter 2a --
those land at iter 2b with the security-specialist gate review.

## Port surface (cora.infrastructure.ports.llm)

LLMPort.chat(request: LLMChatRequest) -> LLMResponse with:
- System prompt as layered LLMContentBlock tuple; each block can
  carry an optional CacheBreakpoint (ttl: "5m" | "1h") to mark a
  cached prefix boundary.
- User message as a single LLMContentBlock (also breakpoint-able).
- Structured output as a JSON Schema dict (the adapter forces
  tool-use-as-structured-output convention).
- ModelRef VO (provider + model + snapshot_pin) carrying the
  Agent.model_ref shape; deliberately duplicated (not hoisted) from
  the agent aggregate's validated VO since one carries domain
  invariants and the other is a wire shape. Hoist deferred to
  rule-of-three trigger (second LLM-consuming agent at 8f-c+).
- LLMUsage (input/output/cache_creation/cache_read tokens; None
  coerced to 0 at the adapter boundary).
- LLMResponse (parsed + raw_text + usage + stop_reason + model_id).

Six error subclasses (LLMRateLimitError, LLMServerError,
LLMTimeoutError, LLMAuthenticationError, LLMInvalidRequestError,
LLMSchemaValidationError) all inherit from LLMError so iter 2b's
retry layer can isinstance-classify on the base.

FakeLLMAdapter test stub with a response queue + LLMError pass-
through + received-request capture; mirrors the
AlwaysCoveredClearanceLookup / AlwaysQuietCautionLookup test-default
convention.

## Production adapter (cora.agent.adapters.anthropic_llm_adapter)

AnthropicLLMAdapter wraps anthropic.AsyncAnthropic with:
- max_retries=2, request_timeout=600s (design memo lock).
- 4-cache-breakpoint client-side validation (fail-fast before API
  call instead of opaque 400; raises LLMInvalidRequestError).
- 1h-TTL beta header (extended-cache-ttl-2025-04-11) set
  conditionally when any block requests "1h".
- Tool-use-as-structured-output via stable synthetic tool name
  cora_structured_output (cache-correctness invariant pinned by
  test_synthetic_tool_name_stable_across_calls).
- Full Anthropic SDK error -> LLMError translation including the
  defensive APIStatusError default arm (pinned by
  test_unknown_apistatuserror_subclass_translates_to_server_error).
- ModelRef snapshot_pin appended as "<model>-<pin>" suffix when
  set.

Owns the adapter per cross-BC convention (Safety BC owns
PostgresClearanceLookup; Caution BC owns PostgresCautionLookup;
Agent BC owns AnthropicLLMAdapter). Tach validated.

## LogbookMirrorPort (cora.infrastructure.ports.logbook_mirror)

Abstract Protocol with no production implementor at 8f-b. Reserves
the Kernel slot (Kernel.logbook_mirror: LogbookMirrorPort | None)
so iter 2b's subscriber can short-circuit cleanly on `is None`
and a future PhoebusOlogAdapter / SciLogAdapter / SciCatAdapter
slots in without subscriber churn. mirror_decision is
fire-and-forget (returns None) so logbook outages never propagate
to the Decision-emission path.

## gen_ai telemetry helper (cora.infrastructure.observability.gen_ai)

record_llm_call(span, ...) sets OTel GenAI semantic-convention
span attributes (gen_ai.system, gen_ai.request.model,
gen_ai.response.model, gen_ai.usage.{input,output,
cache_creation,cache_read}_tokens, gen_ai.response.finish_reasons)
and records two histograms (gen_ai.client.token.usage per OTel
spec + cora.agent.llm.cost.usd custom).

PRICING dict keyed on (provider, model) carrying per-MTok rates
for opus/sonnet/haiku 4-x (1h-TTL cache write tier). Unknown
models cost $0.00 with a one-time-per-process warning. compute_
cost_usd returns dollar value the adapter intentionally discards
(histogram is the persistent record; return value is for tests).

Helpers NOT re-exported from observability/__init__.py: the only
consumer (AnthropicLLMAdapter) imports from the submodule directly.
Re-export trigger is "second LLM adapter ships" (cross-BC review
P1).

## Settings + secret handling

Settings.anthropic_api_key: SecretStr | None = None (read from
ANTHROPIC_API_KEY env var). SecretStr redacts to `**********` in
repr(), str(), and model_dump_json() so no debug-log / json-dump
path can leak the credential. Verified by
test_anthropic_api_key_is_secret_str_and_redacted_in_repr.

The factory (cora.agent.llm_factory.build_llm) is the one and
only call site of .get_secret_value(); composition root binds
build_llm into build_kernel's llm_factory parameter.

## Composition root

LLMPortFactory Protocol in cora.infrastructure.deps:
__call__(settings) -> LLMPort | None. Returns None when settings
indicate no LLM should be wired (eg. anthropic_api_key unset);
iter 2b's subscriber-registration step will fail-fast on
kernel.llm is None.

cora.api.main binds build_llm into build_kernel(llm_factory=...).

## Gate review (Stage 3, 3 baseline panel)

Architecture / test-coverage / cross-BC consistency all
APPROVE WITH NITS, 0 P0s. All P1s addressed in same commit:

- arch P1 #1 (SecretStr leak): anthropic_api_key now SecretStr
  with redaction test pin.
- arch P1 #2 (ModelRef duplication): documented as intentional
  separation with rule-of-three trigger to hoist.
- test-coverage P1 #1 (APIStatusError defensive default
  untested): added
  test_unknown_apistatuserror_subclass_translates_to_server_error.
- test-coverage P1 #2 (tool-name stability not pinned): added
  test_synthetic_tool_name_stable_across_calls (cache-correctness
  invariant).
- test-coverage P1 #3 (record_llm_call return discard intent):
  documented at the call site.
- test-coverage P1 #4 (None cache-token coercion untested): added
  test_none_cache_token_fields_coerce_to_zero.
- cross-BC P1 #1 (gen_ai re-export bloat): dropped from
  observability/__init__.py.

Deferred to iter 2b (with documented triggers):
- test-coverage watch (AsyncAnthropic.aclose leak): subscriber
  lifecycle is iter 2b scope.
- cross-BC nit (LLMPort/LogbookMirrorPort suffix vs prior no-
  suffix ports): cross-cutting naming review; not iter 2a scope.
- cross-BC nit (AlwaysSilentLLM stub for Kernel symmetry):
  Optional shape gives fail-fast at subscriber registration,
  intentional; revisit if RecipeScreener at 8f-c surfaces friction.

## Tests + verification

49 new tests, all green: 22 unit (AnthropicLLMAdapter) + 12 unit
(LLMPort + FakeLLMAdapter) + 7 unit (gen_ai telemetry) + 3 unit
(LogbookMirrorPort Protocol) + 5 unit (deps Kernel composition +
build_llm + SecretStr redaction).

Full suite: 6608 unit+contract+architecture + 329 PG integration
tests pass, 107 skipped, 0 failures. pyright 0/0/0, ruff clean
(8f-b iter 2a scope), tach validated. anthropic>=0.79.0,<1 added
to dependencies (0.102.0 installed); uv.lock committed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xmap added a commit that referenced this pull request May 20, 2026
Adds the second half of the Calibration BC ↔ Data BC AsShot
integration. Phase 12b shipped Run.pinned_calibrations on the
acquired-from Run; 12c-1 ships Dataset.used_calibrations on the
derivative Dataset. The two sets are independent — the
reconstruction may legitimately cite refined revisions not in the
producing Run's pin set (Current vs AsShot per DNG; per-step
refinement per RELION; per-record overrides per CMS toGet).

What ships:

- Dataset.used_calibrations: frozenset[UUID] aggregate field
  (revision-cited atomic IDs per the Calibration BC's
  revision-cited model). IMMUTABLE after register_dataset; both
  transition arms (DatasetDiscarded + DatasetPromoted) preserve
  prior.used_calibrations verbatim.
- DatasetRegistered.used_calibrations: tuple[UUID, ...] event
  payload (sorted at to_payload for deterministic jsonb bytes;
  empty-tuple default + payload.get fold for pre-12c streams,
  same additive-state pattern as derived_from /
  producing_run_end_state / intent).
- RegisterDataset.used_calibrations: frozenset[UUID] command field
  + decider validates cardinality (cap 64, matches derived_from)
  + sorts before emit. NO cross-BC existence check — operator/agent
  supplies the citation set; symmetry with Run.pinned_calibrations
  Phase 12b + the cross-BC eventual-consistency stance per
  project_calibration_design anti-hook #3 (revision-cited atomic
  IDs make "partial override" a category error in this model) +
  canonical DDD eventual-consistency canon (Vernon/Evans).
- REST POST /datasets gains used_calibrations: list[UUID] body
  field with cap; MCP register_dataset tool gains a matching
  optional list param.
- New InvalidUsedCalibrationsError (HTTP 400) + new
  validate_used_calibrations validator + new
  DATASET_USED_CALIBRATIONS_MAX_ENTRIES constant + Data BC
  routes.py registers the validation handler.

Test surface (13 new unit tests across 3 files):
- Evolver: genesis populates, legacy pre-12c folds to empty,
  discard preserves AsShot invariant, promote preserves AsShot
  invariant.
- Events: sorted-bytes invariant, empty-tuple default serializes
  to [], pre-12c payload folds with payload.get, full round-trip
  through StoredEvent envelope, order-independence of payload
  bytes.
- Decider: default empty tuple on payload, threading through
  command->event, sort-before-emit pins exact tuple order
  (NOT just set equality), cardinality cap rejects > cap,
  boundary at exactly cap accepted, no cross-BC validation on
  synthetic UUIDs, no comparison against producing Run's
  pinned_calibrations even when the Run is pre-loaded in context.

Projection update + Atlas migration + the GIN-indexed
proj_data_dataset_summary.used_calibrations column land in 12c-2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xmap added a commit that referenced this pull request May 20, 2026
…ection (Phase 12c-2)

Postgres-tier half of Dataset.used_calibrations per Phase 12c-1.
Surfaces the AsShot citation set on the proj_data_dataset_summary
read model so future operator dashboards + agent subscribers
(RotationCenterRefiner, RunDebrief follow-ups, 12d Calibration-
revision-impact queries) can read "which reconstructions cited
CalibrationRevision X?" without folding the Dataset stream.

What ships:

- Atlas migration 20260519100000_add_proj_data_dataset_summary_used_calibrations.sql
  - ALTER TABLE proj_data_dataset_summary ADD COLUMN
    used_calibrations UUID[] NOT NULL DEFAULT '{}'  (pre-12c rows
    backfill to empty array; matches the in-memory frozenset default
    + the from_stored forward-compat fold of pre-12c
    DatasetRegistered payloads)
  - CREATE INDEX proj_data_dataset_summary_used_calibrations_gin_idx
    ON ... USING GIN (used_calibrations)  (powers the @> array-
    contains operator for membership lookup; symmetric to
    proj_run_summary_pinned_calibrations_gin_idx from 12b-2)

- DatasetSummaryProjection.apply() extension
  - _INSERT_DATASET_SQL gains used_calibrations as $7 (uuid[] cast)
  - Reads payload.get("used_calibrations", []) for pre-12c forward-
    compat (legacy fold to empty)
  - Discard arm unchanged (status-only UPDATE); citation set stays
    intact on discarded rows for audit fidelity

- Integration test test_register_dataset_used_calibrations_postgres.py
  (3 tests, mirrors 12b-2's symmetric Run column test):
  - Happy path: register_dataset with two citations lands sorted on
    both event payload + projection uuid[] column
  - Empty-default: omitted used_calibrations lands empty array
  - GIN-indexed @> membership lookup: pin @> over = ANY (the latter
    is rewritten internally and does NOT probe a GIN index on
    uuid[]; 12b-3 gate-review finding mirrored)

NO cross-table integrity constraint between
proj_data_dataset_summary.used_calibrations and
proj_run_summary.pinned_calibrations - drift between the two
columns is observational per the revision-cited atomic-ID model +
canonical DDD eventual-consistency stance
(project_calibration_design anti-hook #3 + #13 + rejection #11).
Watch item #7 captures the deferred projection-level integrity
check shape if consumer pain emerges.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xmap added a commit that referenced this pull request May 20, 2026
Multi-axis Phase 12 gate review (architecture / test pyramid /
docs+ops / domain) surfaced 1 P0 + 2 P1 asymmetric gaps between
Run BC (12b) and Data BC (12c). Both gates 12c-3 caught and fixed
for Data; 12b-3 missed for Run. This retrofit closes parity.

What ships:

- RUN_PINNED_CALIBRATIONS_MAX_ENTRIES = 64 + InvalidPinnedCalibrationsError
  (HTTP 400) + validate_pinned_calibrations validator on
  apps/api/src/cora/run/aggregates/run/state.py. Mirrors Data BC's
  DATASET_USED_CALIBRATIONS_MAX_ENTRIES + InvalidUsedCalibrationsError
  + validate_used_calibrations exactly (same default cap, same
  precedent: per-entry existence is NOT checked at the write path
  per anti-hook #3 / Vernon/Evans cross-aggregate eventual-
  consistency canon).
- start_run decider wires validate_pinned_calibrations on the
  command's pin set before sort-before-emit. Symmetric to
  register_dataset decider's validate_used_calibrations call.
- Run aggregate __init__ exports the new constant + error class +
  validator. Run BC routes.py registers InvalidPinnedCalibrationsError
  as an HTTP 400 handler via the existing validation-class loop.

Test surface (3 P1 + 1 P0 gate-review findings closed; 11 new
tests total):

- state-tier validator tests in tests/unit/run/test_run.py (5
  tests): accepts empty, within cap, exactly at cap (off-by-one
  guard), rejects > cap, error class carries .count for
  observability. Mirror of Data BC's test_validate_used_calibrations_*
  precedent that 12c-3 added.
- decider cardinality cap tests in tests/unit/run/test_start_run_decider.py
  (2 tests): rejects > cap, accepts exactly at cap. Mirror of Data
  BC's test_decide_raises_invalid_used_calibrations_for_too_many_entries.
- P0 projection unit test pin in tests/unit/run/test_run_summary_projection.py:
  test_run_started_inserts_with_running_status_and_genesis_refs now
  asserts args.args[9] == [] (the projection's UUID[] parameter
  binding). Without the pin, a regression that drops or misorders
  args[9] would silently slip through to integration; with the pin,
  the bug surfaces at unit tier. Plus 2 new tests:
    - test_run_started_pre_12b_payload_falls_back_to_empty_pinned_calibrations
      (locks the .get(..., []) backward-compat fallback)
    - test_run_started_with_pinned_calibrations_inserts_uuid_array
      (verifies non-empty payload entries parse to UUIDs)
  Direct mirror of 12c-3's P0 fix.
- Backward-compat fold pin in tests/integration/test_start_run_handler_postgres.py:
  legacy callers that omit pinned_calibrations get
  state.pinned_calibrations == frozenset() after the full event
  payload -> PG round-trip -> load_run fold chain. Mirror of Data
  BC's 12c-3 pin on test_register_dataset_handler_postgres.

All gates clean: pyright 0/0/0; tach 0 violations; 4317 architecture
tests pass; 111 Run BC tests pass (this commit's deltas: 11 new
tests across 3 files, all green).

Pattern caught: 12b shipped without the same quality gates that
12c-3 retrofitted for Data. This is now closed — 12b and 12c are
symmetric across (validator + error class + decider cardinality
test + projection args-binding pin + integration backward-compat
pin). Future similar aggregate-field additions should follow the
12c-3 / 12b-5 checklist from the start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xmap added a commit that referenced this pull request May 20, 2026
…red-test pins (Phase C Iter A cleanup)

Post-Iter-A-gate-review cleanup pass before Iter B. Addresses the
deferred SHOULD-FIX items the fix(auth) 6f1dfca commit left for
later, plus the impl-quality NIT items that touch the same files.

Shared helpers (test-coverage gap #13):
  New tests/unit/auth/_helpers.py (sibling to tests/unit/_helpers.py,
  the build_deps factory every BC handler test uses). Eliminates the
  triplication of _sign / _make_keypair / _make_mapper / _make_verifier
  across 3 test files. Also exports the test constants (TEST_ISSUER /
  TEST_AUD_HTTP / TEST_SURFACE_HTTP / FIXED_PRINCIPAL_ID / TEST_CLIENT_*)
  so tests don't redeclare them.

Test file reorganization:
  tests/unit/test_{jwt,introspection}_verifier.py and test_idp_registry.py
  -> tests/unit/auth/test_*.py. BC-named subdir matches existing
  convention (access/, agent/, etc.). Each file refactored to import
  from _helpers instead of redeclaring. ~350 lines of duplication
  removed across the 3 files.

Dead code dropped (impl-quality #3 + #5):
  Test-helper RSAAlgorithm.to_jwk(from_jwk(to_jwk(...))) round-trip
  replaced with the single-call form. Unused _ = public_pem and
  _ = jwk_dict assignments removed.

Token-length cap at the registry boundary (gate-review F11):
  IdentityProviderRegistry.verify now rejects tokens > 8192 bytes with
  InvalidTokenError('malformed') BEFORE any base64-decode / parse work.
  DoS guard against multi-megabyte attacker payloads.

Four new test pins:
  - test_concurrent_same_token_both_miss_cache_deferred_behavior: locks
    the current 'both miss' fan-out so a future single-flight coalescing
    fix doesn't silently change the contract.
  - test_verify_rejects_hand_crafted_hs256_signed_with_rsa_public_key:
    the real algorithm-confusion attack (CVE-2015-9235 class) via
    hand-crafted base64 + HMAC (PyJWT 2.x blocks the equivalent at
    jwt.encode() time, so simulating the attacker requires bypassing
    PyJWT on the construction side). Verifier's algorithms=['RS256']
    pin at decode time rejects.
  - test_jwt_only_deployment_works + test_introspection_only_deployment_works:
    registry shape symmetry (test-coverage gap #11).
  - test_excessive_length_token_rejected_at_boundary: F11 pin.

Verification: 11989 passed (+6 net) / 319 skipped in 390s; pyright +
ruff + tach clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
xmap added a commit that referenced this pull request May 21, 2026
…dation (Phase 1A)

First concrete instantiation of the Q4 compensation-primitive pattern
(per [[project-dataset-demote-design]] / [[project-compensation-primitive-research]]).
Closes documented gap at `state.py:184` where `Intent.RETRACTED` was
anticipated but not implemented. Mirrors the Crossref retraction model
(additive notice, original DatasetPromoted preserved + marked).

Data BC:
- Intent.RETRACTED added to open enum (terminal — no re-promote)
- DatasetDemoted event (mirrors DatasetPromoted shape)
- Evolver fold Production→Retracted (status + AsShot used_calibrations preserved)
- demote_dataset slice (command/decider/handler/route/tool) mirroring discard_dataset
  (no context.py — no cross-BC cascade by [[project-calibration-design]] anti-hook #3)
- DemotionReason VO + DatasetCannotDemoteError + DatasetAlreadyRetractedError
  + InvalidDemotionReasonError
- Wire/routes/tools registration; OpenAPI snapshot regenerated

Decision BC:
- DecisionOverrideKind Literal extended 4→5: added "invalidation"
- SUPERSEDES Q4 memo's `Decision.compensates_decision_id` field proposal.
  Gate-review code-inspection found existing parent_id + override_kind
  discriminator already carries the chain semantic more cleanly. No new
  field, no migration. Q4 watch item #1 resolved by extending the closed
  enum instead. "invalidation" maps to PROV-O wasInvalidatedBy on the
  activity side; parent_id stays informedBy across all 5 kinds.

REST + MCP:
- POST /datasets/{id}/demote with 204/404/409×3/400/422×2 coverage
- demote_dataset MCP tool

Projection update SKIPPED (gate-review discovery): DatasetSummaryProjection
deliberately tracks status only, not intent (DatasetPromoted also
unsubscribed). Watch item: subscribe to both Promoted+Demoted together
when first consumer asks. Documented in design memo.

Strict-not-idempotent at decider (re-demote → DatasetAlreadyRetractedError);
source-state must be Production (Trial→Retracted rejected — use discard
for Trial cleanup; Discarded→Retracted rejected — Discarded is stronger
terminal). Decision linkage at slice is OPTIONAL (mirror adjust_run).

Tests: 40+ new. 11980 pass / 319 skipped (+11 vs baseline).
- decider 9, handler 8, events 3, evolver 4, VO 6 (Data)
- register_decision_decider 2 (Decision)
- REST endpoint 8, MCP tool 2 (contract)
- existing parametrize + Intent enum tests extended

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant