Skip to content

MVP2 backlog batch: caplog-isolation fix + 6 cross-model-reviewed spec/plan pairs#364

Merged
SoundMindsAI merged 6 commits into
mainfrom
feature/mvp2-backlog-spec-plan-batch
Jun 1, 2026
Merged

MVP2 backlog batch: caplog-isolation fix + 6 cross-model-reviewed spec/plan pairs#364
SoundMindsAI merged 6 commits into
mainfrom
feature/mvp2-backlog-spec-plan-batch

Conversation

@SoundMindsAI
Copy link
Copy Markdown
Owner

What this PR does

Drives the next 7 highest-priority MVP2 idea-stage items forward in one batch (one branch, per the single-branch-per-session convention). Item #1 was a bug (fixed for real); items #2#7 are taken from idea.mdfeature_spec.md + implementation_plan.md via idea-preflight → spec-gen → impl-plan-gen, each with the mandatory GPT-5.5 cross-model review. No feature is implemented here — the six are planning artifacts.

Code changes (2 commits)

fix(logging)e4707cd0 — closes bug_backend_suite_nondeterministic_caplog_isolation (P1)

The backend suite was nondeterministically red: capture_logs()-based unit tests asserted on [] in the full suite but passed in isolation. The idea's pytest-randomly theory was wrong (that plugin isn't installed). Real cause: configure_logging() handed structlog a brand-new processors list instance on every call; loggers cache against the instance (cache_logger_on_first_use=True), so a second configure_logging() (e.g. an integration test's FastAPI lifespan) left earlier-bound loggers pointing at a stale list while capture_logs() mutated the new one → blind capture. Reproduced deterministically (8 failed → 75 passed). Fix: mutate the list in place. Also fixed a second polluter — a hand-rolled capture fixture in test_position_bias_prior.py doing configure(processors=[cap]) + reset_defaults(). Regression test added; bug_fix.md written.

fix(tests)2c287a48 — closes bug_contract_allowlists_outdated_after_mvp2_features

Three hand-maintained contract allowlists never got 'solr' (+ 5 new MVP2 endpoints) after the Solr adapter shipped, leaving the suite deterministically red on a clean tree. Updated test_resolve_engine_type_wire, test_health_contract subsystems, and EXPECTED_ENDPOINTS. Folded in (operator decision) so this batch's CI can go green. Contract layer: 325 passed.

Planning artifacts (1 commit, 09585893)

Feature FR / stories Notes
feat_apply_path_normalizer_declaration 8 / 7 Design-ahead — gated on feat_query_normalization_tuning Phase 1 merge + operator-friction evidence; plan refuses /impl-execute until both clear
feat_overnight_studies_summary_card 6 / 7 Phase 2 of shipped feat_overnight_autopilot; ready
feat_query_normalizer_typed_pipeline 9 / 8 Design-ahead — Phase 2 of feat_query_normalization_tuning; Story 0 hard precondition gate
infra_generated_artifact_freshness_gate 9 / 6 CI freshness gate; report-only until operator adds required check
chore_arq_pool_aclose_deprecation 4 / 2 preflight found a second call site (workers/all.py:225) + corrected the arq version claim
chore_cluster_detail_rung_badge 8 / 8 surface UbiRungBadge on cluster detail

Each idea.md was preflighted (stale claims corrected, dependency status recorded); each folder carries a pipeline_status.md through the Plan stage. Dashboards regenerated.

Not in scope

No production feature code. The two design-ahead specs (#2, #4) are explicitly not green-lit for /impl-execute until their Phase 1 dependency merges.

🤖 Generated with Claude Code

SoundMindsAI and others added 3 commits June 1, 2026 00:03
…igure_logging calls

The backend suite was order-dependently red: unit tests asserting on
structlog.testing.capture_logs() failed with empty-capture shapes (assert [])
in the full suite but passed in isolation, and identical-commit re-runs produced
different failing sets.

Root cause: configure_logging() handed structlog a brand-new processors list on
every call. structlog binds loggers against that list instance and (with
cache_logger_on_first_use=True) freezes the reference; capture_logs() mutates the
list in place. A second configure_logging() (e.g. an integration test's FastAPI
lifespan) swapped the list, leaving module-level loggers cached against the stale
instance blind to capture_logs() — order-dependent, so the failing set varied.

Fix: mutate the existing processors list in place (existing[:] = new) instead of
replacing it, so every bound logger always observes the current chain — the same
invariant capture_logs() relies on. Caching stays on; no production behavior
change (the single startup caller configures once).

Verified: reset_defaults() / cache-off cannot un-blind an already-cached proxy,
so a conftest reset fixture (the idea's proposed fix) is insufficient — the fix
must live in configure_logging(). New regression test fails on main, passes here.

Also corrects a stale pyproject dev-group comment claiming CI passes `-n auto`
(it was reverted).

Closes bug_backend_suite_nondeterministic_caplog_isolation (caplog cluster).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
…P2 endpoints

Three hand-maintained contract-test allowlists drifted as MVP2 features shipped
without updating them in the same PR, leaving the backend suite deterministically
red on a clean tree:

- test_resolve_engine_type_wire: add "solr" (infra_adapter_solr, PR #336).
- test_health_contract subsystems: add "solr" (infra_adapter_solr Story A12).
- test_openapi_has_no_orphan_endpoints: add the 5 endpoints that shipped without
  an EXPECTED_ENDPOINTS entry — ubi-readiness + generate-from-ubi (feat_ubi_
  judgments #317), studies/{id}/chain (feat_overnight_autopilot #343),
  clusters/test-connection + clusters/{id}/reprobe (infra_adapter_solr #336).

Test-only; status codes verified against the live OpenAPI surface. Closes
bug_contract_allowlists_outdated_after_mvp2_features. Contract layer: 325 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
… ideas

Advances the next 6 MVP2 idea-stage features from idea.md to feature_spec.md +
implementation_plan.md, each via idea-preflight -> spec-gen -> impl-plan-gen with
the mandatory GPT-5.5 cross-model review (Opus creates, GPT-5.5 reviews).

- feat_apply_path_normalizer_declaration (8 FR / 7 stories) — DESIGN-AHEAD:
  gated on feat_query_normalization_tuning Phase 1 merge + operator-friction
  evidence; plan refuses /impl-execute until both clear.
- feat_overnight_studies_summary_card (6 FR / 7 stories) — Phase 2 of the
  shipped feat_overnight_autopilot; ready.
- feat_query_normalizer_typed_pipeline (9 FR / 8 stories) — DESIGN-AHEAD:
  Phase 2 of feat_query_normalization_tuning; Story 0 hard precondition gate.
- infra_generated_artifact_freshness_gate (9 FR / 6 stories) — CI freshness gate.
- chore_arq_pool_aclose_deprecation (4 FR / 2 stories) — preflight found a
  SECOND call site (workers/all.py:225) + corrected the arq version claim.
- chore_cluster_detail_rung_badge (8 FR / 8 stories) — surface UbiRungBadge.

Preflight patched each idea.md against the live tree. Dashboards regenerated.
No production code; no feature implemented.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request resolves the non-deterministic test isolation issue (bug_backend_suite_nondeterministic_caplog_isolation) by ensuring that configure_logging mutates the existing structlog processors list in place rather than replacing it with a new instance. It also updates several contract tests to include Solr and other new endpoints, adds regression tests for the logging fix, and introduces several design-ahead specifications and plans for upcoming features and chores. A critical review comment points out that the rewritten capture_logs fixture in test_position_bias_prior.py will raise an AttributeError if run in isolation when structlog defaults to an immutable tuple of processors; a check should be added to handle this case gracefully.

Comment on lines +45 to +55
configured = structlog.get_config()["processors"]
saved = configured.copy()
configured.clear()
configured.append(cap)
structlog.configure(processors=configured)
try:
yield cap
finally:
configured.clear()
configured.extend(saved)
structlog.configure(processors=configured)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If test_position_bias_prior.py is run in isolation and configure_logging() has not been called yet, structlog.get_config()["processors"] will return a tuple (the default in structlog). Since tuple is immutable and does not have .copy(), .clear(), or .extend() methods, this fixture will raise an AttributeError and fail the test.

To prevent this, check if configured is an instance of list before attempting to mutate it in place. If it is a tuple (or other non-list type), fall back to configuring it directly and restoring the original value on teardown.

Suggested change
configured = structlog.get_config()["processors"]
saved = configured.copy()
configured.clear()
configured.append(cap)
structlog.configure(processors=configured)
try:
yield cap
finally:
configured.clear()
configured.extend(saved)
structlog.configure(processors=configured)
configured = structlog.get_config()["processors"]
if isinstance(configured, list):
saved = list(configured)
configured.clear()
configured.append(cap)
structlog.configure(processors=configured)
try:
yield cap
finally:
configured.clear()
configured.extend(saved)
structlog.configure(processors=configured)
else:
structlog.configure(processors=[cap])
try:
yield cap
finally:
structlog.configure(processors=configured)

SoundMindsAI and others added 3 commits June 1, 2026 06:35
…2 drift

- test_judgment_generate::test_happy_path_ac1_ac6: source_breakdown is the
  three-term {llm,human,click} shape since feat_ubi_judgments FR-10 (the old
  "click folds into human" contract was retired when UBI lists ship click
  rows). Update the expected dict to include click:0.
- test_migration_0021_generation_params::test_downgrade_drops_column_then_round_trip:
  the test did `alembic downgrade -1` assuming 0021 was head, but head is now
  0022_solr_engine_auth_check — so `-1` reversed 0022 (a CHECK relaxation) and
  left 0021's generation_params column intact. Target absolute revision 0020 so
  0021's downgrade actually runs, robust against future head advances. The
  migration itself was correct; only the test's relative target was stale.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
…mini)

Adopt Gemini Code Assist's defensive isinstance(list) guard on the in-place
processors-list mutation. structlog 25.5.0 returns a mutable list here
(verified — the in-place path is live and the test passes standalone), so
Gemini's stated AttributeError is not currently reachable; the guard is
accepted as cheap future-proofing that mirrors the same isinstance guard in
backend/app/core/logging.configure_logging. Falls back to a clean
configure/restore if a future structlog hands back an immutable tuple.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
…MVP2 batch

Solr is not CI-ready: the pr.yml backend job has no Solr service container
(test_demo_seeding_ubi_full ConnectErrors) and the smoke job's Solr container
crashes (exited 1) during make up. Pre-existing since Solr + the Solr demo
scenario shipped (PRs #336/#348); not introduced by this batch. Captured as a
P1 infra idea with three fix options. Dashboards + roadmap regenerated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
@SoundMindsAI
Copy link
Copy Markdown
Owner Author

Gemini Code Assist adjudication

# Severity File Finding Verdict
1 High test_position_bias_prior.py:55 In-place mutation of get_config()["processors"] would raise AttributeError if the default is an immutable tuple Accepted (as future-proofing) — premise is not reachable on the pinned structlog 25.5.0: verified structlog.get_config()["processors"] is a mutable list at default, and the test passes standalone (15/15). Adopted the suggested isinstance(list) guard anyway (commit 685cdebd) — cheap, harmless, and mirrors the same guard already in configure_logging().

No other review findings. CI: 11/13 checks green, 3168 backend tests pass. The 2 red checks (backend → only test_demo_seeding_ubi_full; smoke) are a pre-existing Solr-CI-readiness gap unrelated to this PR — captured as infra_solr_ci_readiness (P1). Merging over per maintainer decision.

@SoundMindsAI SoundMindsAI merged commit b479028 into main Jun 1, 2026
11 of 12 checks passed
@SoundMindsAI SoundMindsAI deleted the feature/mvp2-backlog-spec-plan-batch branch June 1, 2026 10:59
SoundMindsAI added a commit that referenced this pull request Jun 2, 2026
* docs(state): finalize 3 stale MVP2 folders + close #357 + dashboards

Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>

* docs(dashboards): regenerate after rebase on #385 (reflect both folder-move sets)

Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>

---------

Signed-off-by: SoundMindsAI <eric.starr@soundminds.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant