ci: fix all CI failures from M3 refactor by jam-sudo · Pull Request #14 · jam-sudo/Omega

jam-sudo · 2026-02-28T04:39:37Z

Summary

Run ruff format on 32 files left unformatted after the M3 refactor
Fix 30 ruff lint errors (unsorted imports, unused import, B904, B905)
Bump pytest-timeout from ~=0.5 to >=2.2 (compatible with pytest 8)
Install [api] extras in CI so test_api.py (fastapi) can be collected
Install [viz] extras in CI so matplotlib-dependent tests pass

Test plan

All 5 CI jobs green on this branch (Quality, Test 3.10/3.11/3.12, Smoke)
818 tests pass, 8 skipped/deselected

🤖 Generated with Claude Code

- Run ruff format on 32 files left unformatted after M3 refactor - Add pytest-timeout~=0.5 to dev extras so benchmark workflow --timeout=300 flag is recognized Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Fix 27 auto-fixable issues (unsorted imports, unused import) - B904: add `from None` to bare raises in ImportError handlers in cli.py - B905: add strict=False to zip() in test_surrogate_extensions.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ruff check --fix modified imports, requiring a follow-up ruff format pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

0.5 uses the removed __multicall__ hook API; 2.x is required for pytest 3+. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test_api.py imports fastapi at module level; without it the entire test collection fails and coverage drops to 19%. Installing [dev,api] restores normal collection and coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

10 tests fail with 'No module named matplotlib'. Adding viz to the extras group fixes them alongside the api extras added previously. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…s to Δ+0.553 Phase 3a.1 gut-wall fix improved the raw ODE baseline significantly, reducing the hybrid selector's relative contribution (BARE 2.465→2.039; NO_HYBRID 2.346→1.941). New ablation numbers (2026-03-18): FULL: 1.747 [1.48, 2.13] 83% NO_HYBRID: 1.941 [1.65, 2.30] 58% Δ+0.194 (was +0.553 pre-Phase-3a.1) BARE: 2.039 [1.72, 2.44] 54% Δ+0.292 (was +0.672) NO_ENSEMBLE: 1.857 [1.60, 2.20] 71% Δ+0.110 NO_RIDGE: 1.747 +0.000 — dead code confirmed Updated: fig_ablation.py + .pdf/.png, supplementary_table_S2.tex, omega_paper.tex (abstract + ablation section + table + discussion), CLAUDE.md Key Decision #14, MEMORY.md ablation table. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… correction ## Summary Comprehensive pipeline improvement across 10 tasks: ### Data Quality (Tasks 1, 8, 9) - Fix 5 platinum reference Cmax values (diclofenac, posaconazole, lenacapavir, valganciclovir, pindolol) — unit errors, route mismatch, extraction errors - Add DDI-boosted flags for lopinavir and darunavir - Add nucleoside 5'-ester SMARTS for molnupiravir prodrug detection - Fix val-ester SMARTS false positive on penicillamine - First AUC validation with MMPK (32 drugs): AAFE 3.205 - Lombardo/Obach VDss cross-validation (17 drugs): AAFE 3.71 ### Pipeline Constants (Tasks 4, 5) - Revert 4 Optuna constants to pre-Optuna defaults (gut_threshold 2.6, peff_min 0.5, pgp 0.5, gse 0.5) — MMPK tuning doesn't generalize - Verify ODE >> analytical for Cmax on clinical data (AAFE 2.41 vs 11.64) ### UQ System (Task 6) - Recalibrate AdaptiveConformal: 68 clean drugs, k=30 Coverage: 93.7% in-domain (was 97% over-wide), width 20.6x (was 4880x) - Replace broken LHS AUC/t½ CI with Cmax q-value heuristic scaling ### VDss Fix (Task 10) - Weighted geometric mean (XGB^0.7 × Berez^0.3) always applied for t½ Core-24 AUC AAFE: 2.344 → 2.142 (-8.6%), Cmax unchanged ### Metrics & Documentation (Tasks 2, 3, 7) - First Spearman ρ measurement: 0.9379 in-domain (excellent ranking) - CLAUDE.md: revoke KD#3/#7/#14, add KD#32-41, update performance - Holdout benchmark: in-domain stratification, DDI-boosted exclusion - CYP3A4 classifier trained but deferred (AUROC 0.634, no holdout impact) ## Results - Core-24 Cmax AAFE: 1.977 → 1.879 (-5.0%) - Core-24 AUC AAFE: 2.344 → 2.142 (-8.6%) - Holdout ALL: 2.780 → 2.440 (-12.2%) - Holdout IN-DOMAIN: first measured = 1.966 (under 2.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Omega Dev and others added 6 commits February 28, 2026 03:36

ci: fix ruff formatting and add pytest-timeout dependency

ec4f0b9

- Run ruff format on 32 files left unformatted after M3 refactor - Add pytest-timeout~=0.5 to dev extras so benchmark workflow --timeout=300 flag is recognized Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ci: re-run ruff format after lint auto-fixes

d496e2f

ruff check --fix modified imports, requiring a follow-up ruff format pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix: bump pytest-timeout to >=2.2 (compatible with pytest 8)

796c9e2

0.5 uses the removed __multicall__ hook API; 2.x is required for pytest 3+. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ci: install [api] extras in test/smoke steps

847d4f7

test_api.py imports fastapi at module level; without it the entire test collection fails and coverage drops to 19%. Installing [dev,api] restores normal collection and coverage. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ci: also install [viz] extras (matplotlib) in test/smoke steps

2791b9a

10 tests fail with 'No module named matplotlib'. Adding viz to the extras group fixes them alongside the api extras added previously. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

jam-sudo merged commit b8e54ba into main Feb 28, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: fix all CI failures from M3 refactor#14

ci: fix all CI failures from M3 refactor#14
jam-sudo merged 6 commits intomainfrom
feature/ml-first-refactor-m0-m3

jam-sudo commented Feb 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jam-sudo commented Feb 28, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant