Skip to content

Reduce TROP test time in pure Python CI#204

Merged
igerber merged 1 commit intomainfrom
reduce-python-ci-tests
Mar 16, 2026
Merged

Reduce TROP test time in pure Python CI#204
igerber merged 1 commit intomainfrom
reduce-python-ci-tests

Conversation

@igerber
Copy link
Owner

@igerber igerber commented Mar 15, 2026

Summary

  • Replace file-level pytestmark = pytest.mark.slow in test_trop.py with per-class @pytest.mark.slow on the 5 most expensive test classes (38 tests)
  • Remove -m '' override from pure Python CI job so the default not slow filter takes effect
  • 59 fast tests continue running in pure Python CI; all 97 tests still run in Rust CI

Methodology references (required if estimator / math changes)

  • N/A — no methodology changes

Validation

  • Tests added/updated: No new tests; verified collection counts match expectations:
    • pytest tests/test_trop.py --collect-only -q → 59 tests (fast subset)
    • pytest tests/test_trop.py --collect-only -q -m '' → 97 tests (all)
    • pytest tests/test_trop.py --collect-only -q -m slow → 38 tests (slow only)
  • DIFF_DIFF_BACKEND=python pytest tests/test_trop.py -q → 59 passed

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

Replace the file-level pytestmark = pytest.mark.slow with @pytest.mark.slow
on the 5 most expensive test classes (38 tests), keeping 59 fast tests
running in pure Python CI. Remove -m '' override from the CI workflow so
the default 'not slow' filter takes effect. The Rust CI job continues to
run all 97 tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link

Overall Assessment

✅ Looks good

Executive Summary

  • No estimator, weighting, variance/SE, identification, or default-behavior implementation changed. I cross-checked the affected methods against docs/methodology/REGISTRY.md, and this PR is confined to CI/docs/tests.
  • The test_trop.py marker split is mechanically consistent: the file-level slow mark is replaced by five class-level slow marks, leaving the rest of the TROP test surface runnable under the default not slow filter.
  • The main downside is coverage, not correctness: removing -m '' from the pure-Python workflow makes that job skip all slow tests, not just the re-marked TROP classes.
  • That broader skip removes pure-Python CI coverage for non-TROP inference checks in Callaway-Sant’Anna and Sun-Abraham, plus all end-to-end TROP method="global"/"joint" fit tests.
  • CLAUDE.md is slightly out of sync with the actual default collection behavior because it omits the slow Callaway SE verification test.
  • I could not run pytest in this environment because it is not installed, so the scope/count conclusions are from static inspection of markers and workflow config.

Methodology

Code Quality

  • No findings.

Performance

  • No findings. The workflow change should reduce pure-Python CI time as intended.

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • P2 Impact: Removing -m '' from .github/workflows/rust-test.yml:179 means the suite-wide default marker filter in pyproject.toml:77 now applies to the pure-Python fallback job. That skips every slow test there, not just the TROP classes changed in this PR. In practice, it drops pure-Python execution of the Callaway SE convergence test at tests/test_methodology_callaway.py:804 and the Sun-Abraham bootstrap suite at tests/test_sun_abraham.py:343. Because pure-Python mode also changes test scaling via tests/conftest.py:97, those fallback-specific inference paths are no longer covered in CI. Concrete fix: add a dedicated pure-Python step that explicitly runs the non-TROP slow inference tests with -m '', or scope the skip to TROP-only targets instead of relying on the global slow filter.
  • P2 Impact: The class-level @pytest.mark.slow on tests/test_trop.py:2721 suppresses all end-to-end pure-Python CI coverage for the public TROP.fit(..., method="global"/"joint") path. That also hides cheap contract tests such as tests/test_trop.py:2847, tests/test_trop.py:2852, and tests/test_trop.py:2866 along with the expensive optimization/bootstrap cases. The remaining fast joint-related coverage at tests/test_trop.py:2602 and tests/test_trop.py:3588 is low-level helper coverage, not fit() smoke coverage. Concrete fix: split TestTROPJointMethod into a fast smoke/API class and a slow optimization/bootstrap class, and keep at least one public method="global" fit test in the non-slow set.
  • P3 Impact: CLAUDE.md:125 now documents only TROP/Sun-Abraham/TROP-parity slow tests, but the default exclusion also covers the Callaway SE formula test at tests/test_methodology_callaway.py:804. Contributor guidance is therefore incomplete. Concrete fix: reword that paragraph to say “all @pytest.mark.slow tests are excluded by default” or explicitly add the Callaway test to the list.

@igerber igerber merged commit cfc4057 into main Mar 16, 2026
11 checks passed
@igerber igerber deleted the reduce-python-ci-tests branch March 16, 2026 00:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant