Skip to content

Mark slow tests and exclude by default for faster local iteration#201

Merged
igerber merged 1 commit intomainfrom
refactor/slow-test-markers
Mar 15, 2026
Merged

Mark slow tests and exclude by default for faster local iteration#201
igerber merged 1 commit intomainfrom
refactor/slow-test-markers

Conversation

@igerber
Copy link
Owner

@igerber igerber commented Mar 15, 2026

Summary

  • Add @pytest.mark.slow to Sun-Abraham bootstrap tests (~696s), TROP rust-backend parity tests (~98s), and all TROP tests (module-level pytestmark)
  • Set addopts in pyproject.toml to exclude slow tests by default, reducing local pytest from ~17min to ~4min
  • Update CI workflows to pass -m '' so all tests still run in CI
  • Vectorize SA bootstrap resampling loop: pre-compute unit→row index mapping and replace Python for loop with np.repeat
  • Update CLAUDE.md to reflect new slow-test convention

Methodology references (required if estimator / math changes)

  • Method name(s): N/A — no methodology changes
  • Paper / source link(s): N/A
  • Any intentional deviations from the source (and why): None. The SA bootstrap vectorization is a pure performance refactor producing identical results (same RNG path, same index selection, same unit ID assignment).

Validation

  • Tests added/updated: tests/test_sun_abraham.py, tests/test_rust_backend.py, tests/test_trop.py (marker additions only)
  • Full suite passes with pytest -m '' (1773 collected, 1710 passed, 63 skipped)
  • Default pytest run: 1656 collected, 117 slow deselected, all pass in ~4min

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

Add @pytest.mark.slow to Sun-Abraham bootstrap tests (~696s),
TROP parity tests (~98s), and all TROP tests. Set addopts to
exclude slow tests by default, reducing local test time from
~17min to ~4min. CI workflows updated to pass -m '' to run all
tests. Also vectorize SA bootstrap resampling loop (pre-compute
unit-to-row index mapping, replace Python loop with np.repeat).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link

Overall Assessment

✅ Looks good

Executive Summary

Methodology

Code Quality

Performance

  • Severity: None. Impact: Precomputing per-unit row indices/counts outside the bootstrap loop removes repeated boolean scans while preserving sampled row order in diff_diff/sun_abraham.py:L1003-L1022. Concrete fix: None.

Maintainability

Tech Debt

  • Severity: None. Impact: I did not identify a new deferrable issue that should have been added to TODO.md. Concrete fix: None.

Security

  • Severity: None. Impact: The diff changes test config, internal docs, and an in-memory bootstrap loop only; I did not see a new secret-handling or unsafe-command surface. Concrete fix: None.

Documentation/Tests

@igerber igerber merged commit 1b1fa84 into main Mar 15, 2026
11 checks passed
@igerber igerber deleted the refactor/slow-test-markers branch March 15, 2026 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant