Process hygiene #1 — Add Hypothesis property-based tests for data-shape invariants

## Context

Part of the process-hygiene epic (parent issue will be linked once filed). Entry point because it has the most concrete acceptance criteria + directly prevents the Phase 4h.2 Part 2 root cause.

## Motivation

The 56-signal silent-drop in `compute/features/osap_replicate.py:91-140` (fixed in PR #124) had this structure:

```python
# Pre-Part-2 code assumed deciles universally
df = df[df["port"].isin([LONG_PORT_LABEL, SHORT_PORT_LABEL])]  # 01/10 only
# Quintile + tercile signals silently dropped
```

The tests passed because every example test used decile fixtures. **No test enumerated port cardinalities**. A property test like "for any port-count ∈ {2, 3, 5, 10}, the adapter produces an LS row" would have failed immediately.

## In-scope work

### A. Add `hypothesis` to `[dev]` extra

`pyproject.toml`:
```toml
dev = [
    "pytest>=8.0",
    "ruff>=0.4",
    "hypothesis>=6.92",  # NEW — property-based test generator
]
```

### B. Property tests for `osap_replicate.py`

`tests/test_features/test_osap_replicate_properties.py` (NEW):

```python
from hypothesis import given, strategies as st

@given(
    port_count=st.integers(min_value=2, max_value=10),
    n_dates=st.integers(min_value=1, max_value=12),
)
def test_compute_long_short_returns_handles_any_port_cardinality(port_count, n_dates):
    """For any port cardinality ∈ [2, 10] and any positive date count,
    compute_long_short_returns produces exactly n_dates rows with
    ls_return = ret[port=min] - ret[port=max]."""
    ...
```

Cover at minimum:
1. `compute_long_short_returns` shape invariance across port cardinalities
2. `signals_dropped_no_long_short` returns sorted unique list
3. `_normalize_port_label` round-trips through int/str/categorical
4. Part 2 accounting invariant holds for any combination of manifest + dataset
5. `coverage_by_signal` returns values in `[0, 1]`

### C. Property tests for other shape-sensitive transforms

Identify candidates by `grep -rn "hardcode\|assume\|fixed-shape" compute/`:

- `compute/scoring/composite.py::compute_composite` — PHASE3_WEIGHTS sum = 1.0 invariant under any pillar-score input
- `compute/valuation/ensemble.py::compute_fair_price` — median/min/max relationships
- `compute/scoring/risk_overlay.py::apply_risk_overlay` — annotate vs veto disjointness

Pick 2-3 high-value targets, not all of them. The goal is to ship the pattern + Hypothesis dependency, not to retrofit the whole codebase.

### D. CI integration

`hypothesis` runs deterministic by default (no `@settings(deadline=None)` workaround needed). Ensure:

- Property tests run as part of `pytest -m "not network"`
- Hypothesis database (`.hypothesis/`) added to `.gitignore`
- `hypothesis.errors.Flaky` failures must fail CI, not retry

## Out-of-scope

- Mutation testing (mutmut / cosmic-ray) — defer to follow-up if Hypothesis insufficient
- Property tests for `compute/ingest/*` — those exercise external APIs; @network marker handles them
- Stateful Hypothesis testing (`RuleBasedStateMachine`) — defer to integration-PR scope

## Acceptance criteria

- [ ] `hypothesis>=6.92` in `[dev]`
- [ ] ≥ 5 property tests for `osap_replicate.py` covering port-cardinality invariance
- [ ] ≥ 3 property tests across `composite.py` / `ensemble.py` / `risk_overlay.py`
- [ ] All property tests run in `pytest -m "not network"` and pass deterministically
- [ ] CI green
- [ ] CLAUDE.md `## Gotchas` updated noting Hypothesis as the new line of defense

## Effort estimate

~1 PR, ~200-300 LOC (mostly test code), 1-2 days.

## Why this is the highest-value first improvement

The Phase 4h.2 Part 2 fix took ~3 hours of audit + implementation + CI cycle. The silent-drop survived in production for ~1 cron cycle (≈1 week) before being detected. The cost of NOT having property tests this round was tangible. The pattern, once landed, prevents the entire class of "untested data-shape assumption" bugs — including ones we haven't hit yet but the codebase is full of.

— filed 2026-05-19 by Phase 4h.2 Part 2 auditor session

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Process hygiene #1 — Add Hypothesis property-based tests for data-shape invariants #126

Context

Motivation

In-scope work

A. Add `hypothesis` to `[dev]` extra

B. Property tests for `osap_replicate.py`

C. Property tests for other shape-sensitive transforms

D. CI integration

Out-of-scope

Acceptance criteria

Effort estimate

Why this is the highest-value first improvement

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Process hygiene #1 — Add Hypothesis property-based tests for data-shape invariants #126

Description

Context

Motivation

In-scope work

A. Add hypothesis to [dev] extra

B. Property tests for osap_replicate.py

C. Property tests for other shape-sensitive transforms

D. CI integration

Out-of-scope

Acceptance criteria

Effort estimate

Why this is the highest-value first improvement

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

A. Add `hypothesis` to `[dev]` extra

B. Property tests for `osap_replicate.py`