Skip to content

feat: optional evaluate_batch hook on ga()/nsga2() for whole-population evaluation#20

Merged
CooperBigFoot merged 1 commit into
mainfrom
step/ctrl-freak-batch-hook
Jun 27, 2026
Merged

feat: optional evaluate_batch hook on ga()/nsga2() for whole-population evaluation#20
CooperBigFoot merged 1 commit into
mainfrom
step/ctrl-freak-batch-hook

Conversation

@CooperBigFoot

Copy link
Copy Markdown
Contributor

Adds an optional evaluate_batch: Callable[[ndarray],ndarray] | None = None to ga() and nsga2(). When supplied it receives the whole (n, n_params) population matrix and returns objectives directly, bypassing the per-individual lift/lift_parallel loop — enabling a single vmapped/jit'd evaluation (e.g. a JAX model). When None, behavior is byte-identical to today (back-compat).

Used downstream by hydrologeez for batched GR6J calibration.

Gate: ruff format --check ✓ · ruff check ✓ · ty check ✓ · pytest 531 passed @ 98.91% cov ✓ · doctests ✓ · new equivalence tests (batch == per-individual) ✓. Version 0.2.0→0.2.1.

@CooperBigFoot

Copy link
Copy Markdown
Contributor Author

Adversarial review — APPROVE (safe to merge)

Verified against the committed code on `step/ctrl-freak-batch-hook`, not the description.

1. Hook correctness — PASS. When `evaluate_batch is not None`, `lifted_evaluate` is replaced by a closure calling `batch_fn(x)` directly (ga.py:172-176, nsga2.py:154-158). It does not wrap `lift`/`lift_parallel` — the per-individual loop and (for GA) `evaluate_array` are genuinely never constructed. Both call sites are covered in each algorithm: initial population (ga.py:183, nsga2.py:178) and offspring (ga.py:229, nsga2.py:217) all route through `lifted_evaluate`. GA's `ndim==1` reshape handles `(n,)`/`(n,1)`; NSGA-II passes the `(n,n_obj)` matrix straight through. Correct.

2. Back-compat — PASS (byte-identical). `evaluate_batch` is appended last with default `None`; the `else` branch reproduces the original assignment verbatim (`lift_parallel(..., n_workers) if n_workers != 1 else lift(...)`). Pre-existing test files are untouched (`git diff origin/main...HEAD -- tests/test_ga.py tests/test_nsga2.py tests/test_results.py tests/conftest.py operators/base.py` is empty). Full suite: 531 passed, 98.91% coverage.

3. Equivalence test non-vacuous — PASS. `_matches_per_individual` compares two independent runs on the same seed (per-individual reference vs batch) via `assert_array_equal` on x/objectives/fitness(or rank+crowding). The wiring is proven by the separate `_receives_full_matrix_and_bypasses_loop` tests: `forbidden_evaluate` raises `AssertionError` if the per-individual path is ever entered, and recorded shapes assert every call sees `(pop_size, n_params)`. Together these are non-tautological. All 6 tests execute (not skipped).

4. Conventions — PASS. `ruff format --check` clean (59 files), `ruff check` clean, `uv run ty check src/` (the CI-enforced gate, ci.yml:45) = "All checks passed!", doctests `--doctest-modules src/ctrl_freak` = 38 passed (incl. the two new evaluate_batch examples). Version 0.2.0→0.2.1 via single source (pyproject + uv.lock only); no tag at HEAD; no publish.

5. Scope — PASS. Exactly 5 files: ga.py, nsga2.py, pyproject.toml, uv.lock, tests/test_evaluate_batch.py. `.gitignore` not staged. CI green on py3.11/3.12/3.13 + numpy-floor.

No blockers found.

@CooperBigFoot CooperBigFoot merged commit 5a69300 into main Jun 27, 2026
4 checks passed
@CooperBigFoot CooperBigFoot deleted the step/ctrl-freak-batch-hook branch June 27, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant