feat: optional evaluate_batch hook on ga()/nsga2() for whole-population evaluation by CooperBigFoot · Pull Request #20 · hydrosolutions/ctrl-freak

CooperBigFoot · 2026-06-27T21:22:24Z

Adds an optional evaluate_batch: Callable[[ndarray],ndarray] | None = None to ga() and nsga2(). When supplied it receives the whole (n, n_params) population matrix and returns objectives directly, bypassing the per-individual lift/lift_parallel loop — enabling a single vmapped/jit'd evaluation (e.g. a JAX model). When None, behavior is byte-identical to today (back-compat).

Used downstream by hydrologeez for batched GR6J calibration.

Gate: ruff format --check ✓ · ruff check ✓ · ty check ✓ · pytest 531 passed @ 98.91% cov ✓ · doctests ✓ · new equivalence tests (batch == per-individual) ✓. Version 0.2.0→0.2.1.

…on evaluation

CooperBigFoot · 2026-06-27T21:25:24Z

Adversarial review — APPROVE (safe to merge)

Verified against the committed code on `step/ctrl-freak-batch-hook`, not the description.

1. Hook correctness — PASS. When `evaluate_batch is not None`, `lifted_evaluate` is replaced by a closure calling `batch_fn(x)` directly (ga.py:172-176, nsga2.py:154-158). It does not wrap `lift`/`lift_parallel` — the per-individual loop and (for GA) `evaluate_array` are genuinely never constructed. Both call sites are covered in each algorithm: initial population (ga.py:183, nsga2.py:178) and offspring (ga.py:229, nsga2.py:217) all route through `lifted_evaluate`. GA's `ndim==1` reshape handles `(n,)`/`(n,1)`; NSGA-II passes the `(n,n_obj)` matrix straight through. Correct.

2. Back-compat — PASS (byte-identical). `evaluate_batch` is appended last with default `None`; the `else` branch reproduces the original assignment verbatim (`lift_parallel(..., n_workers) if n_workers != 1 else lift(...)`). Pre-existing test files are untouched (`git diff origin/main...HEAD -- tests/test_ga.py tests/test_nsga2.py tests/test_results.py tests/conftest.py operators/base.py` is empty). Full suite: 531 passed, 98.91% coverage.

3. Equivalence test non-vacuous — PASS. `_matches_per_individual` compares two independent runs on the same seed (per-individual reference vs batch) via `assert_array_equal` on x/objectives/fitness(or rank+crowding). The wiring is proven by the separate `_receives_full_matrix_and_bypasses_loop` tests: `forbidden_evaluate` raises `AssertionError` if the per-individual path is ever entered, and recorded shapes assert every call sees `(pop_size, n_params)`. Together these are non-tautological. All 6 tests execute (not skipped).

4. Conventions — PASS. `ruff format --check` clean (59 files), `ruff check` clean, `uv run ty check src/` (the CI-enforced gate, ci.yml:45) = "All checks passed!", doctests `--doctest-modules src/ctrl_freak` = 38 passed (incl. the two new evaluate_batch examples). Version 0.2.0→0.2.1 via single source (pyproject + uv.lock only); no tag at HEAD; no publish.

5. Scope — PASS. Exactly 5 files: ga.py, nsga2.py, pyproject.toml, uv.lock, tests/test_evaluate_batch.py. `.gitignore` not staged. CI green on py3.11/3.12/3.13 + numpy-floor.

No blockers found.

feat: optional evaluate_batch hook on ga()/nsga2() for whole-populati…

a3bc1c3

…on evaluation

CooperBigFoot merged commit 5a69300 into main Jun 27, 2026
4 checks passed

CooperBigFoot deleted the step/ctrl-freak-batch-hook branch June 27, 2026 21:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: optional evaluate_batch hook on ga()/nsga2() for whole-population evaluation#20

feat: optional evaluate_batch hook on ga()/nsga2() for whole-population evaluation#20
CooperBigFoot merged 1 commit into
mainfrom
step/ctrl-freak-batch-hook

CooperBigFoot commented Jun 27, 2026

Uh oh!

CooperBigFoot commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

CooperBigFoot commented Jun 27, 2026

Uh oh!

CooperBigFoot commented Jun 27, 2026

Adversarial review — APPROVE (safe to merge)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant