Ngmix v2.0 (CI mirror of #740)#741
Conversation
…to ngmix_update
- bin/ scripts were untracked, causing Docker build to fail - Fix license field to use SPDX string format (MIT) to resolve SetuptoolsDeprecationWarning Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…bugs when exp list is empty
Fix check on zero-valued pixels (most did not pass), now all vignets with at least one != 0 pixel pass.
Syncs the branch with develop to pull in the modernized CI (so the full test suite + example pipeline run on this PR) and resolve drift. Only conflict was uv.lock, regenerated with `uv lock` against the auto-merged pyproject.toml (ngmix 2.4.0 source preserved). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
prepare_ngmix_weights drew the noise image and masked-pixel fill from the unseeded global numpy.random, defeating the per-tile self._rng = RandomState(seed) the module sets for reproducibility. Same tile + same seed gave different shears (observed max|dg1| ~ 3e-5). Thread rng through make_ngmix_observation -> prepare_ngmix_weights and draw via rng.standard_normal. Adds test_metacal_is_reproducible_with_fixed_seed (same seed -> identical noshear g); fails before this fix, passes after. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
test_imports.py walks the shapepipe package, but standalone scripts/ files are outside it and never get imported. This extends import-smoke coverage to the validation scripts added with ngmix v2.0. Currently fails on centroid_bias.py, which imports get_guess — a helper removed in the module rewrite; that script's status (restore get_guess / fix import / relocate as legacy) is a developer decision. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
centroid_bias.py is the old-API (ngmix 1.x) bias-validation script: it imports get_guess, a helper the v2.0 rewrite removed (v2.0 seeds fits via ngmix's own guessers). It is only ever run against the test_centroid_bug worktree — the centroid_bug/centroid_fix cases invoke it as SHAPEPIPE_PATH/TEST_SCRIPT with SHAPEPIPE_PATH pointing at that worktree (see run_bias_test.sh), which has its own copy. The duplicate on this branch was never executed and only broke import-smoke. centroid_bias_v2.py supersedes it for ngmix_v2.0. Drops it from the scripts import-smoke list accordingly. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
cailmdaley
left a comment
There was a problem hiding this comment.
Review of the v2.0 work — part 1 of 2, scoped to the few cut-and-dry items we could fix and test outright. I've committed these here on the CI mirror so the suite proves them (green: 260 passed); inline notes point at each.
A second pass with the more substantive, science-level questions (the *_psfo columns now reporting the metacal-reconvolved PSF rather than the original PSFEx/MCCD PSF; the weight-map normalization change; the r50/T column units; a few others) will follow once we've talked them through — those are judgement calls I don't want to assert unilaterally.
Context for this mirror PR is in the description above. Drafted by Claude, directed by me through the review.
— Claude on behalf of Cail
| gal_guess_flag = True | ||
| wsum = 0 | ||
| for n_e in range(n_epoch): | ||
| noise_img = rng.standard_normal(gal.shape) * sig_noise |
There was a problem hiding this comment.
Reproducibility fix (done in this branch). The module sets self._rng = RandomState(seed) per tile precisely so a rerun reproduces — but prepare_ngmix_weights was drawing the noise image and the masked-pixel fill from the unseeded global np.random.randn. Net effect: the same tile with the same seed produced different shears (observed max|Δg1| ≈ 3e-5 between two identical-seed runs). Fixed by threading rng through make_ngmix_observation → prepare_ngmix_weights and drawing via rng.standard_normal. Guarded by the new test_metacal_is_reproducible_with_fixed_seed — fails before this change, passes after.
— Claude on behalf of Cail
There was a problem hiding this comment.
Good. This probably was previous code before we changed to a random seed per tile.
| return np.asarray(res["noshear"]["g"]) | ||
|
|
||
|
|
||
| def test_metacal_is_reproducible_with_fixed_seed(): |
There was a problem hiding this comment.
New property test for the reproducibility fix: same seed → identical noshear g. Red on the pre-fix code (global-RNG leak), green after threading rng. Kept cheap (51 px, 2 epochs) so it runs comfortably in the suite.
|
|
||
|
|
||
| @pytest.mark.parametrize("relpath", VALIDATION_SCRIPTS) | ||
| def test_validation_script_imports_cleanly(relpath): |
There was a problem hiding this comment.
New smoke test + a removal it triggered. tests/unit/test_imports.py walks the shapepipe package, but standalone scripts/ files live outside it and were never import-checked. This extends import-smoke into scripts/, and it immediately caught that the v1 scripts/validation/centroid/centroid_bias.py imported get_guess — a helper the v2.0 rewrite removed (v2.0 seeds fits via ngmix's own guessers; the centroid piece survives inline in make_ngmix_observation). That v1 script is the old-API bias repro and is only ever run against the test_centroid_bug worktree (run_bias_test.sh invokes ${SHAPEPIPE_PATH}/${TEST_SCRIPT}), so the copy on this branch was dead and only broke import. We removed the stranded copy here; centroid_bias_v2.py supersedes it for v2.0. If you'd rather keep a v1 copy on this branch, it'd need get_guess vendored in.
— Claude on behalf of Cail
cailmdaley
left a comment
There was a problem hiding this comment.
Review — part 2 of 2 (methodology + remaining items)
Provenance: this review was generated by Claude, working on candide against the actual module code; I haven't hand-edited it line by line, so please read it with that lens. The empirical checks below were genuinely run; the inline questions are real intent questions, not assertions that something is broken.
Part 1 covered the cut-and-dry items we fixed and tested directly (the RNG reproducibility fix + its property test, the import smoke-test and the get_guess/centroid_bias.py removal it surfaced). This part is the broader read: what we verified empirically, plus methodology and housekeeping. The code-specific points are left as inline comments on the relevant lines.
What we verified empirically ✅
We installed ngmix 2.4.0 (from this PR's own esheldon/ngmix@v2.4.0 source) over the container stack and ran the actual module code:
- All 13 ngmix-2.4 symbols the module uses resolve; the metacal/bootstrap path runs end-to-end.
- On 200 sub-pixel-shifted galsim galaxies,
do_ngmix_metacalrecovers shear with m = +2×10⁻⁴ ± 3×10⁻⁴, c₂ = −1×10⁻⁵ (both consistent with zero), R₁₁≈R₂₂≈0.924, 200/200 OK. Yourcentroid_bias_v2.py(noiseless) gives m = +3.7×10⁻⁴. - The centroid fix (HSM re-centering of the galaxy Jacobian in
make_ngmix_observation) runs clean and is benign in the high-S/N regime we tested. We didn't try to reproduce the bias it targets — Cail notes that lives in the old ngmix-1.x path, so current code can't exhibit it. Just confirming the new fix doesn't introduce anything.
The 2.x API is used correctly, the fitter is well-behaved, and the CI suite is green on this branch.
Methodology questions (for you / Lucy)
Five places where v2.0 behaviour differs from the validated v1 in ways that may be intentional but aren't documented in the diff — flagged inline so the choice is explicit:
- weight-map normalization (inline at
prepare_ngmix_weights) *_psfocolumns now carrying the metacal-reconvolved PSF, and per-type PSF columns collapsing (inline ataverage_multiepoch_psf)- the galaxy zero-pixel guard loosening
any→all(inline atprepare_postage_stamps) - the PSF fit reusing the galaxy prior +
FLUX_AUTOflux guess (inline atdo_ngmix_metacal)
Checklist / housekeeping
closes #links — none on the PR. Axel's centroid-shift PR #725 overlaps this; worth an explicit cross-ref on whether this supersedes it.- Docs — no
docs/changes, and the ngmix-pin note inCLAUDE.md("don't modernize this dependency line") is now stale, since this PR is the modernization. Worth updating on/with merge. pyproject.toml— ngmix is pinned to a git tag (esheldon/ngmix@v2.4.0). If 2.4.0 is on PyPI,ngmix==2.4.0would be more reproducible and drop the git dependency — worth a check.- Description — for a +3894/−3410, 67-file PR bumping a core dependency and rewriting the shape module, a fuller description (especially the centroid-bias result) would help future readers.
A couple of pre-existing cleanups (not changed here, low priority)
In save_results: the error string ~ngmix.py:547 is non-f, so it prints literal {ext_name}; there's leftover print(...) debug just below it; and the np.append-per-batch FITS write is O(n²) over a tile. Worth a sweep if you're in there.
Happy to push fixes for any of the cut-and-dry ones (cleanups, the CHECK_EXISTING_DIR plumbing, the runner decorators) — just say which.
— Claude on behalf of Cail
| self._zero_point = zero_point | ||
| self._pixel_scale = pixel_scale | ||
|
|
||
| # MKDEBUG check whether still used |
There was a problem hiding this comment.
CHECK_EXISTING_DIR resume looks silently dropped. This # MKDEBUG check whether still used is on the money: the new process() never reads self._check_existing_dir and never calls get_last_id (both now dead). So a configured resume reprocesses from scratch and overwrites, rather than skipping already-done objects — a documented feature that now no-ops. Either rewire the resume path or remove the option + get_last_id so it can't mislead. (And the # MKDEBUG marker can go.)
There was a problem hiding this comment.
This was a hack to resume interrupted ngmix runs, bookkeeping done in scripts outside modules. Can be removed now, and if needed again in the future implemented better within the module.
| output_dict[name]["Tpsf"].append(results[idx]["T_PSFo"]) | ||
| output_dict[name]["g1_psf"].append(results[idx]["g_PSFo"][0]) | ||
| output_dict[name]["g2_psf"].append(results[idx]["g_PSFo"][1]) | ||
| output_dict[name]['r50'].append(results[idx][name]['pars'][4]) |
There was a problem hiding this comment.
r50 column naming. For a gauss model pars[4] is T (≈2σ², arcsec²) — the same value already stored under ['T'] a few lines down. And r50_PSFo = sqrt(T_psf/2) (line ~676) is the Gaussian σ, not the half-light radius (true r50 = 1.1774·σ). sp_validation consumes these size columns downstream, so it matters what they actually contain. Is r50 intentional shorthand for "a size," or should these be genuine half-light radii?
There was a problem hiding this comment.
The intend is to resolve the existing confusion about 'T', which in the DES papers is the area measure 2 * sigma^2, but in our ngmix implementation was the half-light radius, so the name was confusing. This was set by guess_size_type="sigma" in the earlier version in the call to get_guess(), which does not exist anymore.
We need to check that pars[4] is indeed 'T', and that r50 should be 1.1774·σ and not just σ. Would be good to not hard-code these transformations. There exist some functions in sp_validation:galaxy.py, we would have to move them to e.g. cs_util.
| ) | ||
|
|
||
| error_msg = hsm_shape.error_message | ||
| if np.all(gal_vign == 0): |
There was a problem hiding this comment.
Zero-pixel guard loosened any→all. v1 skipped an epoch if any pixel was exactly 0; this now skips only if np.all(gal_vign == 0). Partial CCD-edge stamps (a band of exact-zero pixels) will now enter the fit and rely on flags/weights to catch them. Intended loosening, or should the partial-zero case still be excluded?
There was a problem hiding this comment.
This change was added in commit #d01503d0: the any == 0 test had pass close to no postage stamps. I though this was too restrictive.
| flag_vign[np.where(tile_vign == -1e30)] = 2**10 | ||
| v_flag_tmp = flag_vign.ravel() | ||
| # remove objects that are more than 1/3 masked | ||
| if len(np.where(v_flag_tmp != 0)[0]) / (51 * 51) > 1 / 3.0: |
There was a problem hiding this comment.
Hardcoded 51 * 51 denominator. The masked-fraction cut divides by a literal 51 * 51. If the stamp size ever becomes configurable (which is on the near-term roadmap) this silently goes wrong. v_flag_tmp.size would track the actual stamp.
There was a problem hiding this comment.
yes would be good to not hard-code this.
There was a problem hiding this comment.
We have in other module's config files the entries
STAMP_SIZE = 51
or VIGNET_SIZE = 51
|
|
||
| if n_epoch == 0: | ||
| raise ValueError("0 epoch to process") | ||
| sig_noise = sigma_mad(gal) |
There was a problem hiding this comment.
Weight-map normalization changed vs v1. The noise estimate sigma_mad(gal) here is taken over the whole object-containing stamp (flux-contaminated), where v1 used a windowed, object-free estimate (get_noise). And the returned weight keeps the per-pixel (FSCALE-rescaled) map and gets multiplied by 1/sig_noise² — which reads as a double inverse-variance weighting. Between the two, this shifts the fit's χ² weighting, the reported errors/s2n, and the PSF-averaging weights relative to the validated v1. Is this the intended new weighting?
There was a problem hiding this comment.
There is this discussion about noise and weight maps: #604
| } | ||
|
|
||
| Tguess = np.mean(T_guess_psf) | ||
| def do_ngmix_metacal(stamp, prior, flux_guess, rng): |
There was a problem hiding this comment.
PSF fit reuses the galaxy joint prior and the galaxy FLUX_AUTO as the PSF flux guess. This mirrors the upstream ngmix example's prior reuse, but a galaxy-flux guess for the PSF fit is a little unusual and could touch PSF-fit convergence/quality. Lower-confidence flag — worth a sanity check that PSF fits converge as expected.
| if ccd_nb < 18 or ccd_nb in [36, 37]: | ||
| # swap x axis so origin is on top-right | ||
| return np.rot90(vign, k=2) | ||
| print('rotating megapipe image') |
There was a problem hiding this comment.
Cleanup: this print('rotating megapipe image') is after the return, so it's unreachable — can be dropped.
| runner=runner, | ||
| psf_runner=psf_runner, | ||
| ignore_failed_psf=True, | ||
| rng=rng, |
There was a problem hiding this comment.
Cleanup: sextractor_e1e2 looks unused here, and its docstring appears copy-pasted from another function (doesn't describe this one). Remove, or fix the docstring if it's kept.
| # When post-processing is enabled the sqlite WCS file is the last input; | ||
| # remove it before passing the image files to SExtractorCaller. | ||
| if config.getboolean(module_config_sec, "MAKE_POST_PROCESS"): | ||
| f_wcs_path = input_file_list[-1] |
There was a problem hiding this comment.
Runner input-contract now positional, but the decorator wasn't updated. This pulls the WCS path from input_file_list[-1] — the WCS moved from a named config option into the positional input list. The same positional move happened in read_ext_sexcat_runner, psfex_interp_runner, and vignetmaker_runner, but (unlike ngmix_runner) their @module_runner decorators weren't updated to match. The shipped v2.0 example configs are correct, so this isn't a live bug — but the real input contract now lives entirely in each config's FILE_PATTERN/FILE_EXT ordering, so a hand-written or v1-style config would silently mis-map inputs or IndexError with nothing to catch it. ngmix_runner is the model — worth extending these four decorators to match so the contract is self-documenting.
| @@ -0,0 +1,265 @@ | |||
| """ | |||
There was a problem hiding this comment.
Is this meant to ship? This reads as a near-verbatim copy of esheldon's fitting_bd_empsf.py ngmix example (bulge+disk fracdev, espy/images refs in the docstring) and doesn't look connected to the pipeline. It also redefines its own make_data/get_prior, duplicating the new shapepipe.testing.simulate helpers added in this same PR (whose point is to be the single stable-across-branches source). Suggest dropping it, or trimming it to import the canonical helpers. scripts/jupyter/test_centroid_shift.py has the same duplication and could likewise import simulate.make_data.
Next steps — triage of the v2.0 reviewProvenance: like the two-part review above, this summary was generated by Claude working on candide against the current branch head; I haven't hand-edited it line by line, so please read it with that lens. The re-review was genuinely run — every finding below was re-checked against the code as it stands on this branch. A follow-up now that #740 is closed and everything lives here on #741. I re-read the ngmix module and the four runners against the current head: no commits have landed since the part-2 review, so all eleven line-level findings still stand exactly where they were anchored — nothing was silently addressed or drifted. #741 is green and mergeable; the question is just which findings to land before merge and which can wait. The useful cut isn't open-vs-resolved (they're all open) — it's what unblocks each one: A · Cut-and-dry — mechanical, no science (one cleanup commit)
Happy to push these as a single commit here if that's useful — they keep CI green and narrow the review surface to the science. B · Decisions — need your / Lucy's intentThree are light (a one-line "intended" or a small fix): zero-pixel guard Two genuinely change the meaning of output columns and are worth a real answer:
C · Resume feature
Suggested sequence to merge
Honest read: CI is green and nothing here breaks the suite, so #741 can merge today — but the weight-norm and — Claude on behalf of Cail |
Mechanical cleanups from the #741 review that Martin greenlit / are unambiguous — no behaviour change on current configs: - ngmix.py: remove unreachable print() after return in MegaCamFlip (293) - ngmix.py: remove unused sextractor_e1e2() (copy-pasted docstring, no callers anywhere in src/) (1140) - ngmix.py + ngmix_runner.py: remove the dead CHECK_EXISTING_DIR resume path (_check_existing_dir set but never read; get_last_id defined but never called) and the stale '# MKDEBUG check whether still used' marker on _f_wcs_path (which IS used). Martin: 'a hack to resume interrupted runs ... can be removed now.' (254/257/464) - ngmix.py: masked-fraction cut divides by v_flag_tmp.size instead of the literal 51*51 — identical for 51x51 stamps, auto-tracks any future stamp size (766) - scripts/python/fitting.py: remove — verbatim copy of esheldon's ngmix example, no callers, duplicates testing/simulate.py Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
What this is
A same-repo mirror of #740 (@martinkilbinger's "Ngmix v2.0"), pushed to a branch on
CosmoStat/shapepipeso that CI actually runs. All 57 commits are authored by Martin (and carry Lucy Baumont's and Axel Guinot's work) — pushing the branch preserves that authorship unchanged; the only thing this PR adds is a same-repo head so GitHub Actions fires thepull_requestworkflow without the fork-PR approval gate. #740 received no CI runs at all for this reason.Substance is identical to #740 — see that PR for the full description. In short: upgrade ngmix to 2.4.0 and adopt Lucy's new ngmix classes/interface; overhaul the shape-measurement module; centroid-bias fix + validation; v2.0 production-run plumbing.
Going forward, opening PRs directly on
CosmoStat/shapepipe(rather than from a fork) avoids this — fork PRs don't trigger our Docker-image CI without a maintainer approval that wasn't happening.Closes/supersedes #740 once CI is green (leaving that call to Martin).
Review
A detailed review is on its way (read against Martin's checklist plus a science-quality pass). Headline from exercising the new fitter against real ngmix 2.4.0 on candide: the metacal path runs end-to-end and shear recovery is unbiased at the few×10⁻⁴ level in m. Full notes to follow.
— Claude on behalf of Cail