feat: honor PYAUTO_SMALL_DATASETS in Imaging.from_fits#301
Merged
Conversation
Mask2D.circular and Grid2D.uniform already cap to (15, 15) at 0.6"/px under PYAUTO_SMALL_DATASETS=1, but Imaging.from_fits did not — it just loaded whatever was on disk. Any caller that paired from_fits(150x150 fixture) with Mask2D.circular(shape_native=dataset.shape_native) under the env var crashed with a (150,150) vs (15,15) broadcast error on apply_mask. Add a center-crop hook in Imaging.from_fits that mirrors the existing caps: data and noise_map exceeding (15, 15) are center-cropped and pixel_scales is overridden to 0.6. The PSF is left alone (PSFs are usually already small and capping them changes shape semantics). A new utility cap_array_2d_for_small_datasets in autoarray/util/ dataset_util.py implements the cap and is reusable by other from_fits loaders in follow-up PRs. No-op when env unset OR when on-disk shape is already at-or-below the cap, so the simulator -> from_fits round-trip is unchanged. Closes Cluster E from the 2026-05-07 release-prep triage. The workspace-side env_vars.yaml override shipped earlier (PR #80 in autolens_workspace_test) becomes redundant after this lands but is left in place as belt-and-suspenders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PYAUTO_SMALL_DATASETS=1is the smoke-test contract for "cap everything image-shaped to (15, 15) at 0.6"/px so smoke runs are fast and shape-consistent." It was honored byMask2D.circular,Grid2D.uniform,OverSample, anddataset_util.should_simulate— but not byImaging.from_fits, which loaded whatever shape was on disk. Any caller pairingfrom_fits(150×150 fixture)withMask2D.circular(shape_native=dataset.shape_native)under the env var crashed with(150,150) vs (15,15)broadcast error onapply_mask(most recently surfaced as Cluster E in the 2026-05-07 release-prep triage).This PR closes that gap:
Imaging.from_fitsnow center-crops loaded data and noise_map to (15, 15) at 0.6"/px when the env var is set and the input exceeds the cap. PSF is intentionally left alone. No-op when env unset OR when the on-disk shape is already at-or-below the cap, so the simulator → from_fits round-trip is unchanged.API Changes
autoarray.util.dataset_util.cap_array_2d_for_small_datasets(array_2d, pixel_scales)that center-crops a 2D autoarray to (15, 15)/0.6 whenPYAUTO_SMALL_DATASETS=1and the input exceeds the cap. Returns inputs unchanged otherwise.Imaging.from_fitsnow calls this helper fordataandnoise_map. PSF unchanged. Signature, docstring, and default behaviour with the env var unset are unchanged — this is a behaviour change only underPYAUTO_SMALL_DATASETS=1.See full details below.
Test Plan
test_autoarray/util/test_dataset_util.pycovering env-unset, env-set + already-at-cap, env-set + below-cap, env-set + above-cap (square + non-square).test_autoarray/dataset/imaging/test_dataset.pycoveringImaging.from_fitsend-to-end with env set vs unset on a 30×30 FITS fixture.test_autoarray/suite passes (747/747).PYAUTO_SMALL_DATASETS=1,autolens_workspace_test/scripts/multi/visualization_imaging.pyloadsdataset/multi/lens_sersic/g_data.fits(150×150) and the resultingdataset.data.shape_native == (15, 15),dataset.psf.kernel.shape_native == (21, 21)(PSF preserved),dataset.pixel_scales == (0.6, 0.6), andapply_masksucceeds — Cluster E reproducer no longer broadcasts.Full API Changes (for automation & release notes)
Added
autoarray.util.dataset_util.cap_array_2d_for_small_datasets(array_2d, pixel_scales) -> (Array2D, pixel_scales)— center-crops a 2D autoarray to (15, 15) and overridespixel_scalesto 0.6 whenPYAUTO_SMALL_DATASETS=1is set and the input exceeds the cap. No-op otherwise.autoarray.util.dataset_util.SMALL_DATASETS_SHAPE_NATIVE = (15, 15)— module constant, the cap shape.autoarray.util.dataset_util.SMALL_DATASETS_PIXEL_SCALES = 0.6— module constant, the cap pixel_scales.Changed Behaviour
Imaging.from_fitsnow appliescap_array_2d_for_small_datasetstodataandnoise_mapafter loading. UnderPYAUTO_SMALL_DATASETS=1with a >15×15 on-disk fixture, the returned dataset will havedata.shape_native == (15, 15)andpixel_scales == (0.6, 0.6). PSF is unchanged. With the env var unset, behaviour is identical to before.Migration
None required. Existing callers see no change unless they were already running under
PYAUTO_SMALL_DATASETS=1, in which case they previously crashed onapply_maskand now succeed silently with a capped dataset.🤖 Generated with Claude Code