Skip to content

feat: honor PYAUTO_SMALL_DATASETS in Imaging.from_fits#301

Merged
Jammy2211 merged 1 commit intomainfrom
feature/imaging-from-fits-small-datasets-cap
May 8, 2026
Merged

feat: honor PYAUTO_SMALL_DATASETS in Imaging.from_fits#301
Jammy2211 merged 1 commit intomainfrom
feature/imaging-from-fits-small-datasets-cap

Conversation

@Jammy2211
Copy link
Copy Markdown
Collaborator

Summary

PYAUTO_SMALL_DATASETS=1 is the smoke-test contract for "cap everything image-shaped to (15, 15) at 0.6"/px so smoke runs are fast and shape-consistent." It was honored by Mask2D.circular, Grid2D.uniform, OverSample, and dataset_util.should_simulate — but not by Imaging.from_fits, which loaded whatever shape was on disk. Any caller pairing from_fits(150×150 fixture) with Mask2D.circular(shape_native=dataset.shape_native) under the env var crashed with (150,150) vs (15,15) broadcast error on apply_mask (most recently surfaced as Cluster E in the 2026-05-07 release-prep triage).

This PR closes that gap: Imaging.from_fits now center-crops loaded data and noise_map to (15, 15) at 0.6"/px when the env var is set and the input exceeds the cap. PSF is intentionally left alone. No-op when env unset OR when the on-disk shape is already at-or-below the cap, so the simulator → from_fits round-trip is unchanged.

API Changes

  • New public utility autoarray.util.dataset_util.cap_array_2d_for_small_datasets(array_2d, pixel_scales) that center-crops a 2D autoarray to (15, 15)/0.6 when PYAUTO_SMALL_DATASETS=1 and the input exceeds the cap. Returns inputs unchanged otherwise.
  • Imaging.from_fits now calls this helper for data and noise_map. PSF unchanged. Signature, docstring, and default behaviour with the env var unset are unchanged — this is a behaviour change only under PYAUTO_SMALL_DATASETS=1.

See full details below.

Test Plan

  • 5 new unit tests in test_autoarray/util/test_dataset_util.py covering env-unset, env-set + already-at-cap, env-set + below-cap, env-set + above-cap (square + non-square).
  • 2 new integration tests in test_autoarray/dataset/imaging/test_dataset.py covering Imaging.from_fits end-to-end with env set vs unset on a 30×30 FITS fixture.
  • Full test_autoarray/ suite passes (747/747).
  • End-to-end verification: with this branch's autoarray active and PYAUTO_SMALL_DATASETS=1, autolens_workspace_test/scripts/multi/visualization_imaging.py loads dataset/multi/lens_sersic/g_data.fits (150×150) and the resulting dataset.data.shape_native == (15, 15), dataset.psf.kernel.shape_native == (21, 21) (PSF preserved), dataset.pixel_scales == (0.6, 0.6), and apply_mask succeeds — Cluster E reproducer no longer broadcasts.
Full API Changes (for automation & release notes)

Added

  • autoarray.util.dataset_util.cap_array_2d_for_small_datasets(array_2d, pixel_scales) -> (Array2D, pixel_scales) — center-crops a 2D autoarray to (15, 15) and overrides pixel_scales to 0.6 when PYAUTO_SMALL_DATASETS=1 is set and the input exceeds the cap. No-op otherwise.
  • autoarray.util.dataset_util.SMALL_DATASETS_SHAPE_NATIVE = (15, 15) — module constant, the cap shape.
  • autoarray.util.dataset_util.SMALL_DATASETS_PIXEL_SCALES = 0.6 — module constant, the cap pixel_scales.

Changed Behaviour

  • Imaging.from_fits now applies cap_array_2d_for_small_datasets to data and noise_map after loading. Under PYAUTO_SMALL_DATASETS=1 with a >15×15 on-disk fixture, the returned dataset will have data.shape_native == (15, 15) and pixel_scales == (0.6, 0.6). PSF is unchanged. With the env var unset, behaviour is identical to before.

Migration

None required. Existing callers see no change unless they were already running under PYAUTO_SMALL_DATASETS=1, in which case they previously crashed on apply_mask and now succeed silently with a capped dataset.

🤖 Generated with Claude Code

Mask2D.circular and Grid2D.uniform already cap to (15, 15) at 0.6"/px
under PYAUTO_SMALL_DATASETS=1, but Imaging.from_fits did not — it just
loaded whatever was on disk. Any caller that paired from_fits(150x150
fixture) with Mask2D.circular(shape_native=dataset.shape_native) under
the env var crashed with a (150,150) vs (15,15) broadcast error on
apply_mask.

Add a center-crop hook in Imaging.from_fits that mirrors the existing
caps: data and noise_map exceeding (15, 15) are center-cropped and
pixel_scales is overridden to 0.6. The PSF is left alone (PSFs are
usually already small and capping them changes shape semantics).
A new utility cap_array_2d_for_small_datasets in autoarray/util/
dataset_util.py implements the cap and is reusable by other from_fits
loaders in follow-up PRs.

No-op when env unset OR when on-disk shape is already at-or-below the
cap, so the simulator -> from_fits round-trip is unchanged.

Closes Cluster E from the 2026-05-07 release-prep triage. The
workspace-side env_vars.yaml override shipped earlier (PR #80 in
autolens_workspace_test) becomes redundant after this lands but is
left in place as belt-and-suspenders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Jammy2211 Jammy2211 added the pending-release PR queued for the next release build label May 8, 2026
@Jammy2211 Jammy2211 merged commit 1cf3498 into main May 8, 2026
4 checks passed
@Jammy2211 Jammy2211 deleted the feature/imaging-from-fits-small-datasets-cap branch May 8, 2026 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pending-release PR queued for the next release build

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant