DM-54780: add pure-JSON archive implementations by TallJimbo · Pull Request #32 · lsst/images

TallJimbo · 2026-04-28T16:31:02Z

Checklist

ran Jenkins
added a release note for user-visible changes to doc/changes

codecov · 2026-04-28T16:32:24Z

Codecov Report

❌ Patch coverage is 58.18182% with 161 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.20%. Comparing base (cdb62ee) to head (945836d).
⚠️ Report is 12 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
python/lsst/images/json/_input_archive.py	55.76%	23 Missing ⚠️
python/lsst/images/json/formatters.py	0.00%	23 Missing ⚠️
tests/test_transforms.py	12.00%	22 Missing ⚠️
python/lsst/images/json/_output_archive.py	64.40%	21 Missing ⚠️
python/lsst/images/serialization/_tables.py	44.73%	21 Missing ⚠️
python/lsst/images/fits/_input_archive.py	51.61%	15 Missing ⚠️
python/lsst/images/fits/_output_archive.py	40.00%	12 Missing ⚠️
tests/test_psfs.py	9.09%	10 Missing ⚠️
python/lsst/images/serialization/_asdf_utils.py	61.90%	8 Missing ⚠️
python/lsst/images/tests/_roundtrip.py	92.10%	3 Missing ⚠️
... and 2 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #32      +/-   ##
==========================================
- Coverage   75.11%   74.20%   -0.92%     
==========================================
  Files          60       64       +4     
  Lines        6241     6478     +237     
==========================================
+ Hits         4688     4807     +119     
- Misses       1553     1671     +118

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

TallJimbo · 2026-04-28T16:51:53Z

Local test coverage from diff-cover (with legacy packages and test data vailable) is:

-------------
python/lsst/images/_image.py (100%)
python/lsst/images/_mask.py (100%)
python/lsst/images/fits/_common.py (86.7%): Missing lines 75,83
python/lsst/images/fits/_input_archive.py (61.9%): Missing lines 210,212,244,263,275,307,309,314
python/lsst/images/fits/_output_archive.py (72.2%): Missing lines 279,281-284
python/lsst/images/fits/formatters.py (100%)
python/lsst/images/json/__init__.py (100%)
python/lsst/images/json/_input_archive.py (90.4%): Missing lines 95,105,111,122,133
python/lsst/images/json/_output_archive.py (89.8%): Missing lines 70,104,139-140,154,156
python/lsst/images/json/formatters.py (100%)
python/lsst/images/psfs/_legacy.py (100%)
python/lsst/images/psfs/_piff.py (100%)
python/lsst/images/serialization/_asdf_utils.py (61.9%): Missing lines 129,133,137-141,261
python/lsst/images/serialization/_common.py (100%)
python/lsst/images/serialization/_input_archive.py (75.0%): Missing lines 187
python/lsst/images/serialization/_output_archive.py (80.0%): Missing lines 340
python/lsst/images/serialization/_tables.py (71.1%): Missing lines 119,122,127-128,142,161,170,185-186,189-190
python/lsst/images/tests/_roundtrip.py (91.7%): Missing lines 283,295,298
tests/test_image.py (100%)
tests/test_psfs.py (100%)
tests/test_transforms.py (100%)
-------------
Total:   370 lines
Missing: 50 lines
Coverage: 86%
-------------

Most of what's missing is:

exception-raises for nearly-impossible conditions (mostly corrupted archives)
saving tables to archives via astropy.table rather than structured numpy arrays (usage and tests pending on DM-54225: Add cell coadd type and format #18).

timj

Looks good. I like how a second file format forces things to reorganize (and it makes me want to try an HDF5 variant using the Starlink NDF data model).

I think we need to call .inspect somewhere to show that it actually works for JSON tests (I don't think it does)
I have concerns that astropy.io.fits is turning up in the JSON API.

timj · 2026-04-29T01:09:19Z

+        return self._exit_stack.enter_context(from_json(self.filename))
+
+    def _get_extension(self) -> str:
+        return ".fits"


Shouldn't this be .json? Implies that we aren't testing this.

timj · 2026-04-29T01:12:43Z

+
+
+class RoundtripJson[T](RoundtripBase):
+    def inspect(self) -> astropy.io.fits.HDUList:


Copy and paste error? Seems wrong for a JSON round trip.

timj · 2026-04-29T01:22:09Z

+class RoundtripJson[T](RoundtripBase):
+    def inspect(self) -> astropy.io.fits.HDUList:
+        """Read the JSON file as a dictionary."""
+        return self._exit_stack.enter_context(from_json(self.filename))


This looks wrong to me. from_json takes bytes and not a filename. It also returns a dict so I don't think a context manager is needed at all.

I think this means that there are no test calls to this method.

timj · 2026-04-29T02:20:03Z

-        return TableReferenceModel(source=str(key), columns=columns)
+        for n, c in enumerate(columns, start=1):
+            assert isinstance(c.data, ArrayReferenceModel)
+            c.data.source = f"{key}[{n}]"


Is there anyway to push this lower down (into TableColumnModel?) so that we don't have to do the identical source fixup in two places? Is see that TableReferenceModel did accept a source parameter.

I held off on that because I think it bakes a FITS-specific assumption into a more generic model, even though that's just a hypothetical concern now. For ASDF column-major tables in particular, there would be a different source for every column, because they'd go in different blocks.

timj · 2026-04-29T02:22:13Z

-        key, reader = self._get_source_reader(ref)
+        if not isinstance(model.columns[0].data, ArrayReferenceModel):
+            raise ArchiveReadError("Inline array found where a reference array was expected.")
+        key, reader = self._get_source_reader(model.columns[0].data.source, is_table=True)


Maybe add a comment that in a table the first column can always be trusted to return the source.

timj · 2026-04-29T02:34:59Z


+    def test_json_roundtrip(self) -> None:
+        """Test saving a tiny image to pure JSON."""
+        image = Image(


Maybe add units to the test image?

I actually had them originally and then removed them as a (lazy) way to get a little more test coverage. Turns out most of our tests have images with units and relatively few don't. That's orthogonal enough to the archive type (except that it's important for FITS to make sure BUNIT gets set) that I don't think it's worth a near-duplicate of this test to try it.

timj · 2026-04-29T02:37:27Z

+
+
+def read[T: Any](cls: type[T], target: ResourcePathExpression | ArchiveTree) -> ReadResult[T]:
+    """Read an object from a FITS file.


Not a FITS file.

timj · 2026-04-29T02:39:46Z

+
+def write(
+    obj: Any,
+    filename: str | None = None,


In theory for JSON this could be a URI to allow direct writes to S3 (through ResourcePath). I understand that since FITS couldn't do that (and neither can HDF5) that it might be easier to stick to files in the interface else you end up re-implementing the butler datastore "write to local file and then transfer to cloud" approach.

Good idea. I'm trying to make the write and read functions compatible where they can be without forcing them into a least-common-denominator interface, so accepting ResourcePathExpression here sounds fine.

This also includes: - moving TableCellReferenceModel to the 'fits' subpackage, where it has been renamed to PointerModel to reflect the fact that it's only used there, and only as a pointer; - adding support (at archive implementation discretion) for tables with inline arrays for columns; The ASDF table data model gives each column a 'source' field, which is flexibility we don't need for the FITS archives, since we really only need a pointer to the full HDU. But since we've got flexibility to cook up whatever source strings we want, we can just invent a way to append a column number (1-indexed, because FITS; note that the column name is already nearby), and then strip that off entirely when we read it to get the HDU EXTNAME[,EXTVER].

And rename the FITS 'write' method argument from 'filename' to 'path' for consistency, even though that can't do URIs.

TallJimbo force-pushed the tickets/DM-54780 branch from 79a4417 to f6ecdca Compare April 28, 2026 16:46

TallJimbo marked this pull request as ready for review April 28, 2026 16:52

timj approved these changes Apr 29, 2026

View reviewed changes

TallJimbo force-pushed the tickets/DM-54780 branch 3 times, most recently from a06361f to 9152e8a Compare April 29, 2026 18:57

TallJimbo added 10 commits April 29, 2026 15:32

Make opaque archive metadata optional.

b3f5f8a

Allow archive implementations to use inline arrays.

deabd4d

Fix doc typos.

5136625

Move ReadResult into serialization base package.

8891ee3

Factor out a non-FITS base class from RoundtripFits.

98d188a

Fix bug in RoundtripFits no-butler parameter reads.

72898df

Test formatter file extensions in roundtrip helpers.

02c0586

Add a subpackage with a pure JSON archive implementation.

bb026ac

Use constants for special EXTNAME and column names in FITS archives.

3c17c86

TallJimbo force-pushed the tickets/DM-54780 branch 2 times, most recently from 48b6867 to 7a1834a Compare April 29, 2026 19:36

Support direct ResourcePath JSON writes.

945836d

And rename the FITS 'write' method argument from 'filename' to 'path' for consistency, even though that can't do URIs.

TallJimbo force-pushed the tickets/DM-54780 branch from 7a1834a to 945836d Compare April 29, 2026 19:40

TallJimbo merged commit af1a2a9 into main Apr 29, 2026
17 of 19 checks passed

TallJimbo deleted the tickets/DM-54780 branch April 29, 2026 23:48



		class RoundtripJson[T](RoundtripBase):
		def inspect(self) -> astropy.io.fits.HDUList:



		def read[T: Any](cls: type[T], target: ResourcePathExpression \| ArchiveTree) -> ReadResult[T]:
		"""Read an object from a FITS file.

Conversation

TallJimbo commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

codecov Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

TallJimbo commented Apr 28, 2026

Uh oh!

timj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

TallJimbo commented Apr 28, 2026 •

edited

Loading

codecov Bot commented Apr 28, 2026 •

edited

Loading