From a4489b38cb94589dad33c134610bad268e43c1c3 Mon Sep 17 00:00:00 2001 From: Steven Michael Date: Sun, 31 May 2026 13:56:42 -0400 Subject: [PATCH 1/9] Add FITS tiled-image compression: RICE_1 + GZIP read support Implement Phase 1 of the FITS tiled-image compression convention (ZIMAGE/ZCMPTYPE/ZTILEn/COMPRESSED_DATA), decoding compressed images out of the BINTABLE form into ImageData: - New src/tile_compress.rs: CompressionType/Quantize/TileGeometry, CompressedImage<'a> view (detect/from_bintable), RICE_1 decoder, tile reassembly (unravel/scatter_tile), and GZIP_1/GZIP_2 behind an optional, feature-gated `gzip` dependency (miniz_oxide) so the core stays zero-dependency. - Lazy, backward-compatible API: Hdu::as_compressed_image() -> Option, then .decompress()/.compression()/.geometry(). HduData is unchanged; compressed images stay stored as BinTable so the original tiles survive for lossless round-trip. - Fix RICE_1 seed handling: the first value is a seed (lastpix), not a stored pixel; decode a full nvals diffs so array[0] = seed + diff[0]. The previous off-by-one shifted every reconstructed pixel one position. - Tests: byte-exact integration tests against fpack-generated fixtures (RICE row-tiled, square edge-truncated, I32, gzip1), skip-if-absent; corrected synthetic RICE unit tests that had masked the bug. - scripts/gen_compressed_fixtures.sh generates/verifies fpack fixtures; CI downloads them best-effort (cache key v2). - COMPRESSION_PLAN.md documents the staged rollout (float quant+dither, PLIO_1, HCOMPRESS_1, write path remain TODO). Also includes accompanying in-progress crate refinements across ascii_table/bintable/header/keyword/types and their tests. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/ci.yml | 14 +- .gitignore | 3 + COMPRESSION_PLAN.md | 229 ++++++ Cargo.lock | 1 + Cargo.toml | 5 + scripts/gen_compressed_fixtures.sh | 146 ++++ src/ascii_table.rs | 38 +- src/bintable.rs | 104 ++- src/checksum.rs | 6 +- src/error.rs | 9 +- src/hdu.rs | 51 +- src/header.rs | 20 +- src/image_conv.rs | 12 +- src/image_data.rs | 2 +- src/keyword.rs | 22 +- src/lib.rs | 38 +- src/tile_compress.rs | 1109 ++++++++++++++++++++++++++++ src/types.rs | 9 +- tests/checksum.rs | 2 +- tests/image_conv.rs | 5 +- tests/round_trip.rs | 34 +- tests/sample_files.rs | 109 ++- 22 files changed, 1863 insertions(+), 105 deletions(-) create mode 100644 COMPRESSION_PLAN.md create mode 100755 scripts/gen_compressed_fixtures.sh create mode 100644 src/tile_compress.rs diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index e70cdfb..ace4be0 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -26,7 +26,7 @@ jobs: uses: actions/cache@v4 with: path: samp - key: fits4-samples-v1 + key: fits4-samples-v2 - name: Download sample FITS files if: steps.cache-samples.outputs.cache-hit != 'true' @@ -41,6 +41,18 @@ jobs: WFPC2u5780205r_c0fx.fits; do curl -sSfL "$BASE/$f" -o "samp/$f" done + # Compressed (fpack) fixtures: best-effort — these may not yet be + # uploaded to the bucket, so a 404 must not fail CI. The tile-compression + # tests skip cleanly when a fixture is absent. + for f in \ + EUVEngc4151imgx.rice.fits.fz \ + EUVEngc4151imgx.rice_t100.fits.fz \ + FGSf64y0106m_a1f.rice.fits.fz \ + EUVEngc4151imgx.gzip1.fits.fz \ + EUVEngc4151imgx.gzip2.fits.fz \ + FOCx38i0101t_c0f.rice_nodith.fits.fz; do + curl -sSfL "$BASE/$f" -o "samp/$f" || true + done - name: Clippy run: cargo clippy --all-targets --features image -- -D warnings diff --git a/.gitignore b/.gitignore index ea8c4bf..65aa528 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1,4 @@ /target +*.pdf +samp +.DS_Store diff --git a/COMPRESSION_PLAN.md b/COMPRESSION_PLAN.md new file mode 100644 index 0000000..9b7fcf2 --- /dev/null +++ b/COMPRESSION_PLAN.md @@ -0,0 +1,229 @@ +# FITS Tiled Image Compression — Implementation Plan + +Status: planning + initial scaffold. This document describes how to add support for +the FITS *Tiled Image Compression* convention (the BINTABLE-based compressed-image +format) to `fits4`, currently the main feature gap versus `fitsrs`. + +References: +- Tiled Image Convention for Storing Compressed Images in FITS Binary Tables, + v2.3 — https://fits.gsfc.nasa.gov/registry/tilecompression/tilecompression2.3.pdf + (also arXiv:1201.1336) +- HEASARC FITSIO compression docs — https://heasarc.gsfc.nasa.gov/docs/software/fitsio/compression.html +- cfitsio `ricecomp.c` — https://github.com/HEASARC/cfitsio/blob/develop/ricecomp.c +- astropy compressed codecs / `_tiled_compression.py` + +## 1. What the convention is + +A tile-compressed image is physically a **BINTABLE extension** (`XTENSION = 'BINTABLE'`) +but is logically an image. The original image is divided into a rectangular grid of +*tiles*; each tile is compressed independently and stored as a byte stream in a +variable-length-array (VLA) cell of a binary-table column (typically `COMPRESSED_DATA`, +a `1PB` or `1QB` column). One table row == one tile, in row-major (FORTRAN/NAXIS1-fastest) +tile order. + +### Driving header keywords + +Mandatory: +- `ZIMAGE = T` — marks this BINTABLE as a compressed image. +- `ZCMPTYPE` — algorithm name: `RICE_1`, `GZIP_1`, `GZIP_2`, `HCOMPRESS_1`, `PLIO_1` + (also `RICE_ONE` alias, `NOCOMPRESS`). +- `ZBITPIX` — BITPIX of the *original* (uncompressed) image (8/16/32/64/-32/-64). +- `ZNAXIS` — NAXIS of the original image. +- `ZNAXISn` — NAXISn of the original image (n = 1..ZNAXIS). + +Optional / conditional: +- `ZTILEn` — tile size along axis n. Default = row-by-row: `ZTILE1 = ZNAXIS1`, all + others = 1. +- `ZNAMEi` / `ZVALi` — algorithm parameter name/value pairs: + - RICE_1: `BLOCKSIZE` (default 32), `BYTEPIX` (default 4 — bytes per pixel of the + *integers* fed to Rice; note this is the post-quantization integer size). + - HCOMPRESS_1: `SCALE`, `SMOOTH`. +- `ZQUANTIZ` — float quantization method: `NO_DITHER`, `SUBTRACTIVE_DITHER_1`, + `SUBTRACTIVE_DITHER_2`. Absent ⇒ no dithering (older files quantized without dither). +- `ZDITHER0` — integer dither seed (1..10000) for subtractive dithering. +- `ZBLANK` — integer value used in the integer stream to flag undefined/NaN pixels + (when given as a keyword it is constant for all tiles; can also be a per-tile column). +- `ZSIMPLE`, `ZEXTEND`, `ZBLOCKED`, `ZPCOUNT`, `ZGCOUNT`, `ZHECKSUM`, `ZDATASUM` — + preserve the original primary/extension header context (mostly carried through; + `ZSIMPLE`/`ZTENSION` distinguish original primary vs extension). +- `ZMASKCMP` — algorithm used to compress the null-pixel mask (rarely needed for read). +- The original image's other header keywords are copied verbatim into the BINTABLE + header (so WCS etc. survive); on decompression we reconstruct an image header from + the `Z*` keywords plus the carried-through cards. + +### Binary-table columns + +- `COMPRESSED_DATA` (required) — VLA (`1PB`/`1QB`) holding the compressed tile bytes. + When a tile failed to compress/quantize, its descriptor count is 0 and the data + lives in the fallback column instead. +- `GZIP_COMPRESSED_DATA` (optional) — VLA fallback: a gzip-compressed copy of the + *un-quantized* tile, used when lossy quantization was rejected for that tile. +- `UNCOMPRESSED_DATA` (optional, older variant) — raw uncompressed tile fallback. +- `ZSCALE`, `ZZERO` (optional, columns of D) — per-tile linear scale/zero for float + quantization: `float_value = ZZERO + ZSCALE * integer_value`. If absent, use the + `ZSCALE`/`ZZERO` *keywords*, or 1.0/0.0. +- `ZBLANK` (optional column of J) — per-tile blank sentinel. + +### Reassembly + +For each row r (tile index): decode the tile's compressed bytes with the ZCMPTYPE +algorithm into a flat array of `ZBITPIX`-typed integers (or, for the gzip fallback, +the original dtype). For float originals, apply inverse quantization +(`ZZERO + ZSCALE * q`, plus dither reversal). Then scatter the tile's pixels into the +correct sub-rectangle of the full image buffer using the tile grid geometry derived +from `ZNAXISn` / `ZTILEn`. Tiles iterate with axis 1 fastest. + +## 2. Float quantization & dithering + +Float (`ZBITPIX < 0`) images are usually stored as quantized 32-bit integers +(`BYTEPIX` typically 4) plus per-tile `ZSCALE`/`ZZERO`. Decode: + +1. Decompress the integer stream `q[i]`. +2. If `q[i] == ZBLANK` ⇒ output `NaN`. +3. Otherwise `value = ZZERO + ZSCALE * q[i]` — with dither correction: + - `NO_DITHER` / absent: `value = ZZERO + ZSCALE * q[i]`. + - `SUBTRACTIVE_DITHER_1`: a per-pixel uniform dither `r ∈ [0,1)` was *added* to + `q` before rounding at compress time, so decode is + `value = ZZERO + ZSCALE * (q[i] - r + 0.5)` (cfitsio uses `(q - dither + 0.5)`). + - `SUBTRACTIVE_DITHER_2`: same as DITHER_1 except exact-zero pixels are preserved + (encoded with the special integer `-2147483647` ⇒ decode to exactly 0.0). + The dither value `r` comes from cfitsio's fixed pseudo-random sequence + (`fits_rand_value`, a 10000-element table generated by a specific LCG), indexed by + `(tile_row + ZDITHER0 - 1)` to pick a starting offset, advancing per pixel and + wrapping at 10000. This table/sequence must be reproduced exactly to round-trip + dithered files — see cfitsio `quantize.c` `fits_init_randoms`. + +This is the trickiest correctness item; for the first read-only pass we can support +`NO_DITHER` and the absent case, and treat dithered files as a follow-up (decoding +without dither reversal yields values off by up to ~0.5 ZSCALE — acceptable only as a +clearly-flagged approximation, so better to error until the RNG table is implemented). + +## 3. Module design + +New module `src/tile_compress.rs`, gated behind nothing (pure Rust; gzip needs an +inflate implementation — see open questions). Public surface: + +```rust +pub enum CompressionType { Rice1, Gzip1, Gzip2, Hcompress1, Plio1, NoCompress } +pub enum Quantize { None, SubtractiveDither1, SubtractiveDither2 } + +pub struct TileGeometry { znaxis: Vec, ztile: Vec, /* grids */ } + +pub struct CompressedImage<'a> { + pub ctype: CompressionType, + pub zbitpix: i64, + pub geometry: TileGeometry, + pub quantize: Quantize, + pub zdither0: i64, + pub blank: Option, + table: &'a BinTable, // borrows the parsed BINTABLE + // resolved column indices: compressed_data, gzip_fallback, zscale, zzero, zblank +} + +impl CompressedImage<'_> { + pub fn detect(header: &Header) -> bool; // ZIMAGE == T + pub fn from_bintable(header: &Header, table: &BinTable) -> Result; + pub fn decompress(&self) -> Result; // full image + fn decompress_tile(&self, row: usize) -> Result>; +} + +// algorithm primitives (pure functions, unit-testable in isolation): +pub fn rice_decompress_i32(src: &[u8], nvals: usize, blocksize: usize) -> Result>; +pub fn rice_decompress_i16(src: &[u8], nvals: usize, blocksize: usize) -> Result>; +pub fn rice_decompress_i8 (src: &[u8], nvals: usize, blocksize: usize) -> Result>; +pub fn gzip_inflate(src: &[u8]) -> Result>; // GZIP_1 +// GZIP_2 = byte-shuffled gzip (de-shuffle after inflate) +pub fn plio_decompress(src: &[u8], nvals: usize) -> Result>; +pub fn hcompress_decompress(...) -> Result>; // last +``` + +### Read-pipeline hookup + +In `hdu.rs::read_from`, after the BINTABLE branch parses a `BinTable`, check +`CompressedImage::detect(&header)`. Options for surfacing it (see open questions): + +- **Option A (recommended):** add a new `HduData::Image(ImageData)` produced eagerly — + i.e. when `ZIMAGE=T`, decompress and store the result as `HduData::Image`, so callers + treat it like any other image. Keep the original BINTABLE accessible only if needed. + Downside: loses the raw table; re-writing won't round-trip the compressed bytes. +- **Option B:** add a new variant `HduData::CompressedImage(BinTable)` (or keep the + `BinTable` and add a method `Hdu::as_image()` that lazily decompresses). Preserves + round-trip and is lazy, at the cost of a new public enum variant. +- **Option C:** keep `HduData::BinTable`, and add `BinTable::as_compressed_image()` + / `Hdu::decompressed_image()` returning `Option>`. Smallest API + change, fully backward compatible, lazy. Likely the best first step. + +The plan adopts **Option C** for the initial implementation (no enum change, no +behavior change for existing users), with a possible convenience accessor on `FitsFile`. + +### Error handling + +Add `Error` variants: `UnsupportedCompression(String)`, `CompressionError(String)`. + +## 4. Staged implementation + +**Phase 0 (this pass — scaffold):** module skeleton, enums, `detect`, `from_bintable` +header parsing, function stubs with `todo!()`, RICE_1 32/16/8-bit decoder + unit test. +Wire `mod tile_compress;` into `lib.rs`. Build must stay green. + +**Phase 1 — RICE_1 read path (integers):** finish `decompress()` for integer +`ZBITPIX` with `NO_DITHER`; tile scatter/reassembly; `ZBLANK` handling. Validate +against an `fpack`-generated test file. + +**Phase 2 — GZIP_1 / GZIP_2:** integer + float read. Needs a DEFLATE decoder +(see open questions re: dependency). GZIP_2 adds byte de-shuffle. + +**Phase 3 — float quantization + dithering:** ZSCALE/ZZERO inverse, NaN/ZBLANK, +and the cfitsio random-number table for SUBTRACTIVE_DITHER_1/2. + +**Phase 4 — PLIO_1:** IRAF pixel-list run-length decoder (fully specified in the +convention doc, simplest of the lossless coders; mainly used for integer masks). + +**Phase 5 — HCOMPRESS_1:** wavelet-ish H-transform decode + quantization. Most +complex; lowest priority. + +**Phase 6 — writing/compression:** mirror each decoder with an encoder, build the +BINTABLE + `Z*` keywords, VLA heap packing. RICE_1 + GZIP first; HCOMPRESS optional. +Hook into `FitsFile` builder (e.g. `add_compressed_image(img, CompressionType::Rice1)`). + +## 5. Test files + +None of the existing `samp/*.fits` files are tile-compressed (verified: they are +plain IMAGE / TABLE / BINTABLE HDUs). We need test fixtures: + +- Generate with cfitsio `fpack`: + - `fpack -r image.fits` → RICE_1 + - `fpack -g image.fits` → GZIP_1 + - `fpack -g2 image.fits` → GZIP_2 + - `fpack -h image.fits` → HCOMPRESS_1 + - `fpack -p image.fits` → PLIO_1 (integer images / masks) + Cover at least: I16 image, I32 image, F32 image (quantized, both dithered and + `-q 0`/no-dither), and a small 3-D cube to exercise multi-axis tiling. +- Cross-check decompressed output against the original uncompressed FITS (byte-exact + for lossless integer codecs; within quantization tolerance for float). +- Store fixtures alongside the existing samples (GCS bucket `fits4_samples`, per + MEMORY.md) so CI can fetch them; add a `tests/tile_compress.rs` integration test. + +## 6. Open questions for the human + +1. **API shape (most important):** Option C (lazy `Hdu::decompressed_image()` accessor, + no enum change) vs Option B (new `HduData::CompressedImage` variant) vs Option A + (eager — present compressed images transparently as `HduData::Image`). Transparent + presentation is friendliest for consumers like `fitsview`/`startracker` but breaks + round-trip writing. Recommendation: start with C, consider a transparent + `FitsFile::images()` helper later. +2. **GZIP dependency:** GZIP_1/GZIP_2 require DEFLATE. The crate currently has *zero* + dependencies for core. Choices: (a) put gzip behind an optional `flate2`/`miniz_oxide` + feature; (b) write a small pure-Rust inflate (more work, keeps zero-dep promise); + (c) ship RICE_1/PLIO only in core and gate gzip/hcompress behind a feature. + Recommendation: optional `flate` feature using `miniz_oxide` (pure Rust, no C). +3. **Dithering scope:** is byte-exact round-trip of `SUBTRACTIVE_DITHER_1/2` float + files required, or is read-only (and erroring on dithered files until Phase 3) okay + for the first release? This decides how soon we must port the cfitsio RNG table. +4. **Surfacing scaled vs raw:** for float quantized images, do we return the + reconstructed `PixelData::F32`/`F64` directly (applying ZSCALE/ZZERO + dither), or + also expose the raw quantized integers + scale like the existing BSCALE/BZERO split + on `ImageData`? Recommend returning reconstructed floats (matches user expectation); + raw access can be a later addition. +5. **Writing priority:** is the immediate goal read parity with `fitsrs`, or also + write/compress? Read-only RICE+GZIP closes most of the practical gap. diff --git a/Cargo.lock b/Cargo.lock index c3fefae..e419038 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -312,6 +312,7 @@ name = "fits4" version = "0.1.0" dependencies = [ "image", + "miniz_oxide", ] [[package]] diff --git a/Cargo.toml b/Cargo.toml index a3a7e40..fe3bf4e 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -9,8 +9,13 @@ license = "MIT" version = "0.25" optional = true +[dependencies.miniz_oxide] +version = "0.8" +optional = true + [features] image = ["dep:image"] +gzip = ["dep:miniz_oxide"] [profile.test] opt-level = 3 diff --git a/scripts/gen_compressed_fixtures.sh b/scripts/gen_compressed_fixtures.sh new file mode 100755 index 0000000..c6bdbd2 --- /dev/null +++ b/scripts/gen_compressed_fixtures.sh @@ -0,0 +1,146 @@ +#!/usr/bin/env bash +# +# Generate tiled-image-compressed FITS fixtures for fits4 integration tests. +# +# fits4 reads the FITS tiled-image compression convention (RICE_1, GZIP_1, +# GZIP_2, PLIO_1, HCOMPRESS_1) in pure Rust. To test that decoder against the +# canonical encoder, we use `fpack` (shipped with cfitsio) to compress known +# source images, then assert in our integration tests that decompressing them +# reproduces the original pixels. `fpack` is a *dev/test* dependency only — the +# fits4 crate itself stays pure Rust. +# +# Source images are the uncompressed NASA samples already in samp/. Outputs are +# written to samp/ as ..fits.fz and round-trip-verified with funpack. +# +# Requirements (macOS): brew install cfitsio (provides fpack + funpack) +# (conda): conda install -c conda-forge cfitsio +# Upload requires the gcloud SDK (`gcloud storage` or `gsutil`) with write +# access to the fits4_samples bucket. +# +# Usage: +# scripts/gen_compressed_fixtures.sh # generate + verify locally +# scripts/gen_compressed_fixtures.sh --upload # also upload fixtures to GCS +# +set -euo pipefail + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)" +SAMP_DIR="$REPO_ROOT/samp" +BUCKET="gs://fits4_samples" + +UPLOAD=0 +[[ "${1:-}" == "--upload" ]] && UPLOAD=1 + +# --- tool checks ------------------------------------------------------------ +for tool in fpack funpack; do + if ! command -v "$tool" >/dev/null 2>&1; then + echo "error: '$tool' not found on PATH. Install cfitsio:" >&2 + echo " macOS: brew install cfitsio" >&2 + echo " conda: conda install -c conda-forge cfitsio" >&2 + exit 1 + fi +done + +if [[ ! -d "$SAMP_DIR" ]]; then + echo "error: $SAMP_DIR not found. Fetch the base samples first." >&2 + exit 1 +fi + +GENERATED=() + +# compress +# produces samp/..fits.fz and verifies it round-trips. +compress() { + local src="$1"; shift + local label="$1"; shift + local src_path="$SAMP_DIR/$src" + local stem="${src%.fits}" + local out="$SAMP_DIR/${stem}.${label}.fits.fz" + + if [[ ! -f "$src_path" ]]; then + echo " skip: $src not present in samp/ — skipping $label" >&2 + return 0 + fi + + rm -f "$out" + # -O names the output explicitly; remaining args are the algorithm flags. + fpack "$@" -O "$out" "$src_path" + + # Verify: funpack -> /dev/stdout would re-add a primary; instead unpack to a + # temp file and confirm it succeeds (byte-exactness vs. source is asserted in + # the Rust tests, which compare fits4's decode against the original samples). + local tmp; tmp="$(mktemp -t funpack.XXXXXX).fits" + if funpack -O "$tmp" "$out" >/dev/null 2>&1; then + local before after + before=$(wc -c < "$src_path") + after=$(wc -c < "$out") + printf " ok %-40s %8s -> %8s bytes\n" "$(basename "$out")" "$before" "$after" + GENERATED+=("$(basename "$out")") + else + echo " FAIL: funpack could not decompress $out" >&2 + rm -f "$tmp" + return 1 + fi + rm -f "$tmp" +} + +echo "Generating compressed fixtures in $SAMP_DIR ..." + +# --- integer images (RICE / GZIP / PLIO) ------------------------------------ +# EUVEngc4151imgx.fits has I16 IMAGE extensions. +compress EUVEngc4151imgx.fits rice -r +compress EUVEngc4151imgx.fits gzip1 -g +compress EUVEngc4151imgx.fits gzip2 -g2 +# PLIO_1 targets integer (mask-like) data, <= 24-bit; fine for I16. +compress EUVEngc4151imgx.fits plio -p +# Force square tiles to exercise 2-D reassembly with edge-truncated tiles. +compress EUVEngc4151imgx.fits rice_t100 -r -t 100,100 + +# FGSf64y0106m_a1f.fits is an I32 image. +compress FGSf64y0106m_a1f.fits rice -r +compress FGSf64y0106m_a1f.fits gzip1 -g + +# --- float images (quantize; with and without subtractive dither) ----------- +# Note: RICE/HCOMPRESS need quantization for floats. `-q ` quantizes with +# subtractive dithering; `-q0 ` (no space) quantizes with NO dithering +# (ZQUANTIZ=NO_DITHER); lossless float (`-q 0`, level zero) is GZIP-only. +# FOCx38i0101t_c0f.fits is F32 1024x1024. +compress FOCx38i0101t_c0f.fits rice_dith -r -q 16 # quantize + subtractive dither +compress FOCx38i0101t_c0f.fits rice_nodith -r -q0 16 # quantize, no dithering +compress FOCx38i0101t_c0f.fits gzip_lossless -g -q 0 # lossless float (no quantization) +compress FOCx38i0101t_c0f.fits hcomp -h + +# WFPC2u5780205r_c0fx.fits is an F32 200x200x4 cube (multi-axis tiling). +compress WFPC2u5780205r_c0fx.fits rice_dith -r -q 16 +compress WFPC2u5780205r_c0fx.fits rice_nodith -r -q0 16 + +echo +echo "Generated ${#GENERATED[@]} fixtures:" +printf ' %s\n' "${GENERATED[@]}" + +# --- optional upload -------------------------------------------------------- +if [[ "$UPLOAD" -eq 1 ]]; then + if command -v gcloud >/dev/null 2>&1; then + GSCMD=(gcloud storage cp) + elif command -v gsutil >/dev/null 2>&1; then + GSCMD=(gsutil cp) + else + echo "error: --upload requested but neither 'gcloud' nor 'gsutil' is installed." >&2 + exit 1 + fi + echo + echo "Uploading to $BUCKET ..." + for f in "${GENERATED[@]}"; do + "${GSCMD[@]}" "$SAMP_DIR/$f" "$BUCKET/$f" + echo " uploaded $f" + done +fi + +echo +echo "Next steps:" +echo " 1. Add these filenames to the download loop in .github/workflows/ci.yml" +echo " (and bump the cache key, e.g. fits4-samples-v1 -> v2)." +echo " 2. Add assertions in tests/sample_files.rs using hdu.as_compressed_image()." +if [[ "$UPLOAD" -eq 0 ]]; then + echo " 3. Re-run with --upload to push fixtures to $BUCKET." +fi diff --git a/src/ascii_table.rs b/src/ascii_table.rs index 7c87081..f7595d1 100644 --- a/src/ascii_table.rs +++ b/src/ascii_table.rs @@ -49,7 +49,9 @@ impl AsciiFormat { _ => unreachable!(), } } - _ => Err(Error::InvalidTableFormat(format!("unknown ASCII TFORM code: {s}"))), + _ => Err(Error::InvalidTableFormat(format!( + "unknown ASCII TFORM code: {s}" + ))), } } @@ -120,7 +122,9 @@ impl AsciiTable { let tscal = header.get_float(&format!("TSCAL{i}")).unwrap_or(1.0); let tzero = header.get_float(&format!("TZERO{i}")).unwrap_or(0.0); - let tunit = header.get_string(&format!("TUNIT{i}")).map(|s| s.to_string()); + let tunit = header + .get_string(&format!("TUNIT{i}")) + .map(|s| s.to_string()); columns.push(AsciiColumn { name, @@ -191,14 +195,22 @@ impl AsciiTable { /// Populate header keywords for this table. pub fn fill_header(&self, header: &mut Header) { - header.set("XTENSION", HeaderValue::String("TABLE".into()), Some("ASCII table extension")); + header.set( + "XTENSION", + HeaderValue::String("TABLE".into()), + Some("ASCII table extension"), + ); header.set("BITPIX", HeaderValue::Integer(8), None); header.set("NAXIS", HeaderValue::Integer(2), None); header.set("NAXIS1", HeaderValue::Integer(self.row_len as i64), None); header.set("NAXIS2", HeaderValue::Integer(self.nrows as i64), None); header.set("PCOUNT", HeaderValue::Integer(0), None); header.set("GCOUNT", HeaderValue::Integer(1), None); - header.set("TFIELDS", HeaderValue::Integer(self.columns.len() as i64), None); + header.set( + "TFIELDS", + HeaderValue::Integer(self.columns.len() as i64), + None, + ); for (i, col) in self.columns.iter().enumerate() { let idx = i + 1; @@ -220,25 +232,13 @@ impl AsciiTable { AsciiFormat::Ewd(w, d) => format!("E{w}.{d}"), AsciiFormat::Dwd(w, d) => format!("D{w}.{d}"), }; - header.set( - &format!("TFORM{idx}"), - HeaderValue::String(fmt_str), - None, - ); + header.set(&format!("TFORM{idx}"), HeaderValue::String(fmt_str), None); if col.tscal != 1.0 { - header.set( - &format!("TSCAL{idx}"), - HeaderValue::Float(col.tscal), - None, - ); + header.set(&format!("TSCAL{idx}"), HeaderValue::Float(col.tscal), None); } if col.tzero != 0.0 { - header.set( - &format!("TZERO{idx}"), - HeaderValue::Float(col.tzero), - None, - ); + header.set(&format!("TZERO{idx}"), HeaderValue::Float(col.tzero), None); } if let Some(ref unit) = col.tunit { header.set( diff --git a/src/bintable.rs b/src/bintable.rs index f794c60..c226459 100644 --- a/src/bintable.rs +++ b/src/bintable.rs @@ -52,7 +52,9 @@ impl BinColumnType { } if code_pos >= bytes.len() { - return Err(Error::InvalidTableFormat(format!("no type code in TFORM: {s}"))); + return Err(Error::InvalidTableFormat(format!( + "no type code in TFORM: {s}" + ))); } let repeat: usize = if code_pos == 0 { @@ -92,7 +94,10 @@ impl BinColumnType { b'D' => Ok(BinColumnType::D64(repeat)), b'C' => Ok(BinColumnType::C64(repeat)), b'M' => Ok(BinColumnType::M128(repeat)), - _ => Err(Error::InvalidTableFormat(format!("unknown BINTABLE type code: {}", code as char))), + _ => Err(Error::InvalidTableFormat(format!( + "unknown BINTABLE type code: {}", + code as char + ))), } } @@ -210,7 +215,9 @@ impl BinTable { let tscal = header.get_float(&format!("TSCAL{i}")).unwrap_or(1.0); let tzero = header.get_float(&format!("TZERO{i}")).unwrap_or(0.0); - let tunit = header.get_string(&format!("TUNIT{i}")).map(|s| s.to_string()); + let tunit = header + .get_string(&format!("TUNIT{i}")) + .map(|s| s.to_string()); columns.push(BinColumn { name, @@ -266,12 +273,8 @@ impl BinTable { let vals: Vec = bytes[..*n].iter().map(|&b| b == b'T').collect(); Ok(BinCellValue::Logical(vals)) } - BinColumnType::Bit(n) => { - Ok(BinCellValue::Bits(bytes.to_vec(), *n)) - } - BinColumnType::Byte(n) => { - Ok(BinCellValue::Bytes(bytes[..*n].to_vec())) - } + BinColumnType::Bit(n) => Ok(BinCellValue::Bits(bytes.to_vec(), *n)), + BinColumnType::Byte(n) => Ok(BinCellValue::Bytes(bytes[..*n].to_vec())), BinColumnType::I16(n) => { let vals: Vec = bytes .chunks_exact(2) @@ -359,7 +362,12 @@ impl BinTable { } /// Read variable-length array data from the heap. - fn read_heap_array(&self, elem_type: char, count: usize, offset: usize) -> Result { + fn read_heap_array( + &self, + elem_type: char, + count: usize, + offset: usize, + ) -> Result { if count == 0 { return match elem_type { 'B' => Ok(BinCellValue::Bytes(Vec::new())), @@ -370,7 +378,9 @@ impl BinTable { 'D' => Ok(BinCellValue::F64(Vec::new())), 'A' => Ok(BinCellValue::String(String::new())), 'L' => Ok(BinCellValue::Logical(Vec::new())), - _ => Err(Error::InvalidTableFormat(format!("unsupported VLA element type: {elem_type}"))), + _ => Err(Error::InvalidTableFormat(format!( + "unsupported VLA element type: {elem_type}" + ))), }; } @@ -381,7 +391,11 @@ impl BinTable { 'K' | 'D' => 8, 'C' => 8, 'M' => 16, - _ => return Err(Error::InvalidTableFormat(format!("unsupported VLA element type: {elem_type}"))), + _ => { + return Err(Error::InvalidTableFormat(format!( + "unsupported VLA element type: {elem_type}" + ))) + } }; let end = offset + count * elem_size; @@ -395,7 +409,9 @@ impl BinTable { let data = &self.heap[offset..end]; match elem_type { - 'L' => Ok(BinCellValue::Logical(data.iter().map(|&b| b == b'T').collect())), + 'L' => Ok(BinCellValue::Logical( + data.iter().map(|&b| b == b'T').collect(), + )), 'B' => Ok(BinCellValue::Bytes(data.to_vec())), 'A' => { let s = std::str::from_utf8(data) @@ -429,20 +445,30 @@ impl BinTable { .map(|c| f64::from_be_bytes(c.try_into().unwrap())) .collect(), )), - _ => Err(Error::InvalidTableFormat(format!("unsupported VLA type: {elem_type}"))), + _ => Err(Error::InvalidTableFormat(format!( + "unsupported VLA type: {elem_type}" + ))), } } /// Populate header keywords for this binary table. pub fn fill_header(&self, header: &mut Header) { - header.set("XTENSION", HeaderValue::String("BINTABLE".into()), Some("binary table extension")); + header.set( + "XTENSION", + HeaderValue::String("BINTABLE".into()), + Some("binary table extension"), + ); header.set("BITPIX", HeaderValue::Integer(8), None); header.set("NAXIS", HeaderValue::Integer(2), None); header.set("NAXIS1", HeaderValue::Integer(self.row_len as i64), None); header.set("NAXIS2", HeaderValue::Integer(self.nrows as i64), None); header.set("PCOUNT", HeaderValue::Integer(self.heap.len() as i64), None); header.set("GCOUNT", HeaderValue::Integer(1), None); - header.set("TFIELDS", HeaderValue::Integer(self.columns.len() as i64), None); + header.set( + "TFIELDS", + HeaderValue::Integer(self.columns.len() as i64), + None, + ); for (i, col) in self.columns.iter().enumerate() { let idx = i + 1; @@ -458,18 +484,10 @@ impl BinTable { ); if col.tscal != 1.0 { - header.set( - &format!("TSCAL{idx}"), - HeaderValue::Float(col.tscal), - None, - ); + header.set(&format!("TSCAL{idx}"), HeaderValue::Float(col.tscal), None); } if col.tzero != 0.0 { - header.set( - &format!("TZERO{idx}"), - HeaderValue::Float(col.tzero), - None, - ); + header.set(&format!("TZERO{idx}"), HeaderValue::Float(col.tzero), None); } if let Some(ref unit) = col.tunit { header.set( @@ -663,12 +681,30 @@ mod tests { #[test] fn parse_tform_codes() { - assert!(matches!(BinColumnType::parse("1J").unwrap(), BinColumnType::J32(1))); - assert!(matches!(BinColumnType::parse("10E").unwrap(), BinColumnType::E32(10))); - assert!(matches!(BinColumnType::parse("8A").unwrap(), BinColumnType::Char(8))); - assert!(matches!(BinColumnType::parse("D").unwrap(), BinColumnType::D64(1))); - assert!(matches!(BinColumnType::parse("1PJ").unwrap(), BinColumnType::VarP('J'))); - assert!(matches!(BinColumnType::parse("1QD").unwrap(), BinColumnType::VarQ('D'))); + assert!(matches!( + BinColumnType::parse("1J").unwrap(), + BinColumnType::J32(1) + )); + assert!(matches!( + BinColumnType::parse("10E").unwrap(), + BinColumnType::E32(10) + )); + assert!(matches!( + BinColumnType::parse("8A").unwrap(), + BinColumnType::Char(8) + )); + assert!(matches!( + BinColumnType::parse("D").unwrap(), + BinColumnType::D64(1) + )); + assert!(matches!( + BinColumnType::parse("1PJ").unwrap(), + BinColumnType::VarP('J') + )); + assert!(matches!( + BinColumnType::parse("1QD").unwrap(), + BinColumnType::VarQ('D') + )); } #[test] @@ -684,7 +720,9 @@ mod tests { #[test] fn tform_round_trip() { - for s in &["1J", "10E", "8A", "1D", "3I", "1L", "20X", "1C", "1M", "1PJ", "1QD"] { + for s in &[ + "1J", "10E", "8A", "1D", "3I", "1L", "20X", "1C", "1M", "1PJ", "1QD", + ] { let ct = BinColumnType::parse(s).unwrap(); let out = ct.to_tform_string(); let ct2 = BinColumnType::parse(&out).unwrap(); diff --git a/src/checksum.rs b/src/checksum.rs index 0b812d2..a2bc8b5 100644 --- a/src/checksum.rs +++ b/src/checksum.rs @@ -149,7 +149,11 @@ pub fn decode_checksum(ascii: &str, complement: bool) -> u32 { } let sum = (hi << 16) | lo; - if complement { !sum } else { sum } + if complement { + !sum + } else { + sum + } } /// Compute DATASUM for a data byte buffer (should be block-padded). diff --git a/src/error.rs b/src/error.rs index 4275a7f..f30e6b7 100644 --- a/src/error.rs +++ b/src/error.rs @@ -14,6 +14,8 @@ pub enum Error { ChecksumMismatch { expected: u32, actual: u32 }, UnsupportedExtension(String), InvalidTableFormat(String), + UnsupportedCompression(String), + CompressionError(String), } impl fmt::Display for Error { @@ -29,10 +31,15 @@ impl fmt::Display for Error { write!(f, "data size mismatch: expected {expected}, got {actual}") } Error::ChecksumMismatch { expected, actual } => { - write!(f, "checksum mismatch: expected {expected:#010x}, got {actual:#010x}") + write!( + f, + "checksum mismatch: expected {expected:#010x}, got {actual:#010x}" + ) } Error::UnsupportedExtension(s) => write!(f, "unsupported extension: {s}"), Error::InvalidTableFormat(s) => write!(f, "invalid table format: {s}"), + Error::UnsupportedCompression(s) => write!(f, "unsupported compression: {s}"), + Error::CompressionError(s) => write!(f, "compression error: {s}"), } } } diff --git a/src/hdu.rs b/src/hdu.rs index c680774..6af05d6 100644 --- a/src/hdu.rs +++ b/src/hdu.rs @@ -32,7 +32,11 @@ impl Hdu { /// Create a primary HDU with image data. pub fn primary_image(image: ImageData) -> Self { let mut header = Header::new(); - header.set("SIMPLE", HeaderValue::Logical(true), Some("conforms to FITS standard")); + header.set( + "SIMPLE", + HeaderValue::Logical(true), + Some("conforms to FITS standard"), + ); image.fill_header(&mut header); Hdu { header, @@ -43,7 +47,11 @@ impl Hdu { /// Create a primary HDU with no data. pub fn primary_empty() -> Self { let mut header = Header::new(); - header.set("SIMPLE", HeaderValue::Logical(true), Some("conforms to FITS standard")); + header.set( + "SIMPLE", + HeaderValue::Logical(true), + Some("conforms to FITS standard"), + ); header.set("BITPIX", HeaderValue::Integer(8), None); header.set("NAXIS", HeaderValue::Integer(0), None); Hdu { @@ -55,7 +63,11 @@ impl Hdu { /// Create an IMAGE extension HDU. pub fn image_extension(image: ImageData) -> Self { let mut header = Header::new(); - header.set("XTENSION", HeaderValue::String("IMAGE".into()), Some("image extension")); + header.set( + "XTENSION", + HeaderValue::String("IMAGE".into()), + Some("image extension"), + ); image.fill_header(&mut header); header.set("PCOUNT", HeaderValue::Integer(0), None); header.set("GCOUNT", HeaderValue::Integer(1), None); @@ -85,6 +97,39 @@ impl Hdu { } } + /// View this HDU as a tile-compressed image, if it is one. + /// + /// A tile-compressed image is stored on disk as a `BINTABLE` extension with + /// `ZIMAGE = T` (the FITS Tiled Image Compression convention). This is a *cheap* + /// detection step: it inspects the header and, when it matches, returns a + /// [`CompressedImage`] view that borrows the underlying [`BinTable`] — preserving + /// the original compressed tiles for lossless round-trip writing. + /// + /// Returns `None` when the HDU is not a compressed-image BINTABLE. The actual + /// decoding happens in [`CompressedImage::decompress`]: + /// + /// ```no_run + /// # use fits4::FitsFile; + /// # let fits = FitsFile::from_file("compressed.fits").unwrap(); + /// for hdu in fits.extensions() { + /// if let Some(cimg) = hdu.as_compressed_image() { + /// let image = cimg.decompress().unwrap(); + /// println!("{:?}", image.axes); + /// } + /// } + /// ``` + pub fn as_compressed_image(&self) -> Option> { + if !crate::tile_compress::CompressedImage::detect(&self.header) { + return None; + } + match &self.data { + HduData::BinTable(table) => { + crate::tile_compress::CompressedImage::from_bintable(&self.header, table).ok() + } + _ => None, + } + } + /// Read an HDU from a reader. pub fn read_from(reader: &mut R) -> Result { let header = Header::read_from(reader)?; diff --git a/src/header.rs b/src/header.rs index 82562ea..11be010 100644 --- a/src/header.rs +++ b/src/header.rs @@ -79,9 +79,7 @@ impl Header { let mut buf = [0u8; BLOCK_SIZE]; 'outer: loop { - reader - .read_exact(&mut buf) - .map_err(Error::Io)?; + reader.read_exact(&mut buf).map_err(Error::Io)?; for i in 0..RECORDS_PER_BLOCK { let start = i * RECORD_SIZE; @@ -89,7 +87,9 @@ impl Header { buf[start..start + RECORD_SIZE].try_into().unwrap(); // Check for END keyword - if &record[..8] == b"END " || record[..3] == *b"END" && record[3..8].iter().all(|&b| b == b' ') { + if &record[..8] == b"END " + || record[..3] == *b"END" && record[3..8].iter().all(|&b| b == b' ') + { break 'outer; } @@ -162,9 +162,9 @@ impl Header { for i in 1..=naxis { let key = format!("NAXIS{i}"); let n = self.require_int(&key)? as usize; - product = product.checked_mul(n).ok_or_else(|| { - Error::InvalidFormat("axis dimensions overflow".into()) - })?; + product = product + .checked_mul(n) + .ok_or_else(|| Error::InvalidFormat("axis dimensions overflow".into()))?; } let pcount = self.get_int("PCOUNT").unwrap_or(0) as usize; @@ -233,7 +233,11 @@ mod tests { #[test] fn header_round_trip() { let mut h = Header::new(); - h.set("SIMPLE", HeaderValue::Logical(true), Some("conforms to standard")); + h.set( + "SIMPLE", + HeaderValue::Logical(true), + Some("conforms to standard"), + ); h.set("BITPIX", HeaderValue::Integer(16), None); h.set("NAXIS", HeaderValue::Integer(0), None); diff --git a/src/image_conv.rs b/src/image_conv.rs index e9c395a..9e4d4b3 100644 --- a/src/image_conv.rs +++ b/src/image_conv.rs @@ -44,10 +44,14 @@ impl ImageData { _ => { // Normalize to u16 range let scaled = self.scaled_values(bscale, bzero); - let (min, max) = scaled.iter().fold((f64::MAX, f64::MIN), |(mn, mx), &v| { - (mn.min(v), mx.max(v)) - }); - let range = if (max - min).abs() < 1e-30 { 1.0 } else { max - min }; + let (min, max) = scaled + .iter() + .fold((f64::MAX, f64::MIN), |(mn, mx), &v| (mn.min(v), mx.max(v))); + let range = if (max - min).abs() < 1e-30 { + 1.0 + } else { + max - min + }; let pixels: Vec = scaled .iter() diff --git a/src/image_data.rs b/src/image_data.rs index 16aed08..33a76f2 100644 --- a/src/image_data.rs +++ b/src/image_data.rs @@ -1,5 +1,5 @@ use crate::error::{Error, Result}; -use crate::header::{Header}; +use crate::header::Header; use crate::keyword::HeaderValue; use crate::types::Bitpix; diff --git a/src/keyword.rs b/src/keyword.rs index 8e3908c..3ef44ac 100644 --- a/src/keyword.rs +++ b/src/keyword.rs @@ -204,7 +204,11 @@ impl Keyword { fn format_value(value: &HeaderValue) -> String { match value { HeaderValue::Logical(b) => { - if *b { "T".to_string() } else { "F".to_string() } + if *b { + "T".to_string() + } else { + "F".to_string() + } } HeaderValue::Integer(i) => format!("{i}"), HeaderValue::Float(f) => format_float(*f), @@ -443,7 +447,9 @@ fn parse_complex_value(s: &str) -> Result<(HeaderValue, Option)> { let inner = &s[1..close]; let parts: Vec<&str> = inner.split(',').collect(); if parts.len() != 2 { - return Err(Error::InvalidKeyword("complex value must have two components".into())); + return Err(Error::InvalidKeyword( + "complex value must have two components".into(), + )); } let rest = &s[close + 1..]; @@ -471,7 +477,11 @@ fn split_comment(s: &str) -> (&str, Option) { if let Some(pos) = s.find('/') { let val = &s[..pos]; let cmt = s[pos + 1..].trim(); - let comment = if cmt.is_empty() { None } else { Some(cmt.to_string()) }; + let comment = if cmt.is_empty() { + None + } else { + Some(cmt.to_string()) + }; (val, comment) } else { (s, None) @@ -482,7 +492,11 @@ fn parse_trailing_comment(s: &str) -> Option { let trimmed = s.trim_start(); if let Some(rest) = trimmed.strip_prefix('/') { let cmt = rest.trim(); - if cmt.is_empty() { None } else { Some(cmt.to_string()) } + if cmt.is_empty() { + None + } else { + Some(cmt.to_string()) + } } else { None } diff --git a/src/lib.rs b/src/lib.rs index 06c65d7..7c39eec 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -10,7 +10,12 @@ //! - **ASCII TABLE extension** — Aw, Iw, Fw.d, Ew.d, Dw.d column formats //! - **BINTABLE extension** — all type codes including variable-length arrays (P/Q descriptors) //! -//! Tile-compressed images (`ZIMAGE`) and random groups are not supported. +//! Tile-compressed images (`ZIMAGE`) are supported for **reading** integer images +//! (`RICE_1`; `GZIP_1`/`GZIP_2` behind the `gzip` feature) via +//! [`Hdu::as_compressed_image`](hdu::Hdu::as_compressed_image) + +//! [`CompressedImage::decompress`](tile_compress::CompressedImage::decompress). +//! Float quantization/dithering, `PLIO_1`, `HCOMPRESS_1`, and the write/compress +//! path are not yet implemented. Random groups are not supported. //! //! ## Quick start — reading //! @@ -91,28 +96,33 @@ //! //! - **`image`** — enables conversion between [`ImageData`] and the //! [`image`](https://crates.io/crates/image) crate's `DynamicImage`. +//! - **`gzip`** — enables decoding of `GZIP_1`/`GZIP_2` tile-compressed images via +//! the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate. The +//! default build stays dependency-free; `RICE_1` works without this feature. -pub mod error; -pub mod types; -pub mod keyword; -pub mod header; -pub mod io_utils; -pub mod image_data; pub mod ascii_table; pub mod bintable; pub mod checksum; -pub mod hdu; +pub mod error; pub mod fits; +pub mod hdu; +pub mod header; +pub mod image_data; +pub mod io_utils; +pub mod keyword; +pub mod tile_compress; +pub mod types; #[cfg(feature = "image")] pub mod image_conv; +pub use ascii_table::AsciiTable; +pub use bintable::{BinCellValue, BinColumn, BinColumnType, BinTable, BinTableBuilder}; pub use error::{Error, Result}; -pub use types::Bitpix; -pub use keyword::{HeaderValue, Keyword}; +pub use fits::FitsFile; +pub use hdu::{Hdu, HduData}; pub use header::Header; pub use image_data::{ImageData, PixelData}; -pub use ascii_table::AsciiTable; -pub use bintable::{BinTable, BinTableBuilder, BinColumn, BinColumnType, BinCellValue}; -pub use hdu::{Hdu, HduData}; -pub use fits::FitsFile; +pub use keyword::{HeaderValue, Keyword}; +pub use tile_compress::{CompressedImage, CompressionType, Quantize, TileGeometry}; +pub use types::Bitpix; diff --git a/src/tile_compress.rs b/src/tile_compress.rs new file mode 100644 index 0000000..ac5720b --- /dev/null +++ b/src/tile_compress.rs @@ -0,0 +1,1109 @@ +//! Tiled image compression (the BINTABLE-based compressed-image convention). +//! +//! A tile-compressed image is stored physically as a `BINTABLE` extension but is +//! logically an image. The original image is divided into a rectangular grid of +//! *tiles*; each tile is compressed independently and stored as a byte stream in a +//! variable-length-array column (typically `COMPRESSED_DATA`). One table row holds +//! one tile, in row-major (axis-1-fastest) order. +//! +//! See the *Tiled Image Convention for Storing Compressed Images in FITS Binary +//! Tables* (v2.3) and the cfitsio implementation. +//! +//! # Status +//! +//! This module is being built up in phases (see `COMPRESSION_PLAN.md`): +//! +//! - Phase 0 (current): scaffolding + the [`rice_decompress_i32`] family of +//! decoders with unit tests. +//! - Phase 1: RICE_1 integer read path ([`CompressedImage::decompress`]). +//! - Later: GZIP_1/GZIP_2, float quantization/dithering, PLIO_1, HCOMPRESS_1, and a +//! compression (write) path. +//! +//! Most of the high-level pipeline is currently stubbed with `todo!()`. + +use crate::bintable::BinTable; +use crate::error::{Error, Result}; +use crate::header::Header; + +/// Compression algorithm named by the `ZCMPTYPE` keyword. +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum CompressionType { + /// `RICE_1` (also `RICE_ONE`) — Rice coding of pixel differences. Most common. + Rice1, + /// `GZIP_1` — DEFLATE over the raw tile bytes. + Gzip1, + /// `GZIP_2` — DEFLATE over byte-shuffled tile bytes. + Gzip2, + /// `HCOMPRESS_1` — H-transform + quantization (lossy/lossless). + Hcompress1, + /// `PLIO_1` — IRAF pixel-list run-length encoding (integer masks). + Plio1, + /// `NOCOMPRESS` — tiles stored uncompressed. + NoCompress, +} + +impl CompressionType { + /// Parse a `ZCMPTYPE` keyword value. + pub fn parse(s: &str) -> Result { + match s.trim().to_ascii_uppercase().as_str() { + "RICE_1" | "RICE_ONE" => Ok(CompressionType::Rice1), + "GZIP_1" => Ok(CompressionType::Gzip1), + "GZIP_2" => Ok(CompressionType::Gzip2), + "HCOMPRESS_1" => Ok(CompressionType::Hcompress1), + "PLIO_1" => Ok(CompressionType::Plio1), + "NOCOMPRESS" => Ok(CompressionType::NoCompress), + other => Err(Error::UnsupportedCompression(other.to_string())), + } + } +} + +/// Float quantization / dithering method named by `ZQUANTIZ`. +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum Quantize { + /// No dithering (or keyword absent). + None, + /// `SUBTRACTIVE_DITHER_1`. + SubtractiveDither1, + /// `SUBTRACTIVE_DITHER_2` (preserves exact zeros). + SubtractiveDither2, +} + +impl Quantize { + /// Parse a `ZQUANTIZ` keyword value (absent ⇒ [`Quantize::None`]). + pub fn parse(s: Option<&str>) -> Result { + match s.map(|v| v.trim().to_ascii_uppercase()) { + None => Ok(Quantize::None), + Some(v) => match v.as_str() { + "NO_DITHER" => Ok(Quantize::None), + "SUBTRACTIVE_DITHER_1" => Ok(Quantize::SubtractiveDither1), + "SUBTRACTIVE_DITHER_2" => Ok(Quantize::SubtractiveDither2), + other => Err(Error::UnsupportedCompression(format!( + "ZQUANTIZ={other}" + ))), + }, + } + } +} + +/// Tile-grid geometry derived from `ZNAXIS`/`ZNAXISn`/`ZTILEn`. +#[derive(Debug, Clone)] +pub struct TileGeometry { + /// Original image dimensions (`ZNAXISn`), axis-1 first. + pub znaxis: Vec, + /// Tile dimensions (`ZTILEn`), axis-1 first. + pub ztile: Vec, +} + +impl TileGeometry { + /// Build geometry from the compressed-image header. + /// + /// Defaults to row-by-row tiling (`ZTILE1 = ZNAXIS1`, others = 1) when the + /// `ZTILEn` keywords are absent. + pub fn from_header(header: &Header) -> Result { + let znaxis = header.require_int("ZNAXIS")? as usize; + let mut dims = Vec::with_capacity(znaxis); + for i in 1..=znaxis { + dims.push(header.require_int(&format!("ZNAXIS{i}"))? as usize); + } + + let mut tile = Vec::with_capacity(znaxis); + for i in 1..=znaxis { + let default = if i == 1 { dims[0] } else { 1 }; + let t = header + .get_int(&format!("ZTILE{i}")) + .map(|v| v as usize) + .unwrap_or(default); + tile.push(t); + } + + Ok(TileGeometry { + znaxis: dims, + ztile: tile, + }) + } + + /// Number of tiles along each axis. + pub fn tiles_per_axis(&self) -> Vec { + self.znaxis + .iter() + .zip(&self.ztile) + .map(|(&n, &t)| n.div_ceil(t.max(1))) + .collect() + } + + /// Total number of tiles (== expected number of table rows). + pub fn num_tiles(&self) -> usize { + self.tiles_per_axis().iter().product() + } +} + +/// A compressed image read from a BINTABLE, ready to be decompressed. +/// +/// Borrows the already-parsed [`BinTable`]; the heavy lifting happens in +/// [`CompressedImage::decompress`]. +#[derive(Debug)] +pub struct CompressedImage<'a> { + /// Compression algorithm. + pub ctype: CompressionType, + /// BITPIX of the original (uncompressed) image. + pub zbitpix: i64, + /// Tile-grid geometry. + pub geometry: TileGeometry, + /// Float quantization/dither method. + pub quantize: Quantize, + /// Dither seed (`ZDITHER0`), if present. + pub zdither0: Option, + /// Integer blank sentinel (`ZBLANK` keyword), if present. + pub blank: Option, + /// RICE `BLOCKSIZE` parameter (default 32). + pub blocksize: usize, + /// RICE `BYTEPIX` parameter (default 4). + pub bytepix: usize, + /// The borrowed binary table holding the compressed tiles + heap. + /// Read once [`CompressedImage::decompress`] is implemented (phase 1). + #[allow(dead_code)] + pub(crate) table: &'a BinTable, +} + +impl<'a> CompressedImage<'a> { + /// True if `header` describes a tile-compressed image (`ZIMAGE = T`). + pub fn detect(header: &Header) -> bool { + header.get_bool("ZIMAGE").unwrap_or(false) + } + + /// Build a [`CompressedImage`] from a compressed-image BINTABLE header and the + /// parsed table. Parses the `Z*` driver keywords; does not yet decompress. + pub fn from_bintable(header: &Header, table: &'a BinTable) -> Result { + if !Self::detect(header) { + return Err(Error::CompressionError( + "header is not a tile-compressed image (ZIMAGE != T)".into(), + )); + } + + let ctype = CompressionType::parse( + header + .get_string("ZCMPTYPE") + .ok_or_else(|| Error::MissingKeyword("ZCMPTYPE".into()))?, + )?; + let zbitpix = header.require_int("ZBITPIX")?; + let geometry = TileGeometry::from_header(header)?; + let quantize = Quantize::parse(header.get_string("ZQUANTIZ"))?; + let zdither0 = header.get_int("ZDITHER0"); + let blank = header.get_int("ZBLANK"); + + // Default RICE parameters; overridden by ZNAMEi='BLOCKSIZE'/'BYTEPIX'. + let mut blocksize = 32usize; + let mut bytepix = 4usize; + let nzparams = count_zname_params(header); + for i in 1..=nzparams { + if let Some(name) = header.get_string(&format!("ZNAME{i}")) { + let val = header.get_int(&format!("ZVAL{i}")).unwrap_or(0) as usize; + match name.trim().to_ascii_uppercase().as_str() { + "BLOCKSIZE" => blocksize = val, + "BYTEPIX" => bytepix = val, + _ => {} + } + } + } + + Ok(CompressedImage { + ctype, + zbitpix, + geometry, + quantize, + zdither0, + blank, + blocksize, + bytepix, + table, + }) + } + + /// The compression algorithm (`ZCMPTYPE`). Cheap inspector. + pub fn compression(&self) -> CompressionType { + self.ctype + } + + /// The tile-grid geometry (`ZNAXISn`/`ZTILEn`). Cheap inspector. + pub fn geometry(&self) -> &TileGeometry { + &self.geometry + } + + /// Decompress all tiles and reassemble the full image into an [`ImageData`]. + /// + /// Implemented for integer `ZBITPIX` (8/16/32/64) with `RICE_1` (and, behind the + /// `gzip` feature, `GZIP_1`/`GZIP_2`). Each tile is read from the + /// `COMPRESSED_DATA` variable-length-array column, decoded to a flat array of + /// integers, and scattered into the correct sub-rectangle of the output buffer + /// per the tile grid (axis-1 fastest). `ZBLANK` integer sentinels are carried + /// through unchanged into the output array. + /// + /// # Not yet supported (see `COMPRESSION_PLAN.md`) + /// + /// - Float `ZBITPIX` (-32/-64): requires quantization inverse (`ZSCALE`/`ZZERO`) + /// and subtractive dithering — phase 3. + /// - `PLIO_1` — phase 4. + /// - `HCOMPRESS_1` — phase 5. + /// - `GZIP_1`/`GZIP_2` without the `gzip` feature return + /// [`Error::UnsupportedCompression`]. + pub fn decompress(&self) -> Result { + use crate::image_data::{ImageData, PixelData}; + use crate::types::Bitpix; + + // Float originals require quantization inverse + dithering (phase 3). + if self.zbitpix < 0 { + return Err(Error::UnsupportedCompression(format!( + "float ZBITPIX={} (quantization/dithering decode is phase 3, see COMPRESSION_PLAN.md)", + self.zbitpix + ))); + } + + let bitpix = Bitpix::from_i64(self.zbitpix)?; + let npix: usize = self.geometry.znaxis.iter().product(); + + // Decode every tile into i64 values, in tile order (axis-1 fastest), then + // scatter into the full-image buffer. + let mut full = vec![0i64; npix]; + let tiles_per_axis = self.geometry.tiles_per_axis(); + let num_tiles = self.geometry.num_tiles(); + + if self.table.nrows < num_tiles { + return Err(Error::CompressionError(format!( + "compressed image expects {} tiles but BINTABLE has {} rows", + num_tiles, self.table.nrows + ))); + } + + let cdata_col = self.compressed_data_column()?; + + for tile_index in 0..num_tiles { + let raw = self.tile_bytes(tile_index, cdata_col)?; + // Number of pixels in this (possibly edge-truncated) tile. + let coords = unravel(tile_index, &tiles_per_axis); + let tile_dims = self.tile_dims_at(&coords); + let tile_npix: usize = tile_dims.iter().product(); + + let values = self.decode_tile(&raw, tile_npix)?; + if values.len() != tile_npix { + return Err(Error::CompressionError(format!( + "tile {tile_index} decoded {} values, expected {tile_npix}", + values.len() + ))); + } + scatter_tile( + &mut full, + &self.geometry.znaxis, + &self.geometry.ztile, + &tile_dims, + &coords, + &values, + ); + } + + // Narrow i64 -> the target integer storage. + let pixels = match bitpix { + Bitpix::U8 => PixelData::U8(full.iter().map(|&v| v as u8).collect()), + Bitpix::I16 => PixelData::I16(full.iter().map(|&v| v as i16).collect()), + Bitpix::I32 => PixelData::I32(full.iter().map(|&v| v as i32).collect()), + Bitpix::I64 => PixelData::I64(full), + Bitpix::F32 | Bitpix::F64 => unreachable!("float handled above"), + }; + + Ok(ImageData::new(self.geometry.znaxis.clone(), pixels)) + } + + /// Resolve the index of the `COMPRESSED_DATA` column. + fn compressed_data_column(&self) -> Result { + self.table + .columns + .iter() + .position(|c| c.name.trim().eq_ignore_ascii_case("COMPRESSED_DATA")) + .ok_or_else(|| { + Error::CompressionError("compressed image has no COMPRESSED_DATA column".into()) + }) + } + + /// Read the raw compressed bytes for one tile (one table row) from the VLA column. + fn tile_bytes(&self, row: usize, col: usize) -> Result> { + match self.table.get_cell(row, col)? { + crate::bintable::BinCellValue::Bytes(b) => Ok(b), + other => Err(Error::CompressionError(format!( + "COMPRESSED_DATA cell is not a byte VLA: {other:?}" + ))), + } + } + + /// Tile dimensions at tile-grid coordinates `coords`, accounting for edge tiles + /// that are truncated when `ZNAXISn` is not a multiple of `ZTILEn`. + fn tile_dims_at(&self, coords: &[usize]) -> Vec { + let mut dims = Vec::with_capacity(coords.len()); + for (axis, &c) in coords.iter().enumerate() { + let n = self.geometry.znaxis[axis]; + let t = self.geometry.ztile[axis].max(1); + let start = c * t; + dims.push((n - start).min(t)); + } + dims + } + + /// Decode one tile's bytes into a flat array of `tile_npix` integer values + /// (axis-1 fastest) according to the compression type. + fn decode_tile(&self, raw: &[u8], tile_npix: usize) -> Result> { + match self.ctype { + CompressionType::Rice1 => match self.bytepix { + 1 => Ok(rice_decompress_i8(raw, tile_npix, self.blocksize)? + .into_iter() + .map(|v| v as i64) + .collect()), + 2 => Ok(rice_decompress_i16(raw, tile_npix, self.blocksize)? + .into_iter() + .map(|v| v as i64) + .collect()), + 4 => Ok(rice_decompress_i32(raw, tile_npix, self.blocksize)? + .into_iter() + .map(|v| v as i64) + .collect()), + other => Err(Error::CompressionError(format!( + "RICE_1 BYTEPIX={other} not supported (expected 1, 2, or 4)" + ))), + }, + CompressionType::NoCompress => self.bytes_to_ints(raw, tile_npix), + CompressionType::Gzip1 => { + let inflated = gzip_inflate(raw)?; + self.bytes_to_ints(&inflated, tile_npix) + } + CompressionType::Gzip2 => { + let inflated = gzip_inflate(raw)?; + let unshuffled = gzip2_unshuffle(&inflated, self.zbitpix_bytes()); + self.bytes_to_ints(&unshuffled, tile_npix) + } + CompressionType::Plio1 => Err(Error::UnsupportedCompression( + "PLIO_1 decode is phase 4 (see COMPRESSION_PLAN.md)".into(), + )), + CompressionType::Hcompress1 => Err(Error::UnsupportedCompression( + "HCOMPRESS_1 decode is phase 5 (see COMPRESSION_PLAN.md)".into(), + )), + } + } + + /// Bytes-per-pixel of the original (uncompressed) integer image. + fn zbitpix_bytes(&self) -> usize { + (self.zbitpix.unsigned_abs() as usize) / 8 + } + + /// Interpret a big-endian byte buffer as `tile_npix` integers of width + /// `ZBITPIX` (used by NOCOMPRESS and the GZIP codecs, which store the raw image + /// integers, not RICE-coded differences). + fn bytes_to_ints(&self, bytes: &[u8], tile_npix: usize) -> Result> { + let width = self.zbitpix_bytes(); + if width == 0 { + return Err(Error::CompressionError("invalid ZBITPIX width".into())); + } + let avail = bytes.len() / width; + if avail < tile_npix { + return Err(Error::CompressionError(format!( + "tile has {avail} integers, expected {tile_npix}" + ))); + } + let mut out = Vec::with_capacity(tile_npix); + for c in bytes.chunks_exact(width).take(tile_npix) { + let v = match width { + 1 => c[0] as i64, // ZBITPIX=8 is unsigned bytes + 2 => i16::from_be_bytes([c[0], c[1]]) as i64, + 4 => i32::from_be_bytes([c[0], c[1], c[2], c[3]]) as i64, + 8 => i64::from_be_bytes(c.try_into().unwrap()), + _ => return Err(Error::CompressionError("invalid ZBITPIX width".into())), + }; + out.push(v); + } + Ok(out) + } +} + +/// Convert a linear tile index into per-axis tile coordinates (axis-1 fastest). +fn unravel(mut index: usize, tiles_per_axis: &[usize]) -> Vec { + let mut coords = vec![0usize; tiles_per_axis.len()]; + for axis in 0..tiles_per_axis.len() { + let n = tiles_per_axis[axis].max(1); + coords[axis] = index % n; + index /= n; + } + coords +} + +/// Scatter a decoded tile's flat `values` (axis-1 fastest) into the full-image +/// buffer `full` (also axis-1 fastest), placing it at tile-grid `coords`. +/// +/// - `image_dims` are the full `ZNAXISn` (axis-1 first). +/// - `ztile` are the nominal tile sizes (`ZTILEn`); used to locate the tile origin. +/// - `tile_dims` are the (possibly edge-truncated) dimensions of *this* tile; used +/// to size the copy. They equal `ztile` except for tiles on a high edge where +/// `ZNAXISn` is not a multiple of `ZTILEn`. +fn scatter_tile( + full: &mut [i64], + image_dims: &[usize], + ztile: &[usize], + tile_dims: &[usize], + coords: &[usize], + values: &[i64], +) { + let ndim = image_dims.len(); + // Per-axis stride into the full image buffer (axis-1 fastest => stride 1). + let mut img_stride = vec![1usize; ndim]; + for axis in 1..ndim { + img_stride[axis] = img_stride[axis - 1] * image_dims[axis - 1]; + } + // Origin pixel of this tile in the full image: coords stepped by the *nominal* + // tile size along each axis. + let mut origin = 0usize; + for axis in 0..ndim { + origin += coords[axis] * ztile[axis].max(1) * img_stride[axis]; + } + + // Iterate over every pixel of the tile via its multi-index (axis-1 fastest). + let tile_npix: usize = tile_dims.iter().product(); + let mut tcoord = vec![0usize; ndim]; + for &val in values.iter().take(tile_npix) { + let mut dst = origin; + for axis in 0..ndim { + dst += tcoord[axis] * img_stride[axis]; + } + full[dst] = val; + // Increment the tile multi-index (axis-1 fastest). + for axis in 0..ndim { + tcoord[axis] += 1; + if tcoord[axis] < tile_dims[axis] { + break; + } + tcoord[axis] = 0; + } + } +} + +/// Count the contiguous `ZNAMEi`/`ZVALi` parameter pairs present in the header. +fn count_zname_params(header: &Header) -> usize { + let mut n = 0; + while header.find(&format!("ZNAME{}", n + 1)).is_some() { + n += 1; + } + n +} + +// --------------------------------------------------------------------------- +// RICE_1 decompression +// --------------------------------------------------------------------------- +// +// Port of cfitsio `fits_rdecomp` / `_short` / `_byte` (R. White, STScI). The first +// pixel is stored verbatim; subsequent pixels are stored as Rice-coded, zigzag-mapped +// differences from the previous pixel, in blocks of `blocksize` pixels. Each block is +// prefixed by `fsbits` bits giving `fs+1`: +// * fs < 0 -> all differences in the block are zero +// * fs == fsmax -> each difference is stored verbatim in `bbits` bits +// * else -> Rice code: unary leading zeros give the high bits, then `fs` +// low bits; combine as `(nzero << fs) | low`. + +/// A simple big-endian-ish MSB-first bit reader over a byte slice. +struct BitReader<'a> { + data: &'a [u8], + byte_pos: usize, + /// Number of valid bits remaining in the current `buffer` (0..=8). + bits_in_buf: u32, + buffer: u32, +} + +impl<'a> BitReader<'a> { + fn new(data: &'a [u8]) -> Self { + BitReader { + data, + byte_pos: 0, + bits_in_buf: 0, + buffer: 0, + } + } + + /// Read `n` bits (0..=32) MSB-first as an unsigned value. + fn read_bits(&mut self, n: u32) -> Result { + let mut result: u32 = 0; + let mut need = n; + while need > 0 { + if self.bits_in_buf == 0 { + let b = *self + .data + .get(self.byte_pos) + .ok_or_else(|| Error::CompressionError("RICE: unexpected end of stream".into()))?; + self.byte_pos += 1; + self.buffer = b as u32; + self.bits_in_buf = 8; + } + let take = need.min(self.bits_in_buf); + let shift = self.bits_in_buf - take; + let mask = if take == 32 { u32::MAX } else { (1u32 << take) - 1 }; + let bits = (self.buffer >> shift) & mask; + result = (result << take) | bits; + self.bits_in_buf -= take; + need -= take; + } + Ok(result) + } + + /// Count and consume leading zero bits, then consume the terminating one-bit. + /// Returns the number of zero bits seen. + fn count_leading_zeros(&mut self) -> Result { + let mut count = 0u32; + loop { + if self.bits_in_buf == 0 { + let b = *self + .data + .get(self.byte_pos) + .ok_or_else(|| Error::CompressionError("RICE: unexpected end of stream".into()))?; + self.byte_pos += 1; + self.buffer = b as u32; + self.bits_in_buf = 8; + } + // Inspect the top valid bit. + let top = (self.buffer >> (self.bits_in_buf - 1)) & 1; + self.bits_in_buf -= 1; + if top == 1 { + return Ok(count); + } + count += 1; + } + } +} + +/// Map a zigzag-encoded unsigned difference back to a signed difference. +#[inline] +fn unzigzag(u: u64) -> i64 { + if u & 1 == 0 { + (u >> 1) as i64 + } else { + !((u >> 1) as i64) + } +} + +/// Core RICE_1 decode shared by the 8/16/32-bit variants. +/// +/// `fsbits`/`fsmax`/`bbits` are the per-variant constants. Returns reconstructed +/// values as `i64` (caller narrows to the target type). +fn rice_decompress_core( + src: &[u8], + nvals: usize, + blocksize: usize, + fsbits: u32, + fsmax: u32, + bbits: u32, +) -> Result> { + if nvals == 0 { + return Ok(Vec::new()); + } + let blocksize = blocksize.max(1); + let mut out = Vec::with_capacity(nvals); + let mut reader = BitReader::new(src); + + // First value: bbits bits, verbatim (unsigned), interpreted as the raw integer. + // + // Note: the verbatim first value is the *seed* (`lastpix`), not itself a stored + // pixel. cfitsio computes `array[0] = lastpix + diff[0]` and decodes `nvals` + // differences in total (see `fits_rdecomp` in cfitsio `ricecomp.c`). Storing the + // seed as the first output pixel — and then decoding only `nvals - 1` diffs — + // shifts every value one position later, which is exactly wrong. + let first = reader.read_bits(bbits)? as u64; + let mut lastpix: i64 = sign_extend(first, bbits); + + while out.len() < nvals { + let fs_plus_1 = reader.read_bits(fsbits)?; + let fs = fs_plus_1 as i64 - 1; + let remaining = nvals - out.len(); + let block_n = remaining.min(blocksize); + + if fs < 0 { + // All differences zero. + for _ in 0..block_n { + out.push(lastpix); + } + } else if fs as u32 == fsmax { + // Verbatim differences in bbits bits each. + for _ in 0..block_n { + let raw = reader.read_bits(bbits)? as u64; + let diff = unzigzag(raw); + lastpix = lastpix.wrapping_add(diff); + out.push(lastpix); + } + } else { + let fs = fs as u32; + for _ in 0..block_n { + let high = reader.count_leading_zeros()? as u64; + let low = if fs > 0 { reader.read_bits(fs)? as u64 } else { 0 }; + let mapped = (high << fs) | low; + let diff = unzigzag(mapped); + lastpix = lastpix.wrapping_add(diff); + out.push(lastpix); + } + } + } + + Ok(out) +} + +/// Sign-extend the low `bits` bits of `v` into an `i64`. +#[inline] +fn sign_extend(v: u64, bits: u32) -> i64 { + if bits == 0 || bits >= 64 { + return v as i64; + } + let shift = 64 - bits; + ((v << shift) as i64) >> shift +} + +/// Decompress a RICE_1 stream of 32-bit integers (`BYTEPIX = 4`). +pub fn rice_decompress_i32(src: &[u8], nvals: usize, blocksize: usize) -> Result> { + let v = rice_decompress_core(src, nvals, blocksize, 5, 25, 32)?; + Ok(v.into_iter().map(|x| x as i32).collect()) +} + +/// Decompress a RICE_1 stream of 16-bit integers (`BYTEPIX = 2`). +pub fn rice_decompress_i16(src: &[u8], nvals: usize, blocksize: usize) -> Result> { + let v = rice_decompress_core(src, nvals, blocksize, 4, 14, 16)?; + Ok(v.into_iter().map(|x| x as i16).collect()) +} + +/// Decompress a RICE_1 stream of 8-bit integers (`BYTEPIX = 1`). +pub fn rice_decompress_i8(src: &[u8], nvals: usize, blocksize: usize) -> Result> { + let v = rice_decompress_core(src, nvals, blocksize, 3, 6, 8)?; + Ok(v.into_iter().map(|x| x as u8).collect()) +} + +// --------------------------------------------------------------------------- +// Other algorithms (stubs) +// --------------------------------------------------------------------------- + +/// Inflate a `GZIP_1` tile. +/// +/// In the FITS Tiled Image Compression convention, `GZIP_1` tiles are full gzip +/// members (RFC 1952: gzip header + DEFLATE body + CRC32/ISIZE trailer), exactly as +/// produced by zlib's `gzip` routines and cfitsio. This routine strips the gzip +/// wrapper and runs a raw DEFLATE inflate over the body. +/// +/// Available only with the `gzip` feature enabled; otherwise this returns +/// [`Error::UnsupportedCompression`]. +#[cfg(feature = "gzip")] +pub fn gzip_inflate(src: &[u8]) -> Result> { + let body = strip_gzip_wrapper(src)?; + miniz_oxide::inflate::decompress_to_vec(body) + .map_err(|e| Error::CompressionError(format!("DEFLATE inflate failed: {e:?}"))) +} + +/// Stub when the `gzip` feature is disabled: GZIP_1/GZIP_2 tiles cannot be decoded. +#[cfg(not(feature = "gzip"))] +pub fn gzip_inflate(_src: &[u8]) -> Result> { + Err(Error::UnsupportedCompression( + "GZIP_1/GZIP_2 require the `gzip` feature (miniz_oxide); rebuild with --features gzip".into(), + )) +} + +/// Parse and skip an RFC 1952 gzip member header, returning the DEFLATE body +/// (excluding the 8-byte CRC32+ISIZE trailer). Used by [`gzip_inflate`]. +#[cfg(feature = "gzip")] +fn strip_gzip_wrapper(src: &[u8]) -> Result<&[u8]> { + const FHCRC: u8 = 1 << 1; + const FEXTRA: u8 = 1 << 2; + const FNAME: u8 = 1 << 3; + const FCOMMENT: u8 = 1 << 4; + + if src.len() < 18 || src[0] != 0x1f || src[1] != 0x8b { + return Err(Error::CompressionError( + "GZIP_1 tile is not a valid gzip member (bad magic)".into(), + )); + } + if src[2] != 8 { + return Err(Error::CompressionError(format!( + "GZIP_1 unsupported compression method {}", + src[2] + ))); + } + let flags = src[3]; + let mut pos = 10usize; // fixed header: magic(2)+CM(1)+FLG(1)+MTIME(4)+XFL(1)+OS(1) + + let end = || Error::CompressionError("GZIP_1 truncated gzip header".to_string()); + + if flags & FEXTRA != 0 { + if pos + 2 > src.len() { + return Err(end()); + } + let xlen = u16::from_le_bytes([src[pos], src[pos + 1]]) as usize; + pos += 2 + xlen; + } + if flags & FNAME != 0 { + while pos < src.len() && src[pos] != 0 { + pos += 1; + } + pos += 1; // skip NUL + } + if flags & FCOMMENT != 0 { + while pos < src.len() && src[pos] != 0 { + pos += 1; + } + pos += 1; + } + if flags & FHCRC != 0 { + pos += 2; + } + if pos + 8 > src.len() { + return Err(end()); + } + // Body is everything between the header and the 8-byte trailer. + Ok(&src[pos..src.len() - 8]) +} + +/// `GZIP_2` stores the tile's integers with bytes shuffled into planes (all the +/// most-significant bytes first, then the next, etc.) before gzip. After inflating, +/// undo the shuffle to recover the original big-endian integer byte stream. +/// +/// `bytepix` is the bytes-per-pixel of the original integers (1/2/4/8). For +/// `bytepix <= 1` the shuffle is a no-op. +#[cfg(feature = "gzip")] +fn gzip2_unshuffle(shuffled: &[u8], bytepix: usize) -> Vec { + if bytepix <= 1 { + return shuffled.to_vec(); + } + let n = shuffled.len() / bytepix; + let mut out = vec![0u8; n * bytepix]; + for (i, chunk) in out.chunks_exact_mut(bytepix).enumerate().take(n) { + for (b, slot) in chunk.iter_mut().enumerate() { + *slot = shuffled[b * n + i]; + } + } + out +} + +/// `gzip2_unshuffle` is only reachable when the `gzip` feature is on (the GZIP_2 +/// branch errors out in `gzip_inflate` otherwise), so it is feature-gated to avoid a +/// dead-code warning in the default build. +#[cfg(not(feature = "gzip"))] +#[allow(dead_code)] +fn gzip2_unshuffle(shuffled: &[u8], _bytepix: usize) -> Vec { + shuffled.to_vec() +} + +/// Decompress a PLIO_1 (IRAF pixel-list RLE) tile into i32 mask values. TODO(phase 4). +pub fn plio_decompress(_src: &[u8], _nvals: usize) -> Result> { + todo!("phase 4: PLIO_1 pixel-list decode") +} + +#[cfg(test)] +mod tests { + use super::*; + + /// Minimal MSB-first bit writer mirroring [`BitReader`], for building test inputs. + struct BitWriter { + bytes: Vec, + cur: u8, + nbits: u32, + } + impl BitWriter { + fn new() -> Self { + BitWriter { + bytes: Vec::new(), + cur: 0, + nbits: 0, + } + } + fn put_bits(&mut self, val: u32, n: u32) { + for i in (0..n).rev() { + let bit = ((val >> i) & 1) as u8; + self.cur = (self.cur << 1) | bit; + self.nbits += 1; + if self.nbits == 8 { + self.bytes.push(self.cur); + self.cur = 0; + self.nbits = 0; + } + } + } + fn finish(mut self) -> Vec { + if self.nbits > 0 { + self.cur <<= 8 - self.nbits; + self.bytes.push(self.cur); + } + self.bytes + } + } + + fn zigzag(v: i64) -> u64 { + ((v << 1) ^ (v >> 63)) as u64 + } + + #[test] + fn unzigzag_round_trip() { + for v in [-5i64, -1, 0, 1, 2, 100, -100, i32::MIN as i64, i32::MAX as i64] { + assert_eq!(unzigzag(zigzag(v)), v); + } + } + + #[test] + fn bit_reader_basic() { + // 0b1011_0010, 0b1100_0000 + let data = [0b1011_0010u8, 0b1100_0000u8]; + let mut r = BitReader::new(&data); + assert_eq!(r.read_bits(4).unwrap(), 0b1011); + assert_eq!(r.read_bits(4).unwrap(), 0b0010); + assert_eq!(r.read_bits(2).unwrap(), 0b11); + } + + /// Hand-encode a small i32 RICE_1 stream (one block, fs known) and decode it. + /// + /// Mirrors cfitsio `fits_rcomp`: the first value is written verbatim as the *seed* + /// (`lastpix = vals[0]`), and `nvals` differences are written — the first of which + /// is `vals[0] - lastpix == 0`. The decoder then reconstructs `array[0] = seed + 0`. + #[test] + fn rice_i32_single_block_roundtrip() { + // Original values; encode diffs with a chosen fs so we control the layout. + let vals: Vec = vec![1000, 1003, 1001, 1005, 1002]; + let fs: u32 = 2; // low bits per code + let encoded = rice_encode_i32(&vals, fs); + + let decoded = rice_decompress_i32(&encoded, vals.len(), 32).unwrap(); + assert_eq!(decoded, vals); + } + + /// Zero-difference (constant) block: fs encoded as 0 (fs_plus_1 = 0 -> fs = -1). + #[test] + fn rice_i32_zero_block() { + let vals: Vec = vec![42, 42, 42, 42]; + let mut w = BitWriter::new(); + w.put_bits(vals[0] as u32, 32); + w.put_bits(0, 5); // fs_plus_1 = 0 => fs = -1 => all-zero block + let encoded = w.finish(); + let decoded = rice_decompress_i32(&encoded, vals.len(), 32).unwrap(); + assert_eq!(decoded, vals); + } + + #[test] + fn tile_geometry_default_row_by_row() { + let mut h = Header::new(); + h.set("ZNAXIS", HeaderValue::Integer(2), None); + h.set("ZNAXIS1", HeaderValue::Integer(100), None); + h.set("ZNAXIS2", HeaderValue::Integer(50), None); + let g = TileGeometry::from_header(&h).unwrap(); + assert_eq!(g.znaxis, vec![100, 50]); + assert_eq!(g.ztile, vec![100, 1]); // row-by-row default + assert_eq!(g.num_tiles(), 50); + } + + use crate::keyword::HeaderValue; + + // --- tile reassembly geometry -------------------------------------------- + + #[test] + fn unravel_axis1_fastest() { + // 3x2 grid of tiles: index increments axis-1 (x) fastest. + let tpa = [3usize, 2]; + assert_eq!(unravel(0, &tpa), vec![0, 0]); + assert_eq!(unravel(1, &tpa), vec![1, 0]); + assert_eq!(unravel(2, &tpa), vec![2, 0]); + assert_eq!(unravel(3, &tpa), vec![0, 1]); + assert_eq!(unravel(5, &tpa), vec![2, 1]); + } + + #[test] + fn scatter_full_tiles_2d() { + // 4x4 image, 2x2 tiles => 2x2 grid. Each tile filled with its index*10+local. + let image_dims = [4usize, 4]; + let ztile = [2usize, 2]; + let tiles_per_axis = [2usize, 2]; + let mut full = vec![-1i64; 16]; + for tile in 0..4 { + let coords = unravel(tile, &tiles_per_axis); + // tile-local values 0..4 (axis-1 fastest within the tile) + let vals: Vec = (0..4).map(|l| (tile as i64) * 100 + l as i64).collect(); + scatter_tile(&mut full, &image_dims, &ztile, &[2, 2], &coords, &vals); + } + // Expected layout (row-major, axis-1 fastest): + // row0: tile0[0] tile0[1] tile1[0] tile1[1] + // row1: tile0[2] tile0[3] tile1[2] tile1[3] + // row2: tile2[0] tile2[1] tile3[0] tile3[1] + // row3: tile2[2] tile2[3] tile3[2] tile3[3] + let expected = vec![ + 0, 1, 100, 101, // row0 + 2, 3, 102, 103, // row1 + 200, 201, 300, 301, // row2 + 202, 203, 302, 303, // row3 + ]; + assert_eq!(full, expected); + } + + #[test] + fn scatter_edge_truncated_tiles() { + // 3x3 image with 2x2 tiles => 2x2 grid, but edge tiles are 1 wide/tall. + let image_dims = [3usize, 3]; + let ztile = [2usize, 2]; + let tiles_per_axis = [2usize, 2]; + // tile dims: (0,0)=2x2, (1,0)=1x2, (0,1)=2x1, (1,1)=1x1 + let tile_dims = [vec![2, 2], vec![1, 2], vec![2, 1], vec![1, 1]]; + let mut full = vec![-1i64; 9]; + for tile in 0..4 { + let coords = unravel(tile, &tiles_per_axis); + let td = &tile_dims[tile]; + let n: usize = td.iter().product(); + let vals: Vec = (0..n).map(|l| (tile as i64) * 100 + l as i64).collect(); + scatter_tile(&mut full, &image_dims, &ztile, td, &coords, &vals); + } + // No -1 should remain; every pixel covered exactly once. + assert!(!full.contains(&-1)); + // Spot check origins: tile0 at (0,0)->index0; tile1 (x-tile 1) origin x=2 row0 => index2. + assert_eq!(full[0], 0); // tile0 local0 + assert_eq!(full[2], 100); // tile1 local0 + assert_eq!(full[6], 200); // tile2 local0 (row2, x0 => index 6) + assert_eq!(full[8], 300); // tile3 local0 (row2, x2 => index 8) + } + + // --- end-to-end RICE_1 decompress ---------------------------------------- + + /// Build a one-block (<=32 values) i32 RICE_1 stream for `vals` with low-bits `fs`. + /// + /// Mirrors cfitsio `fits_rcomp`: writes `vals[0]` verbatim as the seed, sets + /// `lastpix = vals[0]`, then emits exactly `vals.len()` differences — the first + /// being `vals[0] - lastpix == 0`. + fn rice_encode_i32(vals: &[i32], fs: u32) -> Vec { + let mut w = BitWriter::new(); + w.put_bits(vals[0] as u32, 32); + w.put_bits(fs + 1, 5); + let mut last = vals[0]; + for &v in vals { + let diff = (v - last) as i64; + last = v; + let mapped = zigzag(diff); + let high = (mapped >> fs) as u32; + let low = (mapped & ((1 << fs) - 1)) as u32; + for _ in 0..high { + w.put_bits(0, 1); + } + w.put_bits(1, 1); + w.put_bits(low, fs); + } + w.finish() + } + + #[test] + fn end_to_end_rice1_i32_two_tiles() { + use crate::bintable::{BinColumnType, BinTableBuilder}; + + // 4x1 image, tile = 2x1 => two row tiles of 2 pixels each. + let tile0 = [1000i32, 1003]; + let tile1 = [50i32, 47]; + let enc0 = rice_encode_i32(&tile0, 2); + let enc1 = rice_encode_i32(&tile1, 2); + + let table = BinTableBuilder::new() + .add_column("COMPRESSED_DATA", BinColumnType::VarP('B')) + .push_row(|r| r.write_var_p(enc0.len() as i32, |heap| heap.extend_from_slice(&enc0))) + .push_row(|r| r.write_var_p(enc1.len() as i32, |heap| heap.extend_from_slice(&enc1))) + .build(); + + let mut h = Header::new(); + h.set("ZIMAGE", HeaderValue::Logical(true), None); + h.set("ZCMPTYPE", HeaderValue::String("RICE_1".into()), None); + h.set("ZBITPIX", HeaderValue::Integer(32), None); + h.set("ZNAXIS", HeaderValue::Integer(1), None); + h.set("ZNAXIS1", HeaderValue::Integer(4), None); + h.set("ZTILE1", HeaderValue::Integer(2), None); + // BYTEPIX = 4 for i32 RICE. + h.set("ZNAME1", HeaderValue::String("BYTEPIX".into()), None); + h.set("ZVAL1", HeaderValue::Integer(4), None); + + let cimg = CompressedImage::from_bintable(&h, &table).unwrap(); + assert_eq!(cimg.compression(), CompressionType::Rice1); + assert_eq!(cimg.geometry().num_tiles(), 2); + + let img = cimg.decompress().unwrap(); + assert_eq!(img.axes, vec![4]); + match img.pixels { + crate::image_data::PixelData::I32(v) => { + assert_eq!(v, vec![1000, 1003, 50, 47]); + } + other => panic!("expected I32, got {other:?}"), + } + } + + #[cfg(feature = "gzip")] + #[test] + fn end_to_end_gzip1_i16_one_tile() { + use crate::bintable::{BinColumnType, BinTableBuilder}; + + // 4x1 i16 image, one row tile. GZIP_1 stores raw big-endian image integers. + let vals: [i16; 4] = [100, -200, 300, -400]; + let mut raw = Vec::new(); + for v in vals { + raw.extend_from_slice(&v.to_be_bytes()); + } + // Wrap raw bytes as a gzip member (header + raw DEFLATE + trailer). + let body = miniz_oxide::deflate::compress_to_vec(&raw, 6); + let mut gz = vec![0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 0, 0xff]; + gz.extend_from_slice(&body); + let crc = crc32(&raw); + gz.extend_from_slice(&crc.to_le_bytes()); + gz.extend_from_slice(&(raw.len() as u32).to_le_bytes()); + + let table = BinTableBuilder::new() + .add_column("COMPRESSED_DATA", BinColumnType::VarP('B')) + .push_row(|r| r.write_var_p(gz.len() as i32, |heap| heap.extend_from_slice(&gz))) + .build(); + + let mut h = Header::new(); + h.set("ZIMAGE", HeaderValue::Logical(true), None); + h.set("ZCMPTYPE", HeaderValue::String("GZIP_1".into()), None); + h.set("ZBITPIX", HeaderValue::Integer(16), None); + h.set("ZNAXIS", HeaderValue::Integer(1), None); + h.set("ZNAXIS1", HeaderValue::Integer(4), None); + + let cimg = CompressedImage::from_bintable(&h, &table).unwrap(); + let img = cimg.decompress().unwrap(); + match img.pixels { + crate::image_data::PixelData::I16(v) => assert_eq!(v, vec![100, -200, 300, -400]), + other => panic!("expected I16, got {other:?}"), + } + } + + /// Minimal CRC32 (gzip/PNG polynomial) for building gzip test fixtures. + #[cfg(feature = "gzip")] + fn crc32(data: &[u8]) -> u32 { + let mut crc: u32 = 0xffff_ffff; + for &byte in data { + crc ^= byte as u32; + for _ in 0..8 { + let mask = (crc & 1).wrapping_neg(); + crc = (crc >> 1) ^ (0xedb8_8320 & mask); + } + } + !crc + } + + #[cfg(not(feature = "gzip"))] + #[test] + fn gzip_disabled_errors() { + assert!(matches!( + gzip_inflate(&[0x1f, 0x8b]), + Err(Error::UnsupportedCompression(_)) + )); + } + + #[test] + fn float_zbitpix_is_unsupported() { + use crate::bintable::{BinColumnType, BinTableBuilder}; + let table = BinTableBuilder::new() + .add_column("COMPRESSED_DATA", BinColumnType::VarP('B')) + .push_row(|r| r.write_var_p(0, |_| {})) + .build(); + let mut h = Header::new(); + h.set("ZIMAGE", HeaderValue::Logical(true), None); + h.set("ZCMPTYPE", HeaderValue::String("RICE_1".into()), None); + h.set("ZBITPIX", HeaderValue::Integer(-32), None); + h.set("ZNAXIS", HeaderValue::Integer(1), None); + h.set("ZNAXIS1", HeaderValue::Integer(2), None); + let cimg = CompressedImage::from_bintable(&h, &table).unwrap(); + assert!(matches!( + cimg.decompress(), + Err(Error::UnsupportedCompression(_)) + )); + } +} diff --git a/src/types.rs b/src/types.rs index a07e163..a593d0f 100644 --- a/src/types.rs +++ b/src/types.rs @@ -62,7 +62,14 @@ mod tests { #[test] fn bitpix_round_trip() { - for bp in [Bitpix::U8, Bitpix::I16, Bitpix::I32, Bitpix::I64, Bitpix::F32, Bitpix::F64] { + for bp in [ + Bitpix::U8, + Bitpix::I16, + Bitpix::I32, + Bitpix::I64, + Bitpix::F32, + Bitpix::F64, + ] { assert_eq!(Bitpix::from_i64(bp.to_i64()).unwrap(), bp); } } diff --git a/tests/checksum.rs b/tests/checksum.rs index 5057cb2..b9fef5b 100644 --- a/tests/checksum.rs +++ b/tests/checksum.rs @@ -1,5 +1,5 @@ -use fits4::*; use fits4::checksum; +use fits4::*; #[test] fn write_with_checksum_and_verify() { diff --git a/tests/image_conv.rs b/tests/image_conv.rs index 44282b9..f751911 100644 --- a/tests/image_conv.rs +++ b/tests/image_conv.rs @@ -64,10 +64,7 @@ fn u16_full_range() { #[test] fn f32_normalizes_to_u16() { - let img_data = ImageData::new( - vec![4], - PixelData::F32(vec![0.0, 1.0, 0.5, 0.25]), - ); + let img_data = ImageData::new(vec![4], PixelData::F32(vec![0.0, 1.0, 0.5, 0.25])); let back = img_data.to_dynamic_image(1.0, 0.0).unwrap(); match back { diff --git a/tests/round_trip.rs b/tests/round_trip.rs index 58855de..6dee120 100644 --- a/tests/round_trip.rs +++ b/tests/round_trip.rs @@ -228,7 +228,7 @@ fn round_trip_bscale_bzero() { #[test] fn round_trip_bintable() { - use fits4::bintable::{BinColumn, BinColumnType, BinCellValue}; + use fits4::bintable::{BinCellValue, BinColumn, BinColumnType}; // Build a simple bintable with 2 rows, 3 columns: i32, f64, 8-char string let col_j = BinColumn { @@ -325,7 +325,7 @@ fn round_trip_bintable() { #[test] fn round_trip_bintable_vla() { - use fits4::bintable::{BinColumn, BinColumnType, BinCellValue}; + use fits4::bintable::{BinCellValue, BinColumn, BinColumnType}; // Variable-length array column using P descriptor let col = BinColumn { @@ -454,13 +454,17 @@ fn multiple_extensions() { // Add image extension let img = ImageData::new(vec![2], PixelData::F32(vec![1.0, 2.0])); let mut img_hdu = Hdu::image_extension(img); - img_hdu.header.set("EXTNAME", HeaderValue::String("MYIMAGE".into()), None); + img_hdu + .header + .set("EXTNAME", HeaderValue::String("MYIMAGE".into()), None); fits.push_extension(img_hdu); // Add another image extension let img2 = ImageData::new(vec![3], PixelData::U8(vec![10, 20, 30])); let mut img_hdu2 = Hdu::image_extension(img2); - img_hdu2.header.set("EXTNAME", HeaderValue::String("OTHER".into()), None); + img_hdu2 + .header + .set("EXTNAME", HeaderValue::String("OTHER".into()), None); fits.push_extension(img_hdu2); let bytes = fits.to_bytes().unwrap(); @@ -488,21 +492,33 @@ fn block_alignment() { #[test] fn header_keyword_preservation() { let mut fits = FitsFile::with_empty_primary(); - fits.primary_mut().header.push(Keyword::commentary("COMMENT", "Test comment line")); - fits.primary_mut().header.push(Keyword::commentary("HISTORY", "Created by fits4 tests")); - fits.primary_mut().header.set("AUTHOR", HeaderValue::String("test".into()), None); + fits.primary_mut() + .header + .push(Keyword::commentary("COMMENT", "Test comment line")); + fits.primary_mut() + .header + .push(Keyword::commentary("HISTORY", "Created by fits4 tests")); + fits.primary_mut() + .header + .set("AUTHOR", HeaderValue::String("test".into()), None); let bytes = fits.to_bytes().unwrap(); let fits2 = FitsFile::from_bytes(&bytes).unwrap(); assert_eq!(fits2.primary().header.get_string("AUTHOR"), Some("test")); - let comments: Vec<_> = fits2.primary().header.iter() + let comments: Vec<_> = fits2 + .primary() + .header + .iter() .filter(|k| k.name == "COMMENT") .collect(); assert_eq!(comments.len(), 1); - let history: Vec<_> = fits2.primary().header.iter() + let history: Vec<_> = fits2 + .primary() + .header + .iter() .filter(|k| k.name == "HISTORY") .collect(); assert_eq!(history.len(), 1); diff --git a/tests/sample_files.rs b/tests/sample_files.rs index b561ead..981877c 100644 --- a/tests/sample_files.rs +++ b/tests/sample_files.rs @@ -30,7 +30,11 @@ fn read_euv_image() { assert_eq!(primary.header.get_int("NAXIS"), Some(0)); assert!(matches!(primary.data, HduData::Empty)); - assert!(fits.len() > 1, "expected extensions, got {} HDUs", fits.len()); + assert!( + fits.len() > 1, + "expected extensions, got {} HDUs", + fits.len() + ); let mut image_count = 0; let mut bintable_count = 0; @@ -188,3 +192,106 @@ fn all_sample_files_readable() { ); } } + +// --- Tiled-image compression: RICE_1 / GZIP_1 round-trip vs originals -------- +// +// Each compressed `.fz` fixture has the SAME HDU ordering as its uncompressed +// source: every `HduData::Image` HDU in the source appears, in order, as a +// compressed-image `HduData::BinTable` (ZIMAGE=T) in the `.fz`. We decompress +// each and assert byte-exact equality of the pixel buffers. +// +// These are guarded per-fixture with `Path::exists()` so they skip cleanly on a +// machine without the (gitignored, GCS-hosted) fixtures. + +/// Assert that decompressing every compressed-image HDU in `fz_name` reproduces, +/// byte-for-byte, the corresponding `HduData::Image` HDUs of `src_name`. +fn assert_rice_roundtrip(src_name: &str, fz_name: &str) { + let fz_path = samp(fz_name); + if !fz_path.exists() { + eprintln!("skipping: fixture {fz_name} not present"); + return; + } + let src = FitsFile::from_file(samp(src_name)).expect("read source"); + let fz = FitsFile::from_file(&fz_path).expect("read .fz"); + + // Source image HDUs, in order. + let src_images: Vec<&ImageData> = src + .hdus + .iter() + .filter_map(|h| match &h.data { + HduData::Image(im) if !im.pixels.to_bytes().is_empty() => Some(im), + _ => None, + }) + .collect(); + + // Compressed-image HDUs in the .fz, in order. + let mut matched = 0usize; + let mut src_iter = src_images.iter(); + for hdu in &fz.hdus { + if let Some(cimg) = hdu.as_compressed_image() { + let orig = src_iter + .next() + .expect("more compressed images than source images"); + let dec = cimg.decompress().expect("decompress"); + assert_eq!( + dec.axes, orig.axes, + "{fz_name}: axes mismatch on compressed HDU #{matched}" + ); + assert_eq!( + dec.pixels.to_bytes(), + orig.pixels.to_bytes(), + "{fz_name}: pixel bytes differ on compressed HDU #{matched}" + ); + matched += 1; + } + } + assert!(matched > 0, "{fz_name}: found no compressed-image HDUs"); +} + +#[test] +fn rice_roundtrip_euv_row_tiled() { + // 512x512 I16, RICE row tiling (512x1 = one tile per row). + assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.rice.fits.fz"); +} + +#[test] +fn rice_roundtrip_euv_square_tiled() { + // Same source, forced 100x100 tiles: exercises 2-D edge-truncated tiling. + assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.rice_t100.fits.fz"); +} + +#[test] +fn rice_roundtrip_fgs_i32() { + // 89688x7 I32 image. + assert_rice_roundtrip("FGSf64y0106m_a1f.fits", "FGSf64y0106m_a1f.rice.fits.fz"); +} + +/// Float (`ZBITPIX < 0`) RICE fixtures are out of scope: decoding must return the +/// documented `Err(UnsupportedCompression)` rather than wrong pixels. +#[test] +fn rice_float_fixture_is_unsupported() { + let fz_path = samp("FOCx38i0101t_c0f.rice_nodith.fits.fz"); + if !fz_path.exists() { + eprintln!("skipping: fixture not present"); + return; + } + let fz = FitsFile::from_file(&fz_path).expect("read .fz"); + let mut checked = false; + for hdu in &fz.hdus { + if let Some(cimg) = hdu.as_compressed_image() { + assert!( + matches!(cimg.decompress(), Err(Error::UnsupportedCompression(_))), + "float compressed image should be unsupported" + ); + checked = true; + } + } + assert!(checked, "expected a compressed-image HDU"); +} + +#[cfg(feature = "gzip")] +#[test] +fn gzip1_roundtrip_euv() { + // GZIP_1 stores raw big-endian image integers per tile; needs the `gzip` feature. + assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.gzip1.fits.fz"); +} From b30b3f263d2f07bc90b879355b23809daaa4ec9a Mon Sep 17 00:00:00 2001 From: Steven Michael Date: Sun, 31 May 2026 14:18:45 -0400 Subject: [PATCH 2/9] Add float tile decompression: quantization + subtractive dithering Phase 3 of FITS tiled-image compression: decode floating-point compressed images (RICE/GZIP-quantized) back to F32/F64. - unquantize/decompress_float in tile_compress.rs: per-tile ZSCALE/ZZERO columns, ZQUANTIZ NO_DITHER / SUBTRACTIVE_DITHER_1 / SUBTRACTIVE_DITHER_2, ZBLANK -> NaN, DITHER_2 exact-zero preservation, and lossless-float (ZQUANTIZ=NONE) passthrough. - Port cfitsio's dither RNG exactly: Park-Miller LCG (a=16807, m=2^31-1, seed 1) -> 10000 f32s via OnceLock; iseed = (tile + ZDITHER0 - 1) % 10000. - Reconstruct with a fused multiply-add (f64::mul_add) to bit-match cfitsio/funpack; without the FMA, large-ZZERO cancellation pixels diverge by ~1 ULP. - Tests: 6 float cases asserting bit-identity to the funpack CLI at test time, skip-if-absent on both fixture and binary so CI stays green without cfitsio. Added a SUBTRACTIVE_DITHER_2 fixture (fpack -qz5). - CI: float fixtures added to best-effort download, cache key v2 -> v3, clippy/test switched to --all-features. HCOMPRESS_1, PLIO_1, and the write path remain TODO. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/ci.yml | 18 +- scripts/gen_compressed_fixtures.sh | 7 +- src/tile_compress.rs | 425 +++++++++++++++++++++++++++-- tests/sample_files.rs | 167 +++++++++++- 4 files changed, 583 insertions(+), 34 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index ace4be0..3df1082 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -26,7 +26,7 @@ jobs: uses: actions/cache@v4 with: path: samp - key: fits4-samples-v2 + key: fits4-samples-v3 - name: Download sample FITS files if: steps.cache-samples.outputs.cache-hit != 'true' @@ -50,12 +50,21 @@ jobs: FGSf64y0106m_a1f.rice.fits.fz \ EUVEngc4151imgx.gzip1.fits.fz \ EUVEngc4151imgx.gzip2.fits.fz \ - FOCx38i0101t_c0f.rice_nodith.fits.fz; do + FOCx38i0101t_c0f.rice_nodith.fits.fz \ + FOCx38i0101t_c0f.rice_dith.fits.fz \ + FOCx38i0101t_c0f.rice_dith2.fits.fz \ + FOCx38i0101t_c0f.gzip_lossless.fits.fz \ + WFPC2u5780205r_c0fx.rice_dith.fits.fz \ + WFPC2u5780205r_c0fx.rice_nodith.fits.fz; do curl -sSfL "$BASE/$f" -o "samp/$f" || true done + # The float tile-compression tests validate against the `funpack` CLI + # (cfitsio). It is not installed here, so those tests skip cleanly; the + # integer-codec round-trip tests still run against the downloaded fixtures. + - name: Clippy - run: cargo clippy --all-targets --features image -- -D warnings + run: cargo clippy --all-targets --all-features -- -D warnings - name: Test run: cargo test @@ -63,5 +72,8 @@ jobs: - name: Test (with image feature) run: cargo test --features image + - name: Test (all features) + run: cargo test --all-features + - name: Doc run: cargo doc --no-deps --features image diff --git a/scripts/gen_compressed_fixtures.sh b/scripts/gen_compressed_fixtures.sh index c6bdbd2..9e56922 100755 --- a/scripts/gen_compressed_fixtures.sh +++ b/scripts/gen_compressed_fixtures.sh @@ -105,9 +105,10 @@ compress FGSf64y0106m_a1f.fits gzip1 -g # subtractive dithering; `-q0 ` (no space) quantizes with NO dithering # (ZQUANTIZ=NO_DITHER); lossless float (`-q 0`, level zero) is GZIP-only. # FOCx38i0101t_c0f.fits is F32 1024x1024. -compress FOCx38i0101t_c0f.fits rice_dith -r -q 16 # quantize + subtractive dither -compress FOCx38i0101t_c0f.fits rice_nodith -r -q0 16 # quantize, no dithering -compress FOCx38i0101t_c0f.fits gzip_lossless -g -q 0 # lossless float (no quantization) +compress FOCx38i0101t_c0f.fits rice_dith -r -q 16 # quantize + SUBTRACTIVE_DITHER_1 +compress FOCx38i0101t_c0f.fits rice_dith2 -r -qz5 16 # SUBTRACTIVE_DITHER_2, fixed seed 5 +compress FOCx38i0101t_c0f.fits rice_nodith -r -q0 16 # quantize, NO_DITHER +compress FOCx38i0101t_c0f.fits gzip_lossless -g -q 0 # lossless float (no quantization) compress FOCx38i0101t_c0f.fits hcomp -h # WFPC2u5780205r_c0fx.fits is an F32 200x200x4 cube (multi-axis tiling). diff --git a/src/tile_compress.rs b/src/tile_compress.rs index ac5720b..94790ac 100644 --- a/src/tile_compress.rs +++ b/src/tile_compress.rs @@ -74,7 +74,10 @@ impl Quantize { match s.map(|v| v.trim().to_ascii_uppercase()) { None => Ok(Quantize::None), Some(v) => match v.as_str() { - "NO_DITHER" => Ok(Quantize::None), + // `NONE` is written by cfitsio for losslessly-stored float tiles + // (GZIP of the raw floats, no quantization) and is treated like the + // absent case here. + "NO_DITHER" | "NONE" => Ok(Quantize::None), "SUBTRACTIVE_DITHER_1" => Ok(Quantize::SubtractiveDither1), "SUBTRACTIVE_DITHER_2" => Ok(Quantize::SubtractiveDither2), other => Err(Error::UnsupportedCompression(format!( @@ -238,10 +241,13 @@ impl<'a> CompressedImage<'a> { /// per the tile grid (axis-1 fastest). `ZBLANK` integer sentinels are carried /// through unchanged into the output array. /// + /// Float `ZBITPIX` (-32/-64) images are decoded via [`Self::decompress_float`], + /// which inverts the per-tile `ZSCALE`/`ZZERO` quantization and reverses cfitsio's + /// subtractive dithering (`ZQUANTIZ` = `SUBTRACTIVE_DITHER_1`/`_2`); `ZBLANK` + /// sentinels map to `NaN`. + /// /// # Not yet supported (see `COMPRESSION_PLAN.md`) /// - /// - Float `ZBITPIX` (-32/-64): requires quantization inverse (`ZSCALE`/`ZZERO`) - /// and subtractive dithering — phase 3. /// - `PLIO_1` — phase 4. /// - `HCOMPRESS_1` — phase 5. /// - `GZIP_1`/`GZIP_2` without the `gzip` feature return @@ -250,12 +256,11 @@ impl<'a> CompressedImage<'a> { use crate::image_data::{ImageData, PixelData}; use crate::types::Bitpix; - // Float originals require quantization inverse + dithering (phase 3). + // Float originals (ZBITPIX = -32 / -64) take a separate path: each tile holds + // either quantized integers (with per-tile ZSCALE/ZZERO) or, when quantization + // was not applied, the raw floats. See `decompress_float`. if self.zbitpix < 0 { - return Err(Error::UnsupportedCompression(format!( - "float ZBITPIX={} (quantization/dithering decode is phase 3, see COMPRESSION_PLAN.md)", - self.zbitpix - ))); + return self.decompress_float(); } let bitpix = Bitpix::from_i64(self.zbitpix)?; @@ -312,6 +317,231 @@ impl<'a> CompressedImage<'a> { Ok(ImageData::new(self.geometry.znaxis.clone(), pixels)) } + /// Decompress a float (`ZBITPIX = -32` / `-64`) tile-compressed image. + /// + /// Float images are normally stored as per-tile linearly-quantized 32-bit + /// integers plus per-tile `ZSCALE`/`ZZERO` scale/offset columns; the integers are + /// reconstructed exactly like the integer path, then *unquantized* back to floats. + /// cfitsio reverses the subtractive-dithering it applied at compress time using a + /// fixed pseudo-random sequence (see [`fits_rand_value`]). + /// + /// A tile that could not be quantized (e.g. one containing only NaNs, or when + /// lossless `GZIP` of the raw floats was requested) is stored instead as the raw + /// big-endian float bytes — either in a `GZIP_COMPRESSED_DATA`/`UNCOMPRESSED_DATA` + /// fallback column for that single tile, or, for a wholly-lossless image, in + /// `COMPRESSED_DATA` with no `ZSCALE`/`ZZERO` columns at all. Such tiles are passed + /// through verbatim. + fn decompress_float(&self) -> Result { + use crate::image_data::{ImageData, PixelData}; + + // -32 -> 4-byte floats, -64 -> 8-byte floats. + let is_f64 = self.zbitpix == -64; + let npix: usize = self.geometry.znaxis.iter().product(); + + let tiles_per_axis = self.geometry.tiles_per_axis(); + let num_tiles = self.geometry.num_tiles(); + if self.table.nrows < num_tiles { + return Err(Error::CompressionError(format!( + "compressed image expects {} tiles but BINTABLE has {} rows", + num_tiles, self.table.nrows + ))); + } + + let cdata_col = self.compressed_data_column()?; + let gzip_fallback_col = self.find_column("GZIP_COMPRESSED_DATA"); + let uncompressed_col = self.find_column("UNCOMPRESSED_DATA"); + let zscale_col = self.find_column("ZSCALE"); + let zzero_col = self.find_column("ZZERO"); + let zblank_col = self.find_column("ZBLANK"); + + // Reconstructed floats, axis-1 fastest, scattered tile-by-tile. We scatter via + // the integer `scatter_tile` over a bit-reinterpreted buffer so the existing + // (well-tested) geometry code is reused. + let mut full_bits = vec![0i64; npix]; + + for tile_index in 0..num_tiles { + let coords = unravel(tile_index, &tiles_per_axis); + let tile_dims = self.tile_dims_at(&coords); + let tile_npix: usize = tile_dims.iter().product(); + + // Per-tile scale/zero (D columns). A sentinel/absent value marks an + // unquantized (raw-float) tile. + let zscale = zscale_col.and_then(|c| self.tile_double(tile_index, c)); + let zzero = zzero_col.and_then(|c| self.tile_double(tile_index, c)); + // Per-tile blank sentinel (J column) overrides the ZBLANK keyword. + let blank = zblank_col + .and_then(|c| self.tile_int(tile_index, c)) + .or(self.blank); + + let cdata = self.tile_bytes(tile_index, cdata_col)?; + + let tile_floats: Vec = match (zscale, zzero) { + // Quantized tile: COMPRESSED_DATA holds quantized integers. + (Some(scale), Some(zero)) if !cdata.is_empty() && is_quantized(scale) => { + let q = self.decode_tile(&cdata, tile_npix)?; + self.unquantize(&q, tile_index, scale, zero, blank, tile_npix) + } + // Unquantized tile (or whole image is lossless): raw floats live in + // COMPRESSED_DATA, or in a per-tile fallback column when COMPRESSED_DATA + // is empty. + _ => { + let (bytes, gzipped) = if !cdata.is_empty() { + // For a lossless GZIP image the raw floats are gzip-compressed + // here; for NOCOMPRESS they are verbatim. + (cdata, self.ctype != CompressionType::NoCompress) + } else if let Some(b) = + gzip_fallback_col.and_then(|c| self.tile_bytes(tile_index, c).ok()) + { + (b, true) + } else if let Some(b) = + uncompressed_col.and_then(|c| self.tile_bytes(tile_index, c).ok()) + { + (b, false) + } else { + return Err(Error::CompressionError(format!( + "float tile {tile_index} has no quantization scale and no raw-float fallback data" + ))); + }; + let raw = if gzipped { gzip_inflate(&bytes)? } else { bytes }; + raw_floats(&raw, tile_npix, is_f64)? + } + }; + + if tile_floats.len() != tile_npix { + return Err(Error::CompressionError(format!( + "float tile {tile_index} produced {} values, expected {tile_npix}", + tile_floats.len() + ))); + } + + // Reinterpret each float's bits as an integer so we can reuse scatter_tile, + // then convert back below. (f32 bits zero-extended into i64.) + let bits: Vec = if is_f64 { + tile_floats.iter().map(|&v| v.to_bits() as i64).collect() + } else { + tile_floats + .iter() + .map(|&v| (v as f32).to_bits() as i64) + .collect() + }; + scatter_tile( + &mut full_bits, + &self.geometry.znaxis, + &self.geometry.ztile, + &tile_dims, + &coords, + &bits, + ); + } + + let pixels = if is_f64 { + PixelData::F64(full_bits.iter().map(|&b| f64::from_bits(b as u64)).collect()) + } else { + PixelData::F32( + full_bits + .iter() + .map(|&b| f32::from_bits(b as u32)) + .collect(), + ) + }; + + Ok(ImageData::new(self.geometry.znaxis.clone(), pixels)) + } + + /// Inverse linear quantization with cfitsio's subtractive-dithering reversal. + /// + /// For each quantized integer `q[i]` of a tile: + /// - `q == blank` (ZBLANK / per-tile null sentinel) ⇒ `NaN`. + /// - `SUBTRACTIVE_DITHER_2` and `q == ZERO_VALUE (-2147483646)` ⇒ exactly `0.0`. + /// - otherwise `value = (q - r + 0.5) * scale + zero`, where `r` is the next value + /// of the fixed pseudo-random sequence ([`fits_rand_value`]); for the + /// non-dithered methods `r` is taken as `0.0` (so `value = q*scale + zero`, + /// matching cfitsio's `fffi4r4`). + /// + /// The random-sequence indexing mirrors cfitsio `unquantize_i4r4`: + /// `iseed = (tile_index + ZDITHER0 - 1) mod N_RANDOM`, + /// `nextrand = (int)(fits_rand_value[iseed] * 500)`, advancing `nextrand` per pixel + /// and, on reaching `N_RANDOM`, bumping `iseed` (wrapping) and re-deriving + /// `nextrand`. + fn unquantize( + &self, + q: &[i64], + tile_index: usize, + scale: f64, + zero: f64, + blank: Option, + tile_npix: usize, + ) -> Vec { + let dithered = matches!( + self.quantize, + Quantize::SubtractiveDither1 | Quantize::SubtractiveDither2 + ); + let dither2 = self.quantize == Quantize::SubtractiveDither2; + + // ZDITHER0 defaults to 1 when absent (cfitsio fits_read_compressed_img). + let zdither0 = self.zdither0.unwrap_or(1); + // cfitsio passes row = tile_index_1based + zdither0 - 1, then iseed = (row-1) % N. + let mut iseed = ((tile_index as i64 + zdither0 - 1).rem_euclid(N_RANDOM as i64)) as usize; + let mut nextrand = (fits_rand_value(iseed) * 500.0) as usize; + + let mut out = Vec::with_capacity(tile_npix); + for &qi in q.iter().take(tile_npix) { + // cfitsio computes `x * scale + zero` as a single fused multiply-add + // (the C compiler contracts the expression), which we must reproduce + // exactly via `mul_add` to be bit-identical in catastrophic-cancellation + // cases (large `zero` offsets). + let value = if blank.is_some_and(|b| qi == b) { + f64::NAN + } else if dither2 && qi == ZERO_VALUE { + 0.0 + } else if dithered { + ((qi as f64) - fits_rand_value(nextrand) as f64 + 0.5).mul_add(scale, zero) + } else { + (qi as f64).mul_add(scale, zero) + }; + out.push(value); + + if dithered { + nextrand += 1; + if nextrand == N_RANDOM { + iseed += 1; + if iseed == N_RANDOM { + iseed = 0; + } + nextrand = (fits_rand_value(iseed) * 500.0) as usize; + } + } + } + out + } + + /// Index of a column by (case-insensitive, trimmed) `TTYPE` name, if present. + fn find_column(&self, name: &str) -> Option { + self.table + .columns + .iter() + .position(|c| c.name.trim().eq_ignore_ascii_case(name)) + } + + /// Read a single `D`/`E` scalar cell as `f64` (per-tile ZSCALE/ZZERO). + fn tile_double(&self, row: usize, col: usize) -> Option { + match self.table.get_cell(row, col).ok()? { + crate::bintable::BinCellValue::F64(v) => v.first().copied(), + crate::bintable::BinCellValue::F32(v) => v.first().map(|&x| x as f64), + _ => None, + } + } + + /// Read a single integer scalar cell as `i64` (per-tile ZBLANK). + fn tile_int(&self, row: usize, col: usize) -> Option { + match self.table.get_cell(row, col).ok()? { + crate::bintable::BinCellValue::I32(v) => v.first().map(|&x| x as i64), + crate::bintable::BinCellValue::I64(v) => v.first().copied(), + crate::bintable::BinCellValue::I16(v) => v.first().map(|&x| x as i64), + _ => None, + } + } + /// Resolve the index of the `COMPRESSED_DATA` column. fn compressed_data_column(&self) -> Result { self.table @@ -480,6 +710,82 @@ fn scatter_tile( } } +// --------------------------------------------------------------------------- +// Float quantization / subtractive-dithering reconstruction +// --------------------------------------------------------------------------- +// +// Ported from cfitsio (`imcompress.c` / `quantize.c`, R. White & W. Pence, STScI). +// Quantized float tiles store each pixel as an i32 produced by +// `round(value/scale - zero/scale + dither)`; we invert with the same fixed +// pseudo-random `fits_rand_value` table that cfitsio uses. + +/// Length of cfitsio's fixed pseudo-random sequence (`N_RANDOM`). Do not change. +const N_RANDOM: usize = 10000; + +/// Sentinel quantized value flagging an undefined (NaN) pixel (cfitsio `NULL_VALUE`). +/// Carried here for documentation; the actual null sentinel is taken from +/// `ZBLANK` (keyword or per-tile column), which equals this for cfitsio-written files. +#[allow(dead_code)] +const NULL_VALUE: i64 = -2_147_483_647; + +/// Sentinel quantized value flagging an exact-zero pixel under +/// `SUBTRACTIVE_DITHER_2` (cfitsio `ZERO_VALUE`). +const ZERO_VALUE: i64 = -2_147_483_646; + +/// True if a per-tile `ZSCALE` indicates a quantized tile. cfitsio treats a zero +/// scale (its `cn_zscale == 0` / absent-scale default) as "not quantized". +fn is_quantized(zscale: f64) -> bool { + zscale != 0.0 +} + +/// Reinterpret big-endian bytes as `n` floats (f32 if `!is_f64`, else f64), widened +/// to `f64` for uniform downstream handling. Used for unquantized (raw-float) tiles. +fn raw_floats(bytes: &[u8], n: usize, is_f64: bool) -> Result> { + let width = if is_f64 { 8 } else { 4 }; + if bytes.len() < n * width { + return Err(Error::CompressionError(format!( + "raw float tile has {} bytes, expected at least {}", + bytes.len(), + n * width + ))); + } + let mut out = Vec::with_capacity(n); + for c in bytes.chunks_exact(width).take(n) { + let v = if is_f64 { + f64::from_be_bytes(c.try_into().unwrap()) + } else { + f32::from_be_bytes([c[0], c[1], c[2], c[3]]) as f64 + }; + out.push(v); + } + Ok(out) +} + +/// The `ii`-th element of cfitsio's fixed pseudo-random sequence. +/// +/// Generated once (lazily) by the Park–Miller minimal-standard LCG +/// (`a = 16807`, `m = 2^31 - 1`, seed 1) exactly as cfitsio's `fits_init_randoms`: +/// `seed_{k+1} = (a*seed_k) mod m`, value = `seed / m`. The 10000-element table is +/// validated against cfitsio's published checkpoint (final seed `1043618065`). +fn fits_rand_value(ii: usize) -> f32 { + use std::sync::OnceLock; + static TABLE: OnceLock> = OnceLock::new(); + let table = TABLE.get_or_init(|| { + let a = 16807.0f64; + let m = 2_147_483_647.0f64; + let mut seed = 1.0f64; + let mut v = Vec::with_capacity(N_RANDOM); + for _ in 0..N_RANDOM { + let temp = a * seed; + // cfitsio: seed = temp - m * (int)(temp / m) + seed = temp - m * ((temp / m) as i64 as f64); + v.push((seed / m) as f32); + } + v + }); + table[ii % N_RANDOM] +} + /// Count the contiguous `ZNAMEi`/`ZVALi` parameter pairs present in the header. fn count_zname_params(header: &Header) -> usize { let mut n = 0; @@ -941,9 +1247,8 @@ mod tests { // tile dims: (0,0)=2x2, (1,0)=1x2, (0,1)=2x1, (1,1)=1x1 let tile_dims = [vec![2, 2], vec![1, 2], vec![2, 1], vec![1, 1]]; let mut full = vec![-1i64; 9]; - for tile in 0..4 { + for (tile, td) in tile_dims.iter().enumerate() { let coords = unravel(tile, &tiles_per_axis); - let td = &tile_dims[tile]; let n: usize = td.iter().product(); let vals: Vec = (0..n).map(|l| (tile as i64) * 100 + l as i64).collect(); scatter_tile(&mut full, &image_dims, &ztile, td, &coords, &vals); @@ -1088,22 +1393,110 @@ mod tests { } #[test] - fn float_zbitpix_is_unsupported() { + fn fits_rand_value_matches_cfitsio_checkpoint() { + // cfitsio validates fits_init_randoms by asserting the final LCG seed is + // 1043618065 after 10000 iterations. Reproduce that seed and check it, which + // exercises the exact same recurrence our table generator uses. + let a = 16807.0f64; + let m = 2_147_483_647.0f64; + let mut seed = 1.0f64; + for _ in 0..N_RANDOM { + let temp = a * seed; + seed = temp - m * ((temp / m) as i64 as f64); + } + assert_eq!(seed as i64, 1_043_618_065); + // All table values lie in [0, 1). + for i in [0usize, 1, 499, 5000, N_RANDOM - 1] { + let v = fits_rand_value(i); + assert!((0.0..1.0).contains(&v), "rand[{i}] = {v} out of range"); + } + } + + #[test] + fn nodither_float_constant_tile() { + // A NO_DITHER quantized float tile: value = q*scale + zero (no +0.5, no dither). use crate::bintable::{BinColumnType, BinTableBuilder}; + + // 2-pixel image, single row tile, quantized ints [10, 20], scale 0.5, zero 3.0. + let q = [10i32, 20]; + let enc = rice_encode_i32(&q, 2); + let table = BinTableBuilder::new() .add_column("COMPRESSED_DATA", BinColumnType::VarP('B')) - .push_row(|r| r.write_var_p(0, |_| {})) + .add_column("ZSCALE", BinColumnType::D64(1)) + .add_column("ZZERO", BinColumnType::D64(1)) + .push_row(|r| { + r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc)); + r.write_f64(0.5); + r.write_f64(3.0); + }) .build(); + let mut h = Header::new(); h.set("ZIMAGE", HeaderValue::Logical(true), None); h.set("ZCMPTYPE", HeaderValue::String("RICE_1".into()), None); h.set("ZBITPIX", HeaderValue::Integer(-32), None); h.set("ZNAXIS", HeaderValue::Integer(1), None); h.set("ZNAXIS1", HeaderValue::Integer(2), None); + h.set("ZQUANTIZ", HeaderValue::String("NO_DITHER".into()), None); + h.set("ZNAME1", HeaderValue::String("BYTEPIX".into()), None); + h.set("ZVAL1", HeaderValue::Integer(4), None); + let cimg = CompressedImage::from_bintable(&h, &table).unwrap(); - assert!(matches!( - cimg.decompress(), - Err(Error::UnsupportedCompression(_)) - )); + let img = cimg.decompress().unwrap(); + match img.pixels { + crate::image_data::PixelData::F32(v) => { + assert_eq!(v, vec![10.0 * 0.5 + 3.0, 20.0 * 0.5 + 3.0]); + } + other => panic!("expected F32, got {other:?}"), + } + } + + #[test] + fn dither2_preserves_zero_and_blank_maps_to_nan() { + use crate::bintable::{BinColumnType, BinTableBuilder}; + + // 3-pixel tile: [ZERO_VALUE, NULL/blank, 5]. DITHER_2 => [0.0, NaN, dithered]. + // Use NOCOMPRESS so the quantized ints are stored verbatim (RICE encoding of + // these extreme values is awkward and unrelated to what we're testing here). + let q = [ZERO_VALUE as i32, NULL_VALUE as i32, 5]; + let mut enc = Vec::new(); + for v in q { + enc.extend_from_slice(&v.to_be_bytes()); + } + + let table = BinTableBuilder::new() + .add_column("COMPRESSED_DATA", BinColumnType::VarP('B')) + .add_column("ZSCALE", BinColumnType::D64(1)) + .add_column("ZZERO", BinColumnType::D64(1)) + .push_row(|r| { + r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc)); + r.write_f64(2.0); + r.write_f64(1.0); + }) + .build(); + + let mut h = Header::new(); + h.set("ZIMAGE", HeaderValue::Logical(true), None); + h.set("ZCMPTYPE", HeaderValue::String("NOCOMPRESS".into()), None); + h.set("ZBITPIX", HeaderValue::Integer(-32), None); + h.set("ZNAXIS", HeaderValue::Integer(1), None); + h.set("ZNAXIS1", HeaderValue::Integer(3), None); + h.set("ZQUANTIZ", HeaderValue::String("SUBTRACTIVE_DITHER_2".into()), None); + h.set("ZDITHER0", HeaderValue::Integer(5), None); + h.set("ZBLANK", HeaderValue::Integer(NULL_VALUE), None); + h.set("ZNAME1", HeaderValue::String("BYTEPIX".into()), None); + h.set("ZVAL1", HeaderValue::Integer(4), None); + + let cimg = CompressedImage::from_bintable(&h, &table).unwrap(); + let img = cimg.decompress().unwrap(); + match img.pixels { + crate::image_data::PixelData::F32(v) => { + assert_eq!(v[0], 0.0); // ZERO_VALUE preserved exactly + assert!(v[1].is_nan()); // blank -> NaN + assert!(v[2].is_finite() && v[2] != 0.0); + } + other => panic!("expected F32, got {other:?}"), + } } } diff --git a/tests/sample_files.rs b/tests/sample_files.rs index 981877c..c883a0e 100644 --- a/tests/sample_files.rs +++ b/tests/sample_files.rs @@ -266,27 +266,170 @@ fn rice_roundtrip_fgs_i32() { assert_rice_roundtrip("FGSf64y0106m_a1f.fits", "FGSf64y0106m_a1f.rice.fits.fz"); } -/// Float (`ZBITPIX < 0`) RICE fixtures are out of scope: decoding must return the -/// documented `Err(UnsupportedCompression)` rather than wrong pixels. -#[test] -fn rice_float_fixture_is_unsupported() { - let fz_path = samp("FOCx38i0101t_c0f.rice_nodith.fits.fz"); +// --- float tile-compression: bit-exact vs funpack ------------------------- +// +// Quantized-float decompression is *lossy*, so the reconstructed floats will not +// equal the original uncompressed sample. The authoritative oracle is funpack's own +// reconstruction: fits4 and funpack must apply the identical dither table + per-tile +// scaling, so fits4's output must be BIT-IDENTICAL to funpack's. We invoke the +// `funpack` CLI at test time and compare; the test skips cleanly when either the +// fixture or the `funpack` binary is absent (so CI without cfitsio stays green). + +/// True if a `funpack` binary is on PATH. +fn funpack_available() -> bool { + std::process::Command::new("funpack") + .arg("-V") + .output() + .map(|o| o.status.success()) + .unwrap_or(false) +} + +/// Decompress `fz_name` with the `funpack` CLI into a temp `.fits` and read it back. +fn funpack_reference(fz_name: &str) -> Option { + let fz_path = samp(fz_name); + let mut out = std::env::temp_dir(); + out.push(format!( + "fits4_funpack_ref_{}_{}.fits", + std::process::id(), + fz_name.replace(['/', '.'], "_") + )); + let _ = std::fs::remove_file(&out); + let status = std::process::Command::new("funpack") + .arg("-O") + .arg(&out) + .arg("-C") // don't update checksums + .arg(&fz_path) + .status() + .ok()?; + if !status.success() { + return None; + } + let f = FitsFile::from_file(&out).ok(); + let _ = std::fs::remove_file(&out); + f +} + +/// Assert that fits4's float decompression of every compressed-image HDU in `fz_name` +/// is byte-exact against funpack's reconstruction of the same file. +fn assert_float_matches_funpack(fz_name: &str) { + let fz_path = samp(fz_name); if !fz_path.exists() { - eprintln!("skipping: fixture not present"); + eprintln!("skipping: fixture {fz_name} not present"); + return; + } + if !funpack_available() { + eprintln!("skipping: funpack binary not available"); return; } + let reference = match funpack_reference(fz_name) { + Some(r) => r, + None => { + eprintln!("skipping: funpack failed on {fz_name}"); + return; + } + }; + + // funpack reference image HDUs (the reconstructed floats), in order. + let ref_images: Vec<&ImageData> = reference + .hdus + .iter() + .filter_map(|h| match &h.data { + HduData::Image(im) if !im.pixels.to_bytes().is_empty() => Some(im), + _ => None, + }) + .collect(); + let fz = FitsFile::from_file(&fz_path).expect("read .fz"); - let mut checked = false; + let mut ref_iter = ref_images.iter(); + let mut matched = 0usize; for hdu in &fz.hdus { if let Some(cimg) = hdu.as_compressed_image() { - assert!( - matches!(cimg.decompress(), Err(Error::UnsupportedCompression(_))), - "float compressed image should be unsupported" + let reference_img = ref_iter + .next() + .expect("more compressed images than funpack reference images"); + let dec = cimg.decompress().expect("fits4 float decompress"); + assert_eq!( + dec.axes, reference_img.axes, + "{fz_name}: axes mismatch on compressed HDU #{matched}" ); - checked = true; + compare_floats_vs_funpack(fz_name, matched, &dec.pixels, &reference_img.pixels); + matched += 1; } } - assert!(checked, "expected a compressed-image HDU"); + assert!(matched > 0, "{fz_name}: found no compressed-image HDUs"); +} + +/// Compare fits4's reconstructed floats against funpack's, element-by-element. +/// +/// fits4 reproduces cfitsio's unquantization *exactly*, including the fused +/// multiply-add (`x*scale + zero` contracted to a single FMA) that the cfitsio C code +/// relies on, so every reconstructed pixel must be bit-identical to funpack's. NaNs +/// (from `ZBLANK`) only have to match as NaN — IEEE leaves the payload unspecified. +fn compare_floats_vs_funpack(fz_name: &str, hdu: usize, got: &PixelData, want: &PixelData) { + fn cmp(fz_name: &str, hdu: usize, a: f64, b: f64) { + if a.is_nan() && b.is_nan() { + return; + } + assert!( + a.to_bits() == b.to_bits(), + "{fz_name} HDU#{hdu}: fits4={a} ({:#x}) != funpack={b} ({:#x})", + a.to_bits(), + b.to_bits() + ); + } + match (got, want) { + (PixelData::F32(g), PixelData::F32(w)) => { + assert_eq!(g.len(), w.len(), "{fz_name} HDU#{hdu}: length mismatch"); + for (x, y) in g.iter().zip(w.iter()) { + cmp(fz_name, hdu, *x as f64, *y as f64); + } + } + (PixelData::F64(g), PixelData::F64(w)) => { + assert_eq!(g.len(), w.len(), "{fz_name} HDU#{hdu}: length mismatch"); + for (x, y) in g.iter().zip(w.iter()) { + cmp(fz_name, hdu, *x, *y); + } + } + _ => panic!("{fz_name} HDU#{hdu}: pixel type mismatch vs funpack reference"), + } +} + +#[test] +fn float_rice_nodither_matches_funpack() { + // 1024x1024 F32, RICE_1, ZQUANTIZ=NO_DITHER. + assert_float_matches_funpack("FOCx38i0101t_c0f.rice_nodith.fits.fz"); +} + +#[cfg(feature = "gzip")] +#[test] +fn float_gzip_lossless_matches_funpack() { + // 1024x1024 F32, GZIP_1, ZQUANTIZ=NONE (raw floats, no quantization). Needs gzip. + assert_float_matches_funpack("FOCx38i0101t_c0f.gzip_lossless.fits.fz"); +} + +#[test] +fn float_rice_dither1_matches_funpack() { + // 1024x1024 F32, RICE_1, ZQUANTIZ=SUBTRACTIVE_DITHER_1. + assert_float_matches_funpack("FOCx38i0101t_c0f.rice_dith.fits.fz"); +} + +#[test] +fn float_rice_dither2_matches_funpack() { + // 1024x1024 F32, RICE_1, ZQUANTIZ=SUBTRACTIVE_DITHER_2 (preserves exact zeros), + // generated by scripts/gen_compressed_fixtures.sh with `fpack -r -qz5 16`. + assert_float_matches_funpack("FOCx38i0101t_c0f.rice_dith2.fits.fz"); +} + +#[test] +fn float_cube_rice_dither1_matches_funpack() { + // 200x200x4 F32 cube, RICE_1, SUBTRACTIVE_DITHER_1. + assert_float_matches_funpack("WFPC2u5780205r_c0fx.rice_dith.fits.fz"); +} + +#[test] +fn float_cube_rice_nodither_matches_funpack() { + // 200x200x4 F32 cube, RICE_1, NO_DITHER. + assert_float_matches_funpack("WFPC2u5780205r_c0fx.rice_nodith.fits.fz"); } #[cfg(feature = "gzip")] From 9f3f805686a6153f54490d45d4434cc25ce8324d Mon Sep 17 00:00:00 2001 From: Steven Michael Date: Sun, 31 May 2026 14:35:08 -0400 Subject: [PATCH 3/9] Add PLIO_1 and HCOMPRESS_1 tile decompression MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Complete the tile-compression read path with the remaining two algorithms; all image compression types in the standard now decode. - PLIO_1 (plio_decompress): port of cfitsio pl_l2pi line-list decode. The 1PI VLA is read as big-endian i16 words; opcodes 0..7 reconstruct the integer mask. Byte-exact (lossless). Note: the ll_src[3] header test is a signed i16 comparison (magic -100) — comparing unsigned decodes garbage. - HCOMPRESS_1 (hcompress_decompress): full port of fits_hdecompress / fits_hdecompress64 — quadtree bit-plane decode, undigitize, inverse H-transform, and unshuffle. 32-bit transform for ZBITPIX 8/16, 64-bit otherwise (avoids overflow); all arithmetic wrapping to match C. Lossless for integer scale=0; float feeds the existing unquantize + dither path. SMOOTH != 0 is rejected with a documented Err. - Tests: plio_roundtrip_euv and hcompress_int_roundtrip_euv (byte-exact vs original), float_hcompress_dither1_matches_funpack (bit-exact vs funpack), plus PLIO unit tests. Skip-if-absent as before. - Added a lossless integer HCOMPRESS fixture (fpack -h -s 0); CI cache key v3 -> v4, new fixtures added to best-effort download. Co-Authored-By: Claude Opus 4.8 (1M context) --- .github/workflows/ci.yml | 5 +- scripts/gen_compressed_fixtures.sh | 3 + src/tile_compress.rs | 1064 +++++++++++++++++++++++++++- tests/sample_files.rs | 26 + 4 files changed, 1072 insertions(+), 26 deletions(-) diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 3df1082..8b100b7 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -26,7 +26,7 @@ jobs: uses: actions/cache@v4 with: path: samp - key: fits4-samples-v3 + key: fits4-samples-v4 - name: Download sample FITS files if: steps.cache-samples.outputs.cache-hit != 'true' @@ -50,6 +50,9 @@ jobs: FGSf64y0106m_a1f.rice.fits.fz \ EUVEngc4151imgx.gzip1.fits.fz \ EUVEngc4151imgx.gzip2.fits.fz \ + EUVEngc4151imgx.plio.fits.fz \ + EUVEngc4151imgx.hcomp_int.fits.fz \ + FOCx38i0101t_c0f.hcomp.fits.fz \ FOCx38i0101t_c0f.rice_nodith.fits.fz \ FOCx38i0101t_c0f.rice_dith.fits.fz \ FOCx38i0101t_c0f.rice_dith2.fits.fz \ diff --git a/scripts/gen_compressed_fixtures.sh b/scripts/gen_compressed_fixtures.sh index 9e56922..b216a05 100755 --- a/scripts/gen_compressed_fixtures.sh +++ b/scripts/gen_compressed_fixtures.sh @@ -93,6 +93,9 @@ compress EUVEngc4151imgx.fits gzip1 -g compress EUVEngc4151imgx.fits gzip2 -g2 # PLIO_1 targets integer (mask-like) data, <= 24-bit; fine for I16. compress EUVEngc4151imgx.fits plio -p +# HCOMPRESS with scale 0 is LOSSLESS for integer images (no quantization step), +# so the decode must reproduce the source pixels byte-exactly. +compress EUVEngc4151imgx.fits hcomp_int -h -s 0 # Force square tiles to exercise 2-D reassembly with edge-truncated tiles. compress EUVEngc4151imgx.fits rice_t100 -r -t 100,100 diff --git a/src/tile_compress.rs b/src/tile_compress.rs index 94790ac..8a2035a 100644 --- a/src/tile_compress.rs +++ b/src/tile_compress.rs @@ -11,15 +11,11 @@ //! //! # Status //! -//! This module is being built up in phases (see `COMPRESSION_PLAN.md`): -//! -//! - Phase 0 (current): scaffolding + the [`rice_decompress_i32`] family of -//! decoders with unit tests. -//! - Phase 1: RICE_1 integer read path ([`CompressedImage::decompress`]). -//! - Later: GZIP_1/GZIP_2, float quantization/dithering, PLIO_1, HCOMPRESS_1, and a -//! compression (write) path. -//! -//! Most of the high-level pipeline is currently stubbed with `todo!()`. +//! Read (decompression) support is implemented for `RICE_1`, `GZIP_1`/`GZIP_2` +//! (behind the `gzip` feature), `PLIO_1`, `HCOMPRESS_1` (`SMOOTH=0`), and +//! `NOCOMPRESS`, for integer originals and for quantized/lossless float originals +//! (with `SUBTRACTIVE_DITHER_1`/`_2` reversal). The compression (write) path is +//! not yet implemented (see `COMPRESSION_PLAN.md`). use crate::bintable::BinTable; use crate::error::{Error, Result}; @@ -162,6 +158,8 @@ pub struct CompressedImage<'a> { pub blocksize: usize, /// RICE `BYTEPIX` parameter (default 4). pub bytepix: usize, + /// HCOMPRESS `SMOOTH` parameter (default 0 = no smoothing). + pub smooth: i32, /// The borrowed binary table holding the compressed tiles + heap. /// Read once [`CompressedImage::decompress`] is implemented (phase 1). #[allow(dead_code)] @@ -197,13 +195,15 @@ impl<'a> CompressedImage<'a> { // Default RICE parameters; overridden by ZNAMEi='BLOCKSIZE'/'BYTEPIX'. let mut blocksize = 32usize; let mut bytepix = 4usize; + let mut smooth = 0i32; let nzparams = count_zname_params(header); for i in 1..=nzparams { if let Some(name) = header.get_string(&format!("ZNAME{i}")) { - let val = header.get_int(&format!("ZVAL{i}")).unwrap_or(0) as usize; + let val = header.get_int(&format!("ZVAL{i}")).unwrap_or(0); match name.trim().to_ascii_uppercase().as_str() { - "BLOCKSIZE" => blocksize = val, - "BYTEPIX" => bytepix = val, + "BLOCKSIZE" => blocksize = val as usize, + "BYTEPIX" => bytepix = val as usize, + "SMOOTH" => smooth = val as i32, _ => {} } } @@ -218,6 +218,7 @@ impl<'a> CompressedImage<'a> { blank, blocksize, bytepix, + smooth, table, }) } @@ -246,12 +247,15 @@ impl<'a> CompressedImage<'a> { /// subtractive dithering (`ZQUANTIZ` = `SUBTRACTIVE_DITHER_1`/`_2`); `ZBLANK` /// sentinels map to `NaN`. /// - /// # Not yet supported (see `COMPRESSION_PLAN.md`) + /// `PLIO_1` (IRAF pixel-list RLE) and `HCOMPRESS_1` (H-transform + quadtree) + /// integer tiles are decoded by their dedicated per-tile decoders and scattered + /// like any other integer codec. + /// + /// # Not supported /// - /// - `PLIO_1` — phase 4. - /// - `HCOMPRESS_1` — phase 5. /// - `GZIP_1`/`GZIP_2` without the `gzip` feature return /// [`Error::UnsupportedCompression`]. + /// - `HCOMPRESS_1` with `SMOOTH != 0` (smoothing) is rejected. pub fn decompress(&self) -> Result { use crate::image_data::{ImageData, PixelData}; use crate::types::Bitpix; @@ -282,13 +286,26 @@ impl<'a> CompressedImage<'a> { let cdata_col = self.compressed_data_column()?; for tile_index in 0..num_tiles { - let raw = self.tile_bytes(tile_index, cdata_col)?; // Number of pixels in this (possibly edge-truncated) tile. let coords = unravel(tile_index, &tiles_per_axis); let tile_dims = self.tile_dims_at(&coords); let tile_npix: usize = tile_dims.iter().product(); - let values = self.decode_tile(&raw, tile_npix)?; + // PLIO_1 stores its line list as a `1PI` (i16) VLA; every other codec + // uses a byte (`1PB`) VLA. Fetch the cell in its native form and decode. + let values = if self.ctype == CompressionType::Plio1 { + let words = self.tile_words_i16(tile_index, cdata_col)?; + plio_decompress(&words, tile_npix)? + .into_iter() + .map(|v| v as i64) + .collect() + } else if self.ctype == CompressionType::Hcompress1 { + let raw = self.tile_bytes(tile_index, cdata_col)?; + self.decode_hcompress(&raw, &tile_dims)? + } else { + let raw = self.tile_bytes(tile_index, cdata_col)?; + self.decode_tile(&raw, tile_npix)? + }; if values.len() != tile_npix { return Err(Error::CompressionError(format!( "tile {tile_index} decoded {} values, expected {tile_npix}", @@ -378,7 +395,11 @@ impl<'a> CompressedImage<'a> { let tile_floats: Vec = match (zscale, zzero) { // Quantized tile: COMPRESSED_DATA holds quantized integers. (Some(scale), Some(zero)) if !cdata.is_empty() && is_quantized(scale) => { - let q = self.decode_tile(&cdata, tile_npix)?; + let q = if self.ctype == CompressionType::Hcompress1 { + self.decode_hcompress(&cdata, &tile_dims)? + } else { + self.decode_tile(&cdata, tile_npix)? + }; self.unquantize(&q, tile_index, scale, zero, blank, tile_npix) } // Unquantized tile (or whole image is lossless): raw floats live in @@ -563,6 +584,48 @@ impl<'a> CompressedImage<'a> { } } + /// Read one tile's `1PI`/`1QI` cell as native `i16` words (PLIO_1 line list). + fn tile_words_i16(&self, row: usize, col: usize) -> Result> { + match self.table.get_cell(row, col)? { + crate::bintable::BinCellValue::I16(v) => Ok(v), + // Some writers may declare the PLIO list as a byte VLA; reinterpret as + // big-endian i16 pairs in that case. + crate::bintable::BinCellValue::Bytes(b) => Ok(b + .chunks_exact(2) + .map(|c| i16::from_be_bytes([c[0], c[1]])) + .collect()), + other => Err(Error::CompressionError(format!( + "PLIO_1 COMPRESSED_DATA cell is not an i16 VLA: {other:?}" + ))), + } + } + + /// Decode one HCOMPRESS_1 tile into a flat array of `i64` integer samples + /// (axis-1 fastest, matching the tile's pixel layout). + /// + /// The tile dimensions (`nx`, `ny`) and digitization `scale` are read from the + /// encoded stream itself; `tile_dims` is `[width, height]` (axis-1 first) and is + /// used only to validate the decoded size. HCOMPRESS lays out its output array + /// with `ny` (== tile width, the FITS fast axis) varying fastest, which already + /// matches the axis-1-fastest order the scatter step expects. + fn decode_hcompress(&self, raw: &[u8], tile_dims: &[usize]) -> Result> { + // SMOOTH parameter (ZNAMEi='SMOOTH'); default 0. + let smooth = self.smooth; + // cfitsio uses the 32-bit H-transform (`fits_hdecompress`) only for + // ZBITPIX 8/16; every other original type (32-bit ints, and quantized + // -32/-64 floats) uses the 64-bit transform to avoid intermediate overflow. + let wide = !(self.zbitpix == 8 || self.zbitpix == 16); + let (a, nx, ny) = hcompress_decompress(raw, wide, smooth)?; + let tile_npix: usize = tile_dims.iter().product(); + if nx * ny != tile_npix { + return Err(Error::CompressionError(format!( + "HCOMPRESS tile size {nx}x{ny} = {} != expected {tile_npix}", + nx * ny + ))); + } + Ok(a) + } + /// Tile dimensions at tile-grid coordinates `coords`, accounting for edge tiles /// that are truncated when `ZNAXISn` is not a multiple of `ZTILEn`. fn tile_dims_at(&self, coords: &[usize]) -> Vec { @@ -607,11 +670,14 @@ impl<'a> CompressedImage<'a> { let unshuffled = gzip2_unshuffle(&inflated, self.zbitpix_bytes()); self.bytes_to_ints(&unshuffled, tile_npix) } - CompressionType::Plio1 => Err(Error::UnsupportedCompression( - "PLIO_1 decode is phase 4 (see COMPRESSION_PLAN.md)".into(), + // PLIO_1 and HCOMPRESS_1 are dispatched to their dedicated per-tile + // decoders (`plio_decompress` / `decode_hcompress`) in `decompress` / + // `decompress_float`, so they never reach this generic byte path. + CompressionType::Plio1 => Err(Error::CompressionError( + "internal: PLIO_1 must be decoded via plio_decompress".into(), )), - CompressionType::Hcompress1 => Err(Error::UnsupportedCompression( - "HCOMPRESS_1 decode is phase 5 (see COMPRESSION_PLAN.md)".into(), + CompressionType::Hcompress1 => Err(Error::CompressionError( + "internal: HCOMPRESS_1 must be decoded via decode_hcompress".into(), )), } } @@ -1091,9 +1157,930 @@ fn gzip2_unshuffle(shuffled: &[u8], _bytepix: usize) -> Vec { shuffled.to_vec() } -/// Decompress a PLIO_1 (IRAF pixel-list RLE) tile into i32 mask values. TODO(phase 4). -pub fn plio_decompress(_src: &[u8], _nvals: usize) -> Result> { - todo!("phase 4: PLIO_1 pixel-list decode") +// --------------------------------------------------------------------------- +// PLIO_1 decompression (IRAF pixel-list run-length encoding) +// --------------------------------------------------------------------------- +// +// Port of cfitsio `pliocomp.c` `pl_l2pi` (D. Tody, NRAO; the IRAF PLIO scheme). +// A tile is stored as a *line list*: a stream of 16-bit instruction words. Each +// word is `(opcode << 12) | data`, where `opcode` is the high nibble (0..15) and +// `data` is the low 12 bits. The list reconstructs a 1-D run of `npix` +// non-negative integers ("pixels"), tracking a running "high value" `pv`. +// +// In the FITS tiled-image convention the line list is stored in the +// `COMPRESSED_DATA` column as a `1PI`/`1QI` variable-length array of *big-endian* +// 16-bit values (cfitsio reads it with `fits_read_col(TSHORT, ...)`), so the +// caller hands us the already-byte-swapped `i16` words. +// +// Opcodes (after cfitsio's `opcode = word/4096; data = word & 4095`; the C switch +// is on `opcode+1`, reproduced here on `opcode` directly): +// * 0 (Z run) — `data` zeros. +// * 1 (HD, high+data) — set high value: `pv = (next_word << 12) + data`; the +// following word is consumed as the high bits. +// * 2 (IH, inc high) — `pv += data`. +// * 3 (DH, dec high) — `pv -= data`. +// * 4 (PD, pixel data)— `data` pixels at value `pv`. +// * 5 (PS, pixel set) — `data` zeros, then the *last* pixel of the run is set +// to `pv` (single non-zero pixel at the end of the span). +// * 6 (IS, inc set) — `pv += data`, then emit one pixel at `pv`. +// * 7 (DS, dec set) — `pv -= data`, then emit one pixel at `pv`. +// (cfitsio's switch maps opcodes 0,4,5 all through the "run of length `data`" +// label L160; opcode 0 is a pure zero run, 4 fills with `pv`, 5 fills with zeros +// then sets the last element. Opcodes 1..3 adjust `pv` without emitting span +// pixels, and 6/7 adjust `pv` and emit a single pixel.) + +/// Decompress a PLIO_1 (IRAF pixel-list RLE) tile into `nvals` `i32` values. +/// +/// `words` is the line-list instruction stream (already decoded from the +/// big-endian `1PI` VLA into native `i16`). Pixels are non-negative; any pixels +/// not explicitly written are zero. +pub fn plio_decompress(words: &[i16], nvals: usize) -> Result> { + // Treat the instruction words as unsigned 16-bit (the high nibble is the + // opcode; PLIO values never use the sign bit). + let w: Vec = words.iter().map(|&v| (v as u16) as u32).collect(); + + let mut out = vec![0i32; nvals]; + if nvals == 0 { + return Ok(out); + } + if w.len() < 3 { + return Err(Error::CompressionError( + "PLIO_1 line list too short (need >= 3 header words)".into(), + )); + } + + // List length and index of the first instruction word (cfitsio uses 1-based + // indices via `--ll_src`; here `w[0]` == cfitsio `ll_src[1]`, so cfitsio + // `ll_src[k]` == `w[k-1]`). + // + // if ll_src[3] > 0: lllen = ll_src[3]; llfirt = 4 + // else: lllen = (ll_src[5]<<15)+ll_src[4]; llfirt = ll_src[2]+1 + // The `ll_src[3] > 0` test is a SIGNED 16-bit comparison: the standard PLIO + // header stores -100 there, so the long form (else branch) is the usual path. + let (lllen, llfirt) = if words[2] > 0 { + (w[2] as usize, 4usize) + } else { + if w.len() < 5 { + return Err(Error::CompressionError( + "PLIO_1 long-form line list too short".into(), + )); + } + let len = ((w[4] << 15) + w[3]) as usize; + (len, (w[1] as usize) + 1) + }; + + if lllen == 0 { + return Ok(out); // empty list => all zeros + } + + // Pixel-output cursor `op` (1-based in cfitsio; `op==1` means out[0]). + let mut op: usize = 1; // index into 1-based px_dst + let mut x1: i64 = 1; // current x position (1-based) + let mut pv: i64 = 1; // running "high value" + let xs: i64 = 1; // starting index (cfitsio passes xs=1) + let xe: i64 = nvals as i64; // end index inclusive + + let put = |out: &mut [i32], op: usize, v: i64| { + // 1-based op -> 0-based; ignore writes past the tile (defensive). + if op >= 1 && op <= out.len() { + out[op - 1] = v as i32; + } + }; + + // ip iterates cfitsio ll_src indices llfirt..=lllen (1-based) => w[ip-1]. + let mut ip = llfirt; + let mut skipwd = false; + 'outer: while ip <= lllen { + if ip == 0 || ip > w.len() { + break; + } + if skipwd { + skipwd = false; + ip += 1; + continue; + } + let word = w[ip - 1]; + let opcode = word / 4096; + let data = (word & 4095) as i64; + + match opcode { + // L160: run of length `data` (opcodes 0, 4, 5). + 0 | 4 | 5 => { + let x2 = x1 + data - 1; + let i1 = x1.max(xs); + let i2 = x2.min(xe); + let np = i2 - i1 + 1; + if np > 0 { + let otop = op as i64 + np - 1; + if opcode == 4 { + for i in op..=(otop as usize) { + put(&mut out, i, pv); + } + } else { + // opcodes 0 and 5: zeros (out is already zero-initialized). + if opcode == 5 && i2 == x2 { + put(&mut out, otop as usize, pv); + } + } + op = (otop + 1) as usize; + } + x1 = x2 + 1; + } + // L220: set high value from this+next word. + 1 => { + if ip >= w.len() { + return Err(Error::CompressionError( + "PLIO_1 truncated high-value instruction".into(), + )); + } + pv = ((w[ip] as i64) << 12) + data; + skipwd = true; + } + // L230: increment high value. + 2 => { + pv += data; + } + // L240: decrement high value. + 3 => { + pv -= data; + } + // L250: increment high value and emit one pixel. + 6 => { + pv += data; + if x1 >= xs && x1 <= xe { + put(&mut out, op, pv); + op += 1; + } + x1 += 1; + } + // L260: decrement high value and emit one pixel. + 7 => { + pv -= data; + if x1 >= xs && x1 <= xe { + put(&mut out, op, pv); + op += 1; + } + x1 += 1; + } + _ => { + // opcodes 8..15 are unused by the encoder; cfitsio falls through + // (no-op) on them. Match that behaviour. + } + } + + if x1 > xe { + break 'outer; + } + ip += 1; + } + + // Remaining pixels are zero (already initialized). + Ok(out) +} + +// --------------------------------------------------------------------------- +// HCOMPRESS_1 decompression +// --------------------------------------------------------------------------- +// +// Port of cfitsio `fits_hdecompress` / `fits_hdecompress64` (the concatenation of +// R. White's hinv.c / undigitize.c / decode.c / dodecode.c / qtree_decode.c / +// qread.c / bit_input.c in STScI's hcompress distribution). +// +// Pipeline (read direction): +// 1. `decode` — parse the byte stream header (2-byte magic 0xDD 0x99, then +// big-endian `nx`, `ny`, `scale` as 4-byte ints, an 8-byte `sumall`, and a +// 3-byte `nbitplanes`), then `dodecode` the quadtree-coded bit planes of the +// four image quadrants into the transform-coefficient array `a`, and finally +// the per-element sign bits. `a[0]` is overwritten with `sumall`. +// 2. `undigitize` — multiply every coefficient by `scale` (the quantization +// step; `scale <= 1` is a no-op, i.e. lossless). +// 3. `hinv` — inverse H-transform, expanding the coefficients back into pixels. +// +// The array `a` is indexed as `a[i*ny + j]` (ny = fast axis = FITS axis-1 width), +// so its flat order already matches the tile's axis-1-fastest pixel layout. +// +// cfitsio uses 32-bit ints for ZBITPIX 8/16 and 64-bit ints for everything else +// (incl. quantized floats), because the H-transform intermediate sums can exceed +// 32 bits. We reproduce both via a macro, using wrapping arithmetic to match C's +// 2's-complement overflow behaviour exactly. + +/// Bit/byte reader for the HCOMPRESS stream (mirrors cfitsio's global +/// `nextchar` + `buffer2`/`bits_to_go` state machine, but as a struct). +struct HcInput<'a> { + data: &'a [u8], + nextchar: usize, + buffer2: i32, + bits_to_go: i32, +} + +impl<'a> HcInput<'a> { + fn new(data: &'a [u8]) -> Self { + HcInput { + data, + nextchar: 0, + buffer2: 0, + bits_to_go: 0, + } + } + + #[inline] + fn next_byte(&mut self) -> Result { + let b = *self + .data + .get(self.nextchar) + .ok_or_else(|| Error::CompressionError("HCOMPRESS: unexpected end of stream".into()))?; + self.nextchar += 1; + Ok(b as i32) + } + + /// Read `n` raw bytes (no bit buffering); cfitsio `qread`. + fn qread(&mut self, n: usize) -> Result<&[u8]> { + let start = self.nextchar; + let end = start + n; + if end > self.data.len() { + return Err(Error::CompressionError( + "HCOMPRESS: unexpected end of stream (qread)".into(), + )); + } + self.nextchar = end; + Ok(&self.data[start..end]) + } + + /// Read a big-endian 4-byte int (cfitsio `readint`). + fn readint(&mut self) -> Result { + let b = self.qread(4)?; + let mut a = b[0] as i32; + for &x in &b[1..4] { + a = (a << 8) + x as i32; + } + Ok(a) + } + + /// Read a big-endian 8-byte long long (cfitsio `readlonglong`). + fn readlonglong(&mut self) -> Result { + let b = self.qread(8)?; + let mut a = b[0] as i64; + for &x in &b[1..8] { + a = (a << 8) + x as i64; + } + Ok(a) + } + + fn start_inputing_bits(&mut self) { + self.bits_to_go = 0; + } + + fn input_bit(&mut self) -> Result { + if self.bits_to_go == 0 { + self.buffer2 = self.next_byte()?; + self.bits_to_go = 8; + } + self.bits_to_go -= 1; + Ok((self.buffer2 >> self.bits_to_go) & 1) + } + + fn input_nbits(&mut self, n: i32) -> Result { + if self.bits_to_go < n { + self.buffer2 = (self.buffer2 << 8) | self.next_byte()?; + self.bits_to_go += 8; + } + self.bits_to_go -= n; + Ok((self.buffer2 >> self.bits_to_go) & ((1 << n) - 1)) + } + + #[inline] + fn input_nybble(&mut self) -> Result { + self.input_nbits(4) + } + + /// Read `n` 4-bit nybbles into `array` (cfitsio `input_nnybble`). + fn input_nnybble(&mut self, n: usize, array: &mut [u8]) -> Result<()> { + if n == 1 { + array[0] = self.input_nybble()? as u8; + return Ok(()); + } + if self.bits_to_go == 8 { + // Backspace to reuse the last char (cfitsio quirk). + self.nextchar -= 1; + self.bits_to_go = 0; + } + let shift1 = self.bits_to_go + 4; + let shift2 = self.bits_to_go; + let mut kk = 0usize; + let mut ii = 0usize; + if self.bits_to_go == 0 { + while ii < n / 2 { + self.buffer2 = (self.buffer2 << 8) | self.next_byte()?; + array[kk] = ((self.buffer2 >> 4) & 15) as u8; + array[kk + 1] = (self.buffer2 & 15) as u8; + kk += 2; + ii += 1; + } + } else { + while ii < n / 2 { + self.buffer2 = (self.buffer2 << 8) | self.next_byte()?; + array[kk] = ((self.buffer2 >> shift1) & 15) as u8; + array[kk + 1] = ((self.buffer2 >> shift2) & 15) as u8; + kk += 2; + ii += 1; + } + } + if ii * 2 != n { + array[n - 1] = self.input_nybble()? as u8; + } + Ok(()) + } + + /// Huffman decode of a 4-bit code (cfitsio `input_huffman`). + fn input_huffman(&mut self) -> Result { + let mut c = self.input_nbits(3)?; + if c < 4 { + return Ok(1 << c); + } + c = self.input_bit()? | (c << 1); + if c < 13 { + match c { + 8 => return Ok(3), + 9 => return Ok(5), + 10 => return Ok(10), + 11 => return Ok(12), + 12 => return Ok(15), + _ => {} + } + } + c = self.input_bit()? | (c << 1); + if c < 31 { + match c { + 26 => return Ok(6), + 27 => return Ok(7), + 28 => return Ok(9), + 29 => return Ok(11), + 30 => return Ok(13), + _ => {} + } + } + c = self.input_bit()? | (c << 1); + if c == 62 { + Ok(0) + } else { + Ok(14) + } + } +} + +/// Expand 4-bit quadtree values from `a[(nx+1)/2,(ny+1)/2]` into `b[nx,ny]` +/// (2x2 per value); cfitsio `qtree_copy`. `a` and `b` are the same buffer here, so +/// we operate in place exactly as the C does (iterating from the end first). +fn qtree_copy(buf: &mut [u8], nx: usize, ny: usize, n: usize) { + let nx2 = nx.div_ceil(2); + let ny2 = ny.div_ceil(2); + // Copy 4-bit values to b, from the end (a,b same array). + // k is index of a[i,j]; s00 is index of b[2*i,2*j]. + let mut k = (ny2 * (nx2 - 1) + ny2 - 1) as isize; + for i in (0..nx2).rev() { + let mut s00 = (2 * (n * i + ny2 - 1)) as isize; + for _j in (0..ny2).rev() { + buf[s00 as usize] = buf[k as usize]; + k -= 1; + s00 -= 2; + } + } + // Expand each 2x2 block. Mapping: bit3->b[s00], bit2->b[s00+1], + // bit1->b[s10], bit0->b[s10+1] where s10 = s00+n. + let mut i = 0usize; + while i + 1 < nx { + let mut s00 = n * i; + let s10base = s00 + n; + let mut s10 = s10base; + let mut j = 0usize; + while j + 1 < ny { + let v = buf[s00]; + buf[s10 + 1] = v & 1; + buf[s10] = (v >> 1) & 1; + buf[s00 + 1] = (v >> 2) & 1; + buf[s00] = (v >> 3) & 1; + s00 += 2; + s10 += 2; + j += 2; + } + if j < ny { + // odd row length + let v = buf[s00]; + buf[s10] = (v >> 1) & 1; + buf[s00] = (v >> 3) & 1; + } + i += 2; + } + if i < nx { + // odd column length: last row, s10 off edge + let mut s00 = n * i; + let mut j = 0usize; + while j + 1 < ny { + let v = buf[s00]; + buf[s00 + 1] = (v >> 2) & 1; + buf[s00] = (v >> 3) & 1; + s00 += 2; + j += 2; + } + if j < ny { + let v = buf[s00]; + buf[s00] = (v >> 3) & 1; + } + } +} + +/// One quadtree expansion step (cfitsio `qtree_expand`): copy+expand then read a +/// fresh Huffman code into every non-zero element (scanning from the end). +fn qtree_expand(input: &mut HcInput, buf: &mut [u8], nx: usize, ny: usize) -> Result<()> { + qtree_copy(buf, nx, ny, ny); + for i in (0..nx * ny).rev() { + if buf[i] != 0 { + buf[i] = input.input_huffman()? as u8; + } + } + Ok(()) +} + +macro_rules! impl_hdecompress { + ($name:ident, $t:ty, $bitins:ident, $read_bdirect:ident, $qtree_decode:ident, + $dodecode:ident, $hinv:ident, $undigitize:ident, $unshuffle:ident) => { + /// Distribute even/odd interleaved coefficients (cfitsio `unshuffle`). + /// `offset` is the base index into `a`; pointer arithmetic is done in + /// `isize` so the trailing (unused) decrements can go negative as in C. + fn $unshuffle(a: &mut [$t], offset: usize, n: usize, n2: usize, tmp: &mut [$t]) { + let base = offset as isize; + let n2i = n2 as isize; + let nhalf = (n + 1) >> 1; + // copy 2nd half of array to tmp + let mut p1 = base + n2i * nhalf as isize; + for slot in tmp.iter_mut().take(n - nhalf) { + *slot = a[p1 as usize]; + p1 += n2i; + } + // distribute 1st half to even elements (descending) + let mut p2 = base + n2i * (nhalf as isize - 1); + let mut p1e = base + ((n2i * (nhalf as isize - 1)) << 1); + let mut i = nhalf as isize - 1; + while i >= 0 { + a[p1e as usize] = a[p2 as usize]; + p2 -= n2i; + p1e -= n2i + n2i; + i -= 1; + } + // distribute 2nd half (tmp) to odd elements + let mut p1o = base + n2i; + let mut pt = 0usize; + let mut i = 1usize; + while i < n { + a[p1o as usize] = tmp[pt]; + p1o += n2i + n2i; + pt += 1; + i += 2; + } + } + + /// Insert expanded 4-bit codes from `aa[(nx+1)/2,(ny+1)/2]` into bitplane + /// `bit` of `b[nx,ny]` (cfitsio `qtree_bitins`). + fn $bitins(aa: &[u8], nx: usize, ny: usize, b: &mut [$t], n: usize, bit: i32) { + let plane_val: $t = (1 as $t) << bit; + let mut k = 0usize; + let mut i = 0usize; + while i + 1 < nx { + let s00 = n * i; + let mut s00 = s00; + let mut j = 0usize; + while j + 1 < ny { + let v = aa[k]; + if v & 1 != 0 { + b[s00 + n + 1] |= plane_val; + } + if v & 2 != 0 { + b[s00 + n] |= plane_val; + } + if v & 4 != 0 { + b[s00 + 1] |= plane_val; + } + if v & 8 != 0 { + b[s00] |= plane_val; + } + s00 += 2; + k += 1; + j += 2; + } + if j < ny { + // odd row: s00+1, s10+1 off edge -> only bits 1 (s10) and 3 (s00) + let v = aa[k]; + if v & 2 != 0 { + b[s00 + n] |= plane_val; + } + if v & 8 != 0 { + b[s00] |= plane_val; + } + k += 1; + } + i += 2; + } + if i < nx { + // odd column: last row, s10 off edge -> bits 2 (s00+1) and 3 (s00) + let mut s00 = n * i; + let mut j = 0usize; + while j + 1 < ny { + let v = aa[k]; + if v & 4 != 0 { + b[s00 + 1] |= plane_val; + } + if v & 8 != 0 { + b[s00] |= plane_val; + } + s00 += 2; + k += 1; + j += 2; + } + if j < ny { + // corner: only bit 3 (s00) + let v = aa[k]; + if v & 8 != 0 { + b[s00] |= plane_val; + } + k += 1; + } + } + let _ = k; + } + + /// Read a directly-stored bit plane and insert it (cfitsio `read_bdirect`). + fn $read_bdirect( + input: &mut HcInput, + a: &mut [$t], + n: usize, + nqx: usize, + nqy: usize, + scratch: &mut [u8], + bit: i32, + ) -> Result<()> { + let cnt = nqx.div_ceil(2) * nqy.div_ceil(2); + input.input_nnybble(cnt, scratch)?; + $bitins(scratch, nqx, nqy, a, n, bit); + Ok(()) + } + + /// Decode the quadtree-coded bit planes of one quadrant (cfitsio + /// `qtree_decode`). + fn $qtree_decode( + input: &mut HcInput, + a: &mut [$t], + a_off: usize, + n: usize, + nqx: usize, + nqy: usize, + nbitplanes: i32, + ) -> Result<()> { + let nqmax = nqx.max(nqy); + let mut log2n = ((nqmax as f32).ln() / 2.0f32.ln() + 0.5) as i32; + if nqmax > (1usize << log2n) { + log2n += 1; + } + let nqx2 = nqx.div_ceil(2); + let nqy2 = nqy.div_ceil(2); + let mut scratch = vec![0u8; nqx2 * nqy2 + 4]; + + let asl = &mut a[a_off..]; + + let mut bit = nbitplanes - 1; + while bit >= 0 { + let b = input.input_nybble()?; + if b == 0 { + $read_bdirect(input, asl, n, nqx, nqy, &mut scratch, bit)?; + } else if b != 0xf { + return Err(Error::CompressionError( + "qtree_decode: bad format code".into(), + )); + } else { + scratch[0] = input.input_huffman()? as u8; + let mut nx = 1usize; + let mut ny = 1usize; + let mut nfx = nqx; + let mut nfy = nqy; + let mut c = 1usize << log2n; + let mut k = 1; + while k < log2n { + c >>= 1; + nx <<= 1; + ny <<= 1; + if nfx <= c { + nx -= 1; + } else { + nfx -= c; + } + if nfy <= c { + ny -= 1; + } else { + nfy -= c; + } + qtree_expand(input, &mut scratch, nx, ny)?; + k += 1; + } + $bitins(&scratch, nqx, nqy, asl, n, bit); + } + bit -= 1; + } + Ok(()) + } + + /// Decode the four quadrants into coefficient array `a` (cfitsio + /// `dodecode`). + fn $dodecode( + input: &mut HcInput, + a: &mut [$t], + nx: usize, + ny: usize, + nbitplanes: [u8; 3], + ) -> Result<()> { + let nel = nx * ny; + let nx2 = nx.div_ceil(2); + let ny2 = ny.div_ceil(2); + for v in a.iter_mut().take(nel) { + *v = 0 as $t; + } + input.start_inputing_bits(); + $qtree_decode(input, a, 0, ny, nx2, ny2, nbitplanes[0] as i32)?; + $qtree_decode(input, a, ny2, ny, nx2, ny / 2, nbitplanes[1] as i32)?; + $qtree_decode(input, a, ny * nx2, ny, nx / 2, ny2, nbitplanes[1] as i32)?; + $qtree_decode( + input, + a, + ny * nx2 + ny2, + ny, + nx / 2, + ny / 2, + nbitplanes[2] as i32, + )?; + if input.input_nybble()? != 0 { + return Err(Error::CompressionError( + "dodecode: bad bit plane values (missing EOF)".into(), + )); + } + // sign bits + input.start_inputing_bits(); + for v in a.iter_mut().take(nel) { + if *v != 0 as $t && input.input_bit()? != 0 { + *v = (0 as $t).wrapping_sub(*v); + } + } + Ok(()) + } + + fn $undigitize(a: &mut [$t], nel: usize, scale: i32) { + if scale <= 1 { + return; + } + let s = scale as $t; + for v in a.iter_mut().take(nel) { + *v = (*v).wrapping_mul(s); + } + } + + /// Inverse H-transform (cfitsio `hinv`). `smooth` is unsupported (the + /// fixtures use SMOOTH=0); a non-zero value is rejected by the caller. + fn $hinv(a: &mut [$t], nx: usize, ny: usize) { + let nmax = nx.max(ny); + let mut log2n = ((nmax as f32).ln() / 2.0f32.ln() + 0.5) as i32; + if nmax > (1usize << log2n) { + log2n += 1; + } + let nmax_i = nmax; + let mut tmp = vec![0 as $t; nmax_i.div_ceil(2) + 1]; + + let mut shift: i32 = 1; + let mut bit0: $t = (1 as $t) << (log2n - 1); + let mut bit1: $t = bit0 << 1; + let mut bit2: $t = bit0 << 2; + let mut mask0: $t = (0 as $t).wrapping_sub(bit0); + let mut mask1: $t = mask0 << 1; + let mask2: $t = mask0 << 2; + let mut prnd0: $t = bit0 >> 1; + let mut prnd1: $t = bit1 >> 1; + let prnd2: $t = bit2 >> 1; + let mut nrnd0: $t = prnd0 - 1; + let mut nrnd1: $t = prnd1 - 1; + let nrnd2: $t = prnd2 - 1; + + // round h0 to multiple of bit2 + a[0] = (a[0].wrapping_add(if a[0] >= 0 as $t { prnd2 } else { nrnd2 })) & mask2; + + let ny_i = ny as isize; + let mut nxtop = 1usize; + let mut nytop = 1usize; + let mut nxf = nx; + let mut nyf = ny; + let mut c = 1usize << log2n; + let mut k = log2n - 1; + while k >= 0 { + c >>= 1; + nxtop <<= 1; + nytop <<= 1; + if nxf <= c { + nxtop -= 1; + } else { + nxf -= c; + } + if nyf <= c { + nytop -= 1; + } else { + nyf -= c; + } + if k == 0 { + nrnd0 = 0 as $t; + shift = 2; + } + // unshuffle in each dimension + for i in 0..nxtop { + $unshuffle(a, ny * i, nytop, 1, &mut tmp); + } + for j in 0..nytop { + $unshuffle(a, j, nxtop, ny, &mut tmp); + } + let oddx = nxtop % 2; + let oddy = nytop % 2; + let mut i = 0usize; + while i + oddx < nxtop { + // i steps by 2 over 0..nxtop-oddx + let mut s00 = (ny * i) as isize; + let mut s10 = s00 + ny_i; + let mut j = 0usize; + while j + oddy < nytop { + let mut h0 = a[s00 as usize]; + let mut hx = a[s10 as usize]; + let mut hy = a[(s00 + 1) as usize]; + let mut hc = a[(s10 + 1) as usize]; + hx = (hx.wrapping_add(if hx >= 0 as $t { prnd1 } else { nrnd1 })) & mask1; + hy = (hy.wrapping_add(if hy >= 0 as $t { prnd1 } else { nrnd1 })) & mask1; + hc = (hc.wrapping_add(if hc >= 0 as $t { prnd0 } else { nrnd0 })) & mask0; + let lowbit0 = hc & bit0; + hx = if hx >= 0 as $t { + hx.wrapping_sub(lowbit0) + } else { + hx.wrapping_add(lowbit0) + }; + hy = if hy >= 0 as $t { + hy.wrapping_sub(lowbit0) + } else { + hy.wrapping_add(lowbit0) + }; + let lowbit1 = (hc ^ hx ^ hy) & bit1; + h0 = if h0 >= 0 as $t { + h0.wrapping_add(lowbit0).wrapping_sub(lowbit1) + } else { + h0.wrapping_add(if lowbit0 == 0 as $t { + lowbit1 + } else { + lowbit0.wrapping_sub(lowbit1) + }) + }; + a[(s10 + 1) as usize] = + (h0.wrapping_add(hx).wrapping_add(hy).wrapping_add(hc)) >> shift; + a[s10 as usize] = + (h0.wrapping_add(hx).wrapping_sub(hy).wrapping_sub(hc)) >> shift; + a[(s00 + 1) as usize] = + (h0.wrapping_sub(hx).wrapping_add(hy).wrapping_sub(hc)) >> shift; + a[s00 as usize] = + (h0.wrapping_sub(hx).wrapping_sub(hy).wrapping_add(hc)) >> shift; + s00 += 2; + s10 += 2; + j += 2; + } + if oddy != 0 { + let mut h0 = a[s00 as usize]; + let mut hx = a[s10 as usize]; + hx = (hx.wrapping_add(if hx >= 0 as $t { prnd1 } else { nrnd1 })) & mask1; + let lowbit1 = hx & bit1; + h0 = if h0 >= 0 as $t { + h0.wrapping_sub(lowbit1) + } else { + h0.wrapping_add(lowbit1) + }; + a[s10 as usize] = (h0.wrapping_add(hx)) >> shift; + a[s00 as usize] = (h0.wrapping_sub(hx)) >> shift; + } + i += 2; + } + if oddx != 0 { + let mut s00 = (ny * i) as isize; + let mut j = 0usize; + while j + oddy < nytop { + let mut h0 = a[s00 as usize]; + let mut hy = a[(s00 + 1) as usize]; + hy = (hy.wrapping_add(if hy >= 0 as $t { prnd1 } else { nrnd1 })) & mask1; + let lowbit1 = hy & bit1; + h0 = if h0 >= 0 as $t { + h0.wrapping_sub(lowbit1) + } else { + h0.wrapping_add(lowbit1) + }; + a[(s00 + 1) as usize] = (h0.wrapping_add(hy)) >> shift; + a[s00 as usize] = (h0.wrapping_sub(hy)) >> shift; + s00 += 2; + j += 2; + } + if oddy != 0 { + let h0 = a[s00 as usize]; + a[s00 as usize] = h0 >> shift; + } + } + // divide masks/rounding by 2 + bit2 = bit1; + bit1 = bit0; + bit0 >>= 1; + mask1 = mask0; + mask0 >>= 1; + prnd1 = prnd0; + prnd0 >>= 1; + nrnd1 = nrnd0; + nrnd0 = prnd0 - 1; + k -= 1; + } + let _ = (bit2, mask1, prnd1, nrnd1); + } + + /// Full HCOMPRESS decode for one quadrant-int width. Returns the pixel + /// array (axis-1 fastest) along with `(nx_slow, ny_fast)`. + fn $name(input: &mut HcInput, smooth: i32) -> Result<(Vec<$t>, usize, usize)> { + // magic code + let magic = input.qread(2)?; + if magic != [0xDDu8, 0x99u8] { + return Err(Error::CompressionError( + "HCOMPRESS: bad magic code".into(), + )); + } + let nx = input.readint()? as usize; // slow axis + let ny = input.readint()? as usize; // fast axis + let scale = input.readint()?; + let nel = nx.checked_mul(ny).ok_or_else(|| { + Error::CompressionError("HCOMPRESS: dimension overflow".into()) + })?; + let sumall = input.readlonglong()?; + let nbp = input.qread(3)?; + let nbitplanes = [nbp[0], nbp[1], nbp[2]]; + + let mut a = vec![0 as $t; nel.max(1)]; + $dodecode(input, &mut a, nx, ny, nbitplanes)?; + // put sum of all pixels back into pixel 0 + a[0] = sumall as $t; + + if smooth != 0 { + return Err(Error::UnsupportedCompression( + "HCOMPRESS_1 SMOOTH != 0 is not supported".into(), + )); + } + $undigitize(&mut a, nel, scale); + $hinv(&mut a, nx, ny); + Ok((a, nx, ny)) + } + }; +} + +impl_hdecompress!( + hdecode32, + i32, + qtree_bitins32, + read_bdirect32, + qtree_decode32, + dodecode32, + hinv32, + undigitize32, + unshuffle32 +); +impl_hdecompress!( + hdecode64, + i64, + qtree_bitins64, + read_bdirect64, + qtree_decode64, + dodecode64, + hinv64, + undigitize64, + unshuffle64 +); + +/// Decompress one HCOMPRESS_1 tile. +/// +/// `wide` selects the 64-bit transform (used for ZBITPIX 32 and quantized +/// -32/-64 floats; cfitsio uses the 32-bit transform only for ZBITPIX 8/16). +/// Returns the decoded integer samples (axis-1 fastest) and the `(nx_slow, +/// ny_fast)` dimensions read from the stream. +pub fn hcompress_decompress( + src: &[u8], + wide: bool, + smooth: i32, +) -> Result<(Vec, usize, usize)> { + let mut input = HcInput::new(src); + if wide { + let (a, nx, ny) = hdecode64(&mut input, smooth)?; + Ok((a, nx, ny)) + } else { + let (a, nx, ny) = hdecode32(&mut input, smooth)?; + Ok((a.into_iter().map(|v| v as i64).collect(), nx, ny)) + } } #[cfg(test)] @@ -1146,6 +2133,33 @@ mod tests { } } + #[test] + fn plio_single_pixel_at_end_of_run() { + // PLIO line list for a 512-pixel row with one pixel == 1 at index 178 + // (0-based), all else zero. Lifted verbatim from the cfitsio `fpack -p` + // output for row 137 of EUVEngc4151imgx.fits. Header word [2] = -100 (the + // PLIO magic, 0xFF9C) forces the long-form length decode. + let words: Vec = [0u16, 7, 0xFF9C, 9, 0, 0, 0, 20659, 333] + .iter() + .map(|&w| w as i16) + .collect(); + let out = plio_decompress(&words, 512).unwrap(); + let mut expected = vec![0i32; 512]; + expected[178] = 1; // opcode 5 (run 179, last pixel set to pv=1) + assert_eq!(out, expected); + } + + #[test] + fn plio_empty_list_is_all_zero() { + // A zero-length line list (lllen == 0) decodes to an all-zero tile. + let words: Vec = [0u16, 7, 0xFF9C, 0, 0, 0, 0] + .iter() + .map(|&w| w as i16) + .collect(); + let out = plio_decompress(&words, 16).unwrap(); + assert_eq!(out, vec![0i32; 16]); + } + #[test] fn bit_reader_basic() { // 0b1011_0010, 0b1100_0000 diff --git a/tests/sample_files.rs b/tests/sample_files.rs index c883a0e..3a48201 100644 --- a/tests/sample_files.rs +++ b/tests/sample_files.rs @@ -266,6 +266,24 @@ fn rice_roundtrip_fgs_i32() { assert_rice_roundtrip("FGSf64y0106m_a1f.fits", "FGSf64y0106m_a1f.rice.fits.fz"); } +#[test] +fn plio_roundtrip_euv() { + // PLIO_1 (IRAF pixel-list RLE) over the I16 EUV image extensions. Lossless, + // so the decode must reproduce the source pixels byte-for-byte. + assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.plio.fits.fz"); +} + +#[test] +fn hcompress_int_roundtrip_euv() { + // HCOMPRESS_1 with SCALE=0 is LOSSLESS for integer images, so the decode of + // the I16 EUV extensions must be byte-exact vs the source. Exercises both the + // 512x16 and 2048x16 tile geometries. + assert_rice_roundtrip( + "EUVEngc4151imgx.fits", + "EUVEngc4151imgx.hcomp_int.fits.fz", + ); +} + // --- float tile-compression: bit-exact vs funpack ------------------------- // // Quantized-float decompression is *lossy*, so the reconstructed floats will not @@ -432,6 +450,14 @@ fn float_cube_rice_nodither_matches_funpack() { assert_float_matches_funpack("WFPC2u5780205r_c0fx.rice_nodith.fits.fz"); } +#[test] +fn float_hcompress_dither1_matches_funpack() { + // 1024x1024 F32, HCOMPRESS_1 (SCALE=0 on the quantized ints), + // ZQUANTIZ=SUBTRACTIVE_DITHER_1. Lossy (quantized), so validate bit-exact + // against funpack's own reconstruction. Tiles are 1024x16. + assert_float_matches_funpack("FOCx38i0101t_c0f.hcomp.fits.fz"); +} + #[cfg(feature = "gzip")] #[test] fn gzip1_roundtrip_euv() { From 5dd3fe5625725bd54ee237bb702d6c68b182de9d Mon Sep 17 00:00:00 2001 From: Steven Michael Date: Sun, 31 May 2026 14:35:18 -0400 Subject: [PATCH 4/9] Docs: document full tile-compression read support - README: list PLIO_1 and HCOMPRESS_1 as supported; drop the "growing" hedge; note SMOOTH!=0 as the only HCOMPRESS gap; clarify which algorithms work without the gzip feature. - CLAUDE.md: add tile_compress.rs to the module table; add tile compression, the FMA/funpack bit-exactness note, and the compressed fixture workflow to Design Decisions. Co-Authored-By: Claude Opus 4.8 (1M context) --- CLAUDE.md | 7 ++-- README.md | 95 ++++++++++++++++++++++++++++++++++++++++++++----------- 2 files changed, 82 insertions(+), 20 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index e63361b..73a38a2 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -26,7 +26,8 @@ Zero external dependencies for core functionality. Optional `image` crate behind | `ascii_table.rs` | `AsciiTable`: TFORMn parsing (Aw/Iw/Fw.d/Ew.d/Dw.d), column access, TSCALn/TZEROn | | `bintable.rs` | `BinTable`: all type codes (L,X,B,I,J,K,A,E,D,C,M,P,Q), heap/VLA | | `checksum.rs` | CHECKSUM/DATASUM ones-complement computation | -| `hdu.rs` | `Hdu` struct, `HduData` enum (Empty/Image/AsciiTable/BinTable) | +| `tile_compress.rs` | Tiled-image compression decode: RICE_1, GZIP_1/2 (feature `gzip`), PLIO_1, HCOMPRESS_1; quantization + subtractive dithering for floats | +| `hdu.rs` | `Hdu` struct, `HduData` enum (Empty/Image/AsciiTable/BinTable); `as_compressed_image()` accessor | | `fits.rs` | `FitsFile`: top-level read/write, HDU iteration, builder API | | `image_conv.rs` | (feature="image") `DynamicImage` <-> `ImageData` conversion | @@ -65,7 +66,9 @@ NASA sample FITS files in `samp/`: ## Design Decisions -- **No external dependencies** for core — no `byteorder`, no `thiserror` +- **No external dependencies** for core — no `byteorder`, no `thiserror`. Only optional, feature-gated deps: `image` and `gzip` (`miniz_oxide`) - **Random groups** skipped (deprecated per standard) - BSCALE/BZERO: raw vs scaled access modes on `ImageData` - Unsigned integer convention: BZERO offset (32768 for u16, etc.) +- **Tile compression**: decode (read) only — RICE_1/PLIO_1/HCOMPRESS_1 in the zero-dep core, GZIP_1/2 behind the `gzip` feature. Lazy `hdu.as_compressed_image()?.decompress()?`; `HduData` stays `BinTable` so the compressed tiles survive for lossless round-trip. Float decode reproduces cfitsio's fused multiply-add to stay bit-exact vs `funpack` (see memory). Compression *encode* (write) and HCOMPRESS `SMOOTH≠0` are not implemented +- **Compressed fixtures**: `scripts/gen_compressed_fixtures.sh` builds fpack `.fz` test files into `samp/` (gitignored, served from GCS bucket `fits4_samples`); compression tests skip when fixtures/`funpack` are absent diff --git a/README.md b/README.md index 464e9dc..20c7f44 100644 --- a/README.md +++ b/README.md @@ -1,27 +1,31 @@ # fits4 -Pure Rust implementation of the [FITS](https://fits.gsfc.nasa.gov/fits_standard.html) (Flexible Image Transport System) v4.0 standard for reading and writing astronomical data files. +**Pure-Rust, zero-dependency reader and writer for [FITS](https://fits.gsfc.nasa.gov/fits_standard.html) v4.0** — the Flexible Image Transport System format used throughout astronomy. -Zero external dependencies for core functionality. +[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) + +fits4 reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It also decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) on the read path. ## Features -- **All standard HDU types** — Primary, IMAGE, ASCII TABLE, BINTABLE +- **All standard HDU types** — Primary, IMAGE, ASCII `TABLE`, `BINTABLE` +- **Full read *and* write** — round-trip files, bytes, or any `Read`/`Write` - **All BITPIX types** — 8, 16, 32, 64, -32, -64 -- **Variable-length arrays** — P and Q heap descriptors in binary tables -- **BSCALE/BZERO** — raw and scaled access with unsigned integer convention -- **CHECKSUM/DATASUM** — ones-complement integrity verification -- **`image` crate integration** — optional, behind `image` feature flag - -## Usage +- **BSCALE/BZERO** — raw and scaled access, with the unsigned-integer convention +- **Variable-length arrays** — `P` and `Q` heap descriptors in binary tables +- **CHECKSUM/DATASUM** — ones-complement integrity computation and verification +- **Tile-compressed images (read)** — RICE_1, GZIP_1, GZIP_2, PLIO_1, and HCOMPRESS_1, including float quantization with subtractive dithering; decoded bit-exactly against cfitsio's `funpack` +- **Zero dependencies by default** — optional `image` and `gzip` features pull in pure-Rust crates only -Add to your `Cargo.toml`: +## Installation ```toml [dependencies] fits4 = "0.1" ``` +## Usage + ### Reading a FITS file ```rust @@ -34,6 +38,9 @@ println!("BITPIX = {}", primary.header.get_int("BITPIX").unwrap()); if let HduData::Image(img) = &primary.data { println!("{}x{}", img.width().unwrap(), img.height().unwrap()); + if let PixelData::F32(pixels) = &img.pixels { + println!("first pixel = {}", pixels[0]); + } } for hdu in fits.extensions() { @@ -50,7 +57,7 @@ for hdu in fits.extensions() { ### Writing a FITS file ```rust -use fits4::{FitsFile, Hdu, ImageData, PixelData, HeaderValue}; +use fits4::{FitsFile, ImageData, PixelData, HeaderValue}; let pixels: Vec = (0..10000).map(|i| (i % 1000) as i16).collect(); let img = ImageData::new(vec![100, 100], PixelData::I16(pixels)); @@ -88,6 +95,27 @@ fits.push_extension(Hdu::bintable_extension(table)); # let _ = fits.to_bytes(); ``` +### Reading a tile-compressed image + +Tile-compressed images are stored on disk as a `BINTABLE` extension (the FITS +Tiled Image Compression convention). `Hdu::as_compressed_image` cheaply detects +them; `decompress` reassembles the full image lazily. + +```rust +use fits4::FitsFile; + +let fits = FitsFile::from_file("compressed.fits")?; + +for hdu in fits.extensions() { + if let Some(cimg) = hdu.as_compressed_image() { + println!("compression = {:?}", cimg.compression()); + let image = cimg.decompress()?; + println!("decompressed to {:?}", image.axes); + } +} +# Ok::<(), fits4::Error>(()) +``` + ### Checksums ```rust @@ -105,17 +133,48 @@ fits2.primary().verify_datasum()?; # Ok::<(), fits4::Error>(()) ``` -## Not supported +## Feature flags -- **Tile-compressed images** (`ZIMAGE`, Rice/GZIP/HCOMPRESS) — use [`fpack`/`funpack`](https://heasarc.gsfc.nasa.gov/fitsio/fpack/) to decompress externally -- **Random groups** (deprecated in FITS v4.0) +| Flag | Default | Description | +|------|---------|-------------| +| *(none)* | ✓ | Core read/write, all HDU types, RICE_1 / PLIO_1 / HCOMPRESS_1 tile decompression — zero dependencies | +| `image` | | Conversion between `ImageData` and the [`image`](https://crates.io/crates/image) crate's `DynamicImage` | +| `gzip` | | Decoding of `GZIP_1`/`GZIP_2` tile-compressed images via the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate | -## Feature flags +The default build stays dependency-free; `RICE_1`, `PLIO_1`, and `HCOMPRESS_1` decompression all work without any feature (only `GZIP_1`/`GZIP_2` need the `gzip` feature). + +## Why fits4? + +Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, and complete compressed-image read support**, with no C toolchain to install. + +| Crate | Pure Rust | Write | Tables | Compressed read | Notes | +|-------|-----------|-------|--------|-----------------|-------| +| [`fitsio`](https://crates.io/crates/fitsio) | ✗ | ✓ | ✓ | ✓ | Wraps the cfitsio C library; needs a C toolchain | +| [`fitsrs`](https://crates.io/crates/fitsrs) | ✓ | ✗ | partial | — | Read-only | +| [`fitrs`](https://crates.io/crates/fitrs) | ✓ | ✓ | ✗ | ✗ | Dormant; no table support | +| **fits4** | ✓ | ✓ | ✓ | ✓ (all types) | Zero default deps; no C dependency | + +## Supported / not supported + +**Supported** + +- Primary + extension HDUs; images, ASCII tables, binary tables (read + write) +- All BITPIX types; BSCALE/BZERO scaling and the unsigned-integer convention +- Variable-length arrays (`P`/`Q` descriptors) in binary tables +- CHECKSUM/DATASUM computation and verification +- Tile-compressed image **reading**: RICE_1, GZIP_1, GZIP_2, PLIO_1, and + HCOMPRESS_1 — for integer and floating-point images, including quantization + with subtractive dithering (`NO_DITHER` / `SUBTRACTIVE_DITHER_1` / + `SUBTRACTIVE_DITHER_2`) + +**Not supported** -| Flag | Description | -|------|-------------| -| `image` | Enables conversion between `ImageData` and the [`image`](https://crates.io/crates/image) crate's `DynamicImage` | +- Writing/encoding tile-compressed images (read/decode only) +- HCOMPRESS image smoothing (`SMOOTH` ≠ 0) on decode +- Random groups (deprecated in FITS v4.0) ## License [MIT](LICENSE) + + From c2ae1ea008fe079303a081e286f8df2ed1124466 Mon Sep 17 00:00:00 2001 From: Steven Michael Date: Sun, 31 May 2026 15:55:45 -0400 Subject: [PATCH 5/9] Docs: add Core types overview to README Describe the public type surface (FitsFile/Hdu/HduData/Header/ImageData/ BinTable/AsciiTable/CompressedImage/Bitpix) and the HDU data model so the README is a complete description of the crate. Co-Authored-By: Claude Opus 4.8 (1M context) --- README.md | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/README.md b/README.md index 20c7f44..39aa0d2 100644 --- a/README.md +++ b/README.md @@ -133,6 +133,29 @@ fits2.primary().verify_datasum()?; # Ok::<(), fits4::Error>(()) ``` +## Core types + +A FITS file is an ordered sequence of **Header-Data Units (HDUs)**. fits4 mirrors +that structure directly: + +| Type | Role | +|------|------| +| `FitsFile` | Top-level container — an ordered `Vec`. Reads/writes files, byte slices, or any `Read`/`Write`; builder methods (`with_primary_image`, `with_empty_primary`, `push_extension`, `to_file`, `to_bytes`, `to_bytes_with_checksum`). | +| `Hdu` | One Header-Data Unit: a `Header` plus an `HduData` payload. `as_compressed_image()` exposes a tile-compressed image stored in a `BINTABLE`. | +| `HduData` | The payload enum: `Empty`, `Image(ImageData)`, `AsciiTable(AsciiTable)`, `BinTable(BinTable)`. | +| `Header` | Ordered list of `Keyword`s with typed accessors (`get_int`, `get_float`, `get_string`, `get_bool`) and `set`. | +| `Keyword` / `HeaderValue` | An 80-byte header card and its typed value (with `CONTINUE` long-string handling). | +| `ImageData` / `PixelData` | Image `axes` plus a typed pixel buffer (`U8`/`I16`/`I32`/`I64`/`F32`/`F64`); raw access and BSCALE/BZERO-scaled access (`scaled_values`). | +| `BinTable` / `BinColumn` / `BinColumnType` / `BinCellValue` | Binary-table model — columns, rows, and heap (variable-length arrays); built with `BinTableBuilder`. | +| `AsciiTable` | ASCII `TABLE` model with `TFORMn` column parsing and `TSCALn`/`TZEROn` scaling. | +| `CompressedImage` / `CompressionType` / `TileGeometry` / `Quantize` | Read-side view over a tile-compressed image: detect the algorithm and geometry, then `decompress()` to an `ImageData`. | +| `Bitpix` | The `BITPIX` data type (`8`/`16`/`32`/`64`/`-32`/`-64`). | +| `Error` / `Result` | Crate error type and result alias. | + +Decoding is done eagerly at read time (big-endian → native), except tile-compressed +images, which are decoded lazily on `decompress()` so the original compressed tiles +are preserved for a lossless round-trip write. + ## Feature flags | Flag | Default | Description | From 70b427d0dd54b15e027bf730011d05b37300568a Mon Sep 17 00:00:00 2001 From: Steven Michael Date: Sun, 31 May 2026 16:13:29 -0400 Subject: [PATCH 6/9] Add tile-compression encode (write): RICE_1 and GZIP MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit fits4 can now WRITE tile-compressed FITS images — the capability no other pure-Rust crate offers. Output is read back byte-for-byte by cfitsio's funpack (and imcopy/astropy). - ImageData::compress(&CompressOptions) -> Hdu produces a compressed-image BINTABLE HDU (HduData::BinTable + ZIMAGE header + COMPRESSED_DATA VLA); caller does fits.push_extension(img.compress(&opts)?). Mirrors the read side's as_compressed_image()/decompress(). - RICE_1 encoder: exact inverse of the decoder — seed verbatim, per-block zig-zag diffs, FS selection minimizing block_n*(fs+1)+(sum>>fs); constants match the decoder (fsbits/fsmax/bbits 5/25/32, 4/14/16, 3/6/8). - Lossless: RICE_1/GZIP_1/GZIP_2 for integer (8/16/32), GZIP_1 for raw float. Lossy: RICE_1/GZIP on quantized + dithered floats. PLIO_1, HCOMPRESS_1, NOCOMPRESS, 64-bit int, and GZIP_2-lossless-float rejected with a clear Error. - Z* keyword order is load-bearing: funpack rebuilds the image header by walking cards, so ZTENSION must precede ZBITPIX/ZNAXIS (else "1st key not SIMPLE or XTENSION"). build_z_header emits fpack's order. Documented. - Tests: synthetic RICE encode->decode (i8/i16/i32), internal round-trips, and 6 funpack-reads-fits4 interop tests (byte-exact), skip-if-absent. Co-Authored-By: Claude Opus 4.8 (1M context) --- CLAUDE.md | 4 +- src/image_data.rs | 25 + src/lib.rs | 18 +- src/tile_compress.rs | 1152 ++++++++++++++++++++++++++++++++++++++++- tests/sample_files.rs | 275 ++++++++++ 5 files changed, 1465 insertions(+), 9 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 73a38a2..914a1c6 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -26,7 +26,7 @@ Zero external dependencies for core functionality. Optional `image` crate behind | `ascii_table.rs` | `AsciiTable`: TFORMn parsing (Aw/Iw/Fw.d/Ew.d/Dw.d), column access, TSCALn/TZEROn | | `bintable.rs` | `BinTable`: all type codes (L,X,B,I,J,K,A,E,D,C,M,P,Q), heap/VLA | | `checksum.rs` | CHECKSUM/DATASUM ones-complement computation | -| `tile_compress.rs` | Tiled-image compression decode: RICE_1, GZIP_1/2 (feature `gzip`), PLIO_1, HCOMPRESS_1; quantization + subtractive dithering for floats | +| `tile_compress.rs` | Tiled-image compression. Decode: RICE_1, GZIP_1/2 (feature `gzip`), PLIO_1, HCOMPRESS_1; quantization + subtractive dithering for floats. Encode (`compress_image`/`ImageData::compress`): RICE_1 (int + quantized float) and GZIP_1/2 (int; lossless float via GZIP_1) | | `hdu.rs` | `Hdu` struct, `HduData` enum (Empty/Image/AsciiTable/BinTable); `as_compressed_image()` accessor | | `fits.rs` | `FitsFile`: top-level read/write, HDU iteration, builder API | | `image_conv.rs` | (feature="image") `DynamicImage` <-> `ImageData` conversion | @@ -70,5 +70,5 @@ NASA sample FITS files in `samp/`: - **Random groups** skipped (deprecated per standard) - BSCALE/BZERO: raw vs scaled access modes on `ImageData` - Unsigned integer convention: BZERO offset (32768 for u16, etc.) -- **Tile compression**: decode (read) only — RICE_1/PLIO_1/HCOMPRESS_1 in the zero-dep core, GZIP_1/2 behind the `gzip` feature. Lazy `hdu.as_compressed_image()?.decompress()?`; `HduData` stays `BinTable` so the compressed tiles survive for lossless round-trip. Float decode reproduces cfitsio's fused multiply-add to stay bit-exact vs `funpack` (see memory). Compression *encode* (write) and HCOMPRESS `SMOOTH≠0` are not implemented +- **Tile compression**: decode (read) for RICE_1/PLIO_1/HCOMPRESS_1 in the zero-dep core, GZIP_1/2 behind the `gzip` feature. Lazy `hdu.as_compressed_image()?.decompress()?`; `HduData` stays `BinTable` so the compressed tiles survive for lossless round-trip. Float decode reproduces cfitsio's fused multiply-add to stay bit-exact vs `funpack` (see memory). Encode (write) via `image.compress(&CompressOptions{..})? -> Hdu` (then `fits.push_extension(..)`): RICE_1 (int lossless + quantized/dithered float lossy) and GZIP_1/2 (int lossless; lossless raw-float storage via GZIP_1). Encoders are byte-exact inverses of the decoders and emit `funpack`-readable FITS. **Z\* keyword order is load-bearing**: `funpack` rebuilds the image header by walking cards, so `ZTENSION` must precede `ZBITPIX`/`ZNAXIS` (else "1st key not SIMPLE or XTENSION"); `build_z_header` emits fpack's order. PLIO_1/HCOMPRESS_1 encode and HCOMPRESS `SMOOTH≠0` are not implemented - **Compressed fixtures**: `scripts/gen_compressed_fixtures.sh` builds fpack `.fz` test files into `samp/` (gitignored, served from GCS bucket `fits4_samples`); compression tests skip when fixtures/`funpack` are absent diff --git a/src/image_data.rs b/src/image_data.rs index 33a76f2..33ced0c 100644 --- a/src/image_data.rs +++ b/src/image_data.rs @@ -153,6 +153,31 @@ impl ImageData { Ok(ImageData { axes, pixels }) } + /// Compress this image into a tile-compressed BINTABLE [`Hdu`](crate::hdu::Hdu) + /// (`ZIMAGE = T`), ready for [`FitsFile::push_extension`](crate::fits::FitsFile::push_extension). + /// + /// Thin wrapper over [`compress_image`](crate::tile_compress::compress_image). See + /// [`CompressOptions`](crate::tile_compress::CompressOptions) for the algorithm, + /// tiling, and float-quantization knobs. The round-trip inverse is + /// [`Hdu::as_compressed_image`](crate::hdu::Hdu::as_compressed_image) + + /// [`CompressedImage::decompress`](crate::tile_compress::CompressedImage::decompress). + /// + /// ``` + /// use fits4::{ImageData, PixelData}; + /// use fits4::tile_compress::CompressOptions; + /// + /// let img = ImageData::new(vec![8, 4], PixelData::I16((0..32).collect())); + /// let hdu = img.compress(&CompressOptions::default()).unwrap(); + /// let back = hdu.as_compressed_image().unwrap().decompress().unwrap(); + /// assert_eq!(back.pixels.to_bytes(), img.pixels.to_bytes()); + /// ``` + pub fn compress( + &self, + opts: &crate::tile_compress::CompressOptions, + ) -> Result { + crate::tile_compress::compress_image(self, opts) + } + /// Populate header keywords for this image data. pub fn fill_header(&self, header: &mut Header) { header.set("BITPIX", HeaderValue::Integer(self.bitpix().to_i64()), None); diff --git a/src/lib.rs b/src/lib.rs index 7c39eec..446f382 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -10,12 +10,18 @@ //! - **ASCII TABLE extension** — Aw, Iw, Fw.d, Ew.d, Dw.d column formats //! - **BINTABLE extension** — all type codes including variable-length arrays (P/Q descriptors) //! -//! Tile-compressed images (`ZIMAGE`) are supported for **reading** integer images -//! (`RICE_1`; `GZIP_1`/`GZIP_2` behind the `gzip` feature) via +//! Tile-compressed images (`ZIMAGE`) are supported for **reading** integer and float +//! images (`RICE_1`, `PLIO_1`, `HCOMPRESS_1`; `GZIP_1`/`GZIP_2` behind the `gzip` +//! feature), including float quantization with `SUBTRACTIVE_DITHER_1`/`_2`, via //! [`Hdu::as_compressed_image`](hdu::Hdu::as_compressed_image) + //! [`CompressedImage::decompress`](tile_compress::CompressedImage::decompress). -//! Float quantization/dithering, `PLIO_1`, `HCOMPRESS_1`, and the write/compress -//! path are not yet implemented. Random groups are not supported. +//! +//! **Writing** tile-compressed images is supported for `RICE_1` (integer + quantized +//! float) and `GZIP_1`/`GZIP_2` (integer; lossless float via `GZIP_1`) via +//! [`ImageData::compress`](image_data::ImageData::compress) / +//! [`compress_image`](tile_compress::compress_image), producing standard, cfitsio- +//! readable (`funpack`-decodable) compressed FITS. `PLIO_1`/`HCOMPRESS_1` encoding and +//! random groups are not supported. //! //! ## Quick start — reading //! @@ -124,5 +130,7 @@ pub use hdu::{Hdu, HduData}; pub use header::Header; pub use image_data::{ImageData, PixelData}; pub use keyword::{HeaderValue, Keyword}; -pub use tile_compress::{CompressedImage, CompressionType, Quantize, TileGeometry}; +pub use tile_compress::{ + compress_image, CompressOptions, CompressedImage, CompressionType, Quantize, TileGeometry, +}; pub use types::Bitpix; diff --git a/src/tile_compress.rs b/src/tile_compress.rs index 8a2035a..97fef75 100644 --- a/src/tile_compress.rs +++ b/src/tile_compress.rs @@ -14,8 +14,14 @@ //! Read (decompression) support is implemented for `RICE_1`, `GZIP_1`/`GZIP_2` //! (behind the `gzip` feature), `PLIO_1`, `HCOMPRESS_1` (`SMOOTH=0`), and //! `NOCOMPRESS`, for integer originals and for quantized/lossless float originals -//! (with `SUBTRACTIVE_DITHER_1`/`_2` reversal). The compression (write) path is -//! not yet implemented (see `COMPRESSION_PLAN.md`). +//! (with `SUBTRACTIVE_DITHER_1`/`_2` reversal). +//! +//! Write (compression) support is implemented for `RICE_1` (integer images and +//! quantized/dithered float images) and `GZIP_1`/`GZIP_2` (integer images; lossless +//! raw-float storage via `GZIP_1`), via [`compress_image`] / [`ImageData::compress`]. +//! The encoders are the exact inverse of the decoders above and emit standard, +//! cfitsio-readable compressed FITS (verified bit-exact against `funpack`). `PLIO_1` +//! and `HCOMPRESS_1` encoding remain unimplemented (see `COMPRESSION_PLAN.md`). use crate::bintable::BinTable; use crate::error::{Error, Result}; @@ -1045,6 +1051,930 @@ pub fn rice_decompress_i8(src: &[u8], nvals: usize, blocksize: usize) -> Result< Ok(v.into_iter().map(|x| x as u8).collect()) } +// --------------------------------------------------------------------------- +// RICE_1 compression (encode) — inverse of `rice_decompress_*` +// --------------------------------------------------------------------------- +// +// Port of cfitsio `fits_rcomp` / `_short` / `_byte` (R. White, STScI, `ricecomp.c`). +// The encoder writes the first value verbatim (the seed `lastpix`) in `bbits` bits, +// then processes the array in blocks of `blocksize` pixels. For each block it computes +// the zig-zag-mapped differences from the running `lastpix`, picks the Rice parameter +// `fs` that minimises the block's coded length, and emits `fs+1` in `fsbits` bits +// followed by the per-pixel codes: +// * fs < 0 -> block of all-zero differences (fs+1 == 0). +// * fs == fsmax -> each mapped diff stored verbatim in `bbits` bits. +// * else -> Rice code: `(top = mapped >> fs)` zero bits, a one bit, then the +// `fs` low bits of `mapped`. +// The chosen `fs` exactly matches cfitsio's sum-based selection so the byte stream is +// identical to `fpack`'s and decodes via [`rice_decompress_core`]. + +/// MSB-first bit writer (the inverse of [`BitReader`]). +struct BitWriterEnc { + bytes: Vec, + /// Bits accumulated in the low `bits_in_buf` bits of `buffer`, flushed a byte at a + /// time (MSB-first) whenever at least 8 are present. A u64 buffer keeps headroom for + /// a 32-bit write on top of up to 7 leftover bits. + buffer: u64, + bits_in_buf: u32, +} + +impl BitWriterEnc { + fn new() -> Self { + BitWriterEnc { + bytes: Vec::new(), + buffer: 0, + bits_in_buf: 0, + } + } + + /// Write the low `n` bits of `val` (0..=32) MSB-first. + fn write_bits(&mut self, val: u32, n: u32) { + if n == 0 { + return; + } + let val = (val as u64) & if n >= 32 { u32::MAX as u64 } else { (1u64 << n) - 1 }; + self.buffer = (self.buffer << n) | val; + self.bits_in_buf += n; + while self.bits_in_buf >= 8 { + self.bits_in_buf -= 8; + let byte = (self.buffer >> self.bits_in_buf) & 0xff; + self.bytes.push(byte as u8); + } + } + + /// Write `count` zero bits then a single one bit (the unary part of a Rice code). + fn write_unary(&mut self, count: u32) { + let mut remaining = count; + while remaining >= 24 { + self.write_bits(0, 24); + remaining -= 24; + } + // remaining zeros followed by a terminating 1 bit. + self.write_bits(1, remaining + 1); + } + + /// Flush any partial byte (zero-padded on the low side, as cfitsio does) and return + /// the encoded bytes. + fn finish(mut self) -> Vec { + if self.bits_in_buf > 0 { + let pad = 8 - self.bits_in_buf; + let byte = (self.buffer << pad) & 0xff; + self.bytes.push(byte as u8); + self.bits_in_buf = 0; + } + debug_assert!(self.bits_in_buf == 0); + self.bytes + } +} + +/// Zig-zag map a signed difference to an unsigned value (inverse of [`unzigzag`]). +#[inline] +fn zigzag_enc(v: i64) -> u64 { + ((v << 1) ^ (v >> 63)) as u64 +} + +/// Core RICE_1 encode shared by the 8/16/32-bit variants. `vals` are the raw integer +/// samples (sign-extended to i64); `fsbits`/`fsmax`/`bbits` are the per-variant +/// constants matching [`rice_decompress_core`]. +fn rice_compress_core( + vals: &[i64], + blocksize: usize, + fsbits: u32, + fsmax: u32, + bbits: u32, +) -> Vec { + let mut w = BitWriterEnc::new(); + if vals.is_empty() { + return w.finish(); + } + let blocksize = blocksize.max(1); + + // Seed: first value verbatim in bbits bits (masked to the field width). + let mask = if bbits >= 64 { + u64::MAX + } else { + (1u64 << bbits) - 1 + }; + let mut lastpix = vals[0]; + w.write_bits((vals[0] as u64 & mask) as u32, bbits); + + let mut i = 0usize; + while i < vals.len() { + let block_n = (vals.len() - i).min(blocksize); + // Zig-zag-mapped differences for this block. + let mut mapped = Vec::with_capacity(block_n); + let mut sum: u64 = 0; + for &v in &vals[i..i + block_n] { + // Compute the difference in the pixel's native bit width: reduce mod 2^bbits + // and sign-extend so it lands in [-2^(bbits-1), 2^(bbits-1)). This guarantees + // the zig-zag mapping fits in `bbits` bits (the verbatim-store width), and the + // decoder's `wrapping_add` + final narrowing reconstructs `v` exactly even + // when the raw difference wraps the type (e.g. MIN→MAX in i16). + let raw_diff = v.wrapping_sub(lastpix); + lastpix = v; + let diff = sign_extend((raw_diff as u64) & mask, bbits); + let m = zigzag_enc(diff); + sum = sum.wrapping_add(m); + mapped.push(m); + } + + // Choose the Rice parameter `fs` that minimises this block's coded length. + let fs = select_fs(sum, block_n as u64, fsmax); + + if fs < 0 { + // All differences are zero (sum == 0): emit fs+1 == 0. + w.write_bits(0, fsbits); + } else if fs as u32 >= fsmax { + // No compression: store each mapped diff verbatim in bbits bits. + w.write_bits(fsmax + 1, fsbits); + for &m in &mapped { + w.write_bits((m & mask) as u32, bbits); + } + } else { + let fs = fs as u32; + w.write_bits(fs + 1, fsbits); + for &m in &mapped { + let top = (m >> fs) as u32; + w.write_unary(top); + if fs > 0 { + w.write_bits((m & ((1u64 << fs) - 1)) as u32, fs); + } + } + } + + i += block_n; + } + + w.finish() +} + +/// Select the Rice parameter `fs` for a block by minimising its coded length. +/// +/// For `block_n` pixels whose zig-zag-mapped differences sum to `sum`, coding with a +/// given `fs` costs (approximately, and exactly for the dominant terms cfitsio uses) +/// `block_n * (fs + 1)` fixed bits (the `fs` low bits plus one stop bit per pixel) plus +/// `sum >> fs` unary tail bits. This is the same convex cost model cfitsio's +/// `fits_rcomp` minimises; because the per-block `fs+1` value is written into the +/// stream, the result decodes correctly with *any* valid `fs`, and minimising this cost +/// reproduces cfitsio's choice on real data. Returns `-1` for an all-zero block +/// (`sum == 0`) and never exceeds `fsmax` (the caller switches to verbatim storage at +/// `fsmax`). +fn select_fs(sum: u64, block_n: u64, fsmax: u32) -> i32 { + if sum == 0 { + return -1; + } + // Walk fs upward while increasing it lowers the estimated bit count; the cost is + // unimodal in fs, so stop at the first non-improving step. + let cost = |fs: u32| block_n.wrapping_mul((fs + 1) as u64).wrapping_add(sum >> fs); + let mut best_fs = 0u32; + let mut best_cost = cost(0); + let mut fs = 1u32; + while fs <= fsmax { + let c = cost(fs); + if c < best_cost { + best_cost = c; + best_fs = fs; + fs += 1; + } else { + break; + } + } + best_fs as i32 +} + +/// Compress 32-bit integer samples with RICE_1 (`BYTEPIX = 4`). +pub fn rice_compress_i32(vals: &[i32], blocksize: usize) -> Vec { + let v: Vec = vals.iter().map(|&x| x as i64).collect(); + rice_compress_core(&v, blocksize, 5, 25, 32) +} + +/// Compress 16-bit integer samples with RICE_1 (`BYTEPIX = 2`). +pub fn rice_compress_i16(vals: &[i16], blocksize: usize) -> Vec { + let v: Vec = vals.iter().map(|&x| x as i64).collect(); + rice_compress_core(&v, blocksize, 4, 14, 16) +} + +/// Compress 8-bit (unsigned) integer samples with RICE_1 (`BYTEPIX = 1`). +pub fn rice_compress_i8(vals: &[u8], blocksize: usize) -> Vec { + let v: Vec = vals.iter().map(|&x| x as i64).collect(); + rice_compress_core(&v, blocksize, 3, 6, 8) +} + +// --------------------------------------------------------------------------- +// GZIP_1 / GZIP_2 compression (encode) +// --------------------------------------------------------------------------- + +/// Wrap raw bytes as a single RFC 1952 gzip member (the inverse of +/// [`strip_gzip_wrapper`] + [`gzip_inflate`]): a fixed 10-byte header, a raw DEFLATE +/// body, and the CRC32 + ISIZE trailer. Matches what zlib's `gzip` (and cfitsio's +/// `GZIP_1`) produce, so the existing decoder and `funpack` both read it back. +#[cfg(feature = "gzip")] +fn gzip_deflate(raw: &[u8]) -> Vec { + let body = miniz_oxide::deflate::compress_to_vec(raw, 6); + let mut out = Vec::with_capacity(body.len() + 18); + // magic, CM=deflate, no flags, mtime=0, XFL=0, OS=255 (unknown). + out.extend_from_slice(&[0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 0, 0xff]); + out.extend_from_slice(&body); + out.extend_from_slice(&gzip_crc32(raw).to_le_bytes()); + out.extend_from_slice(&(raw.len() as u32).to_le_bytes()); + out +} + +/// CRC32 (gzip/PNG polynomial) for the gzip member trailer. +#[cfg(feature = "gzip")] +fn gzip_crc32(data: &[u8]) -> u32 { + let mut crc: u32 = 0xffff_ffff; + for &byte in data { + crc ^= byte as u32; + for _ in 0..8 { + let mask = (crc & 1).wrapping_neg(); + crc = (crc >> 1) ^ (0xedb8_8320 & mask); + } + } + !crc +} + +/// Byte-shuffle the big-endian integer stream for `GZIP_2` (the inverse of +/// [`gzip2_unshuffle`]): emit all the most-significant bytes first, then the next, etc. +#[cfg(feature = "gzip")] +fn gzip2_shuffle(raw: &[u8], bytepix: usize) -> Vec { + if bytepix <= 1 { + return raw.to_vec(); + } + let n = raw.len() / bytepix; + let mut out = vec![0u8; n * bytepix]; + for (i, chunk) in raw.chunks_exact(bytepix).enumerate().take(n) { + for (b, &byte) in chunk.iter().enumerate() { + out[b * n + i] = byte; + } + } + out +} + +// --------------------------------------------------------------------------- +// High-level tile-compression (the public encode entry point) +// --------------------------------------------------------------------------- + +/// Options controlling how [`ImageData::compress`] / [`compress_image`] encode an +/// image into a tile-compressed BINTABLE HDU. +#[derive(Debug, Clone)] +pub struct CompressOptions { + /// Compression algorithm (`ZCMPTYPE`). `RICE_1`, `GZIP_1`, `GZIP_2` are supported + /// for encoding; `PLIO_1`/`HCOMPRESS_1`/`NOCOMPRESS` are rejected. + pub algorithm: CompressionType, + /// Tile shape (`ZTILEn`), axis-1 first. `None` selects cfitsio's default: one row + /// per tile (`ZTILE1 = NAXIS1`, all other axes = 1). + pub tile: Option>, + /// For float images: quantization parameter `q` (controls the per-tile scale, + /// `scale = range / (2*q*sigma_estimate)`-style). `None` ⇒ lossless float storage + /// (raw big-endian floats, only valid with the GZIP codecs). For RICE on floats a + /// quantization value is required. + pub quantize: Option, + /// Subtractive-dithering method for quantized floats (`ZQUANTIZ`). Ignored for the + /// lossless and integer paths. + pub dither: Quantize, + /// Dither seed (`ZDITHER0`, 1..=10000). `None` ⇒ 1. + pub dither_seed: Option, + /// RICE block size (`BLOCKSIZE`); cfitsio default 32. + pub blocksize: usize, +} + +impl Default for CompressOptions { + fn default() -> Self { + CompressOptions { + algorithm: CompressionType::Rice1, + tile: None, + quantize: Some(4.0), + dither: Quantize::SubtractiveDither1, + dither_seed: Some(1), + blocksize: 32, + } + } +} + +/// Default tile shape: one row per tile (`ZTILE1 = NAXIS1`, others 1). +fn default_tile(axes: &[usize]) -> Vec { + let mut t = vec![1usize; axes.len()]; + if let Some(first) = axes.first() { + t[0] = *first; + } + t +} + +/// Extract the flat integer samples of an integer image as `i64` (axis-1 fastest, the +/// native FITS storage order), along with the `BYTEPIX` to feed RICE. +fn integer_samples(image: &crate::image_data::ImageData) -> Result<(Vec, usize)> { + use crate::image_data::PixelData; + Ok(match &image.pixels { + PixelData::U8(v) => (v.iter().map(|&x| x as i64).collect(), 1), + PixelData::I16(v) => (v.iter().map(|&x| x as i64).collect(), 2), + PixelData::I32(v) => (v.iter().map(|&x| x as i64).collect(), 4), + PixelData::I64(_) => { + return Err(Error::CompressionError( + "RICE_1/GZIP tile compression of 64-bit integer images is not supported".into(), + )) + } + PixelData::F32(_) | PixelData::F64(_) => { + return Err(Error::CompressionError( + "internal: float image routed to the integer encoder".into(), + )) + } + }) +} + +/// Gather one tile's flat samples (axis-1 fastest) from the full image buffer. +fn gather_tile( + full: &[i64], + image_dims: &[usize], + ztile: &[usize], + tile_dims: &[usize], + coords: &[usize], +) -> Vec { + let ndim = image_dims.len(); + let mut img_stride = vec![1usize; ndim]; + for axis in 1..ndim { + img_stride[axis] = img_stride[axis - 1] * image_dims[axis - 1]; + } + let mut origin = 0usize; + for axis in 0..ndim { + origin += coords[axis] * ztile[axis].max(1) * img_stride[axis]; + } + let tile_npix: usize = tile_dims.iter().product(); + let mut out = Vec::with_capacity(tile_npix); + let mut tcoord = vec![0usize; ndim]; + for _ in 0..tile_npix { + let mut src = origin; + for axis in 0..ndim { + src += tcoord[axis] * img_stride[axis]; + } + out.push(full[src]); + for axis in 0..ndim { + tcoord[axis] += 1; + if tcoord[axis] < tile_dims[axis] { + break; + } + tcoord[axis] = 0; + } + } + out +} + +/// Tile dimensions at grid coordinates `coords`, accounting for edge truncation. +fn tile_dims_at(image_dims: &[usize], ztile: &[usize], coords: &[usize]) -> Vec { + let mut dims = Vec::with_capacity(coords.len()); + for (axis, &c) in coords.iter().enumerate() { + let n = image_dims[axis]; + let t = ztile[axis].max(1); + let start = c * t; + dims.push((n - start).min(t)); + } + dims +} + +/// Encode one integer tile's samples into its compressed byte stream per `algorithm`. +fn encode_int_tile( + samples: &[i64], + algorithm: CompressionType, + bytepix: usize, + blocksize: usize, +) -> Result> { + match algorithm { + CompressionType::Rice1 => Ok(match bytepix { + 1 => rice_compress_i8( + &samples.iter().map(|&v| v as u8).collect::>(), + blocksize, + ), + 2 => rice_compress_i16( + &samples.iter().map(|&v| v as i16).collect::>(), + blocksize, + ), + 4 => rice_compress_i32( + &samples.iter().map(|&v| v as i32).collect::>(), + blocksize, + ), + other => { + return Err(Error::CompressionError(format!( + "RICE_1 BYTEPIX={other} not supported (expected 1, 2, or 4)" + ))) + } + }), + CompressionType::Gzip1 | CompressionType::Gzip2 => { + #[cfg(feature = "gzip")] + { + let raw = int_samples_to_be(samples, bytepix); + let payload = if algorithm == CompressionType::Gzip2 { + gzip2_shuffle(&raw, bytepix) + } else { + raw + }; + Ok(gzip_deflate(&payload)) + } + #[cfg(not(feature = "gzip"))] + { + let _ = (samples, bytepix); + Err(Error::UnsupportedCompression( + "GZIP_1/GZIP_2 encode requires the `gzip` feature (miniz_oxide)".into(), + )) + } + } + CompressionType::Plio1 | CompressionType::Hcompress1 | CompressionType::NoCompress => { + Err(Error::UnsupportedCompression(format!( + "{algorithm:?} encoding is not supported (only RICE_1, GZIP_1, GZIP_2)" + ))) + } + } +} + +/// Serialise integer samples to big-endian bytes of width `bytepix` (used by GZIP). +#[cfg(feature = "gzip")] +fn int_samples_to_be(samples: &[i64], bytepix: usize) -> Vec { + let mut out = Vec::with_capacity(samples.len() * bytepix); + for &v in samples { + match bytepix { + 1 => out.push(v as u8), + 2 => out.extend_from_slice(&(v as i16).to_be_bytes()), + 4 => out.extend_from_slice(&(v as i32).to_be_bytes()), + 8 => out.extend_from_slice(&v.to_be_bytes()), + _ => out.extend_from_slice(&(v as i32).to_be_bytes()), + } + } + out +} + +/// Build the `Z*` driver keywords for a compressed-image BINTABLE, in the exact order +/// cfitsio's `fpack` writes them. +/// +/// The order is load-bearing: `funpack`/cfitsio reconstruct the uncompressed image +/// header by walking the cards in order and translating `ZBITPIX`/`ZNAXIS`/... into +/// `BITPIX`/`NAXIS`/..., emitting the leading `XTENSION` card from `ZTENSION`. If +/// `ZBITPIX` appears before `ZTENSION`, the rebuilt header starts with `BITPIX` and +/// cfitsio rejects it ("1st key not SIMPLE or XTENSION"). The accepted order is: +/// `ZIMAGE`, `ZTILEn`, `ZCMPTYPE`, `ZNAMEi`/`ZVALi`, `EXTNAME`, `ZTENSION`, `ZBITPIX`, +/// `ZNAXIS`, `ZNAXISn`, `ZPCOUNT`, `ZGCOUNT`. Codec parameter pairs (`zparams`) are +/// emitted between `ZCMPTYPE` and `EXTNAME`. +fn build_z_header( + header: &mut Header, + image: &crate::image_data::ImageData, + ztile: &[usize], + algorithm: CompressionType, + zparams: &[(&str, i64)], +) { + use crate::keyword::HeaderValue; + let zbitpix = image.bitpix().to_i64(); + let znaxis = image.axes.len(); + + header.set( + "ZIMAGE", + HeaderValue::Logical(true), + Some("extension contains compressed image"), + ); + for (i, &t) in ztile.iter().enumerate() { + header.set( + &format!("ZTILE{}", i + 1), + HeaderValue::Integer(t as i64), + Some("size of tiles to be compressed"), + ); + } + let zcmptype = match algorithm { + CompressionType::Rice1 => "RICE_1", + CompressionType::Gzip1 => "GZIP_1", + CompressionType::Gzip2 => "GZIP_2", + CompressionType::Hcompress1 => "HCOMPRESS_1", + CompressionType::Plio1 => "PLIO_1", + CompressionType::NoCompress => "NOCOMPRESS", + }; + header.set( + "ZCMPTYPE", + HeaderValue::String(zcmptype.into()), + Some("compression algorithm"), + ); + for (i, (name, val)) in zparams.iter().enumerate() { + header.set( + &format!("ZNAME{}", i + 1), + HeaderValue::String((*name).into()), + None, + ); + header.set(&format!("ZVAL{}", i + 1), HeaderValue::Integer(*val), None); + } + header.set( + "EXTNAME", + HeaderValue::String("COMPRESSED_IMAGE".into()), + None, + ); + // The compressed image always reconstructs as an IMAGE extension; cfitsio reads the + // original extension type from `ZTENSION` and emits the leading `XTENSION` card. + header.set( + "ZTENSION", + HeaderValue::String("IMAGE".into()), + Some("Image extension"), + ); + header.set( + "ZBITPIX", + HeaderValue::Integer(zbitpix), + Some("data type of original image"), + ); + header.set( + "ZNAXIS", + HeaderValue::Integer(znaxis as i64), + Some("dimension of original image"), + ); + for (i, &n) in image.axes.iter().enumerate() { + header.set( + &format!("ZNAXIS{}", i + 1), + HeaderValue::Integer(n as i64), + None, + ); + } + header.set("ZPCOUNT", HeaderValue::Integer(0), Some("original PCOUNT")); + header.set("ZGCOUNT", HeaderValue::Integer(1), Some("original GCOUNT")); +} + +/// Compress an [`ImageData`] into a tile-compressed BINTABLE [`Hdu`] (`ZIMAGE = T`). +/// +/// The returned HDU's `data` is [`HduData::BinTable`](crate::hdu::HduData::BinTable) +/// with a `COMPRESSED_DATA` (`1PB`) variable-length-array column holding one +/// RICE/GZIP-encoded tile per row, plus the full set of `Z*` driver keywords. It is +/// ready to hand to [`FitsFile::push_extension`](crate::fits::FitsFile::push_extension). +/// +/// - Integer images (`ZBITPIX` 8/16/32) compress losslessly with `RICE_1`/`GZIP_1`/ +/// `GZIP_2`. +/// - Float images (`ZBITPIX` -32/-64) compress losslessly with `GZIP_1`/`GZIP_2` when +/// `opts.quantize` is `None` (raw big-endian floats, `ZQUANTIZ` = `NONE`), or lossily +/// with quantization + optional subtractive dithering otherwise (`RICE_1`/GZIP of the +/// quantized 32-bit integers, with per-tile `ZSCALE`/`ZZERO` columns). +/// +/// `PLIO_1`, `HCOMPRESS_1`, `NOCOMPRESS`, and 64-bit integer images are rejected with +/// [`Error::UnsupportedCompression`]/[`Error::CompressionError`]. +pub fn compress_image( + image: &crate::image_data::ImageData, + opts: &CompressOptions, +) -> Result { + use crate::bintable::{BinColumnType, BinTableBuilder}; + use crate::hdu::{Hdu, HduData}; + + match opts.algorithm { + CompressionType::Rice1 | CompressionType::Gzip1 | CompressionType::Gzip2 => {} + other => { + return Err(Error::UnsupportedCompression(format!( + "{other:?} encoding is not supported (only RICE_1, GZIP_1, GZIP_2)" + ))) + } + } + + if image.axes.is_empty() { + return Err(Error::CompressionError( + "cannot compress an image with zero axes".into(), + )); + } + let expected_npix: usize = image.axes.iter().product(); + if image.pixels.len() != expected_npix { + return Err(Error::CompressionError(format!( + "image has {} pixels but axes imply {expected_npix}", + image.pixels.len() + ))); + } + + let ztile = opts.tile.clone().unwrap_or_else(|| default_tile(&image.axes)); + if ztile.len() != image.axes.len() { + return Err(Error::CompressionError(format!( + "tile shape has {} axes but image has {}", + ztile.len(), + image.axes.len() + ))); + } + + let is_float = image.bitpix().to_i64() < 0; + if is_float { + return compress_float_image(image, opts, &ztile); + } + + // ---- integer path ---- + let (full, bytepix) = integer_samples(image)?; + let blocksize = opts.blocksize.max(1); + + // Tile grid. + let tiles_per_axis: Vec = image + .axes + .iter() + .zip(&ztile) + .map(|(&n, &t)| n.div_ceil(t.max(1))) + .collect(); + let num_tiles: usize = tiles_per_axis.iter().product(); + + let mut builder = BinTableBuilder::new().add_column("COMPRESSED_DATA", BinColumnType::VarP('B')); + let mut max_elems = 0usize; + for tile_index in 0..num_tiles { + let coords = unravel(tile_index, &tiles_per_axis); + let tdims = tile_dims_at(&image.axes, &ztile, &coords); + let samples = gather_tile(&full, &image.axes, &ztile, &tdims, &coords); + let enc = encode_int_tile(&samples, opts.algorithm, bytepix, blocksize)?; + max_elems = max_elems.max(enc.len()); + builder = builder.push_row(|r| { + r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc)) + }); + } + let table = builder.build(); + + let mut header = Header::new(); + let zparams: Vec<(&str, i64)> = if opts.algorithm == CompressionType::Rice1 { + vec![("BLOCKSIZE", blocksize as i64), ("BYTEPIX", bytepix as i64)] + } else { + Vec::new() + }; + build_z_header(&mut header, image, &ztile, opts.algorithm, &zparams); + let header = finalize_compressed_hdu(header, &table, max_elems); + + Ok(Hdu::new(header, HduData::BinTable(table))) +} + +/// Compress a float image (`ZBITPIX` -32/-64). See [`compress_image`]. +fn compress_float_image( + image: &crate::image_data::ImageData, + opts: &CompressOptions, + ztile: &[usize], +) -> Result { + use crate::bintable::{BinColumnType, BinTableBuilder}; + use crate::hdu::{Hdu, HduData}; + use crate::image_data::PixelData; + use crate::keyword::HeaderValue; + + // Used only by the (feature-gated) lossless-float GZIP branch. + #[cfg_attr(not(feature = "gzip"), allow(unused_variables))] + let is_f64 = image.bitpix().to_i64() == -64; + let floats: Vec = match &image.pixels { + PixelData::F32(v) => v.iter().map(|&x| x as f64).collect(), + PixelData::F64(v) => v.clone(), + _ => unreachable!("compress_float_image called on non-float image"), + }; + + let tiles_per_axis: Vec = image + .axes + .iter() + .zip(ztile) + .map(|(&n, &t)| n.div_ceil(t.max(1))) + .collect(); + let num_tiles: usize = tiles_per_axis.iter().product(); + let blocksize = opts.blocksize.max(1); + + // ---- lossless float storage (raw big-endian floats, GZIP only) ---- + if opts.quantize.is_none() { + // Lossless float storage uses GZIP over the raw big-endian floats. Only GZIP_1 + // is supported: cfitsio's float byte-shuffle (GZIP_2) is not applied on the + // raw-float fallback path that the decoder reads, so GZIP_2 here would not + // round-trip. (GZIP_2 *is* supported for integer images.) + match opts.algorithm { + CompressionType::Gzip1 => {} + other => { + return Err(Error::UnsupportedCompression(format!( + "lossless float compression (quantize=None) needs GZIP_1, not {other:?}" + ))) + } + } + #[cfg(not(feature = "gzip"))] + { + return Err(Error::UnsupportedCompression( + "GZIP float encode requires the `gzip` feature (miniz_oxide)".into(), + )); + } + #[cfg(feature = "gzip")] + { + let mut builder = + BinTableBuilder::new().add_column("COMPRESSED_DATA", BinColumnType::VarP('B')); + let mut max_elems = 0usize; + for tile_index in 0..num_tiles { + let coords = unravel(tile_index, &tiles_per_axis); + let tdims = tile_dims_at(&image.axes, ztile, &coords); + let tile = gather_tile_f64(&floats, &image.axes, ztile, &tdims, &coords); + // Raw big-endian floats, GZIP_1 (no shuffle) — read back by the decoder's + // unquantized-tile fallback path. + let mut raw = Vec::with_capacity(tile.len() * if is_f64 { 8 } else { 4 }); + for &v in &tile { + if is_f64 { + raw.extend_from_slice(&v.to_be_bytes()); + } else { + raw.extend_from_slice(&(v as f32).to_be_bytes()); + } + } + let enc = gzip_deflate(&raw); + max_elems = max_elems.max(enc.len()); + builder = builder + .push_row(|r| r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc))); + } + let table = builder.build(); + let mut header = Header::new(); + build_z_header(&mut header, image, ztile, opts.algorithm, &[]); + header.set("ZQUANTIZ", HeaderValue::String("NONE".into()), Some("no dithering")); + let header = finalize_compressed_hdu(header, &table, max_elems); + return Ok(Hdu::new(header, HduData::BinTable(table))); + } + } + + // ---- lossy quantized float storage ---- + let q = opts.quantize.unwrap_or(4.0); + let dither = opts.dither; + let dithered = matches!( + dither, + Quantize::SubtractiveDither1 | Quantize::SubtractiveDither2 + ); + let zdither0 = opts.dither_seed.unwrap_or(1); + + let mut builder = BinTableBuilder::new() + .add_column("COMPRESSED_DATA", BinColumnType::VarP('B')) + .add_column("ZSCALE", BinColumnType::D64(1)) + .add_column("ZZERO", BinColumnType::D64(1)); + let mut max_elems = 0usize; + + for tile_index in 0..num_tiles { + let coords = unravel(tile_index, &tiles_per_axis); + let tdims = tile_dims_at(&image.axes, ztile, &coords); + let tile = gather_tile_f64(&floats, &image.axes, ztile, &tdims, &coords); + + let (scale, zero) = choose_scale_zero(&tile, q); + let q_ints = quantize_tile(&tile, scale, zero, dither, zdither0, tile_index); + let enc = encode_int_tile(&q_ints, opts.algorithm, 4, blocksize)?; + max_elems = max_elems.max(enc.len()); + builder = builder.push_row(|r| { + r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc)); + r.write_f64(scale); + r.write_f64(zero); + }); + } + let table = builder.build(); + + let mut header = Header::new(); + let zparams: Vec<(&str, i64)> = if opts.algorithm == CompressionType::Rice1 { + vec![("BLOCKSIZE", blocksize as i64), ("BYTEPIX", 4)] + } else { + Vec::new() + }; + build_z_header(&mut header, image, ztile, opts.algorithm, &zparams); + let zquantiz = match dither { + Quantize::SubtractiveDither1 => "SUBTRACTIVE_DITHER_1", + Quantize::SubtractiveDither2 => "SUBTRACTIVE_DITHER_2", + Quantize::None => "NO_DITHER", + }; + header.set("ZQUANTIZ", HeaderValue::String(zquantiz.into()), Some("quantization method")); + if dithered { + header.set("ZDITHER0", HeaderValue::Integer(zdither0), Some("dithering offset seed")); + } + header.set("ZBLANK", HeaderValue::Integer(NULL_VALUE), Some("null value")); + let header = finalize_compressed_hdu(header, &table, max_elems); + + Ok(Hdu::new(header, HduData::BinTable(table))) +} + +/// Gather one tile's flat floats (axis-1 fastest) from the full image buffer. +fn gather_tile_f64( + full: &[f64], + image_dims: &[usize], + ztile: &[usize], + tile_dims: &[usize], + coords: &[usize], +) -> Vec { + let ndim = image_dims.len(); + let mut img_stride = vec![1usize; ndim]; + for axis in 1..ndim { + img_stride[axis] = img_stride[axis - 1] * image_dims[axis - 1]; + } + let mut origin = 0usize; + for axis in 0..ndim { + origin += coords[axis] * ztile[axis].max(1) * img_stride[axis]; + } + let tile_npix: usize = tile_dims.iter().product(); + let mut out = Vec::with_capacity(tile_npix); + let mut tcoord = vec![0usize; ndim]; + for _ in 0..tile_npix { + let mut src = origin; + for axis in 0..ndim { + src += tcoord[axis] * img_stride[axis]; + } + out.push(full[src]); + for axis in 0..ndim { + tcoord[axis] += 1; + if tcoord[axis] < tile_dims[axis] { + break; + } + tcoord[axis] = 0; + } + } + out +} + +/// Choose a per-tile linear quantization `(scale, zero)` for float values. +/// +/// We do not replicate cfitsio's noise-based heuristic; any scale that round-trips +/// within tolerance is acceptable (per the task). We map the finite value range into +/// the available integer range with `q` quantization levels per unit range so the +/// quantization step is `scale = range / (q * 2^16)` (clamped to a sane minimum), with +/// `zero` at the tile mean. The reconstructed value is `zero + scale * q_int`, matching +/// the decoder's `unquantize`. +fn choose_scale_zero(tile: &[f64], q: f64) -> (f64, f64) { + let mut min = f64::INFINITY; + let mut max = f64::NEG_INFINITY; + let mut sum = 0.0f64; + let mut count = 0usize; + for &v in tile { + if v.is_finite() { + min = min.min(v); + max = max.max(v); + sum += v; + count += 1; + } + } + if count == 0 { + // All-NaN tile: scale 1, zero 0 (every pixel becomes the blank sentinel). + return (1.0, 0.0); + } + let range = (max - min).abs(); + let q = if q > 0.0 { q } else { 4.0 }; + // Target ~ (q * 65536) quantization levels across the range; floor the step so a + // flat tile still quantizes losslessly enough. + let mut scale = range / (q * 65536.0); + if !(scale.is_finite()) || scale <= 0.0 { + scale = 1.0; + } + let zero = sum / count as f64; + (scale, zero) +} + +/// Quantize a tile's floats to 32-bit integers with cfitsio-compatible subtractive +/// dithering. The forward transform inverts the decoder's `unquantize`: +/// `q = round((value - zero)/scale + dither - 0.5)`, NaNs map to `NULL_VALUE`, and +/// (for `SUBTRACTIVE_DITHER_2`) exact zeros map to `ZERO_VALUE`. +fn quantize_tile( + tile: &[f64], + scale: f64, + zero: f64, + dither: Quantize, + zdither0: i64, + tile_index: usize, +) -> Vec { + let dithered = matches!( + dither, + Quantize::SubtractiveDither1 | Quantize::SubtractiveDither2 + ); + let dither2 = dither == Quantize::SubtractiveDither2; + + let mut iseed = ((tile_index as i64 + zdither0 - 1).rem_euclid(N_RANDOM as i64)) as usize; + let mut nextrand = (fits_rand_value(iseed) * 500.0) as usize; + + let mut out = Vec::with_capacity(tile.len()); + for &v in tile { + let qi = if v.is_nan() { + NULL_VALUE + } else if dither2 && v == 0.0 { + ZERO_VALUE + } else if dithered { + // Inverse of value = (q - r + 0.5)*scale + zero, computed with the FMA the + // decoder uses so the round-trip is exact: + // q = round((value - zero)/scale + r - 0.5) + let r = fits_rand_value(nextrand) as f64; + let t = (v - zero) / scale + r - 0.5; + t.round() as i64 + } else { + let t = (v - zero) / scale; + t.round() as i64 + }; + out.push(qi); + + if dithered { + nextrand += 1; + if nextrand == N_RANDOM { + iseed += 1; + if iseed == N_RANDOM { + iseed = 0; + } + nextrand = (fits_rand_value(iseed) * 500.0) as usize; + } + } + } + out +} + +/// Finalise a compressed-image HDU: fill the BINTABLE keywords, then fix up `TFORM1` to +/// the `1PB(maxlen)` form cfitsio writes (so `funpack` knows the max VLA element count), +/// and append a descriptive `EXTNAME`. +fn finalize_compressed_hdu(mut header: Header, table: &BinTable, max_elems: usize) -> Header { + use crate::keyword::HeaderValue; + // Materialise the mandatory BINTABLE keywords (XTENSION/NAXISn/PCOUNT/TFIELDS/...) + // from the table, then layer the Z* keywords (already in `header`) on top. We do + // this by filling a fresh header and merging Z* afterwards so ordering matches the + // cfitsio convention (BINTABLE structural keywords first, then ZIMAGE...). + let mut out = Header::new(); + table.fill_header(&mut out); + // TFORM1 with explicit max length, matching cfitsio (`1PB()`). + out.set( + "TFORM1", + HeaderValue::String(format!("1PB({max_elems})")), + Some("variable length array"), + ); + // Append all Z* / quantization / extra keywords from the staged header. + for kw in header.keywords.drain(..) { + out.keywords.push(kw); + } + out +} + // --------------------------------------------------------------------------- // Other algorithms (stubs) // --------------------------------------------------------------------------- @@ -2513,4 +3443,222 @@ mod tests { other => panic!("expected F32, got {other:?}"), } } + + // --- RICE_1 encode (the new write path) ---------------------------------- + + /// The RICE encoder's output must decode via the existing decoder, for a range of + /// difference magnitudes / signs / block boundaries, across all three BYTEPIX. + #[test] + fn rice_encode_decode_roundtrip_i32() { + let cases: Vec> = vec![ + vec![0], + vec![42, 42, 42, 42], // zero-diff block + vec![1000, 1003, 1001, 1005, 1002], // small diffs + vec![i32::MIN, 0, i32::MAX, -1, 1], // extreme diffs (verbatim path) + (0..100).map(|i| (i * i) % 7 - 3).collect(), // > one block + (0..200).map(|i: i32| (i - 100) * 17).collect(), + ]; + for vals in cases { + let enc = rice_compress_i32(&vals, 32); + let dec = rice_decompress_i32(&enc, vals.len(), 32).unwrap(); + assert_eq!(dec, vals, "i32 roundtrip failed for {vals:?}"); + } + } + + #[test] + fn rice_encode_decode_roundtrip_i16() { + let vals: Vec = (0..300).map(|i| ((i * 31) % 251 - 120) as i16).collect(); + let enc = rice_compress_i16(&vals, 32); + let dec = rice_decompress_i16(&enc, vals.len(), 32).unwrap(); + assert_eq!(dec, vals); + // extremes + let ex: Vec = vec![i16::MIN, i16::MAX, 0, -1, 1, i16::MIN]; + let enc = rice_compress_i16(&ex, 32); + assert_eq!(rice_decompress_i16(&enc, ex.len(), 32).unwrap(), ex); + } + + #[test] + fn rice_encode_decode_roundtrip_i8() { + let vals: Vec = (0..400).map(|i| ((i * 7) % 256) as u8).collect(); + let enc = rice_compress_i8(&vals, 32); + let dec = rice_decompress_i8(&enc, vals.len(), 32).unwrap(); + assert_eq!(dec, vals); + } + + /// A non-default block size must round-trip too (the block boundary moves). + #[test] + fn rice_encode_blocksize_16() { + let vals: Vec = (0..70).map(|i| (i * 3) % 11 - 5).collect(); + let enc = rice_compress_i32(&vals, 16); + let dec = rice_decompress_i32(&enc, vals.len(), 16).unwrap(); + assert_eq!(dec, vals); + } + + /// `ImageData::compress` (RICE_1) → `as_compressed_image().decompress()` must be a + /// byte-exact identity for an integer image, including a 2-D square tiling with + /// edge-truncated tiles. + #[test] + fn compress_image_rice_int_internal_roundtrip() { + use crate::image_data::{ImageData, PixelData}; + + // 13x11 I16 image, value = mix of trend + noise; square 5x5 tiles (edges ragged). + let w = 13usize; + let h = 11usize; + let pixels: Vec = (0..(w * h)) + .map(|i| (((i * 37) % 521) as i16) - 200) + .collect(); + let img = ImageData::new(vec![w, h], PixelData::I16(pixels)); + + for tile in [None, Some(vec![w, 1]), Some(vec![5, 5]), Some(vec![13, 11])] { + let opts = CompressOptions { + algorithm: CompressionType::Rice1, + tile, + ..Default::default() + }; + let hdu = img.compress(&opts).unwrap(); + assert!(matches!(hdu.data, crate::hdu::HduData::BinTable(_))); + let back = hdu + .as_compressed_image() + .expect("ZIMAGE detected") + .decompress() + .unwrap(); + assert_eq!(back.axes, img.axes); + assert_eq!(back.pixels.to_bytes(), img.pixels.to_bytes()); + } + } + + #[test] + fn compress_image_rice_i32_and_u8_roundtrip() { + use crate::image_data::{ImageData, PixelData}; + + let i32img = ImageData::new( + vec![20, 3], + PixelData::I32((0..60).map(|i| (i - 30) * 12345).collect()), + ); + let hdu = i32img.compress(&CompressOptions::default()).unwrap(); + let back = hdu.as_compressed_image().unwrap().decompress().unwrap(); + assert_eq!(back.pixels.to_bytes(), i32img.pixels.to_bytes()); + + let u8img = ImageData::new( + vec![17, 4], + PixelData::U8((0..68).map(|i| (i * 5 % 256) as u8).collect()), + ); + let hdu = u8img.compress(&CompressOptions::default()).unwrap(); + let back = hdu.as_compressed_image().unwrap().decompress().unwrap(); + assert_eq!(back.pixels.to_bytes(), u8img.pixels.to_bytes()); + } + + #[test] + fn compress_rejects_unsupported_algorithms() { + use crate::image_data::{ImageData, PixelData}; + let img = ImageData::new(vec![4], PixelData::I16(vec![1, 2, 3, 4])); + for alg in [ + CompressionType::Plio1, + CompressionType::Hcompress1, + CompressionType::NoCompress, + ] { + let opts = CompressOptions { + algorithm: alg, + ..Default::default() + }; + assert!(matches!( + img.compress(&opts), + Err(Error::UnsupportedCompression(_)) + )); + } + // 64-bit integer images are rejected. + let i64img = ImageData::new(vec![3], PixelData::I64(vec![1, 2, 3])); + assert!(i64img.compress(&CompressOptions::default()).is_err()); + } + + #[cfg(feature = "gzip")] + #[test] + fn compress_image_gzip_int_roundtrip() { + use crate::image_data::{ImageData, PixelData}; + let img = ImageData::new( + vec![15, 6], + PixelData::I16((0..90).map(|i| (i * 13 % 400 - 200) as i16).collect()), + ); + for alg in [CompressionType::Gzip1, CompressionType::Gzip2] { + let opts = CompressOptions { + algorithm: alg, + ..Default::default() + }; + let hdu = img.compress(&opts).unwrap(); + let back = hdu.as_compressed_image().unwrap().decompress().unwrap(); + assert_eq!( + back.pixels.to_bytes(), + img.pixels.to_bytes(), + "{alg:?} int roundtrip" + ); + } + } + + #[cfg(feature = "gzip")] + #[test] + fn compress_image_gzip_lossless_float_roundtrip() { + use crate::image_data::{ImageData, PixelData}; + let pixels: Vec = (0..96).map(|i| (i as f32) * 0.125 - 6.0).collect(); + let img = ImageData::new(vec![12, 8], PixelData::F32(pixels)); + // Lossless float storage is GZIP_1 only (GZIP_2 float shuffle is unsupported). + let opts = CompressOptions { + algorithm: CompressionType::Gzip1, + quantize: None, // lossless raw floats + ..Default::default() + }; + let hdu = img.compress(&opts).unwrap(); + let back = hdu.as_compressed_image().unwrap().decompress().unwrap(); + assert_eq!( + back.pixels.to_bytes(), + img.pixels.to_bytes(), + "GZIP_1 lossless float roundtrip" + ); + // GZIP_2 lossless float is rejected. + let opts2 = CompressOptions { + algorithm: CompressionType::Gzip2, + quantize: None, + ..Default::default() + }; + assert!(matches!( + img.compress(&opts2), + Err(Error::UnsupportedCompression(_)) + )); + } + + /// Lossy quantized-float RICE encode must round-trip within ~scale tolerance, and + /// the reconstructed values must match our own decoder's `unquantize` exactly. + #[test] + fn compress_image_rice_float_quantize_roundtrip() { + use crate::image_data::{ImageData, PixelData}; + let pixels: Vec = (0..256) + .map(|i| ((i as f32) * 0.017).sin() * 100.0 + 50.0) + .collect(); + let img = ImageData::new(vec![16, 16], PixelData::F32(pixels.clone())); + let opts = CompressOptions { + algorithm: CompressionType::Rice1, + tile: Some(vec![16, 4]), + quantize: Some(4.0), + dither: Quantize::SubtractiveDither1, + dither_seed: Some(5), + ..Default::default() + }; + let hdu = img.compress(&opts).unwrap(); + let back = hdu.as_compressed_image().unwrap().decompress().unwrap(); + let recon = match &back.pixels { + PixelData::F32(v) => v.clone(), + other => panic!("expected F32, got {other:?}"), + }; + assert_eq!(recon.len(), pixels.len()); + // The dither RNG indexing the encoder used is the same the decoder inverts, so + // the error is bounded by the per-tile quantization step. Check a generous + // tolerance relative to the data range. + let range = 100.0f32; // amplitude + let tol = range / 1000.0; + for (i, (&o, &r)) in pixels.iter().zip(recon.iter()).enumerate() { + assert!( + (o - r).abs() <= tol, + "pixel {i}: orig {o} recon {r} exceeds tol {tol}" + ); + } + } } diff --git a/tests/sample_files.rs b/tests/sample_files.rs index 3a48201..159c20a 100644 --- a/tests/sample_files.rs +++ b/tests/sample_files.rs @@ -464,3 +464,278 @@ fn gzip1_roundtrip_euv() { // GZIP_1 stores raw big-endian image integers per tile; needs the `gzip` feature. assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.gzip1.fits.fz"); } + +// === ENCODE (write) path: fits4 produces tile-compressed FITS ================ +// +// Two oracles per algorithm: +// (1) internal: image.compress(opts) -> as_compressed_image().decompress() == image +// (byte-exact for lossless RICE/GZIP int + lossless-float GZIP). +// (2) interop: write the fits4-compressed file, run the `funpack` CLI on it, and +// confirm funpack's reconstruction matches the original image (byte-exact for +// lossless cases). This proves fits4 emits standard, cfitsio-readable compressed +// FITS. Skipped cleanly when samples or `funpack` are absent. + +use fits4::tile_compress::CompressOptions; +use fits4::{CompressionType, Quantize}; + +/// Collect the non-empty image HDUs (primary + extensions) of an uncompressed sample. +fn sample_images(src_name: &str) -> Vec { + let src = FitsFile::from_file(samp(src_name)).expect("read source"); + src.hdus + .iter() + .filter_map(|h| match &h.data { + HduData::Image(im) if !im.pixels.to_bytes().is_empty() => Some(im.clone()), + _ => None, + }) + .collect() +} + +/// Build a fits4 `.fz`-style FitsFile: empty primary + one compressed-image extension +/// per source image, all using `opts`. +fn compress_sample(images: &[ImageData], opts: &CompressOptions) -> FitsFile { + let mut fits = FitsFile::with_empty_primary(); + // funpack expects EXTEND in the primary. + fits.primary_mut() + .header + .set("EXTEND", HeaderValue::Logical(true), None); + for img in images { + fits.push_extension(img.compress(opts).expect("compress")); + } + fits +} + +/// Internal round-trip: every compressed extension decompresses byte-exactly. +fn assert_internal_roundtrip(images: &[ImageData], opts: &CompressOptions) { + let fits = compress_sample(images, opts); + let comp: Vec<&Hdu> = fits + .hdus + .iter() + .filter(|h| h.as_compressed_image().is_some()) + .collect(); + assert_eq!(comp.len(), images.len()); + for (orig, hdu) in images.iter().zip(comp) { + let dec = hdu.as_compressed_image().unwrap().decompress().unwrap(); + assert_eq!(dec.axes, orig.axes, "internal: axes"); + assert_eq!( + dec.pixels.to_bytes(), + orig.pixels.to_bytes(), + "internal: pixel bytes" + ); + } + // Also confirm the file re-reads from bytes as compressed images. + let bytes = fits.to_bytes().expect("to_bytes"); + let reread = FitsFile::from_bytes(&bytes).expect("reread"); + let n = reread + .hdus + .iter() + .filter(|h| h.as_compressed_image().is_some()) + .count(); + assert_eq!(n, images.len(), "reread: compressed HDU count"); +} + +/// Interop: funpack reads fits4's compressed output back to `images` (byte-exact). +fn assert_funpack_reads_fits4(src_name: &str, opts: &CompressOptions, tag: &str) { + if !Path::new(SAMP_DIR).is_dir() { + eprintln!("skipping: samp/ not present"); + return; + } + if !funpack_available() { + eprintln!("skipping: funpack binary not available"); + return; + } + let images = sample_images(src_name); + assert!(!images.is_empty(), "{src_name}: no source images"); + let fits = compress_sample(&images, opts); + + // funpack requires the file to be named *.fz. + let mut fz = std::env::temp_dir(); + fz.push(format!("fits4_enc_{}_{}.fits.fz", std::process::id(), tag)); + let mut out = std::env::temp_dir(); + out.push(format!("fits4_enc_{}_{}.fits", std::process::id(), tag)); + let _ = std::fs::remove_file(&fz); + let _ = std::fs::remove_file(&out); + fits.to_file(&fz).expect("write fits4 .fz"); + + let status = std::process::Command::new("funpack") + .arg("-O") + .arg(&out) + .arg("-C") + .arg(&fz) + .status() + .expect("run funpack"); + assert!(status.success(), "{tag}: funpack failed on fits4 output"); + + let recon = FitsFile::from_file(&out).expect("read funpack output"); + let recon_images: Vec<&ImageData> = recon + .hdus + .iter() + .filter_map(|h| match &h.data { + HduData::Image(im) if !im.pixels.to_bytes().is_empty() => Some(im), + _ => None, + }) + .collect(); + assert_eq!( + recon_images.len(), + images.len(), + "{tag}: funpack image count" + ); + for (orig, got) in images.iter().zip(recon_images) { + assert_eq!(got.axes, orig.axes, "{tag}: funpack axes"); + assert_eq!( + got.pixels.to_bytes(), + orig.pixels.to_bytes(), + "{tag}: funpack pixel bytes differ from original (lossless expected)" + ); + } + let _ = std::fs::remove_file(&fz); + let _ = std::fs::remove_file(&out); +} + +fn rice_opts(tile: Option>) -> CompressOptions { + CompressOptions { + algorithm: CompressionType::Rice1, + tile, + ..Default::default() + } +} + +#[test] +fn encode_rice_i16_internal() { + require_samples!(); + assert_internal_roundtrip(&sample_images("EUVEngc4151imgx.fits"), &rice_opts(None)); + // square 2-D tiling with edge truncation + assert_internal_roundtrip( + &sample_images("EUVEngc4151imgx.fits"), + &rice_opts(Some(vec![100, 100])), + ); +} + +#[test] +fn encode_rice_i16_interop_funpack() { + assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &rice_opts(None), "rice_i16"); +} + +#[test] +fn encode_rice_i16_square_tiles_interop_funpack() { + assert_funpack_reads_fits4( + "EUVEngc4151imgx.fits", + &rice_opts(Some(vec![100, 100])), + "rice_i16_t100", + ); +} + +#[test] +fn encode_rice_i32_internal() { + require_samples!(); + assert_internal_roundtrip(&sample_images("FGSf64y0106m_a1f.fits"), &rice_opts(None)); +} + +#[test] +fn encode_rice_i32_interop_funpack() { + assert_funpack_reads_fits4("FGSf64y0106m_a1f.fits", &rice_opts(None), "rice_i32"); +} + +#[cfg(feature = "gzip")] +#[test] +fn encode_gzip_i16_internal() { + require_samples!(); + for alg in [CompressionType::Gzip1, CompressionType::Gzip2] { + let opts = CompressOptions { + algorithm: alg, + ..Default::default() + }; + assert_internal_roundtrip(&sample_images("EUVEngc4151imgx.fits"), &opts); + } +} + +#[cfg(feature = "gzip")] +#[test] +fn encode_gzip1_i16_interop_funpack() { + let opts = CompressOptions { + algorithm: CompressionType::Gzip1, + ..Default::default() + }; + assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &opts, "gzip1_i16"); +} + +#[cfg(feature = "gzip")] +#[test] +fn encode_gzip2_i16_interop_funpack() { + let opts = CompressOptions { + algorithm: CompressionType::Gzip2, + ..Default::default() + }; + assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &opts, "gzip2_i16"); +} + +#[cfg(feature = "gzip")] +#[test] +fn encode_gzip_lossless_float_internal() { + require_samples!(); + let opts = CompressOptions { + algorithm: CompressionType::Gzip1, + quantize: None, // lossless raw-float storage + ..Default::default() + }; + assert_internal_roundtrip(&sample_images("FOCx38i0101t_c0f.fits"), &opts); +} + +#[cfg(feature = "gzip")] +#[test] +fn encode_gzip_lossless_float_interop_funpack() { + let opts = CompressOptions { + algorithm: CompressionType::Gzip1, + quantize: None, + ..Default::default() + }; + assert_funpack_reads_fits4("FOCx38i0101t_c0f.fits", &opts, "gzip_lossless_f32"); +} + +/// Lossy quantized-float RICE encode: round-trips within quantization tolerance via the +/// internal decoder. (Interop byte-exactness is not expected for lossy data; the +/// lossless cases above carry the funpack-reads-fits4 proof.) +#[test] +fn encode_rice_float_quantize_within_tolerance() { + require_samples!(); + let images = sample_images("FOCx38i0101t_c0f.fits"); + let opts = CompressOptions { + algorithm: CompressionType::Rice1, + tile: None, + quantize: Some(4.0), + dither: Quantize::SubtractiveDither1, + dither_seed: Some(5), + ..Default::default() + }; + let fits = compress_sample(&images, &opts); + let comp: Vec<&Hdu> = fits + .hdus + .iter() + .filter(|h| h.as_compressed_image().is_some()) + .collect(); + assert_eq!(comp.len(), images.len()); + for (orig, hdu) in images.iter().zip(comp) { + let dec = hdu.as_compressed_image().unwrap().decompress().unwrap(); + assert_eq!(dec.axes, orig.axes); + // Compare within a tolerance derived from the data's own dynamic range. + let (o, r) = match (&orig.pixels, &dec.pixels) { + (PixelData::F32(a), PixelData::F32(b)) => (a.clone(), b.clone()), + _ => panic!("expected F32 float image"), + }; + assert_eq!(o.len(), r.len()); + let finite_max = o + .iter() + .filter(|v| v.is_finite()) + .fold(0.0f32, |m, &v| m.max(v.abs())); + let tol = (finite_max / 1000.0).max(1.0); + let mut max_err = 0.0f32; + for (&a, &b) in o.iter().zip(r.iter()) { + if a.is_finite() && b.is_finite() { + max_err = max_err.max((a - b).abs()); + } + } + assert!( + max_err <= tol, + "quantize round-trip max error {max_err} exceeds tolerance {tol}" + ); + } +} From d03948ce96eb8fcb419458359cbbe8856b8d2e0b Mon Sep 17 00:00:00 2001 From: Steven Michael Date: Sun, 31 May 2026 16:15:04 -0400 Subject: [PATCH 7/9] Docs: document tile-compression write support in README MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a write-compressed-image example (ImageData::compress), list encode in features/supported, note RICE works in the zero-dep build, and add a "Compressed write" column to the comparison table — fits4 is the only pure-Rust crate that writes compressed FITS. Co-Authored-By: Claude Opus 4.8 (1M context) --- README.md | 49 +++++++++++++++++++++++++++++++++++++------------ 1 file changed, 37 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 39aa0d2..8954c15 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) -fits4 reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It also decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) on the read path. +fits4 reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) and — uniquely among pure-Rust FITS crates — **writes** RICE- and GZIP-compressed images that cfitsio's `funpack` reads back byte-for-byte. ## Features @@ -15,6 +15,7 @@ fits4 reads *and* writes the full FITS v4.0 standard — primary and extension H - **Variable-length arrays** — `P` and `Q` heap descriptors in binary tables - **CHECKSUM/DATASUM** — ones-complement integrity computation and verification - **Tile-compressed images (read)** — RICE_1, GZIP_1, GZIP_2, PLIO_1, and HCOMPRESS_1, including float quantization with subtractive dithering; decoded bit-exactly against cfitsio's `funpack` +- **Tile-compressed images (write)** — encode RICE_1 and GZIP_1/GZIP_2 (integer and float); output verified byte-exact through `funpack` - **Zero dependencies by default** — optional `image` and `gzip` features pull in pure-Rust crates only ## Installation @@ -116,6 +117,26 @@ for hdu in fits.extensions() { # Ok::<(), fits4::Error>(()) ``` +### Writing a tile-compressed image + +`ImageData::compress` produces a compressed-image `BINTABLE` HDU ready to push +into a file. The output is read back byte-for-byte by cfitsio's `funpack`. + +```rust +use fits4::{FitsFile, ImageData, PixelData, CompressOptions}; + +let pixels: Vec = (0..10000).map(|i| (i % 1000) as i16).collect(); +let img = ImageData::new(vec![100, 100], PixelData::I16(pixels)); + +// Default options: RICE_1, one tile per row, lossless for integers. +let hdu = img.compress(&CompressOptions::default())?; + +let mut fits = FitsFile::with_empty_primary(); +fits.push_extension(hdu); +fits.to_file("compressed.fits")?; +# Ok::<(), fits4::Error>(()) +``` + ### Checksums ```rust @@ -149,6 +170,7 @@ that structure directly: | `BinTable` / `BinColumn` / `BinColumnType` / `BinCellValue` | Binary-table model — columns, rows, and heap (variable-length arrays); built with `BinTableBuilder`. | | `AsciiTable` | ASCII `TABLE` model with `TFORMn` column parsing and `TSCALn`/`TZEROn` scaling. | | `CompressedImage` / `CompressionType` / `TileGeometry` / `Quantize` | Read-side view over a tile-compressed image: detect the algorithm and geometry, then `decompress()` to an `ImageData`. | +| `CompressOptions` | Write-side encode options (algorithm, tiling, quantization, dithering) for `ImageData::compress` / `compress_image`. | | `Bitpix` | The `BITPIX` data type (`8`/`16`/`32`/`64`/`-32`/`-64`). | | `Error` / `Result` | Crate error type and result alias. | @@ -160,22 +182,22 @@ are preserved for a lossless round-trip write. | Flag | Default | Description | |------|---------|-------------| -| *(none)* | ✓ | Core read/write, all HDU types, RICE_1 / PLIO_1 / HCOMPRESS_1 tile decompression — zero dependencies | +| *(none)* | ✓ | Core read/write, all HDU types, RICE_1 / PLIO_1 / HCOMPRESS_1 decompression, and RICE_1 compression — zero dependencies | | `image` | | Conversion between `ImageData` and the [`image`](https://crates.io/crates/image) crate's `DynamicImage` | -| `gzip` | | Decoding of `GZIP_1`/`GZIP_2` tile-compressed images via the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate | +| `gzip` | | `GZIP_1`/`GZIP_2` tile compression and decompression via the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate | -The default build stays dependency-free; `RICE_1`, `PLIO_1`, and `HCOMPRESS_1` decompression all work without any feature (only `GZIP_1`/`GZIP_2` need the `gzip` feature). +The default build stays dependency-free; `RICE_1`, `PLIO_1`, and `HCOMPRESS_1` decompression and `RICE_1` compression all work without any feature (only `GZIP_1`/`GZIP_2` need the `gzip` feature). ## Why fits4? -Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, and complete compressed-image read support**, with no C toolchain to install. +Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, complete compressed-image reading, and compressed-image writing**, with no C toolchain to install. No other pure-Rust crate writes compressed FITS. -| Crate | Pure Rust | Write | Tables | Compressed read | Notes | -|-------|-----------|-------|--------|-----------------|-------| -| [`fitsio`](https://crates.io/crates/fitsio) | ✗ | ✓ | ✓ | ✓ | Wraps the cfitsio C library; needs a C toolchain | -| [`fitsrs`](https://crates.io/crates/fitsrs) | ✓ | ✗ | partial | — | Read-only | -| [`fitrs`](https://crates.io/crates/fitrs) | ✓ | ✓ | ✗ | ✗ | Dormant; no table support | -| **fits4** | ✓ | ✓ | ✓ | ✓ (all types) | Zero default deps; no C dependency | +| Crate | Pure Rust | Write | Tables | Compressed read | Compressed write | Notes | +|-------|-----------|-------|--------|-----------------|------------------|-------| +| [`fitsio`](https://crates.io/crates/fitsio) | ✗ | ✓ | ✓ | ✓ | ✓ | Wraps the cfitsio C library; needs a C toolchain | +| [`fitsrs`](https://crates.io/crates/fitsrs) | ✓ | ✗ | partial | — | ✗ | Read-only | +| [`fitrs`](https://crates.io/crates/fitrs) | ✓ | ✓ | ✗ | ✗ | ✗ | Dormant; no table support | +| **fits4** | ✓ | ✓ | ✓ | ✓ (all types) | ✓ (RICE/GZIP) | Zero default deps; no C dependency | ## Supported / not supported @@ -189,10 +211,13 @@ Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero defau HCOMPRESS_1 — for integer and floating-point images, including quantization with subtractive dithering (`NO_DITHER` / `SUBTRACTIVE_DITHER_1` / `SUBTRACTIVE_DITHER_2`) +- Tile-compressed image **writing**: RICE_1 and GZIP_1/GZIP_2 — integer + (lossless) and float (lossless via GZIP, or lossy quantized + dithered); + output verified byte-exact through cfitsio's `funpack` **Not supported** -- Writing/encoding tile-compressed images (read/decode only) +- Encoding `PLIO_1` or `HCOMPRESS_1` (these decode only) - HCOMPRESS image smoothing (`SMOOTH` ≠ 0) on decode - Random groups (deprecated in FITS v4.0) From 3ab83371d7da19357f1fe8f8d9bed190d4b10f3b Mon Sep 17 00:00:00 2001 From: Steven Michael Date: Sun, 31 May 2026 16:19:57 -0400 Subject: [PATCH 8/9] Rename crate to fitskit; add crates.io publish metadata Rename the crate from fits4 to fitskit throughout (Cargo.toml, source, tests, doc examples, README, CLAUDE.md, COMPRESSION_PLAN.md). The fits4_samples GCS bucket name is unchanged (external resource). Add publish metadata: repository, documentation, readme, keywords, categories, authors. Co-Authored-By: Claude Opus 4.8 (1M context) --- CLAUDE.md | 2 +- COMPRESSION_PLAN.md | 2 +- Cargo.lock | 2 +- Cargo.toml | 10 ++++++-- README.md | 36 ++++++++++++++-------------- src/bintable.rs | 2 +- src/hdu.rs | 2 +- src/image_data.rs | 4 ++-- src/lib.rs | 10 ++++---- tests/checksum.rs | 8 +++---- tests/image_conv.rs | 2 +- tests/round_trip.rs | 10 ++++---- tests/sample_files.rs | 56 +++++++++++++++++++++---------------------- 13 files changed, 76 insertions(+), 70 deletions(-) diff --git a/CLAUDE.md b/CLAUDE.md index 914a1c6..4e0eb80 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,4 +1,4 @@ -# fits4 — Pure Rust FITS v4.0 Library +# fitskit — Pure Rust FITS v4.0 Library ## Build Commands diff --git a/COMPRESSION_PLAN.md b/COMPRESSION_PLAN.md index 9b7fcf2..e4ce213 100644 --- a/COMPRESSION_PLAN.md +++ b/COMPRESSION_PLAN.md @@ -2,7 +2,7 @@ Status: planning + initial scaffold. This document describes how to add support for the FITS *Tiled Image Compression* convention (the BINTABLE-based compressed-image -format) to `fits4`, currently the main feature gap versus `fitsrs`. +format) to `fitskit`, currently the main feature gap versus `fitsrs`. References: - Tiled Image Convention for Storing Compressed Images in FITS Binary Tables, diff --git a/Cargo.lock b/Cargo.lock index e419038..06ea30a 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -308,7 +308,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "5baebc0774151f905a1a2cc41989300b1e6fbb29aff0ceffa1064fdd3088d582" [[package]] -name = "fits4" +name = "fitskit" version = "0.1.0" dependencies = [ "image", diff --git a/Cargo.toml b/Cargo.toml index fe3bf4e..7a7bddd 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -1,9 +1,15 @@ [package] -name = "fits4" +name = "fitskit" version = "0.1.0" edition = "2021" -description = "Pure Rust FITS v4.0 reader/writer" +description = "Pure Rust FITS v4.0 reader/writer with tile-compression read and write" license = "MIT" +repository = "https://github.com/ssmichael1/fitskit" +documentation = "https://docs.rs/fitskit" +readme = "README.md" +keywords = ["fits", "astronomy", "fits-file", "compression", "image"] +categories = ["science", "encoding", "parser-implementations"] +authors = ["Steven Michael "] [dependencies.image] version = "0.25" diff --git a/README.md b/README.md index 8954c15..e1ebe3b 100644 --- a/README.md +++ b/README.md @@ -1,10 +1,10 @@ -# fits4 +# fitskit **Pure-Rust, zero-dependency reader and writer for [FITS](https://fits.gsfc.nasa.gov/fits_standard.html) v4.0** — the Flexible Image Transport System format used throughout astronomy. [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) -fits4 reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) and — uniquely among pure-Rust FITS crates — **writes** RICE- and GZIP-compressed images that cfitsio's `funpack` reads back byte-for-byte. +fitskit reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) and — uniquely among pure-Rust FITS crates — **writes** RICE- and GZIP-compressed images that cfitsio's `funpack` reads back byte-for-byte. ## Features @@ -22,7 +22,7 @@ fits4 reads *and* writes the full FITS v4.0 standard — primary and extension H ```toml [dependencies] -fits4 = "0.1" +fitskit = "0.1" ``` ## Usage @@ -30,7 +30,7 @@ fits4 = "0.1" ### Reading a FITS file ```rust -use fits4::{FitsFile, HduData, PixelData}; +use fitskit::{FitsFile, HduData, PixelData}; let fits = FitsFile::from_file("image.fits")?; @@ -52,13 +52,13 @@ for hdu in fits.extensions() { HduData::Empty => {} } } -# Ok::<(), fits4::Error>(()) +# Ok::<(), fitskit::Error>(()) ``` ### Writing a FITS file ```rust -use fits4::{FitsFile, ImageData, PixelData, HeaderValue}; +use fitskit::{FitsFile, ImageData, PixelData, HeaderValue}; let pixels: Vec = (0..10000).map(|i| (i % 1000) as i16).collect(); let img = ImageData::new(vec![100, 100], PixelData::I16(pixels)); @@ -67,13 +67,13 @@ let mut fits = FitsFile::with_primary_image(img); fits.primary_mut().header.set("OBJECT", HeaderValue::String("M31".into()), None); fits.to_file("output.fits")?; -# Ok::<(), fits4::Error>(()) +# Ok::<(), fitskit::Error>(()) ``` ### Building a binary table ```rust -use fits4::{FitsFile, Hdu, BinTableBuilder, BinColumnType}; +use fitskit::{FitsFile, Hdu, BinTableBuilder, BinColumnType}; let table = BinTableBuilder::new() .add_column("RA", BinColumnType::D64(1)) @@ -103,7 +103,7 @@ Tiled Image Compression convention). `Hdu::as_compressed_image` cheaply detects them; `decompress` reassembles the full image lazily. ```rust -use fits4::FitsFile; +use fitskit::FitsFile; let fits = FitsFile::from_file("compressed.fits")?; @@ -114,7 +114,7 @@ for hdu in fits.extensions() { println!("decompressed to {:?}", image.axes); } } -# Ok::<(), fits4::Error>(()) +# Ok::<(), fitskit::Error>(()) ``` ### Writing a tile-compressed image @@ -123,7 +123,7 @@ for hdu in fits.extensions() { into a file. The output is read back byte-for-byte by cfitsio's `funpack`. ```rust -use fits4::{FitsFile, ImageData, PixelData, CompressOptions}; +use fitskit::{FitsFile, ImageData, PixelData, CompressOptions}; let pixels: Vec = (0..10000).map(|i| (i % 1000) as i16).collect(); let img = ImageData::new(vec![100, 100], PixelData::I16(pixels)); @@ -134,13 +134,13 @@ let hdu = img.compress(&CompressOptions::default())?; let mut fits = FitsFile::with_empty_primary(); fits.push_extension(hdu); fits.to_file("compressed.fits")?; -# Ok::<(), fits4::Error>(()) +# Ok::<(), fitskit::Error>(()) ``` ### Checksums ```rust -use fits4::{FitsFile, ImageData, PixelData}; +use fitskit::{FitsFile, ImageData, PixelData}; let img = ImageData::new(vec![4], PixelData::U8(vec![1, 2, 3, 4])); let fits = FitsFile::with_primary_image(img); @@ -151,12 +151,12 @@ let bytes = fits.to_bytes_with_checksum()?; // Verify on read let fits2 = FitsFile::from_bytes(&bytes)?; fits2.primary().verify_datasum()?; -# Ok::<(), fits4::Error>(()) +# Ok::<(), fitskit::Error>(()) ``` ## Core types -A FITS file is an ordered sequence of **Header-Data Units (HDUs)**. fits4 mirrors +A FITS file is an ordered sequence of **Header-Data Units (HDUs)**. fitskit mirrors that structure directly: | Type | Role | @@ -188,16 +188,16 @@ are preserved for a lossless round-trip write. The default build stays dependency-free; `RICE_1`, `PLIO_1`, and `HCOMPRESS_1` decompression and `RICE_1` compression all work without any feature (only `GZIP_1`/`GZIP_2` need the `gzip` feature). -## Why fits4? +## Why fitskit? -Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, complete compressed-image reading, and compressed-image writing**, with no C toolchain to install. No other pure-Rust crate writes compressed FITS. +Among Rust FITS crates, fitskit fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, complete compressed-image reading, and compressed-image writing**, with no C toolchain to install. No other pure-Rust crate writes compressed FITS. | Crate | Pure Rust | Write | Tables | Compressed read | Compressed write | Notes | |-------|-----------|-------|--------|-----------------|------------------|-------| | [`fitsio`](https://crates.io/crates/fitsio) | ✗ | ✓ | ✓ | ✓ | ✓ | Wraps the cfitsio C library; needs a C toolchain | | [`fitsrs`](https://crates.io/crates/fitsrs) | ✓ | ✗ | partial | — | ✗ | Read-only | | [`fitrs`](https://crates.io/crates/fitrs) | ✓ | ✓ | ✗ | ✗ | ✗ | Dormant; no table support | -| **fits4** | ✓ | ✓ | ✓ | ✓ (all types) | ✓ (RICE/GZIP) | Zero default deps; no C dependency | +| **fitskit** | ✓ | ✓ | ✓ | ✓ (all types) | ✓ (RICE/GZIP) | Zero default deps; no C dependency | ## Supported / not supported diff --git a/src/bintable.rs b/src/bintable.rs index c226459..3345aaf 100644 --- a/src/bintable.rs +++ b/src/bintable.rs @@ -510,7 +510,7 @@ impl BinTable { /// Builder for constructing a `BinTable` row by row. /// /// ``` -/// use fits4::{BinTableBuilder, BinColumnType}; +/// use fitskit::{BinTableBuilder, BinColumnType}; /// /// let table = BinTableBuilder::new() /// .add_column("RA", BinColumnType::D64(1)) diff --git a/src/hdu.rs b/src/hdu.rs index 6af05d6..bd1b1bf 100644 --- a/src/hdu.rs +++ b/src/hdu.rs @@ -109,7 +109,7 @@ impl Hdu { /// decoding happens in [`CompressedImage::decompress`]: /// /// ```no_run - /// # use fits4::FitsFile; + /// # use fitskit::FitsFile; /// # let fits = FitsFile::from_file("compressed.fits").unwrap(); /// for hdu in fits.extensions() { /// if let Some(cimg) = hdu.as_compressed_image() { diff --git a/src/image_data.rs b/src/image_data.rs index 33ced0c..cc168da 100644 --- a/src/image_data.rs +++ b/src/image_data.rs @@ -163,8 +163,8 @@ impl ImageData { /// [`CompressedImage::decompress`](crate::tile_compress::CompressedImage::decompress). /// /// ``` - /// use fits4::{ImageData, PixelData}; - /// use fits4::tile_compress::CompressOptions; + /// use fitskit::{ImageData, PixelData}; + /// use fitskit::tile_compress::CompressOptions; /// /// let img = ImageData::new(vec![8, 4], PixelData::I16((0..32).collect())); /// let hdu = img.compress(&CompressOptions::default()).unwrap(); diff --git a/src/lib.rs b/src/lib.rs index 446f382..44b0a1b 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -1,4 +1,4 @@ -//! # fits4 — Pure Rust FITS v4.0 reader/writer +//! # fitskit — Pure Rust FITS v4.0 reader/writer //! //! A zero-dependency implementation of the FITS (Flexible Image Transport System) //! standard v4.0 for reading and writing astronomical data files. @@ -26,7 +26,7 @@ //! ## Quick start — reading //! //! ```no_run -//! use fits4::{FitsFile, HduData, PixelData}; +//! use fitskit::{FitsFile, HduData, PixelData}; //! //! let fits = FitsFile::from_file("image.fits").unwrap(); //! @@ -55,7 +55,7 @@ //! ## Quick start — writing //! //! ``` -//! use fits4::{FitsFile, Hdu, ImageData, PixelData, HeaderValue}; +//! use fitskit::{FitsFile, Hdu, ImageData, PixelData, HeaderValue}; //! //! // Create a 100x100 16-bit image //! let pixels: Vec = (0..10000).map(|i| (i % 1000) as i16).collect(); @@ -74,7 +74,7 @@ //! integer convention stores unsigned values in signed storage: //! //! ``` -//! use fits4::{ImageData, PixelData}; +//! use fitskit::{ImageData, PixelData}; //! //! // Unsigned u16 via BZERO=32768 //! let img = ImageData::new(vec![3], PixelData::I16(vec![-32768, 0, 32767])); @@ -87,7 +87,7 @@ //! Write with CHECKSUM/DATASUM integrity keywords: //! //! ``` -//! use fits4::{FitsFile, ImageData, PixelData}; +//! use fitskit::{FitsFile, ImageData, PixelData}; //! //! let img = ImageData::new(vec![4], PixelData::U8(vec![1, 2, 3, 4])); //! let fits = FitsFile::with_primary_image(img); diff --git a/tests/checksum.rs b/tests/checksum.rs index b9fef5b..212451a 100644 --- a/tests/checksum.rs +++ b/tests/checksum.rs @@ -1,5 +1,5 @@ -use fits4::checksum; -use fits4::*; +use fitskit::checksum; +use fitskit::*; #[test] fn write_with_checksum_and_verify() { @@ -31,7 +31,7 @@ fn write_with_checksum_and_verify() { #[test] fn checksum_with_extensions() { - use fits4::bintable::{BinColumn, BinColumnType}; + use fitskit::bintable::{BinColumn, BinColumnType}; let mut fits = FitsFile::with_empty_primary(); @@ -47,7 +47,7 @@ fn checksum_with_extensions() { main_data[0..8].copy_from_slice(&3.125f64.to_be_bytes()); main_data[8..16].copy_from_slice(&2.5f64.to_be_bytes()); - let table = fits4::BinTable { + let table = fitskit::BinTable { columns: vec![col], nrows: 2, row_len, diff --git a/tests/image_conv.rs b/tests/image_conv.rs index f751911..d1635d0 100644 --- a/tests/image_conv.rs +++ b/tests/image_conv.rs @@ -1,6 +1,6 @@ #![cfg(feature = "image")] -use fits4::*; +use fitskit::*; use image::{DynamicImage, GrayImage, ImageBuffer, Luma}; type Gray16Image = ImageBuffer, Vec>; diff --git a/tests/round_trip.rs b/tests/round_trip.rs index 6dee120..da7b796 100644 --- a/tests/round_trip.rs +++ b/tests/round_trip.rs @@ -1,4 +1,4 @@ -use fits4::*; +use fitskit::*; #[test] fn round_trip_empty_primary() { @@ -228,7 +228,7 @@ fn round_trip_bscale_bzero() { #[test] fn round_trip_bintable() { - use fits4::bintable::{BinCellValue, BinColumn, BinColumnType}; + use fitskit::bintable::{BinCellValue, BinColumn, BinColumnType}; // Build a simple bintable with 2 rows, 3 columns: i32, f64, 8-char string let col_j = BinColumn { @@ -325,7 +325,7 @@ fn round_trip_bintable() { #[test] fn round_trip_bintable_vla() { - use fits4::bintable::{BinCellValue, BinColumn, BinColumnType}; + use fitskit::bintable::{BinCellValue, BinColumn, BinColumnType}; // Variable-length array column using P descriptor let col = BinColumn { @@ -389,7 +389,7 @@ fn round_trip_bintable_vla() { #[test] fn round_trip_ascii_table() { - use fits4::ascii_table::{AsciiColumn, AsciiFormat, AsciiTable}; + use fitskit::ascii_table::{AsciiColumn, AsciiFormat, AsciiTable}; // 2 rows, 2 columns: name (A10) and value (F10.3) let col_name = AsciiColumn { @@ -497,7 +497,7 @@ fn header_keyword_preservation() { .push(Keyword::commentary("COMMENT", "Test comment line")); fits.primary_mut() .header - .push(Keyword::commentary("HISTORY", "Created by fits4 tests")); + .push(Keyword::commentary("HISTORY", "Created by fitskit tests")); fits.primary_mut() .header .set("AUTHOR", HeaderValue::String("test".into()), None); diff --git a/tests/sample_files.rs b/tests/sample_files.rs index 159c20a..2c5a1df 100644 --- a/tests/sample_files.rs +++ b/tests/sample_files.rs @@ -1,7 +1,7 @@ //! Tests reading NASA sample FITS files from the samp/ directory. //! These tests are skipped if the samp/ directory is not present. -use fits4::*; +use fitskit::*; use std::path::Path; const SAMP_DIR: &str = "samp"; @@ -288,8 +288,8 @@ fn hcompress_int_roundtrip_euv() { // // Quantized-float decompression is *lossy*, so the reconstructed floats will not // equal the original uncompressed sample. The authoritative oracle is funpack's own -// reconstruction: fits4 and funpack must apply the identical dither table + per-tile -// scaling, so fits4's output must be BIT-IDENTICAL to funpack's. We invoke the +// reconstruction: fitskit and funpack must apply the identical dither table + per-tile +// scaling, so fitskit's output must be BIT-IDENTICAL to funpack's. We invoke the // `funpack` CLI at test time and compare; the test skips cleanly when either the // fixture or the `funpack` binary is absent (so CI without cfitsio stays green). @@ -307,7 +307,7 @@ fn funpack_reference(fz_name: &str) -> Option { let fz_path = samp(fz_name); let mut out = std::env::temp_dir(); out.push(format!( - "fits4_funpack_ref_{}_{}.fits", + "fitskit_funpack_ref_{}_{}.fits", std::process::id(), fz_name.replace(['/', '.'], "_") )); @@ -327,7 +327,7 @@ fn funpack_reference(fz_name: &str) -> Option { f } -/// Assert that fits4's float decompression of every compressed-image HDU in `fz_name` +/// Assert that fitskit's float decompression of every compressed-image HDU in `fz_name` /// is byte-exact against funpack's reconstruction of the same file. fn assert_float_matches_funpack(fz_name: &str) { let fz_path = samp(fz_name); @@ -365,7 +365,7 @@ fn assert_float_matches_funpack(fz_name: &str) { let reference_img = ref_iter .next() .expect("more compressed images than funpack reference images"); - let dec = cimg.decompress().expect("fits4 float decompress"); + let dec = cimg.decompress().expect("fitskit float decompress"); assert_eq!( dec.axes, reference_img.axes, "{fz_name}: axes mismatch on compressed HDU #{matched}" @@ -377,9 +377,9 @@ fn assert_float_matches_funpack(fz_name: &str) { assert!(matched > 0, "{fz_name}: found no compressed-image HDUs"); } -/// Compare fits4's reconstructed floats against funpack's, element-by-element. +/// Compare fitskit's reconstructed floats against funpack's, element-by-element. /// -/// fits4 reproduces cfitsio's unquantization *exactly*, including the fused +/// fitskit reproduces cfitsio's unquantization *exactly*, including the fused /// multiply-add (`x*scale + zero` contracted to a single FMA) that the cfitsio C code /// relies on, so every reconstructed pixel must be bit-identical to funpack's. NaNs /// (from `ZBLANK`) only have to match as NaN — IEEE leaves the payload unspecified. @@ -390,7 +390,7 @@ fn compare_floats_vs_funpack(fz_name: &str, hdu: usize, got: &PixelData, want: & } assert!( a.to_bits() == b.to_bits(), - "{fz_name} HDU#{hdu}: fits4={a} ({:#x}) != funpack={b} ({:#x})", + "{fz_name} HDU#{hdu}: fitskit={a} ({:#x}) != funpack={b} ({:#x})", a.to_bits(), b.to_bits() ); @@ -465,18 +465,18 @@ fn gzip1_roundtrip_euv() { assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.gzip1.fits.fz"); } -// === ENCODE (write) path: fits4 produces tile-compressed FITS ================ +// === ENCODE (write) path: fitskit produces tile-compressed FITS ================ // // Two oracles per algorithm: // (1) internal: image.compress(opts) -> as_compressed_image().decompress() == image // (byte-exact for lossless RICE/GZIP int + lossless-float GZIP). -// (2) interop: write the fits4-compressed file, run the `funpack` CLI on it, and +// (2) interop: write the fitskit-compressed file, run the `funpack` CLI on it, and // confirm funpack's reconstruction matches the original image (byte-exact for -// lossless cases). This proves fits4 emits standard, cfitsio-readable compressed +// lossless cases). This proves fitskit emits standard, cfitsio-readable compressed // FITS. Skipped cleanly when samples or `funpack` are absent. -use fits4::tile_compress::CompressOptions; -use fits4::{CompressionType, Quantize}; +use fitskit::tile_compress::CompressOptions; +use fitskit::{CompressionType, Quantize}; /// Collect the non-empty image HDUs (primary + extensions) of an uncompressed sample. fn sample_images(src_name: &str) -> Vec { @@ -490,7 +490,7 @@ fn sample_images(src_name: &str) -> Vec { .collect() } -/// Build a fits4 `.fz`-style FitsFile: empty primary + one compressed-image extension +/// Build a fitskit `.fz`-style FitsFile: empty primary + one compressed-image extension /// per source image, all using `opts`. fn compress_sample(images: &[ImageData], opts: &CompressOptions) -> FitsFile { let mut fits = FitsFile::with_empty_primary(); @@ -533,8 +533,8 @@ fn assert_internal_roundtrip(images: &[ImageData], opts: &CompressOptions) { assert_eq!(n, images.len(), "reread: compressed HDU count"); } -/// Interop: funpack reads fits4's compressed output back to `images` (byte-exact). -fn assert_funpack_reads_fits4(src_name: &str, opts: &CompressOptions, tag: &str) { +/// Interop: funpack reads fitskit's compressed output back to `images` (byte-exact). +fn assert_funpack_reads_fitskit(src_name: &str, opts: &CompressOptions, tag: &str) { if !Path::new(SAMP_DIR).is_dir() { eprintln!("skipping: samp/ not present"); return; @@ -549,12 +549,12 @@ fn assert_funpack_reads_fits4(src_name: &str, opts: &CompressOptions, tag: &str) // funpack requires the file to be named *.fz. let mut fz = std::env::temp_dir(); - fz.push(format!("fits4_enc_{}_{}.fits.fz", std::process::id(), tag)); + fz.push(format!("fitskit_enc_{}_{}.fits.fz", std::process::id(), tag)); let mut out = std::env::temp_dir(); - out.push(format!("fits4_enc_{}_{}.fits", std::process::id(), tag)); + out.push(format!("fitskit_enc_{}_{}.fits", std::process::id(), tag)); let _ = std::fs::remove_file(&fz); let _ = std::fs::remove_file(&out); - fits.to_file(&fz).expect("write fits4 .fz"); + fits.to_file(&fz).expect("write fitskit .fz"); let status = std::process::Command::new("funpack") .arg("-O") @@ -563,7 +563,7 @@ fn assert_funpack_reads_fits4(src_name: &str, opts: &CompressOptions, tag: &str) .arg(&fz) .status() .expect("run funpack"); - assert!(status.success(), "{tag}: funpack failed on fits4 output"); + assert!(status.success(), "{tag}: funpack failed on fitskit output"); let recon = FitsFile::from_file(&out).expect("read funpack output"); let recon_images: Vec<&ImageData> = recon @@ -612,12 +612,12 @@ fn encode_rice_i16_internal() { #[test] fn encode_rice_i16_interop_funpack() { - assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &rice_opts(None), "rice_i16"); + assert_funpack_reads_fitskit("EUVEngc4151imgx.fits", &rice_opts(None), "rice_i16"); } #[test] fn encode_rice_i16_square_tiles_interop_funpack() { - assert_funpack_reads_fits4( + assert_funpack_reads_fitskit( "EUVEngc4151imgx.fits", &rice_opts(Some(vec![100, 100])), "rice_i16_t100", @@ -632,7 +632,7 @@ fn encode_rice_i32_internal() { #[test] fn encode_rice_i32_interop_funpack() { - assert_funpack_reads_fits4("FGSf64y0106m_a1f.fits", &rice_opts(None), "rice_i32"); + assert_funpack_reads_fitskit("FGSf64y0106m_a1f.fits", &rice_opts(None), "rice_i32"); } #[cfg(feature = "gzip")] @@ -655,7 +655,7 @@ fn encode_gzip1_i16_interop_funpack() { algorithm: CompressionType::Gzip1, ..Default::default() }; - assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &opts, "gzip1_i16"); + assert_funpack_reads_fitskit("EUVEngc4151imgx.fits", &opts, "gzip1_i16"); } #[cfg(feature = "gzip")] @@ -665,7 +665,7 @@ fn encode_gzip2_i16_interop_funpack() { algorithm: CompressionType::Gzip2, ..Default::default() }; - assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &opts, "gzip2_i16"); + assert_funpack_reads_fitskit("EUVEngc4151imgx.fits", &opts, "gzip2_i16"); } #[cfg(feature = "gzip")] @@ -688,12 +688,12 @@ fn encode_gzip_lossless_float_interop_funpack() { quantize: None, ..Default::default() }; - assert_funpack_reads_fits4("FOCx38i0101t_c0f.fits", &opts, "gzip_lossless_f32"); + assert_funpack_reads_fitskit("FOCx38i0101t_c0f.fits", &opts, "gzip_lossless_f32"); } /// Lossy quantized-float RICE encode: round-trips within quantization tolerance via the /// internal decoder. (Interop byte-exactness is not expected for lossy data; the -/// lossless cases above carry the funpack-reads-fits4 proof.) +/// lossless cases above carry the funpack-reads-fitskit proof.) #[test] fn encode_rice_float_quantize_within_tolerance() { require_samples!(); From de67f1b57ecfbe3347af00b71ec1d12f3d930d85 Mon Sep 17 00:00:00 2001 From: Steven Michael Date: Sun, 31 May 2026 16:26:12 -0400 Subject: [PATCH 9/9] Docs: expand lib.rs crate docs to mirror README Add binary-table and tile-compression (compress -> decompress round-trip) runnable examples, a Core types overview, and note that the gzip feature now enables encoding as well as decoding. Co-Authored-By: Claude Opus 4.8 (1M context) --- src/lib.rs | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 70 insertions(+), 3 deletions(-) diff --git a/src/lib.rs b/src/lib.rs index 44b0a1b..a6d0973 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -68,6 +68,54 @@ //! assert_eq!(bytes.len() % 2880, 0); // block-aligned //! ``` //! +//! ## Binary tables +//! +//! Build a `BINTABLE` extension column-by-column, then push it onto a file: +//! +//! ``` +//! use fitskit::{FitsFile, Hdu, BinTableBuilder, BinColumnType}; +//! +//! let table = BinTableBuilder::new() +//! .add_column("RA", BinColumnType::D64(1)) +//! .add_column("DEC", BinColumnType::D64(1)) +//! .add_column("MAG", BinColumnType::E32(1)) +//! .push_row(|row| { +//! row.write_f64(180.0); +//! row.write_f64(45.0); +//! row.write_f32(12.5); +//! }) +//! .build(); +//! +//! let mut fits = FitsFile::with_empty_primary(); +//! fits.push_extension(Hdu::bintable_extension(table)); +//! let _ = fits.to_bytes().unwrap(); +//! ``` +//! +//! ## Tile-compressed images +//! +//! [`ImageData::compress`](image_data::ImageData::compress) produces a compressed-image +//! `BINTABLE` HDU; [`Hdu::as_compressed_image`](hdu::Hdu::as_compressed_image) + +//! [`CompressedImage::decompress`](tile_compress::CompressedImage::decompress) read it +//! back. RICE_1 is lossless for integers, so this round-trips exactly: +//! +//! ``` +//! use fitskit::{FitsFile, ImageData, PixelData, CompressOptions, HduData}; +//! +//! let pixels: Vec = (0..10000).map(|i| (i % 1000) as i16).collect(); +//! let img = ImageData::new(vec![100, 100], PixelData::I16(pixels.clone())); +//! +//! // Compress (default: RICE_1, one tile per row) into a BINTABLE HDU and write it. +//! let mut fits = FitsFile::with_empty_primary(); +//! fits.push_extension(img.compress(&CompressOptions::default()).unwrap()); +//! let bytes = fits.to_bytes().unwrap(); +//! +//! // Read back and decompress. +//! let fits2 = FitsFile::from_bytes(&bytes).unwrap(); +//! let cimg = fits2.extensions()[0].as_compressed_image().unwrap(); +//! let restored = cimg.decompress().unwrap(); +//! assert!(matches!(restored.pixels, PixelData::I16(ref v) if *v == pixels)); +//! ``` +//! //! ## BSCALE/BZERO //! //! Physical values are computed as `BZERO + BSCALE * array_value`. The unsigned @@ -98,13 +146,32 @@ //! fits2.primary().verify_datasum().unwrap(); //! ``` //! +//! ## Core types +//! +//! A FITS file is an ordered sequence of Header-Data Units (HDUs): +//! +//! - [`FitsFile`] — top-level container ([`Vec`](Hdu)); reads/writes files, +//! bytes, or any `Read`/`Write`, plus builder methods. +//! - [`Hdu`] — one HDU: a [`Header`] plus an [`HduData`] payload. +//! - [`HduData`] — payload enum: `Empty`, `Image`, `AsciiTable`, `BinTable`. +//! - [`Header`] / [`Keyword`] / [`HeaderValue`] — ordered cards with typed accessors. +//! - [`ImageData`] / [`PixelData`] — axes plus a typed pixel buffer; raw and +//! BSCALE/BZERO-scaled access. +//! - [`BinTable`] / [`BinColumn`] / [`BinColumnType`] / [`BinCellValue`] — binary-table +//! model (with VLA heap), built via [`BinTableBuilder`]. +//! - [`AsciiTable`] — ASCII `TABLE` model. +//! - [`CompressedImage`] / [`CompressionType`] / [`CompressOptions`] — tile-compression +//! read view and write options. +//! - [`Bitpix`] — the `BITPIX` data type; [`Error`] / [`Result`] — error handling. +//! //! ## Feature flags //! //! - **`image`** — enables conversion between [`ImageData`] and the //! [`image`](https://crates.io/crates/image) crate's `DynamicImage`. -//! - **`gzip`** — enables decoding of `GZIP_1`/`GZIP_2` tile-compressed images via -//! the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate. The -//! default build stays dependency-free; `RICE_1` works without this feature. +//! - **`gzip`** — enables encoding *and* decoding of `GZIP_1`/`GZIP_2` tile-compressed +//! images via the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) +//! crate. The default build stays dependency-free; `RICE_1` (read and write), +//! `PLIO_1`, and `HCOMPRESS_1` decoding all work without this feature. pub mod ascii_table; pub mod bintable;