From a4489b38cb94589dad33c134610bad268e43c1c3 Mon Sep 17 00:00:00 2001
From: Steven Michael <ssmichael@gmail.com>
Date: Sun, 31 May 2026 13:56:42 -0400
Subject: [PATCH 1/9] Add FITS tiled-image compression: RICE_1 + GZIP read
 support

Implement Phase 1 of the FITS tiled-image compression convention
(ZIMAGE/ZCMPTYPE/ZTILEn/COMPRESSED_DATA), decoding compressed images
out of the BINTABLE form into ImageData:

- New src/tile_compress.rs: CompressionType/Quantize/TileGeometry,
  CompressedImage<'a> view (detect/from_bintable), RICE_1 decoder,
  tile reassembly (unravel/scatter_tile), and GZIP_1/GZIP_2 behind an
  optional, feature-gated `gzip` dependency (miniz_oxide) so the core
  stays zero-dependency.
- Lazy, backward-compatible API: Hdu::as_compressed_image() ->
  Option<CompressedImage>, then .decompress()/.compression()/.geometry().
  HduData is unchanged; compressed images stay stored as BinTable so the
  original tiles survive for lossless round-trip.
- Fix RICE_1 seed handling: the first value is a seed (lastpix), not a
  stored pixel; decode a full nvals diffs so array[0] = seed + diff[0].
  The previous off-by-one shifted every reconstructed pixel one position.
- Tests: byte-exact integration tests against fpack-generated fixtures
  (RICE row-tiled, square edge-truncated, I32, gzip1), skip-if-absent;
  corrected synthetic RICE unit tests that had masked the bug.
- scripts/gen_compressed_fixtures.sh generates/verifies fpack fixtures;
  CI downloads them best-effort (cache key v2).
- COMPRESSION_PLAN.md documents the staged rollout (float quant+dither,
  PLIO_1, HCOMPRESS_1, write path remain TODO).

Also includes accompanying in-progress crate refinements across
ascii_table/bintable/header/keyword/types and their tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .github/workflows/ci.yml           |   14 +-
 .gitignore                         |    3 +
 COMPRESSION_PLAN.md                |  229 ++++++
 Cargo.lock                         |    1 +
 Cargo.toml                         |    5 +
 scripts/gen_compressed_fixtures.sh |  146 ++++
 src/ascii_table.rs                 |   38 +-
 src/bintable.rs                    |  104 ++-
 src/checksum.rs                    |    6 +-
 src/error.rs                       |    9 +-
 src/hdu.rs                         |   51 +-
 src/header.rs                      |   20 +-
 src/image_conv.rs                  |   12 +-
 src/image_data.rs                  |    2 +-
 src/keyword.rs                     |   22 +-
 src/lib.rs                         |   38 +-
 src/tile_compress.rs               | 1109 ++++++++++++++++++++++++++++
 src/types.rs                       |    9 +-
 tests/checksum.rs                  |    2 +-
 tests/image_conv.rs                |    5 +-
 tests/round_trip.rs                |   34 +-
 tests/sample_files.rs              |  109 ++-
 22 files changed, 1863 insertions(+), 105 deletions(-)
 create mode 100644 COMPRESSION_PLAN.md
 create mode 100755 scripts/gen_compressed_fixtures.sh
 create mode 100644 src/tile_compress.rs
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index e70cdfb..ace4be0 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -26,7 +26,7 @@ jobs:
         uses: actions/cache@v4
         with:
           path: samp
-          key: fits4-samples-v1
+          key: fits4-samples-v2
 
       - name: Download sample FITS files
         if: steps.cache-samples.outputs.cache-hit != 'true'
@@ -41,6 +41,18 @@ jobs:
             WFPC2u5780205r_c0fx.fits; do
             curl -sSfL "$BASE/$f" -o "samp/$f"
           done
+          # Compressed (fpack) fixtures: best-effort — these may not yet be
+          # uploaded to the bucket, so a 404 must not fail CI. The tile-compression
+          # tests skip cleanly when a fixture is absent.
+          for f in \
+            EUVEngc4151imgx.rice.fits.fz \
+            EUVEngc4151imgx.rice_t100.fits.fz \
+            FGSf64y0106m_a1f.rice.fits.fz \
+            EUVEngc4151imgx.gzip1.fits.fz \
+            EUVEngc4151imgx.gzip2.fits.fz \
+            FOCx38i0101t_c0f.rice_nodith.fits.fz; do
+            curl -sSfL "$BASE/$f" -o "samp/$f" || true
+          done
 
       - name: Clippy
         run: cargo clippy --all-targets --features image -- -D warnings
diff --git a/.gitignore b/.gitignore
index ea8c4bf..65aa528 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1 +1,4 @@
 /target
+*.pdf
+samp
+.DS_Store
diff --git a/COMPRESSION_PLAN.md b/COMPRESSION_PLAN.md
new file mode 100644
index 0000000..9b7fcf2
--- /dev/null
+++ b/COMPRESSION_PLAN.md
@@ -0,0 +1,229 @@
+# FITS Tiled Image Compression — Implementation Plan
+
+Status: planning + initial scaffold. This document describes how to add support for
+the FITS *Tiled Image Compression* convention (the BINTABLE-based compressed-image
+format) to `fits4`, currently the main feature gap versus `fitsrs`.
+
+References:
+- Tiled Image Convention for Storing Compressed Images in FITS Binary Tables,
+  v2.3 — https://fits.gsfc.nasa.gov/registry/tilecompression/tilecompression2.3.pdf
+  (also arXiv:1201.1336)
+- HEASARC FITSIO compression docs — https://heasarc.gsfc.nasa.gov/docs/software/fitsio/compression.html
+- cfitsio `ricecomp.c` — https://github.com/HEASARC/cfitsio/blob/develop/ricecomp.c
+- astropy compressed codecs / `_tiled_compression.py`
+
+## 1. What the convention is
+
+A tile-compressed image is physically a **BINTABLE extension** (`XTENSION = 'BINTABLE'`)
+but is logically an image. The original image is divided into a rectangular grid of
+*tiles*; each tile is compressed independently and stored as a byte stream in a
+variable-length-array (VLA) cell of a binary-table column (typically `COMPRESSED_DATA`,
+a `1PB` or `1QB` column). One table row == one tile, in row-major (FORTRAN/NAXIS1-fastest)
+tile order.
+
+### Driving header keywords
+
+Mandatory:
+- `ZIMAGE = T` — marks this BINTABLE as a compressed image.
+- `ZCMPTYPE` — algorithm name: `RICE_1`, `GZIP_1`, `GZIP_2`, `HCOMPRESS_1`, `PLIO_1`
+  (also `RICE_ONE` alias, `NOCOMPRESS`).
+- `ZBITPIX` — BITPIX of the *original* (uncompressed) image (8/16/32/64/-32/-64).
+- `ZNAXIS` — NAXIS of the original image.
+- `ZNAXISn` — NAXISn of the original image (n = 1..ZNAXIS).
+
+Optional / conditional:
+- `ZTILEn` — tile size along axis n. Default = row-by-row: `ZTILE1 = ZNAXIS1`, all
+  others = 1.
+- `ZNAMEi` / `ZVALi` — algorithm parameter name/value pairs:
+  - RICE_1: `BLOCKSIZE` (default 32), `BYTEPIX` (default 4 — bytes per pixel of the
+    *integers* fed to Rice; note this is the post-quantization integer size).
+  - HCOMPRESS_1: `SCALE`, `SMOOTH`.
+- `ZQUANTIZ` — float quantization method: `NO_DITHER`, `SUBTRACTIVE_DITHER_1`,
+  `SUBTRACTIVE_DITHER_2`. Absent ⇒ no dithering (older files quantized without dither).
+- `ZDITHER0` — integer dither seed (1..10000) for subtractive dithering.
+- `ZBLANK` — integer value used in the integer stream to flag undefined/NaN pixels
+  (when given as a keyword it is constant for all tiles; can also be a per-tile column).
+- `ZSIMPLE`, `ZEXTEND`, `ZBLOCKED`, `ZPCOUNT`, `ZGCOUNT`, `ZHECKSUM`, `ZDATASUM` —
+  preserve the original primary/extension header context (mostly carried through;
+  `ZSIMPLE`/`ZTENSION` distinguish original primary vs extension).
+- `ZMASKCMP` — algorithm used to compress the null-pixel mask (rarely needed for read).
+- The original image's other header keywords are copied verbatim into the BINTABLE
+  header (so WCS etc. survive); on decompression we reconstruct an image header from
+  the `Z*` keywords plus the carried-through cards.
+
+### Binary-table columns
+
+- `COMPRESSED_DATA` (required) — VLA (`1PB`/`1QB`) holding the compressed tile bytes.
+  When a tile failed to compress/quantize, its descriptor count is 0 and the data
+  lives in the fallback column instead.
+- `GZIP_COMPRESSED_DATA` (optional) — VLA fallback: a gzip-compressed copy of the
+  *un-quantized* tile, used when lossy quantization was rejected for that tile.
+- `UNCOMPRESSED_DATA` (optional, older variant) — raw uncompressed tile fallback.
+- `ZSCALE`, `ZZERO` (optional, columns of D) — per-tile linear scale/zero for float
+  quantization: `float_value = ZZERO + ZSCALE * integer_value`. If absent, use the
+  `ZSCALE`/`ZZERO` *keywords*, or 1.0/0.0.
+- `ZBLANK` (optional column of J) — per-tile blank sentinel.
+
+### Reassembly
+
+For each row r (tile index): decode the tile's compressed bytes with the ZCMPTYPE
+algorithm into a flat array of `ZBITPIX`-typed integers (or, for the gzip fallback,
+the original dtype). For float originals, apply inverse quantization
+(`ZZERO + ZSCALE * q`, plus dither reversal). Then scatter the tile's pixels into the
+correct sub-rectangle of the full image buffer using the tile grid geometry derived
+from `ZNAXISn` / `ZTILEn`. Tiles iterate with axis 1 fastest.
+
+## 2. Float quantization & dithering
+
+Float (`ZBITPIX < 0`) images are usually stored as quantized 32-bit integers
+(`BYTEPIX` typically 4) plus per-tile `ZSCALE`/`ZZERO`. Decode:
+
+1. Decompress the integer stream `q[i]`.
+2. If `q[i] == ZBLANK` ⇒ output `NaN`.
+3. Otherwise `value = ZZERO + ZSCALE * q[i]` — with dither correction:
+   - `NO_DITHER` / absent: `value = ZZERO + ZSCALE * q[i]`.
+   - `SUBTRACTIVE_DITHER_1`: a per-pixel uniform dither `r ∈ [0,1)` was *added* to
+     `q` before rounding at compress time, so decode is
+     `value = ZZERO + ZSCALE * (q[i] - r + 0.5)` (cfitsio uses `(q - dither + 0.5)`).
+   - `SUBTRACTIVE_DITHER_2`: same as DITHER_1 except exact-zero pixels are preserved
+     (encoded with the special integer `-2147483647` ⇒ decode to exactly 0.0).
+   The dither value `r` comes from cfitsio's fixed pseudo-random sequence
+   (`fits_rand_value`, a 10000-element table generated by a specific LCG), indexed by
+   `(tile_row + ZDITHER0 - 1)` to pick a starting offset, advancing per pixel and
+   wrapping at 10000. This table/sequence must be reproduced exactly to round-trip
+   dithered files — see cfitsio `quantize.c` `fits_init_randoms`.
+
+This is the trickiest correctness item; for the first read-only pass we can support
+`NO_DITHER` and the absent case, and treat dithered files as a follow-up (decoding
+without dither reversal yields values off by up to ~0.5 ZSCALE — acceptable only as a
+clearly-flagged approximation, so better to error until the RNG table is implemented).
+
+## 3. Module design
+
+New module `src/tile_compress.rs`, gated behind nothing (pure Rust; gzip needs an
+inflate implementation — see open questions). Public surface:
+
+```rust
+pub enum CompressionType { Rice1, Gzip1, Gzip2, Hcompress1, Plio1, NoCompress }
+pub enum Quantize { None, SubtractiveDither1, SubtractiveDither2 }
+
+pub struct TileGeometry { znaxis: Vec<usize>, ztile: Vec<usize>, /* grids */ }
+
+pub struct CompressedImage<'a> {
+    pub ctype: CompressionType,
+    pub zbitpix: i64,
+    pub geometry: TileGeometry,
+    pub quantize: Quantize,
+    pub zdither0: i64,
+    pub blank: Option<i64>,
+    table: &'a BinTable,           // borrows the parsed BINTABLE
+    // resolved column indices: compressed_data, gzip_fallback, zscale, zzero, zblank
+}
+
+impl CompressedImage<'_> {
+    pub fn detect(header: &Header) -> bool;          // ZIMAGE == T
+    pub fn from_bintable(header: &Header, table: &BinTable) -> Result<Self>;
+    pub fn decompress(&self) -> Result<ImageData>;   // full image
+    fn decompress_tile(&self, row: usize) -> Result<Vec<i64-ish>>;
+}
+
+// algorithm primitives (pure functions, unit-testable in isolation):
+pub fn rice_decompress_i32(src: &[u8], nvals: usize, blocksize: usize) -> Result<Vec<i32>>;
+pub fn rice_decompress_i16(src: &[u8], nvals: usize, blocksize: usize) -> Result<Vec<i16>>;
+pub fn rice_decompress_i8 (src: &[u8], nvals: usize, blocksize: usize) -> Result<Vec<u8>>;
+pub fn gzip_inflate(src: &[u8]) -> Result<Vec<u8>>;          // GZIP_1
+// GZIP_2 = byte-shuffled gzip (de-shuffle after inflate)
+pub fn plio_decompress(src: &[u8], nvals: usize) -> Result<Vec<i32>>;
+pub fn hcompress_decompress(...) -> Result<Vec<i32>>;        // last
+```
+
+### Read-pipeline hookup
+
+In `hdu.rs::read_from`, after the BINTABLE branch parses a `BinTable`, check
+`CompressedImage::detect(&header)`. Options for surfacing it (see open questions):
+
+- **Option A (recommended):** add a new `HduData::Image(ImageData)` produced eagerly —
+  i.e. when `ZIMAGE=T`, decompress and store the result as `HduData::Image`, so callers
+  treat it like any other image. Keep the original BINTABLE accessible only if needed.
+  Downside: loses the raw table; re-writing won't round-trip the compressed bytes.
+- **Option B:** add a new variant `HduData::CompressedImage(BinTable)` (or keep the
+  `BinTable` and add a method `Hdu::as_image()` that lazily decompresses). Preserves
+  round-trip and is lazy, at the cost of a new public enum variant.
+- **Option C:** keep `HduData::BinTable`, and add `BinTable::as_compressed_image()`
+  / `Hdu::decompressed_image()` returning `Option<Result<ImageData>>`. Smallest API
+  change, fully backward compatible, lazy. Likely the best first step.
+
+The plan adopts **Option C** for the initial implementation (no enum change, no
+behavior change for existing users), with a possible convenience accessor on `FitsFile`.
+
+### Error handling
+
+Add `Error` variants: `UnsupportedCompression(String)`, `CompressionError(String)`.
+
+## 4. Staged implementation
+
+**Phase 0 (this pass — scaffold):** module skeleton, enums, `detect`, `from_bintable`
+header parsing, function stubs with `todo!()`, RICE_1 32/16/8-bit decoder + unit test.
+Wire `mod tile_compress;` into `lib.rs`. Build must stay green.
+
+**Phase 1 — RICE_1 read path (integers):** finish `decompress()` for integer
+`ZBITPIX` with `NO_DITHER`; tile scatter/reassembly; `ZBLANK` handling. Validate
+against an `fpack`-generated test file.
+
+**Phase 2 — GZIP_1 / GZIP_2:** integer + float read. Needs a DEFLATE decoder
+(see open questions re: dependency). GZIP_2 adds byte de-shuffle.
+
+**Phase 3 — float quantization + dithering:** ZSCALE/ZZERO inverse, NaN/ZBLANK,
+and the cfitsio random-number table for SUBTRACTIVE_DITHER_1/2.
+
+**Phase 4 — PLIO_1:** IRAF pixel-list run-length decoder (fully specified in the
+convention doc, simplest of the lossless coders; mainly used for integer masks).
+
+**Phase 5 — HCOMPRESS_1:** wavelet-ish H-transform decode + quantization. Most
+complex; lowest priority.
+
+**Phase 6 — writing/compression:** mirror each decoder with an encoder, build the
+BINTABLE + `Z*` keywords, VLA heap packing. RICE_1 + GZIP first; HCOMPRESS optional.
+Hook into `FitsFile` builder (e.g. `add_compressed_image(img, CompressionType::Rice1)`).
+
+## 5. Test files
+
+None of the existing `samp/*.fits` files are tile-compressed (verified: they are
+plain IMAGE / TABLE / BINTABLE HDUs). We need test fixtures:
+
+- Generate with cfitsio `fpack`:
+  - `fpack -r image.fits`  → RICE_1
+  - `fpack -g image.fits`  → GZIP_1
+  - `fpack -g2 image.fits` → GZIP_2
+  - `fpack -h image.fits`  → HCOMPRESS_1
+  - `fpack -p image.fits`  → PLIO_1 (integer images / masks)
+  Cover at least: I16 image, I32 image, F32 image (quantized, both dithered and
+  `-q 0`/no-dither), and a small 3-D cube to exercise multi-axis tiling.
+- Cross-check decompressed output against the original uncompressed FITS (byte-exact
+  for lossless integer codecs; within quantization tolerance for float).
+- Store fixtures alongside the existing samples (GCS bucket `fits4_samples`, per
+  MEMORY.md) so CI can fetch them; add a `tests/tile_compress.rs` integration test.
+
+## 6. Open questions for the human
+
+1. **API shape (most important):** Option C (lazy `Hdu::decompressed_image()` accessor,
+   no enum change) vs Option B (new `HduData::CompressedImage` variant) vs Option A
+   (eager — present compressed images transparently as `HduData::Image`). Transparent
+   presentation is friendliest for consumers like `fitsview`/`startracker` but breaks
+   round-trip writing. Recommendation: start with C, consider a transparent
+   `FitsFile::images()` helper later.
+2. **GZIP dependency:** GZIP_1/GZIP_2 require DEFLATE. The crate currently has *zero*
+   dependencies for core. Choices: (a) put gzip behind an optional `flate2`/`miniz_oxide`
+   feature; (b) write a small pure-Rust inflate (more work, keeps zero-dep promise);
+   (c) ship RICE_1/PLIO only in core and gate gzip/hcompress behind a feature.
+   Recommendation: optional `flate` feature using `miniz_oxide` (pure Rust, no C).
+3. **Dithering scope:** is byte-exact round-trip of `SUBTRACTIVE_DITHER_1/2` float
+   files required, or is read-only (and erroring on dithered files until Phase 3) okay
+   for the first release? This decides how soon we must port the cfitsio RNG table.
+4. **Surfacing scaled vs raw:** for float quantized images, do we return the
+   reconstructed `PixelData::F32`/`F64` directly (applying ZSCALE/ZZERO + dither), or
+   also expose the raw quantized integers + scale like the existing BSCALE/BZERO split
+   on `ImageData`? Recommend returning reconstructed floats (matches user expectation);
+   raw access can be a later addition.
+5. **Writing priority:** is the immediate goal read parity with `fitsrs`, or also
+   write/compress? Read-only RICE+GZIP closes most of the practical gap.
diff --git a/Cargo.lock b/Cargo.lock
index c3fefae..e419038 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -312,6 +312,7 @@ name = "fits4"
 version = "0.1.0"
 dependencies = [
  "image",
+ "miniz_oxide",
 ]
 
 [[package]]
diff --git a/Cargo.toml b/Cargo.toml
index a3a7e40..fe3bf4e 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -9,8 +9,13 @@ license = "MIT"
 version = "0.25"
 optional = true
 
+[dependencies.miniz_oxide]
+version = "0.8"
+optional = true
+
 [features]
 image = ["dep:image"]
+gzip = ["dep:miniz_oxide"]
 
 [profile.test]
 opt-level = 3
diff --git a/scripts/gen_compressed_fixtures.sh b/scripts/gen_compressed_fixtures.sh
new file mode 100755
index 0000000..c6bdbd2
--- /dev/null
+++ b/scripts/gen_compressed_fixtures.sh
@@ -0,0 +1,146 @@
+#!/usr/bin/env bash
+#
+# Generate tiled-image-compressed FITS fixtures for fits4 integration tests.
+#
+# fits4 reads the FITS tiled-image compression convention (RICE_1, GZIP_1,
+# GZIP_2, PLIO_1, HCOMPRESS_1) in pure Rust. To test that decoder against the
+# canonical encoder, we use `fpack` (shipped with cfitsio) to compress known
+# source images, then assert in our integration tests that decompressing them
+# reproduces the original pixels. `fpack` is a *dev/test* dependency only — the
+# fits4 crate itself stays pure Rust.
+#
+# Source images are the uncompressed NASA samples already in samp/. Outputs are
+# written to samp/ as <stem>.<algo>.fits.fz and round-trip-verified with funpack.
+#
+# Requirements (macOS): brew install cfitsio   (provides fpack + funpack)
+#            (conda):    conda install -c conda-forge cfitsio
+# Upload requires the gcloud SDK (`gcloud storage` or `gsutil`) with write
+# access to the fits4_samples bucket.
+#
+# Usage:
+#   scripts/gen_compressed_fixtures.sh            # generate + verify locally
+#   scripts/gen_compressed_fixtures.sh --upload   # also upload fixtures to GCS
+#
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
+SAMP_DIR="$REPO_ROOT/samp"
+BUCKET="gs://fits4_samples"
+
+UPLOAD=0
+[[ "${1:-}" == "--upload" ]] && UPLOAD=1
+
+# --- tool checks ------------------------------------------------------------
+for tool in fpack funpack; do
+  if ! command -v "$tool" >/dev/null 2>&1; then
+    echo "error: '$tool' not found on PATH. Install cfitsio:" >&2
+    echo "  macOS:  brew install cfitsio" >&2
+    echo "  conda:  conda install -c conda-forge cfitsio" >&2
+    exit 1
+  fi
+done
+
+if [[ ! -d "$SAMP_DIR" ]]; then
+  echo "error: $SAMP_DIR not found. Fetch the base samples first." >&2
+  exit 1
+fi
+
+GENERATED=()
+
+# compress <source-stem.fits> <algo-label> <fpack-flags...>
+#   produces samp/<stem>.<algo-label>.fits.fz and verifies it round-trips.
+compress() {
+  local src="$1"; shift
+  local label="$1"; shift
+  local src_path="$SAMP_DIR/$src"
+  local stem="${src%.fits}"
+  local out="$SAMP_DIR/${stem}.${label}.fits.fz"
+
+  if [[ ! -f "$src_path" ]]; then
+    echo "  skip: $src not present in samp/ — skipping $label" >&2
+    return 0
+  fi
+
+  rm -f "$out"
+  # -O names the output explicitly; remaining args are the algorithm flags.
+  fpack "$@" -O "$out" "$src_path"
+
+  # Verify: funpack -> /dev/stdout would re-add a primary; instead unpack to a
+  # temp file and confirm it succeeds (byte-exactness vs. source is asserted in
+  # the Rust tests, which compare fits4's decode against the original samples).
+  local tmp; tmp="$(mktemp -t funpack.XXXXXX).fits"
+  if funpack -O "$tmp" "$out" >/dev/null 2>&1; then
+    local before after
+    before=$(wc -c < "$src_path")
+    after=$(wc -c < "$out")
+    printf "  ok  %-40s %8s -> %8s bytes\n" "$(basename "$out")" "$before" "$after"
+    GENERATED+=("$(basename "$out")")
+  else
+    echo "  FAIL: funpack could not decompress $out" >&2
+    rm -f "$tmp"
+    return 1
+  fi
+  rm -f "$tmp"
+}
+
+echo "Generating compressed fixtures in $SAMP_DIR ..."
+
+# --- integer images (RICE / GZIP / PLIO) ------------------------------------
+# EUVEngc4151imgx.fits has I16 IMAGE extensions.
+compress EUVEngc4151imgx.fits     rice   -r
+compress EUVEngc4151imgx.fits     gzip1  -g
+compress EUVEngc4151imgx.fits     gzip2  -g2
+# PLIO_1 targets integer (mask-like) data, <= 24-bit; fine for I16.
+compress EUVEngc4151imgx.fits     plio   -p
+# Force square tiles to exercise 2-D reassembly with edge-truncated tiles.
+compress EUVEngc4151imgx.fits     rice_t100   -r -t 100,100
+
+# FGSf64y0106m_a1f.fits is an I32 image.
+compress FGSf64y0106m_a1f.fits    rice   -r
+compress FGSf64y0106m_a1f.fits    gzip1  -g
+
+# --- float images (quantize; with and without subtractive dither) -----------
+# Note: RICE/HCOMPRESS need quantization for floats. `-q <lvl>` quantizes with
+# subtractive dithering; `-q0 <lvl>` (no space) quantizes with NO dithering
+# (ZQUANTIZ=NO_DITHER); lossless float (`-q 0`, level zero) is GZIP-only.
+# FOCx38i0101t_c0f.fits is F32 1024x1024.
+compress FOCx38i0101t_c0f.fits    rice_dith     -r -q 16    # quantize + subtractive dither
+compress FOCx38i0101t_c0f.fits    rice_nodith   -r -q0 16   # quantize, no dithering
+compress FOCx38i0101t_c0f.fits    gzip_lossless -g -q 0     # lossless float (no quantization)
+compress FOCx38i0101t_c0f.fits    hcomp         -h
+
+# WFPC2u5780205r_c0fx.fits is an F32 200x200x4 cube (multi-axis tiling).
+compress WFPC2u5780205r_c0fx.fits rice_dith     -r -q 16
+compress WFPC2u5780205r_c0fx.fits rice_nodith   -r -q0 16
+
+echo
+echo "Generated ${#GENERATED[@]} fixtures:"
+printf '  %s\n' "${GENERATED[@]}"
+
+# --- optional upload --------------------------------------------------------
+if [[ "$UPLOAD" -eq 1 ]]; then
+  if command -v gcloud >/dev/null 2>&1; then
+    GSCMD=(gcloud storage cp)
+  elif command -v gsutil >/dev/null 2>&1; then
+    GSCMD=(gsutil cp)
+  else
+    echo "error: --upload requested but neither 'gcloud' nor 'gsutil' is installed." >&2
+    exit 1
+  fi
+  echo
+  echo "Uploading to $BUCKET ..."
+  for f in "${GENERATED[@]}"; do
+    "${GSCMD[@]}" "$SAMP_DIR/$f" "$BUCKET/$f"
+    echo "  uploaded $f"
+  done
+fi
+
+echo
+echo "Next steps:"
+echo "  1. Add these filenames to the download loop in .github/workflows/ci.yml"
+echo "     (and bump the cache key, e.g. fits4-samples-v1 -> v2)."
+echo "  2. Add assertions in tests/sample_files.rs using hdu.as_compressed_image()."
+if [[ "$UPLOAD" -eq 0 ]]; then
+  echo "  3. Re-run with --upload to push fixtures to $BUCKET."
+fi
diff --git a/src/ascii_table.rs b/src/ascii_table.rs
index 7c87081..f7595d1 100644
--- a/src/ascii_table.rs
+++ b/src/ascii_table.rs
@@ -49,7 +49,9 @@ impl AsciiFormat {
                     _ => unreachable!(),
                 }
             }
-            _ => Err(Error::InvalidTableFormat(format!("unknown ASCII TFORM code: {s}"))),
+            _ => Err(Error::InvalidTableFormat(format!(
+                "unknown ASCII TFORM code: {s}"
+            ))),
         }
     }
 
@@ -120,7 +122,9 @@ impl AsciiTable {
 
             let tscal = header.get_float(&format!("TSCAL{i}")).unwrap_or(1.0);
             let tzero = header.get_float(&format!("TZERO{i}")).unwrap_or(0.0);
-            let tunit = header.get_string(&format!("TUNIT{i}")).map(|s| s.to_string());
+            let tunit = header
+                .get_string(&format!("TUNIT{i}"))
+                .map(|s| s.to_string());
 
             columns.push(AsciiColumn {
                 name,
@@ -191,14 +195,22 @@ impl AsciiTable {
 
     /// Populate header keywords for this table.
     pub fn fill_header(&self, header: &mut Header) {
-        header.set("XTENSION", HeaderValue::String("TABLE".into()), Some("ASCII table extension"));
+        header.set(
+            "XTENSION",
+            HeaderValue::String("TABLE".into()),
+            Some("ASCII table extension"),
+        );
         header.set("BITPIX", HeaderValue::Integer(8), None);
         header.set("NAXIS", HeaderValue::Integer(2), None);
         header.set("NAXIS1", HeaderValue::Integer(self.row_len as i64), None);
         header.set("NAXIS2", HeaderValue::Integer(self.nrows as i64), None);
         header.set("PCOUNT", HeaderValue::Integer(0), None);
         header.set("GCOUNT", HeaderValue::Integer(1), None);
-        header.set("TFIELDS", HeaderValue::Integer(self.columns.len() as i64), None);
+        header.set(
+            "TFIELDS",
+            HeaderValue::Integer(self.columns.len() as i64),
+            None,
+        );
 
         for (i, col) in self.columns.iter().enumerate() {
             let idx = i + 1;
@@ -220,25 +232,13 @@ impl AsciiTable {
                 AsciiFormat::Ewd(w, d) => format!("E{w}.{d}"),
                 AsciiFormat::Dwd(w, d) => format!("D{w}.{d}"),
             };
-            header.set(
-                &format!("TFORM{idx}"),
-                HeaderValue::String(fmt_str),
-                None,
-            );
+            header.set(&format!("TFORM{idx}"), HeaderValue::String(fmt_str), None);
 
             if col.tscal != 1.0 {
-                header.set(
-                    &format!("TSCAL{idx}"),
-                    HeaderValue::Float(col.tscal),
-                    None,
-                );
+                header.set(&format!("TSCAL{idx}"), HeaderValue::Float(col.tscal), None);
             }
             if col.tzero != 0.0 {
-                header.set(
-                    &format!("TZERO{idx}"),
-                    HeaderValue::Float(col.tzero),
-                    None,
-                );
+                header.set(&format!("TZERO{idx}"), HeaderValue::Float(col.tzero), None);
             }
             if let Some(ref unit) = col.tunit {
                 header.set(
diff --git a/src/bintable.rs b/src/bintable.rs
index f794c60..c226459 100644
--- a/src/bintable.rs
+++ b/src/bintable.rs
@@ -52,7 +52,9 @@ impl BinColumnType {
         }
 
         if code_pos >= bytes.len() {
-            return Err(Error::InvalidTableFormat(format!("no type code in TFORM: {s}")));
+            return Err(Error::InvalidTableFormat(format!(
+                "no type code in TFORM: {s}"
+            )));
         }
 
         let repeat: usize = if code_pos == 0 {
@@ -92,7 +94,10 @@ impl BinColumnType {
             b'D' => Ok(BinColumnType::D64(repeat)),
             b'C' => Ok(BinColumnType::C64(repeat)),
             b'M' => Ok(BinColumnType::M128(repeat)),
-            _ => Err(Error::InvalidTableFormat(format!("unknown BINTABLE type code: {}", code as char))),
+            _ => Err(Error::InvalidTableFormat(format!(
+                "unknown BINTABLE type code: {}",
+                code as char
+            ))),
         }
     }
 
@@ -210,7 +215,9 @@ impl BinTable {
 
             let tscal = header.get_float(&format!("TSCAL{i}")).unwrap_or(1.0);
             let tzero = header.get_float(&format!("TZERO{i}")).unwrap_or(0.0);
-            let tunit = header.get_string(&format!("TUNIT{i}")).map(|s| s.to_string());
+            let tunit = header
+                .get_string(&format!("TUNIT{i}"))
+                .map(|s| s.to_string());
 
             columns.push(BinColumn {
                 name,
@@ -266,12 +273,8 @@ impl BinTable {
                 let vals: Vec<bool> = bytes[..*n].iter().map(|&b| b == b'T').collect();
                 Ok(BinCellValue::Logical(vals))
             }
-            BinColumnType::Bit(n) => {
-                Ok(BinCellValue::Bits(bytes.to_vec(), *n))
-            }
-            BinColumnType::Byte(n) => {
-                Ok(BinCellValue::Bytes(bytes[..*n].to_vec()))
-            }
+            BinColumnType::Bit(n) => Ok(BinCellValue::Bits(bytes.to_vec(), *n)),
+            BinColumnType::Byte(n) => Ok(BinCellValue::Bytes(bytes[..*n].to_vec())),
             BinColumnType::I16(n) => {
                 let vals: Vec<i16> = bytes
                     .chunks_exact(2)
@@ -359,7 +362,12 @@ impl BinTable {
     }
 
     /// Read variable-length array data from the heap.
-    fn read_heap_array(&self, elem_type: char, count: usize, offset: usize) -> Result<BinCellValue> {
+    fn read_heap_array(
+        &self,
+        elem_type: char,
+        count: usize,
+        offset: usize,
+    ) -> Result<BinCellValue> {
         if count == 0 {
             return match elem_type {
                 'B' => Ok(BinCellValue::Bytes(Vec::new())),
@@ -370,7 +378,9 @@ impl BinTable {
                 'D' => Ok(BinCellValue::F64(Vec::new())),
                 'A' => Ok(BinCellValue::String(String::new())),
                 'L' => Ok(BinCellValue::Logical(Vec::new())),
-                _ => Err(Error::InvalidTableFormat(format!("unsupported VLA element type: {elem_type}"))),
+                _ => Err(Error::InvalidTableFormat(format!(
+                    "unsupported VLA element type: {elem_type}"
+                ))),
             };
         }
 
@@ -381,7 +391,11 @@ impl BinTable {
             'K' | 'D' => 8,
             'C' => 8,
             'M' => 16,
-            _ => return Err(Error::InvalidTableFormat(format!("unsupported VLA element type: {elem_type}"))),
+            _ => {
+                return Err(Error::InvalidTableFormat(format!(
+                    "unsupported VLA element type: {elem_type}"
+                )))
+            }
         };
 
         let end = offset + count * elem_size;
@@ -395,7 +409,9 @@ impl BinTable {
         let data = &self.heap[offset..end];
 
         match elem_type {
-            'L' => Ok(BinCellValue::Logical(data.iter().map(|&b| b == b'T').collect())),
+            'L' => Ok(BinCellValue::Logical(
+                data.iter().map(|&b| b == b'T').collect(),
+            )),
             'B' => Ok(BinCellValue::Bytes(data.to_vec())),
             'A' => {
                 let s = std::str::from_utf8(data)
@@ -429,20 +445,30 @@ impl BinTable {
                     .map(|c| f64::from_be_bytes(c.try_into().unwrap()))
                     .collect(),
             )),
-            _ => Err(Error::InvalidTableFormat(format!("unsupported VLA type: {elem_type}"))),
+            _ => Err(Error::InvalidTableFormat(format!(
+                "unsupported VLA type: {elem_type}"
+            ))),
         }
     }
 
     /// Populate header keywords for this binary table.
     pub fn fill_header(&self, header: &mut Header) {
-        header.set("XTENSION", HeaderValue::String("BINTABLE".into()), Some("binary table extension"));
+        header.set(
+            "XTENSION",
+            HeaderValue::String("BINTABLE".into()),
+            Some("binary table extension"),
+        );
         header.set("BITPIX", HeaderValue::Integer(8), None);
         header.set("NAXIS", HeaderValue::Integer(2), None);
         header.set("NAXIS1", HeaderValue::Integer(self.row_len as i64), None);
         header.set("NAXIS2", HeaderValue::Integer(self.nrows as i64), None);
         header.set("PCOUNT", HeaderValue::Integer(self.heap.len() as i64), None);
         header.set("GCOUNT", HeaderValue::Integer(1), None);
-        header.set("TFIELDS", HeaderValue::Integer(self.columns.len() as i64), None);
+        header.set(
+            "TFIELDS",
+            HeaderValue::Integer(self.columns.len() as i64),
+            None,
+        );
 
         for (i, col) in self.columns.iter().enumerate() {
             let idx = i + 1;
@@ -458,18 +484,10 @@ impl BinTable {
             );
 
             if col.tscal != 1.0 {
-                header.set(
-                    &format!("TSCAL{idx}"),
-                    HeaderValue::Float(col.tscal),
-                    None,
-                );
+                header.set(&format!("TSCAL{idx}"), HeaderValue::Float(col.tscal), None);
             }
             if col.tzero != 0.0 {
-                header.set(
-                    &format!("TZERO{idx}"),
-                    HeaderValue::Float(col.tzero),
-                    None,
-                );
+                header.set(&format!("TZERO{idx}"), HeaderValue::Float(col.tzero), None);
             }
             if let Some(ref unit) = col.tunit {
                 header.set(
@@ -663,12 +681,30 @@ mod tests {
 
     #[test]
     fn parse_tform_codes() {
-        assert!(matches!(BinColumnType::parse("1J").unwrap(), BinColumnType::J32(1)));
-        assert!(matches!(BinColumnType::parse("10E").unwrap(), BinColumnType::E32(10)));
-        assert!(matches!(BinColumnType::parse("8A").unwrap(), BinColumnType::Char(8)));
-        assert!(matches!(BinColumnType::parse("D").unwrap(), BinColumnType::D64(1)));
-        assert!(matches!(BinColumnType::parse("1PJ").unwrap(), BinColumnType::VarP('J')));
-        assert!(matches!(BinColumnType::parse("1QD").unwrap(), BinColumnType::VarQ('D')));
+        assert!(matches!(
+            BinColumnType::parse("1J").unwrap(),
+            BinColumnType::J32(1)
+        ));
+        assert!(matches!(
+            BinColumnType::parse("10E").unwrap(),
+            BinColumnType::E32(10)
+        ));
+        assert!(matches!(
+            BinColumnType::parse("8A").unwrap(),
+            BinColumnType::Char(8)
+        ));
+        assert!(matches!(
+            BinColumnType::parse("D").unwrap(),
+            BinColumnType::D64(1)
+        ));
+        assert!(matches!(
+            BinColumnType::parse("1PJ").unwrap(),
+            BinColumnType::VarP('J')
+        ));
+        assert!(matches!(
+            BinColumnType::parse("1QD").unwrap(),
+            BinColumnType::VarQ('D')
+        ));
     }
 
     #[test]
@@ -684,7 +720,9 @@ mod tests {
 
     #[test]
     fn tform_round_trip() {
-        for s in &["1J", "10E", "8A", "1D", "3I", "1L", "20X", "1C", "1M", "1PJ", "1QD"] {
+        for s in &[
+            "1J", "10E", "8A", "1D", "3I", "1L", "20X", "1C", "1M", "1PJ", "1QD",
+        ] {
             let ct = BinColumnType::parse(s).unwrap();
             let out = ct.to_tform_string();
             let ct2 = BinColumnType::parse(&out).unwrap();
diff --git a/src/checksum.rs b/src/checksum.rs
index 0b812d2..a2bc8b5 100644
--- a/src/checksum.rs
+++ b/src/checksum.rs
@@ -149,7 +149,11 @@ pub fn decode_checksum(ascii: &str, complement: bool) -> u32 {
     }
 
     let sum = (hi << 16) | lo;
-    if complement { !sum } else { sum }
+    if complement {
+        !sum
+    } else {
+        sum
+    }
 }
 
 /// Compute DATASUM for a data byte buffer (should be block-padded).
diff --git a/src/error.rs b/src/error.rs
index 4275a7f..f30e6b7 100644
--- a/src/error.rs
+++ b/src/error.rs
@@ -14,6 +14,8 @@ pub enum Error {
     ChecksumMismatch { expected: u32, actual: u32 },
     UnsupportedExtension(String),
     InvalidTableFormat(String),
+    UnsupportedCompression(String),
+    CompressionError(String),
 }
 
 impl fmt::Display for Error {
@@ -29,10 +31,15 @@ impl fmt::Display for Error {
                 write!(f, "data size mismatch: expected {expected}, got {actual}")
             }
             Error::ChecksumMismatch { expected, actual } => {
-                write!(f, "checksum mismatch: expected {expected:#010x}, got {actual:#010x}")
+                write!(
+                    f,
+                    "checksum mismatch: expected {expected:#010x}, got {actual:#010x}"
+                )
             }
             Error::UnsupportedExtension(s) => write!(f, "unsupported extension: {s}"),
             Error::InvalidTableFormat(s) => write!(f, "invalid table format: {s}"),
+            Error::UnsupportedCompression(s) => write!(f, "unsupported compression: {s}"),
+            Error::CompressionError(s) => write!(f, "compression error: {s}"),
         }
     }
 }
diff --git a/src/hdu.rs b/src/hdu.rs
index c680774..6af05d6 100644
--- a/src/hdu.rs
+++ b/src/hdu.rs
@@ -32,7 +32,11 @@ impl Hdu {
     /// Create a primary HDU with image data.
     pub fn primary_image(image: ImageData) -> Self {
         let mut header = Header::new();
-        header.set("SIMPLE", HeaderValue::Logical(true), Some("conforms to FITS standard"));
+        header.set(
+            "SIMPLE",
+            HeaderValue::Logical(true),
+            Some("conforms to FITS standard"),
+        );
         image.fill_header(&mut header);
         Hdu {
             header,
@@ -43,7 +47,11 @@ impl Hdu {
     /// Create a primary HDU with no data.
     pub fn primary_empty() -> Self {
         let mut header = Header::new();
-        header.set("SIMPLE", HeaderValue::Logical(true), Some("conforms to FITS standard"));
+        header.set(
+            "SIMPLE",
+            HeaderValue::Logical(true),
+            Some("conforms to FITS standard"),
+        );
         header.set("BITPIX", HeaderValue::Integer(8), None);
         header.set("NAXIS", HeaderValue::Integer(0), None);
         Hdu {
@@ -55,7 +63,11 @@ impl Hdu {
     /// Create an IMAGE extension HDU.
     pub fn image_extension(image: ImageData) -> Self {
         let mut header = Header::new();
-        header.set("XTENSION", HeaderValue::String("IMAGE".into()), Some("image extension"));
+        header.set(
+            "XTENSION",
+            HeaderValue::String("IMAGE".into()),
+            Some("image extension"),
+        );
         image.fill_header(&mut header);
         header.set("PCOUNT", HeaderValue::Integer(0), None);
         header.set("GCOUNT", HeaderValue::Integer(1), None);
@@ -85,6 +97,39 @@ impl Hdu {
         }
     }
 
+    /// View this HDU as a tile-compressed image, if it is one.
+    ///
+    /// A tile-compressed image is stored on disk as a `BINTABLE` extension with
+    /// `ZIMAGE = T` (the FITS Tiled Image Compression convention). This is a *cheap*
+    /// detection step: it inspects the header and, when it matches, returns a
+    /// [`CompressedImage`] view that borrows the underlying [`BinTable`] — preserving
+    /// the original compressed tiles for lossless round-trip writing.
+    ///
+    /// Returns `None` when the HDU is not a compressed-image BINTABLE. The actual
+    /// decoding happens in [`CompressedImage::decompress`]:
+    ///
+    /// ```no_run
+    /// # use fits4::FitsFile;
+    /// # let fits = FitsFile::from_file("compressed.fits").unwrap();
+    /// for hdu in fits.extensions() {
+    ///     if let Some(cimg) = hdu.as_compressed_image() {
+    ///         let image = cimg.decompress().unwrap();
+    ///         println!("{:?}", image.axes);
+    ///     }
+    /// }
+    /// ```
+    pub fn as_compressed_image(&self) -> Option<crate::tile_compress::CompressedImage<'_>> {
+        if !crate::tile_compress::CompressedImage::detect(&self.header) {
+            return None;
+        }
+        match &self.data {
+            HduData::BinTable(table) => {
+                crate::tile_compress::CompressedImage::from_bintable(&self.header, table).ok()
+            }
+            _ => None,
+        }
+    }
+
     /// Read an HDU from a reader.
     pub fn read_from<R: Read + Seek>(reader: &mut R) -> Result<Self> {
         let header = Header::read_from(reader)?;
diff --git a/src/header.rs b/src/header.rs
index 82562ea..11be010 100644
--- a/src/header.rs
+++ b/src/header.rs
@@ -79,9 +79,7 @@ impl Header {
         let mut buf = [0u8; BLOCK_SIZE];
 
         'outer: loop {
-            reader
-                .read_exact(&mut buf)
-                .map_err(Error::Io)?;
+            reader.read_exact(&mut buf).map_err(Error::Io)?;
 
             for i in 0..RECORDS_PER_BLOCK {
                 let start = i * RECORD_SIZE;
@@ -89,7 +87,9 @@ impl Header {
                     buf[start..start + RECORD_SIZE].try_into().unwrap();
 
                 // Check for END keyword
-                if &record[..8] == b"END     " || record[..3] == *b"END" && record[3..8].iter().all(|&b| b == b' ') {
+                if &record[..8] == b"END     "
+                    || record[..3] == *b"END" && record[3..8].iter().all(|&b| b == b' ')
+                {
                     break 'outer;
                 }
 
@@ -162,9 +162,9 @@ impl Header {
         for i in 1..=naxis {
             let key = format!("NAXIS{i}");
             let n = self.require_int(&key)? as usize;
-            product = product.checked_mul(n).ok_or_else(|| {
-                Error::InvalidFormat("axis dimensions overflow".into())
-            })?;
+            product = product
+                .checked_mul(n)
+                .ok_or_else(|| Error::InvalidFormat("axis dimensions overflow".into()))?;
         }
 
         let pcount = self.get_int("PCOUNT").unwrap_or(0) as usize;
@@ -233,7 +233,11 @@ mod tests {
     #[test]
     fn header_round_trip() {
         let mut h = Header::new();
-        h.set("SIMPLE", HeaderValue::Logical(true), Some("conforms to standard"));
+        h.set(
+            "SIMPLE",
+            HeaderValue::Logical(true),
+            Some("conforms to standard"),
+        );
         h.set("BITPIX", HeaderValue::Integer(16), None);
         h.set("NAXIS", HeaderValue::Integer(0), None);
 
diff --git a/src/image_conv.rs b/src/image_conv.rs
index e9c395a..9e4d4b3 100644
--- a/src/image_conv.rs
+++ b/src/image_conv.rs
@@ -44,10 +44,14 @@ impl ImageData {
             _ => {
                 // Normalize to u16 range
                 let scaled = self.scaled_values(bscale, bzero);
-                let (min, max) = scaled.iter().fold((f64::MAX, f64::MIN), |(mn, mx), &v| {
-                    (mn.min(v), mx.max(v))
-                });
-                let range = if (max - min).abs() < 1e-30 { 1.0 } else { max - min };
+                let (min, max) = scaled
+                    .iter()
+                    .fold((f64::MAX, f64::MIN), |(mn, mx), &v| (mn.min(v), mx.max(v)));
+                let range = if (max - min).abs() < 1e-30 {
+                    1.0
+                } else {
+                    max - min
+                };
 
                 let pixels: Vec<u16> = scaled
                     .iter()
diff --git a/src/image_data.rs b/src/image_data.rs
index 16aed08..33a76f2 100644
--- a/src/image_data.rs
+++ b/src/image_data.rs
@@ -1,5 +1,5 @@
 use crate::error::{Error, Result};
-use crate::header::{Header};
+use crate::header::Header;
 use crate::keyword::HeaderValue;
 use crate::types::Bitpix;
 
diff --git a/src/keyword.rs b/src/keyword.rs
index 8e3908c..3ef44ac 100644
--- a/src/keyword.rs
+++ b/src/keyword.rs
@@ -204,7 +204,11 @@ impl Keyword {
 fn format_value(value: &HeaderValue) -> String {
     match value {
         HeaderValue::Logical(b) => {
-            if *b { "T".to_string() } else { "F".to_string() }
+            if *b {
+                "T".to_string()
+            } else {
+                "F".to_string()
+            }
         }
         HeaderValue::Integer(i) => format!("{i}"),
         HeaderValue::Float(f) => format_float(*f),
@@ -443,7 +447,9 @@ fn parse_complex_value(s: &str) -> Result<(HeaderValue, Option<String>)> {
     let inner = &s[1..close];
     let parts: Vec<&str> = inner.split(',').collect();
     if parts.len() != 2 {
-        return Err(Error::InvalidKeyword("complex value must have two components".into()));
+        return Err(Error::InvalidKeyword(
+            "complex value must have two components".into(),
+        ));
     }
 
     let rest = &s[close + 1..];
@@ -471,7 +477,11 @@ fn split_comment(s: &str) -> (&str, Option<String>) {
     if let Some(pos) = s.find('/') {
         let val = &s[..pos];
         let cmt = s[pos + 1..].trim();
-        let comment = if cmt.is_empty() { None } else { Some(cmt.to_string()) };
+        let comment = if cmt.is_empty() {
+            None
+        } else {
+            Some(cmt.to_string())
+        };
         (val, comment)
     } else {
         (s, None)
@@ -482,7 +492,11 @@ fn parse_trailing_comment(s: &str) -> Option<String> {
     let trimmed = s.trim_start();
     if let Some(rest) = trimmed.strip_prefix('/') {
         let cmt = rest.trim();
-        if cmt.is_empty() { None } else { Some(cmt.to_string()) }
+        if cmt.is_empty() {
+            None
+        } else {
+            Some(cmt.to_string())
+        }
     } else {
         None
     }
diff --git a/src/lib.rs b/src/lib.rs
index 06c65d7..7c39eec 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -10,7 +10,12 @@
 //! - **ASCII TABLE extension** — Aw, Iw, Fw.d, Ew.d, Dw.d column formats
 //! - **BINTABLE extension** — all type codes including variable-length arrays (P/Q descriptors)
 //!
-//! Tile-compressed images (`ZIMAGE`) and random groups are not supported.
+//! Tile-compressed images (`ZIMAGE`) are supported for **reading** integer images
+//! (`RICE_1`; `GZIP_1`/`GZIP_2` behind the `gzip` feature) via
+//! [`Hdu::as_compressed_image`](hdu::Hdu::as_compressed_image) +
+//! [`CompressedImage::decompress`](tile_compress::CompressedImage::decompress).
+//! Float quantization/dithering, `PLIO_1`, `HCOMPRESS_1`, and the write/compress
+//! path are not yet implemented. Random groups are not supported.
 //!
 //! ## Quick start — reading
 //!
@@ -91,28 +96,33 @@
 //!
 //! - **`image`** — enables conversion between [`ImageData`] and the
 //!   [`image`](https://crates.io/crates/image) crate's `DynamicImage`.
+//! - **`gzip`** — enables decoding of `GZIP_1`/`GZIP_2` tile-compressed images via
+//!   the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate. The
+//!   default build stays dependency-free; `RICE_1` works without this feature.
 
-pub mod error;
-pub mod types;
-pub mod keyword;
-pub mod header;
-pub mod io_utils;
-pub mod image_data;
 pub mod ascii_table;
 pub mod bintable;
 pub mod checksum;
-pub mod hdu;
+pub mod error;
 pub mod fits;
+pub mod hdu;
+pub mod header;
+pub mod image_data;
+pub mod io_utils;
+pub mod keyword;
+pub mod tile_compress;
+pub mod types;
 
 #[cfg(feature = "image")]
 pub mod image_conv;
 
+pub use ascii_table::AsciiTable;
+pub use bintable::{BinCellValue, BinColumn, BinColumnType, BinTable, BinTableBuilder};
 pub use error::{Error, Result};
-pub use types::Bitpix;
-pub use keyword::{HeaderValue, Keyword};
+pub use fits::FitsFile;
+pub use hdu::{Hdu, HduData};
 pub use header::Header;
 pub use image_data::{ImageData, PixelData};
-pub use ascii_table::AsciiTable;
-pub use bintable::{BinTable, BinTableBuilder, BinColumn, BinColumnType, BinCellValue};
-pub use hdu::{Hdu, HduData};
-pub use fits::FitsFile;
+pub use keyword::{HeaderValue, Keyword};
+pub use tile_compress::{CompressedImage, CompressionType, Quantize, TileGeometry};
+pub use types::Bitpix;
diff --git a/src/tile_compress.rs b/src/tile_compress.rs
new file mode 100644
index 0000000..ac5720b
--- /dev/null
+++ b/src/tile_compress.rs
@@ -0,0 +1,1109 @@
+//! Tiled image compression (the BINTABLE-based compressed-image convention).
+//!
+//! A tile-compressed image is stored physically as a `BINTABLE` extension but is
+//! logically an image. The original image is divided into a rectangular grid of
+//! *tiles*; each tile is compressed independently and stored as a byte stream in a
+//! variable-length-array column (typically `COMPRESSED_DATA`). One table row holds
+//! one tile, in row-major (axis-1-fastest) order.
+//!
+//! See the *Tiled Image Convention for Storing Compressed Images in FITS Binary
+//! Tables* (v2.3) and the cfitsio implementation.
+//!
+//! # Status
+//!
+//! This module is being built up in phases (see `COMPRESSION_PLAN.md`):
+//!
+//! - Phase 0 (current): scaffolding + the [`rice_decompress_i32`] family of
+//!   decoders with unit tests.
+//! - Phase 1: RICE_1 integer read path ([`CompressedImage::decompress`]).
+//! - Later: GZIP_1/GZIP_2, float quantization/dithering, PLIO_1, HCOMPRESS_1, and a
+//!   compression (write) path.
+//!
+//! Most of the high-level pipeline is currently stubbed with `todo!()`.
+
+use crate::bintable::BinTable;
+use crate::error::{Error, Result};
+use crate::header::Header;
+
+/// Compression algorithm named by the `ZCMPTYPE` keyword.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub enum CompressionType {
+    /// `RICE_1` (also `RICE_ONE`) — Rice coding of pixel differences. Most common.
+    Rice1,
+    /// `GZIP_1` — DEFLATE over the raw tile bytes.
+    Gzip1,
+    /// `GZIP_2` — DEFLATE over byte-shuffled tile bytes.
+    Gzip2,
+    /// `HCOMPRESS_1` — H-transform + quantization (lossy/lossless).
+    Hcompress1,
+    /// `PLIO_1` — IRAF pixel-list run-length encoding (integer masks).
+    Plio1,
+    /// `NOCOMPRESS` — tiles stored uncompressed.
+    NoCompress,
+}
+
+impl CompressionType {
+    /// Parse a `ZCMPTYPE` keyword value.
+    pub fn parse(s: &str) -> Result<Self> {
+        match s.trim().to_ascii_uppercase().as_str() {
+            "RICE_1" | "RICE_ONE" => Ok(CompressionType::Rice1),
+            "GZIP_1" => Ok(CompressionType::Gzip1),
+            "GZIP_2" => Ok(CompressionType::Gzip2),
+            "HCOMPRESS_1" => Ok(CompressionType::Hcompress1),
+            "PLIO_1" => Ok(CompressionType::Plio1),
+            "NOCOMPRESS" => Ok(CompressionType::NoCompress),
+            other => Err(Error::UnsupportedCompression(other.to_string())),
+        }
+    }
+}
+
+/// Float quantization / dithering method named by `ZQUANTIZ`.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub enum Quantize {
+    /// No dithering (or keyword absent).
+    None,
+    /// `SUBTRACTIVE_DITHER_1`.
+    SubtractiveDither1,
+    /// `SUBTRACTIVE_DITHER_2` (preserves exact zeros).
+    SubtractiveDither2,
+}
+
+impl Quantize {
+    /// Parse a `ZQUANTIZ` keyword value (absent ⇒ [`Quantize::None`]).
+    pub fn parse(s: Option<&str>) -> Result<Self> {
+        match s.map(|v| v.trim().to_ascii_uppercase()) {
+            None => Ok(Quantize::None),
+            Some(v) => match v.as_str() {
+                "NO_DITHER" => Ok(Quantize::None),
+                "SUBTRACTIVE_DITHER_1" => Ok(Quantize::SubtractiveDither1),
+                "SUBTRACTIVE_DITHER_2" => Ok(Quantize::SubtractiveDither2),
+                other => Err(Error::UnsupportedCompression(format!(
+                    "ZQUANTIZ={other}"
+                ))),
+            },
+        }
+    }
+}
+
+/// Tile-grid geometry derived from `ZNAXIS`/`ZNAXISn`/`ZTILEn`.
+#[derive(Debug, Clone)]
+pub struct TileGeometry {
+    /// Original image dimensions (`ZNAXISn`), axis-1 first.
+    pub znaxis: Vec<usize>,
+    /// Tile dimensions (`ZTILEn`), axis-1 first.
+    pub ztile: Vec<usize>,
+}
+
+impl TileGeometry {
+    /// Build geometry from the compressed-image header.
+    ///
+    /// Defaults to row-by-row tiling (`ZTILE1 = ZNAXIS1`, others = 1) when the
+    /// `ZTILEn` keywords are absent.
+    pub fn from_header(header: &Header) -> Result<Self> {
+        let znaxis = header.require_int("ZNAXIS")? as usize;
+        let mut dims = Vec::with_capacity(znaxis);
+        for i in 1..=znaxis {
+            dims.push(header.require_int(&format!("ZNAXIS{i}"))? as usize);
+        }
+
+        let mut tile = Vec::with_capacity(znaxis);
+        for i in 1..=znaxis {
+            let default = if i == 1 { dims[0] } else { 1 };
+            let t = header
+                .get_int(&format!("ZTILE{i}"))
+                .map(|v| v as usize)
+                .unwrap_or(default);
+            tile.push(t);
+        }
+
+        Ok(TileGeometry {
+            znaxis: dims,
+            ztile: tile,
+        })
+    }
+
+    /// Number of tiles along each axis.
+    pub fn tiles_per_axis(&self) -> Vec<usize> {
+        self.znaxis
+            .iter()
+            .zip(&self.ztile)
+            .map(|(&n, &t)| n.div_ceil(t.max(1)))
+            .collect()
+    }
+
+    /// Total number of tiles (== expected number of table rows).
+    pub fn num_tiles(&self) -> usize {
+        self.tiles_per_axis().iter().product()
+    }
+}
+
+/// A compressed image read from a BINTABLE, ready to be decompressed.
+///
+/// Borrows the already-parsed [`BinTable`]; the heavy lifting happens in
+/// [`CompressedImage::decompress`].
+#[derive(Debug)]
+pub struct CompressedImage<'a> {
+    /// Compression algorithm.
+    pub ctype: CompressionType,
+    /// BITPIX of the original (uncompressed) image.
+    pub zbitpix: i64,
+    /// Tile-grid geometry.
+    pub geometry: TileGeometry,
+    /// Float quantization/dither method.
+    pub quantize: Quantize,
+    /// Dither seed (`ZDITHER0`), if present.
+    pub zdither0: Option<i64>,
+    /// Integer blank sentinel (`ZBLANK` keyword), if present.
+    pub blank: Option<i64>,
+    /// RICE `BLOCKSIZE` parameter (default 32).
+    pub blocksize: usize,
+    /// RICE `BYTEPIX` parameter (default 4).
+    pub bytepix: usize,
+    /// The borrowed binary table holding the compressed tiles + heap.
+    /// Read once [`CompressedImage::decompress`] is implemented (phase 1).
+    #[allow(dead_code)]
+    pub(crate) table: &'a BinTable,
+}
+
+impl<'a> CompressedImage<'a> {
+    /// True if `header` describes a tile-compressed image (`ZIMAGE = T`).
+    pub fn detect(header: &Header) -> bool {
+        header.get_bool("ZIMAGE").unwrap_or(false)
+    }
+
+    /// Build a [`CompressedImage`] from a compressed-image BINTABLE header and the
+    /// parsed table. Parses the `Z*` driver keywords; does not yet decompress.
+    pub fn from_bintable(header: &Header, table: &'a BinTable) -> Result<Self> {
+        if !Self::detect(header) {
+            return Err(Error::CompressionError(
+                "header is not a tile-compressed image (ZIMAGE != T)".into(),
+            ));
+        }
+
+        let ctype = CompressionType::parse(
+            header
+                .get_string("ZCMPTYPE")
+                .ok_or_else(|| Error::MissingKeyword("ZCMPTYPE".into()))?,
+        )?;
+        let zbitpix = header.require_int("ZBITPIX")?;
+        let geometry = TileGeometry::from_header(header)?;
+        let quantize = Quantize::parse(header.get_string("ZQUANTIZ"))?;
+        let zdither0 = header.get_int("ZDITHER0");
+        let blank = header.get_int("ZBLANK");
+
+        // Default RICE parameters; overridden by ZNAMEi='BLOCKSIZE'/'BYTEPIX'.
+        let mut blocksize = 32usize;
+        let mut bytepix = 4usize;
+        let nzparams = count_zname_params(header);
+        for i in 1..=nzparams {
+            if let Some(name) = header.get_string(&format!("ZNAME{i}")) {
+                let val = header.get_int(&format!("ZVAL{i}")).unwrap_or(0) as usize;
+                match name.trim().to_ascii_uppercase().as_str() {
+                    "BLOCKSIZE" => blocksize = val,
+                    "BYTEPIX" => bytepix = val,
+                    _ => {}
+                }
+            }
+        }
+
+        Ok(CompressedImage {
+            ctype,
+            zbitpix,
+            geometry,
+            quantize,
+            zdither0,
+            blank,
+            blocksize,
+            bytepix,
+            table,
+        })
+    }
+
+    /// The compression algorithm (`ZCMPTYPE`). Cheap inspector.
+    pub fn compression(&self) -> CompressionType {
+        self.ctype
+    }
+
+    /// The tile-grid geometry (`ZNAXISn`/`ZTILEn`). Cheap inspector.
+    pub fn geometry(&self) -> &TileGeometry {
+        &self.geometry
+    }
+
+    /// Decompress all tiles and reassemble the full image into an [`ImageData`].
+    ///
+    /// Implemented for integer `ZBITPIX` (8/16/32/64) with `RICE_1` (and, behind the
+    /// `gzip` feature, `GZIP_1`/`GZIP_2`). Each tile is read from the
+    /// `COMPRESSED_DATA` variable-length-array column, decoded to a flat array of
+    /// integers, and scattered into the correct sub-rectangle of the output buffer
+    /// per the tile grid (axis-1 fastest). `ZBLANK` integer sentinels are carried
+    /// through unchanged into the output array.
+    ///
+    /// # Not yet supported (see `COMPRESSION_PLAN.md`)
+    ///
+    /// - Float `ZBITPIX` (-32/-64): requires quantization inverse (`ZSCALE`/`ZZERO`)
+    ///   and subtractive dithering — phase 3.
+    /// - `PLIO_1` — phase 4.
+    /// - `HCOMPRESS_1` — phase 5.
+    /// - `GZIP_1`/`GZIP_2` without the `gzip` feature return
+    ///   [`Error::UnsupportedCompression`].
+    pub fn decompress(&self) -> Result<crate::image_data::ImageData> {
+        use crate::image_data::{ImageData, PixelData};
+        use crate::types::Bitpix;
+
+        // Float originals require quantization inverse + dithering (phase 3).
+        if self.zbitpix < 0 {
+            return Err(Error::UnsupportedCompression(format!(
+                "float ZBITPIX={} (quantization/dithering decode is phase 3, see COMPRESSION_PLAN.md)",
+                self.zbitpix
+            )));
+        }
+
+        let bitpix = Bitpix::from_i64(self.zbitpix)?;
+        let npix: usize = self.geometry.znaxis.iter().product();
+
+        // Decode every tile into i64 values, in tile order (axis-1 fastest), then
+        // scatter into the full-image buffer.
+        let mut full = vec![0i64; npix];
+        let tiles_per_axis = self.geometry.tiles_per_axis();
+        let num_tiles = self.geometry.num_tiles();
+
+        if self.table.nrows < num_tiles {
+            return Err(Error::CompressionError(format!(
+                "compressed image expects {} tiles but BINTABLE has {} rows",
+                num_tiles, self.table.nrows
+            )));
+        }
+
+        let cdata_col = self.compressed_data_column()?;
+
+        for tile_index in 0..num_tiles {
+            let raw = self.tile_bytes(tile_index, cdata_col)?;
+            // Number of pixels in this (possibly edge-truncated) tile.
+            let coords = unravel(tile_index, &tiles_per_axis);
+            let tile_dims = self.tile_dims_at(&coords);
+            let tile_npix: usize = tile_dims.iter().product();
+
+            let values = self.decode_tile(&raw, tile_npix)?;
+            if values.len() != tile_npix {
+                return Err(Error::CompressionError(format!(
+                    "tile {tile_index} decoded {} values, expected {tile_npix}",
+                    values.len()
+                )));
+            }
+            scatter_tile(
+                &mut full,
+                &self.geometry.znaxis,
+                &self.geometry.ztile,
+                &tile_dims,
+                &coords,
+                &values,
+            );
+        }
+
+        // Narrow i64 -> the target integer storage.
+        let pixels = match bitpix {
+            Bitpix::U8 => PixelData::U8(full.iter().map(|&v| v as u8).collect()),
+            Bitpix::I16 => PixelData::I16(full.iter().map(|&v| v as i16).collect()),
+            Bitpix::I32 => PixelData::I32(full.iter().map(|&v| v as i32).collect()),
+            Bitpix::I64 => PixelData::I64(full),
+            Bitpix::F32 | Bitpix::F64 => unreachable!("float handled above"),
+        };
+
+        Ok(ImageData::new(self.geometry.znaxis.clone(), pixels))
+    }
+
+    /// Resolve the index of the `COMPRESSED_DATA` column.
+    fn compressed_data_column(&self) -> Result<usize> {
+        self.table
+            .columns
+            .iter()
+            .position(|c| c.name.trim().eq_ignore_ascii_case("COMPRESSED_DATA"))
+            .ok_or_else(|| {
+                Error::CompressionError("compressed image has no COMPRESSED_DATA column".into())
+            })
+    }
+
+    /// Read the raw compressed bytes for one tile (one table row) from the VLA column.
+    fn tile_bytes(&self, row: usize, col: usize) -> Result<Vec<u8>> {
+        match self.table.get_cell(row, col)? {
+            crate::bintable::BinCellValue::Bytes(b) => Ok(b),
+            other => Err(Error::CompressionError(format!(
+                "COMPRESSED_DATA cell is not a byte VLA: {other:?}"
+            ))),
+        }
+    }
+
+    /// Tile dimensions at tile-grid coordinates `coords`, accounting for edge tiles
+    /// that are truncated when `ZNAXISn` is not a multiple of `ZTILEn`.
+    fn tile_dims_at(&self, coords: &[usize]) -> Vec<usize> {
+        let mut dims = Vec::with_capacity(coords.len());
+        for (axis, &c) in coords.iter().enumerate() {
+            let n = self.geometry.znaxis[axis];
+            let t = self.geometry.ztile[axis].max(1);
+            let start = c * t;
+            dims.push((n - start).min(t));
+        }
+        dims
+    }
+
+    /// Decode one tile's bytes into a flat array of `tile_npix` integer values
+    /// (axis-1 fastest) according to the compression type.
+    fn decode_tile(&self, raw: &[u8], tile_npix: usize) -> Result<Vec<i64>> {
+        match self.ctype {
+            CompressionType::Rice1 => match self.bytepix {
+                1 => Ok(rice_decompress_i8(raw, tile_npix, self.blocksize)?
+                    .into_iter()
+                    .map(|v| v as i64)
+                    .collect()),
+                2 => Ok(rice_decompress_i16(raw, tile_npix, self.blocksize)?
+                    .into_iter()
+                    .map(|v| v as i64)
+                    .collect()),
+                4 => Ok(rice_decompress_i32(raw, tile_npix, self.blocksize)?
+                    .into_iter()
+                    .map(|v| v as i64)
+                    .collect()),
+                other => Err(Error::CompressionError(format!(
+                    "RICE_1 BYTEPIX={other} not supported (expected 1, 2, or 4)"
+                ))),
+            },
+            CompressionType::NoCompress => self.bytes_to_ints(raw, tile_npix),
+            CompressionType::Gzip1 => {
+                let inflated = gzip_inflate(raw)?;
+                self.bytes_to_ints(&inflated, tile_npix)
+            }
+            CompressionType::Gzip2 => {
+                let inflated = gzip_inflate(raw)?;
+                let unshuffled = gzip2_unshuffle(&inflated, self.zbitpix_bytes());
+                self.bytes_to_ints(&unshuffled, tile_npix)
+            }
+            CompressionType::Plio1 => Err(Error::UnsupportedCompression(
+                "PLIO_1 decode is phase 4 (see COMPRESSION_PLAN.md)".into(),
+            )),
+            CompressionType::Hcompress1 => Err(Error::UnsupportedCompression(
+                "HCOMPRESS_1 decode is phase 5 (see COMPRESSION_PLAN.md)".into(),
+            )),
+        }
+    }
+
+    /// Bytes-per-pixel of the original (uncompressed) integer image.
+    fn zbitpix_bytes(&self) -> usize {
+        (self.zbitpix.unsigned_abs() as usize) / 8
+    }
+
+    /// Interpret a big-endian byte buffer as `tile_npix` integers of width
+    /// `ZBITPIX` (used by NOCOMPRESS and the GZIP codecs, which store the raw image
+    /// integers, not RICE-coded differences).
+    fn bytes_to_ints(&self, bytes: &[u8], tile_npix: usize) -> Result<Vec<i64>> {
+        let width = self.zbitpix_bytes();
+        if width == 0 {
+            return Err(Error::CompressionError("invalid ZBITPIX width".into()));
+        }
+        let avail = bytes.len() / width;
+        if avail < tile_npix {
+            return Err(Error::CompressionError(format!(
+                "tile has {avail} integers, expected {tile_npix}"
+            )));
+        }
+        let mut out = Vec::with_capacity(tile_npix);
+        for c in bytes.chunks_exact(width).take(tile_npix) {
+            let v = match width {
+                1 => c[0] as i64, // ZBITPIX=8 is unsigned bytes
+                2 => i16::from_be_bytes([c[0], c[1]]) as i64,
+                4 => i32::from_be_bytes([c[0], c[1], c[2], c[3]]) as i64,
+                8 => i64::from_be_bytes(c.try_into().unwrap()),
+                _ => return Err(Error::CompressionError("invalid ZBITPIX width".into())),
+            };
+            out.push(v);
+        }
+        Ok(out)
+    }
+}
+
+/// Convert a linear tile index into per-axis tile coordinates (axis-1 fastest).
+fn unravel(mut index: usize, tiles_per_axis: &[usize]) -> Vec<usize> {
+    let mut coords = vec![0usize; tiles_per_axis.len()];
+    for axis in 0..tiles_per_axis.len() {
+        let n = tiles_per_axis[axis].max(1);
+        coords[axis] = index % n;
+        index /= n;
+    }
+    coords
+}
+
+/// Scatter a decoded tile's flat `values` (axis-1 fastest) into the full-image
+/// buffer `full` (also axis-1 fastest), placing it at tile-grid `coords`.
+///
+/// - `image_dims` are the full `ZNAXISn` (axis-1 first).
+/// - `ztile` are the nominal tile sizes (`ZTILEn`); used to locate the tile origin.
+/// - `tile_dims` are the (possibly edge-truncated) dimensions of *this* tile; used
+///   to size the copy. They equal `ztile` except for tiles on a high edge where
+///   `ZNAXISn` is not a multiple of `ZTILEn`.
+fn scatter_tile(
+    full: &mut [i64],
+    image_dims: &[usize],
+    ztile: &[usize],
+    tile_dims: &[usize],
+    coords: &[usize],
+    values: &[i64],
+) {
+    let ndim = image_dims.len();
+    // Per-axis stride into the full image buffer (axis-1 fastest => stride 1).
+    let mut img_stride = vec![1usize; ndim];
+    for axis in 1..ndim {
+        img_stride[axis] = img_stride[axis - 1] * image_dims[axis - 1];
+    }
+    // Origin pixel of this tile in the full image: coords stepped by the *nominal*
+    // tile size along each axis.
+    let mut origin = 0usize;
+    for axis in 0..ndim {
+        origin += coords[axis] * ztile[axis].max(1) * img_stride[axis];
+    }
+
+    // Iterate over every pixel of the tile via its multi-index (axis-1 fastest).
+    let tile_npix: usize = tile_dims.iter().product();
+    let mut tcoord = vec![0usize; ndim];
+    for &val in values.iter().take(tile_npix) {
+        let mut dst = origin;
+        for axis in 0..ndim {
+            dst += tcoord[axis] * img_stride[axis];
+        }
+        full[dst] = val;
+        // Increment the tile multi-index (axis-1 fastest).
+        for axis in 0..ndim {
+            tcoord[axis] += 1;
+            if tcoord[axis] < tile_dims[axis] {
+                break;
+            }
+            tcoord[axis] = 0;
+        }
+    }
+}
+
+/// Count the contiguous `ZNAMEi`/`ZVALi` parameter pairs present in the header.
+fn count_zname_params(header: &Header) -> usize {
+    let mut n = 0;
+    while header.find(&format!("ZNAME{}", n + 1)).is_some() {
+        n += 1;
+    }
+    n
+}
+
+// ---------------------------------------------------------------------------
+// RICE_1 decompression
+// ---------------------------------------------------------------------------
+//
+// Port of cfitsio `fits_rdecomp` / `_short` / `_byte` (R. White, STScI). The first
+// pixel is stored verbatim; subsequent pixels are stored as Rice-coded, zigzag-mapped
+// differences from the previous pixel, in blocks of `blocksize` pixels. Each block is
+// prefixed by `fsbits` bits giving `fs+1`:
+//   * fs < 0      -> all differences in the block are zero
+//   * fs == fsmax -> each difference is stored verbatim in `bbits` bits
+//   * else        -> Rice code: unary leading zeros give the high bits, then `fs`
+//                    low bits; combine as `(nzero << fs) | low`.
+
+/// A simple big-endian-ish MSB-first bit reader over a byte slice.
+struct BitReader<'a> {
+    data: &'a [u8],
+    byte_pos: usize,
+    /// Number of valid bits remaining in the current `buffer` (0..=8).
+    bits_in_buf: u32,
+    buffer: u32,
+}
+
+impl<'a> BitReader<'a> {
+    fn new(data: &'a [u8]) -> Self {
+        BitReader {
+            data,
+            byte_pos: 0,
+            bits_in_buf: 0,
+            buffer: 0,
+        }
+    }
+
+    /// Read `n` bits (0..=32) MSB-first as an unsigned value.
+    fn read_bits(&mut self, n: u32) -> Result<u32> {
+        let mut result: u32 = 0;
+        let mut need = n;
+        while need > 0 {
+            if self.bits_in_buf == 0 {
+                let b = *self
+                    .data
+                    .get(self.byte_pos)
+                    .ok_or_else(|| Error::CompressionError("RICE: unexpected end of stream".into()))?;
+                self.byte_pos += 1;
+                self.buffer = b as u32;
+                self.bits_in_buf = 8;
+            }
+            let take = need.min(self.bits_in_buf);
+            let shift = self.bits_in_buf - take;
+            let mask = if take == 32 { u32::MAX } else { (1u32 << take) - 1 };
+            let bits = (self.buffer >> shift) & mask;
+            result = (result << take) | bits;
+            self.bits_in_buf -= take;
+            need -= take;
+        }
+        Ok(result)
+    }
+
+    /// Count and consume leading zero bits, then consume the terminating one-bit.
+    /// Returns the number of zero bits seen.
+    fn count_leading_zeros(&mut self) -> Result<u32> {
+        let mut count = 0u32;
+        loop {
+            if self.bits_in_buf == 0 {
+                let b = *self
+                    .data
+                    .get(self.byte_pos)
+                    .ok_or_else(|| Error::CompressionError("RICE: unexpected end of stream".into()))?;
+                self.byte_pos += 1;
+                self.buffer = b as u32;
+                self.bits_in_buf = 8;
+            }
+            // Inspect the top valid bit.
+            let top = (self.buffer >> (self.bits_in_buf - 1)) & 1;
+            self.bits_in_buf -= 1;
+            if top == 1 {
+                return Ok(count);
+            }
+            count += 1;
+        }
+    }
+}
+
+/// Map a zigzag-encoded unsigned difference back to a signed difference.
+#[inline]
+fn unzigzag(u: u64) -> i64 {
+    if u & 1 == 0 {
+        (u >> 1) as i64
+    } else {
+        !((u >> 1) as i64)
+    }
+}
+
+/// Core RICE_1 decode shared by the 8/16/32-bit variants.
+///
+/// `fsbits`/`fsmax`/`bbits` are the per-variant constants. Returns reconstructed
+/// values as `i64` (caller narrows to the target type).
+fn rice_decompress_core(
+    src: &[u8],
+    nvals: usize,
+    blocksize: usize,
+    fsbits: u32,
+    fsmax: u32,
+    bbits: u32,
+) -> Result<Vec<i64>> {
+    if nvals == 0 {
+        return Ok(Vec::new());
+    }
+    let blocksize = blocksize.max(1);
+    let mut out = Vec::with_capacity(nvals);
+    let mut reader = BitReader::new(src);
+
+    // First value: bbits bits, verbatim (unsigned), interpreted as the raw integer.
+    //
+    // Note: the verbatim first value is the *seed* (`lastpix`), not itself a stored
+    // pixel. cfitsio computes `array[0] = lastpix + diff[0]` and decodes `nvals`
+    // differences in total (see `fits_rdecomp` in cfitsio `ricecomp.c`). Storing the
+    // seed as the first output pixel — and then decoding only `nvals - 1` diffs —
+    // shifts every value one position later, which is exactly wrong.
+    let first = reader.read_bits(bbits)? as u64;
+    let mut lastpix: i64 = sign_extend(first, bbits);
+
+    while out.len() < nvals {
+        let fs_plus_1 = reader.read_bits(fsbits)?;
+        let fs = fs_plus_1 as i64 - 1;
+        let remaining = nvals - out.len();
+        let block_n = remaining.min(blocksize);
+
+        if fs < 0 {
+            // All differences zero.
+            for _ in 0..block_n {
+                out.push(lastpix);
+            }
+        } else if fs as u32 == fsmax {
+            // Verbatim differences in bbits bits each.
+            for _ in 0..block_n {
+                let raw = reader.read_bits(bbits)? as u64;
+                let diff = unzigzag(raw);
+                lastpix = lastpix.wrapping_add(diff);
+                out.push(lastpix);
+            }
+        } else {
+            let fs = fs as u32;
+            for _ in 0..block_n {
+                let high = reader.count_leading_zeros()? as u64;
+                let low = if fs > 0 { reader.read_bits(fs)? as u64 } else { 0 };
+                let mapped = (high << fs) | low;
+                let diff = unzigzag(mapped);
+                lastpix = lastpix.wrapping_add(diff);
+                out.push(lastpix);
+            }
+        }
+    }
+
+    Ok(out)
+}
+
+/// Sign-extend the low `bits` bits of `v` into an `i64`.
+#[inline]
+fn sign_extend(v: u64, bits: u32) -> i64 {
+    if bits == 0 || bits >= 64 {
+        return v as i64;
+    }
+    let shift = 64 - bits;
+    ((v << shift) as i64) >> shift
+}
+
+/// Decompress a RICE_1 stream of 32-bit integers (`BYTEPIX = 4`).
+pub fn rice_decompress_i32(src: &[u8], nvals: usize, blocksize: usize) -> Result<Vec<i32>> {
+    let v = rice_decompress_core(src, nvals, blocksize, 5, 25, 32)?;
+    Ok(v.into_iter().map(|x| x as i32).collect())
+}
+
+/// Decompress a RICE_1 stream of 16-bit integers (`BYTEPIX = 2`).
+pub fn rice_decompress_i16(src: &[u8], nvals: usize, blocksize: usize) -> Result<Vec<i16>> {
+    let v = rice_decompress_core(src, nvals, blocksize, 4, 14, 16)?;
+    Ok(v.into_iter().map(|x| x as i16).collect())
+}
+
+/// Decompress a RICE_1 stream of 8-bit integers (`BYTEPIX = 1`).
+pub fn rice_decompress_i8(src: &[u8], nvals: usize, blocksize: usize) -> Result<Vec<u8>> {
+    let v = rice_decompress_core(src, nvals, blocksize, 3, 6, 8)?;
+    Ok(v.into_iter().map(|x| x as u8).collect())
+}
+
+// ---------------------------------------------------------------------------
+// Other algorithms (stubs)
+// ---------------------------------------------------------------------------
+
+/// Inflate a `GZIP_1` tile.
+///
+/// In the FITS Tiled Image Compression convention, `GZIP_1` tiles are full gzip
+/// members (RFC 1952: gzip header + DEFLATE body + CRC32/ISIZE trailer), exactly as
+/// produced by zlib's `gzip` routines and cfitsio. This routine strips the gzip
+/// wrapper and runs a raw DEFLATE inflate over the body.
+///
+/// Available only with the `gzip` feature enabled; otherwise this returns
+/// [`Error::UnsupportedCompression`].
+#[cfg(feature = "gzip")]
+pub fn gzip_inflate(src: &[u8]) -> Result<Vec<u8>> {
+    let body = strip_gzip_wrapper(src)?;
+    miniz_oxide::inflate::decompress_to_vec(body)
+        .map_err(|e| Error::CompressionError(format!("DEFLATE inflate failed: {e:?}")))
+}
+
+/// Stub when the `gzip` feature is disabled: GZIP_1/GZIP_2 tiles cannot be decoded.
+#[cfg(not(feature = "gzip"))]
+pub fn gzip_inflate(_src: &[u8]) -> Result<Vec<u8>> {
+    Err(Error::UnsupportedCompression(
+        "GZIP_1/GZIP_2 require the `gzip` feature (miniz_oxide); rebuild with --features gzip".into(),
+    ))
+}
+
+/// Parse and skip an RFC 1952 gzip member header, returning the DEFLATE body
+/// (excluding the 8-byte CRC32+ISIZE trailer). Used by [`gzip_inflate`].
+#[cfg(feature = "gzip")]
+fn strip_gzip_wrapper(src: &[u8]) -> Result<&[u8]> {
+    const FHCRC: u8 = 1 << 1;
+    const FEXTRA: u8 = 1 << 2;
+    const FNAME: u8 = 1 << 3;
+    const FCOMMENT: u8 = 1 << 4;
+
+    if src.len() < 18 || src[0] != 0x1f || src[1] != 0x8b {
+        return Err(Error::CompressionError(
+            "GZIP_1 tile is not a valid gzip member (bad magic)".into(),
+        ));
+    }
+    if src[2] != 8 {
+        return Err(Error::CompressionError(format!(
+            "GZIP_1 unsupported compression method {}",
+            src[2]
+        )));
+    }
+    let flags = src[3];
+    let mut pos = 10usize; // fixed header: magic(2)+CM(1)+FLG(1)+MTIME(4)+XFL(1)+OS(1)
+
+    let end = || Error::CompressionError("GZIP_1 truncated gzip header".to_string());
+
+    if flags & FEXTRA != 0 {
+        if pos + 2 > src.len() {
+            return Err(end());
+        }
+        let xlen = u16::from_le_bytes([src[pos], src[pos + 1]]) as usize;
+        pos += 2 + xlen;
+    }
+    if flags & FNAME != 0 {
+        while pos < src.len() && src[pos] != 0 {
+            pos += 1;
+        }
+        pos += 1; // skip NUL
+    }
+    if flags & FCOMMENT != 0 {
+        while pos < src.len() && src[pos] != 0 {
+            pos += 1;
+        }
+        pos += 1;
+    }
+    if flags & FHCRC != 0 {
+        pos += 2;
+    }
+    if pos + 8 > src.len() {
+        return Err(end());
+    }
+    // Body is everything between the header and the 8-byte trailer.
+    Ok(&src[pos..src.len() - 8])
+}
+
+/// `GZIP_2` stores the tile's integers with bytes shuffled into planes (all the
+/// most-significant bytes first, then the next, etc.) before gzip. After inflating,
+/// undo the shuffle to recover the original big-endian integer byte stream.
+///
+/// `bytepix` is the bytes-per-pixel of the original integers (1/2/4/8). For
+/// `bytepix <= 1` the shuffle is a no-op.
+#[cfg(feature = "gzip")]
+fn gzip2_unshuffle(shuffled: &[u8], bytepix: usize) -> Vec<u8> {
+    if bytepix <= 1 {
+        return shuffled.to_vec();
+    }
+    let n = shuffled.len() / bytepix;
+    let mut out = vec![0u8; n * bytepix];
+    for (i, chunk) in out.chunks_exact_mut(bytepix).enumerate().take(n) {
+        for (b, slot) in chunk.iter_mut().enumerate() {
+            *slot = shuffled[b * n + i];
+        }
+    }
+    out
+}
+
+/// `gzip2_unshuffle` is only reachable when the `gzip` feature is on (the GZIP_2
+/// branch errors out in `gzip_inflate` otherwise), so it is feature-gated to avoid a
+/// dead-code warning in the default build.
+#[cfg(not(feature = "gzip"))]
+#[allow(dead_code)]
+fn gzip2_unshuffle(shuffled: &[u8], _bytepix: usize) -> Vec<u8> {
+    shuffled.to_vec()
+}
+
+/// Decompress a PLIO_1 (IRAF pixel-list RLE) tile into i32 mask values. TODO(phase 4).
+pub fn plio_decompress(_src: &[u8], _nvals: usize) -> Result<Vec<i32>> {
+    todo!("phase 4: PLIO_1 pixel-list decode")
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    /// Minimal MSB-first bit writer mirroring [`BitReader`], for building test inputs.
+    struct BitWriter {
+        bytes: Vec<u8>,
+        cur: u8,
+        nbits: u32,
+    }
+    impl BitWriter {
+        fn new() -> Self {
+            BitWriter {
+                bytes: Vec::new(),
+                cur: 0,
+                nbits: 0,
+            }
+        }
+        fn put_bits(&mut self, val: u32, n: u32) {
+            for i in (0..n).rev() {
+                let bit = ((val >> i) & 1) as u8;
+                self.cur = (self.cur << 1) | bit;
+                self.nbits += 1;
+                if self.nbits == 8 {
+                    self.bytes.push(self.cur);
+                    self.cur = 0;
+                    self.nbits = 0;
+                }
+            }
+        }
+        fn finish(mut self) -> Vec<u8> {
+            if self.nbits > 0 {
+                self.cur <<= 8 - self.nbits;
+                self.bytes.push(self.cur);
+            }
+            self.bytes
+        }
+    }
+
+    fn zigzag(v: i64) -> u64 {
+        ((v << 1) ^ (v >> 63)) as u64
+    }
+
+    #[test]
+    fn unzigzag_round_trip() {
+        for v in [-5i64, -1, 0, 1, 2, 100, -100, i32::MIN as i64, i32::MAX as i64] {
+            assert_eq!(unzigzag(zigzag(v)), v);
+        }
+    }
+
+    #[test]
+    fn bit_reader_basic() {
+        // 0b1011_0010, 0b1100_0000
+        let data = [0b1011_0010u8, 0b1100_0000u8];
+        let mut r = BitReader::new(&data);
+        assert_eq!(r.read_bits(4).unwrap(), 0b1011);
+        assert_eq!(r.read_bits(4).unwrap(), 0b0010);
+        assert_eq!(r.read_bits(2).unwrap(), 0b11);
+    }
+
+    /// Hand-encode a small i32 RICE_1 stream (one block, fs known) and decode it.
+    ///
+    /// Mirrors cfitsio `fits_rcomp`: the first value is written verbatim as the *seed*
+    /// (`lastpix = vals[0]`), and `nvals` differences are written — the first of which
+    /// is `vals[0] - lastpix == 0`. The decoder then reconstructs `array[0] = seed + 0`.
+    #[test]
+    fn rice_i32_single_block_roundtrip() {
+        // Original values; encode diffs with a chosen fs so we control the layout.
+        let vals: Vec<i32> = vec![1000, 1003, 1001, 1005, 1002];
+        let fs: u32 = 2; // low bits per code
+        let encoded = rice_encode_i32(&vals, fs);
+
+        let decoded = rice_decompress_i32(&encoded, vals.len(), 32).unwrap();
+        assert_eq!(decoded, vals);
+    }
+
+    /// Zero-difference (constant) block: fs encoded as 0 (fs_plus_1 = 0 -> fs = -1).
+    #[test]
+    fn rice_i32_zero_block() {
+        let vals: Vec<i32> = vec![42, 42, 42, 42];
+        let mut w = BitWriter::new();
+        w.put_bits(vals[0] as u32, 32);
+        w.put_bits(0, 5); // fs_plus_1 = 0 => fs = -1 => all-zero block
+        let encoded = w.finish();
+        let decoded = rice_decompress_i32(&encoded, vals.len(), 32).unwrap();
+        assert_eq!(decoded, vals);
+    }
+
+    #[test]
+    fn tile_geometry_default_row_by_row() {
+        let mut h = Header::new();
+        h.set("ZNAXIS", HeaderValue::Integer(2), None);
+        h.set("ZNAXIS1", HeaderValue::Integer(100), None);
+        h.set("ZNAXIS2", HeaderValue::Integer(50), None);
+        let g = TileGeometry::from_header(&h).unwrap();
+        assert_eq!(g.znaxis, vec![100, 50]);
+        assert_eq!(g.ztile, vec![100, 1]); // row-by-row default
+        assert_eq!(g.num_tiles(), 50);
+    }
+
+    use crate::keyword::HeaderValue;
+
+    // --- tile reassembly geometry --------------------------------------------
+
+    #[test]
+    fn unravel_axis1_fastest() {
+        // 3x2 grid of tiles: index increments axis-1 (x) fastest.
+        let tpa = [3usize, 2];
+        assert_eq!(unravel(0, &tpa), vec![0, 0]);
+        assert_eq!(unravel(1, &tpa), vec![1, 0]);
+        assert_eq!(unravel(2, &tpa), vec![2, 0]);
+        assert_eq!(unravel(3, &tpa), vec![0, 1]);
+        assert_eq!(unravel(5, &tpa), vec![2, 1]);
+    }
+
+    #[test]
+    fn scatter_full_tiles_2d() {
+        // 4x4 image, 2x2 tiles => 2x2 grid. Each tile filled with its index*10+local.
+        let image_dims = [4usize, 4];
+        let ztile = [2usize, 2];
+        let tiles_per_axis = [2usize, 2];
+        let mut full = vec![-1i64; 16];
+        for tile in 0..4 {
+            let coords = unravel(tile, &tiles_per_axis);
+            // tile-local values 0..4 (axis-1 fastest within the tile)
+            let vals: Vec<i64> = (0..4).map(|l| (tile as i64) * 100 + l as i64).collect();
+            scatter_tile(&mut full, &image_dims, &ztile, &[2, 2], &coords, &vals);
+        }
+        // Expected layout (row-major, axis-1 fastest):
+        // row0: tile0[0] tile0[1] tile1[0] tile1[1]
+        // row1: tile0[2] tile0[3] tile1[2] tile1[3]
+        // row2: tile2[0] tile2[1] tile3[0] tile3[1]
+        // row3: tile2[2] tile2[3] tile3[2] tile3[3]
+        let expected = vec![
+            0, 1, 100, 101, // row0
+            2, 3, 102, 103, // row1
+            200, 201, 300, 301, // row2
+            202, 203, 302, 303, // row3
+        ];
+        assert_eq!(full, expected);
+    }
+
+    #[test]
+    fn scatter_edge_truncated_tiles() {
+        // 3x3 image with 2x2 tiles => 2x2 grid, but edge tiles are 1 wide/tall.
+        let image_dims = [3usize, 3];
+        let ztile = [2usize, 2];
+        let tiles_per_axis = [2usize, 2];
+        // tile dims: (0,0)=2x2, (1,0)=1x2, (0,1)=2x1, (1,1)=1x1
+        let tile_dims = [vec![2, 2], vec![1, 2], vec![2, 1], vec![1, 1]];
+        let mut full = vec![-1i64; 9];
+        for tile in 0..4 {
+            let coords = unravel(tile, &tiles_per_axis);
+            let td = &tile_dims[tile];
+            let n: usize = td.iter().product();
+            let vals: Vec<i64> = (0..n).map(|l| (tile as i64) * 100 + l as i64).collect();
+            scatter_tile(&mut full, &image_dims, &ztile, td, &coords, &vals);
+        }
+        // No -1 should remain; every pixel covered exactly once.
+        assert!(!full.contains(&-1));
+        // Spot check origins: tile0 at (0,0)->index0; tile1 (x-tile 1) origin x=2 row0 => index2.
+        assert_eq!(full[0], 0); // tile0 local0
+        assert_eq!(full[2], 100); // tile1 local0
+        assert_eq!(full[6], 200); // tile2 local0 (row2, x0 => index 6)
+        assert_eq!(full[8], 300); // tile3 local0 (row2, x2 => index 8)
+    }
+
+    // --- end-to-end RICE_1 decompress ----------------------------------------
+
+    /// Build a one-block (<=32 values) i32 RICE_1 stream for `vals` with low-bits `fs`.
+    ///
+    /// Mirrors cfitsio `fits_rcomp`: writes `vals[0]` verbatim as the seed, sets
+    /// `lastpix = vals[0]`, then emits exactly `vals.len()` differences — the first
+    /// being `vals[0] - lastpix == 0`.
+    fn rice_encode_i32(vals: &[i32], fs: u32) -> Vec<u8> {
+        let mut w = BitWriter::new();
+        w.put_bits(vals[0] as u32, 32);
+        w.put_bits(fs + 1, 5);
+        let mut last = vals[0];
+        for &v in vals {
+            let diff = (v - last) as i64;
+            last = v;
+            let mapped = zigzag(diff);
+            let high = (mapped >> fs) as u32;
+            let low = (mapped & ((1 << fs) - 1)) as u32;
+            for _ in 0..high {
+                w.put_bits(0, 1);
+            }
+            w.put_bits(1, 1);
+            w.put_bits(low, fs);
+        }
+        w.finish()
+    }
+
+    #[test]
+    fn end_to_end_rice1_i32_two_tiles() {
+        use crate::bintable::{BinColumnType, BinTableBuilder};
+
+        // 4x1 image, tile = 2x1 => two row tiles of 2 pixels each.
+        let tile0 = [1000i32, 1003];
+        let tile1 = [50i32, 47];
+        let enc0 = rice_encode_i32(&tile0, 2);
+        let enc1 = rice_encode_i32(&tile1, 2);
+
+        let table = BinTableBuilder::new()
+            .add_column("COMPRESSED_DATA", BinColumnType::VarP('B'))
+            .push_row(|r| r.write_var_p(enc0.len() as i32, |heap| heap.extend_from_slice(&enc0)))
+            .push_row(|r| r.write_var_p(enc1.len() as i32, |heap| heap.extend_from_slice(&enc1)))
+            .build();
+
+        let mut h = Header::new();
+        h.set("ZIMAGE", HeaderValue::Logical(true), None);
+        h.set("ZCMPTYPE", HeaderValue::String("RICE_1".into()), None);
+        h.set("ZBITPIX", HeaderValue::Integer(32), None);
+        h.set("ZNAXIS", HeaderValue::Integer(1), None);
+        h.set("ZNAXIS1", HeaderValue::Integer(4), None);
+        h.set("ZTILE1", HeaderValue::Integer(2), None);
+        // BYTEPIX = 4 for i32 RICE.
+        h.set("ZNAME1", HeaderValue::String("BYTEPIX".into()), None);
+        h.set("ZVAL1", HeaderValue::Integer(4), None);
+
+        let cimg = CompressedImage::from_bintable(&h, &table).unwrap();
+        assert_eq!(cimg.compression(), CompressionType::Rice1);
+        assert_eq!(cimg.geometry().num_tiles(), 2);
+
+        let img = cimg.decompress().unwrap();
+        assert_eq!(img.axes, vec![4]);
+        match img.pixels {
+            crate::image_data::PixelData::I32(v) => {
+                assert_eq!(v, vec![1000, 1003, 50, 47]);
+            }
+            other => panic!("expected I32, got {other:?}"),
+        }
+    }
+
+    #[cfg(feature = "gzip")]
+    #[test]
+    fn end_to_end_gzip1_i16_one_tile() {
+        use crate::bintable::{BinColumnType, BinTableBuilder};
+
+        // 4x1 i16 image, one row tile. GZIP_1 stores raw big-endian image integers.
+        let vals: [i16; 4] = [100, -200, 300, -400];
+        let mut raw = Vec::new();
+        for v in vals {
+            raw.extend_from_slice(&v.to_be_bytes());
+        }
+        // Wrap raw bytes as a gzip member (header + raw DEFLATE + trailer).
+        let body = miniz_oxide::deflate::compress_to_vec(&raw, 6);
+        let mut gz = vec![0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 0, 0xff];
+        gz.extend_from_slice(&body);
+        let crc = crc32(&raw);
+        gz.extend_from_slice(&crc.to_le_bytes());
+        gz.extend_from_slice(&(raw.len() as u32).to_le_bytes());
+
+        let table = BinTableBuilder::new()
+            .add_column("COMPRESSED_DATA", BinColumnType::VarP('B'))
+            .push_row(|r| r.write_var_p(gz.len() as i32, |heap| heap.extend_from_slice(&gz)))
+            .build();
+
+        let mut h = Header::new();
+        h.set("ZIMAGE", HeaderValue::Logical(true), None);
+        h.set("ZCMPTYPE", HeaderValue::String("GZIP_1".into()), None);
+        h.set("ZBITPIX", HeaderValue::Integer(16), None);
+        h.set("ZNAXIS", HeaderValue::Integer(1), None);
+        h.set("ZNAXIS1", HeaderValue::Integer(4), None);
+
+        let cimg = CompressedImage::from_bintable(&h, &table).unwrap();
+        let img = cimg.decompress().unwrap();
+        match img.pixels {
+            crate::image_data::PixelData::I16(v) => assert_eq!(v, vec![100, -200, 300, -400]),
+            other => panic!("expected I16, got {other:?}"),
+        }
+    }
+
+    /// Minimal CRC32 (gzip/PNG polynomial) for building gzip test fixtures.
+    #[cfg(feature = "gzip")]
+    fn crc32(data: &[u8]) -> u32 {
+        let mut crc: u32 = 0xffff_ffff;
+        for &byte in data {
+            crc ^= byte as u32;
+            for _ in 0..8 {
+                let mask = (crc & 1).wrapping_neg();
+                crc = (crc >> 1) ^ (0xedb8_8320 & mask);
+            }
+        }
+        !crc
+    }
+
+    #[cfg(not(feature = "gzip"))]
+    #[test]
+    fn gzip_disabled_errors() {
+        assert!(matches!(
+            gzip_inflate(&[0x1f, 0x8b]),
+            Err(Error::UnsupportedCompression(_))
+        ));
+    }
+
+    #[test]
+    fn float_zbitpix_is_unsupported() {
+        use crate::bintable::{BinColumnType, BinTableBuilder};
+        let table = BinTableBuilder::new()
+            .add_column("COMPRESSED_DATA", BinColumnType::VarP('B'))
+            .push_row(|r| r.write_var_p(0, |_| {}))
+            .build();
+        let mut h = Header::new();
+        h.set("ZIMAGE", HeaderValue::Logical(true), None);
+        h.set("ZCMPTYPE", HeaderValue::String("RICE_1".into()), None);
+        h.set("ZBITPIX", HeaderValue::Integer(-32), None);
+        h.set("ZNAXIS", HeaderValue::Integer(1), None);
+        h.set("ZNAXIS1", HeaderValue::Integer(2), None);
+        let cimg = CompressedImage::from_bintable(&h, &table).unwrap();
+        assert!(matches!(
+            cimg.decompress(),
+            Err(Error::UnsupportedCompression(_))
+        ));
+    }
+}
diff --git a/src/types.rs b/src/types.rs
index a07e163..a593d0f 100644
--- a/src/types.rs
+++ b/src/types.rs
@@ -62,7 +62,14 @@ mod tests {
 
     #[test]
     fn bitpix_round_trip() {
-        for bp in [Bitpix::U8, Bitpix::I16, Bitpix::I32, Bitpix::I64, Bitpix::F32, Bitpix::F64] {
+        for bp in [
+            Bitpix::U8,
+            Bitpix::I16,
+            Bitpix::I32,
+            Bitpix::I64,
+            Bitpix::F32,
+            Bitpix::F64,
+        ] {
             assert_eq!(Bitpix::from_i64(bp.to_i64()).unwrap(), bp);
         }
     }
diff --git a/tests/checksum.rs b/tests/checksum.rs
index 5057cb2..b9fef5b 100644
--- a/tests/checksum.rs
+++ b/tests/checksum.rs
@@ -1,5 +1,5 @@
-use fits4::*;
 use fits4::checksum;
+use fits4::*;
 
 #[test]
 fn write_with_checksum_and_verify() {
diff --git a/tests/image_conv.rs b/tests/image_conv.rs
index 44282b9..f751911 100644
--- a/tests/image_conv.rs
+++ b/tests/image_conv.rs
@@ -64,10 +64,7 @@ fn u16_full_range() {
 
 #[test]
 fn f32_normalizes_to_u16() {
-    let img_data = ImageData::new(
-        vec![4],
-        PixelData::F32(vec![0.0, 1.0, 0.5, 0.25]),
-    );
+    let img_data = ImageData::new(vec![4], PixelData::F32(vec![0.0, 1.0, 0.5, 0.25]));
 
     let back = img_data.to_dynamic_image(1.0, 0.0).unwrap();
     match back {
diff --git a/tests/round_trip.rs b/tests/round_trip.rs
index 58855de..6dee120 100644
--- a/tests/round_trip.rs
+++ b/tests/round_trip.rs
@@ -228,7 +228,7 @@ fn round_trip_bscale_bzero() {
 
 #[test]
 fn round_trip_bintable() {
-    use fits4::bintable::{BinColumn, BinColumnType, BinCellValue};
+    use fits4::bintable::{BinCellValue, BinColumn, BinColumnType};
 
     // Build a simple bintable with 2 rows, 3 columns: i32, f64, 8-char string
     let col_j = BinColumn {
@@ -325,7 +325,7 @@ fn round_trip_bintable() {
 
 #[test]
 fn round_trip_bintable_vla() {
-    use fits4::bintable::{BinColumn, BinColumnType, BinCellValue};
+    use fits4::bintable::{BinCellValue, BinColumn, BinColumnType};
 
     // Variable-length array column using P descriptor
     let col = BinColumn {
@@ -454,13 +454,17 @@ fn multiple_extensions() {
     // Add image extension
     let img = ImageData::new(vec![2], PixelData::F32(vec![1.0, 2.0]));
     let mut img_hdu = Hdu::image_extension(img);
-    img_hdu.header.set("EXTNAME", HeaderValue::String("MYIMAGE".into()), None);
+    img_hdu
+        .header
+        .set("EXTNAME", HeaderValue::String("MYIMAGE".into()), None);
     fits.push_extension(img_hdu);
 
     // Add another image extension
     let img2 = ImageData::new(vec![3], PixelData::U8(vec![10, 20, 30]));
     let mut img_hdu2 = Hdu::image_extension(img2);
-    img_hdu2.header.set("EXTNAME", HeaderValue::String("OTHER".into()), None);
+    img_hdu2
+        .header
+        .set("EXTNAME", HeaderValue::String("OTHER".into()), None);
     fits.push_extension(img_hdu2);
 
     let bytes = fits.to_bytes().unwrap();
@@ -488,21 +492,33 @@ fn block_alignment() {
 #[test]
 fn header_keyword_preservation() {
     let mut fits = FitsFile::with_empty_primary();
-    fits.primary_mut().header.push(Keyword::commentary("COMMENT", "Test comment line"));
-    fits.primary_mut().header.push(Keyword::commentary("HISTORY", "Created by fits4 tests"));
-    fits.primary_mut().header.set("AUTHOR", HeaderValue::String("test".into()), None);
+    fits.primary_mut()
+        .header
+        .push(Keyword::commentary("COMMENT", "Test comment line"));
+    fits.primary_mut()
+        .header
+        .push(Keyword::commentary("HISTORY", "Created by fits4 tests"));
+    fits.primary_mut()
+        .header
+        .set("AUTHOR", HeaderValue::String("test".into()), None);
 
     let bytes = fits.to_bytes().unwrap();
     let fits2 = FitsFile::from_bytes(&bytes).unwrap();
 
     assert_eq!(fits2.primary().header.get_string("AUTHOR"), Some("test"));
 
-    let comments: Vec<_> = fits2.primary().header.iter()
+    let comments: Vec<_> = fits2
+        .primary()
+        .header
+        .iter()
         .filter(|k| k.name == "COMMENT")
         .collect();
     assert_eq!(comments.len(), 1);
 
-    let history: Vec<_> = fits2.primary().header.iter()
+    let history: Vec<_> = fits2
+        .primary()
+        .header
+        .iter()
         .filter(|k| k.name == "HISTORY")
         .collect();
     assert_eq!(history.len(), 1);
diff --git a/tests/sample_files.rs b/tests/sample_files.rs
index b561ead..981877c 100644
--- a/tests/sample_files.rs
+++ b/tests/sample_files.rs
@@ -30,7 +30,11 @@ fn read_euv_image() {
     assert_eq!(primary.header.get_int("NAXIS"), Some(0));
     assert!(matches!(primary.data, HduData::Empty));
 
-    assert!(fits.len() > 1, "expected extensions, got {} HDUs", fits.len());
+    assert!(
+        fits.len() > 1,
+        "expected extensions, got {} HDUs",
+        fits.len()
+    );
 
     let mut image_count = 0;
     let mut bintable_count = 0;
@@ -188,3 +192,106 @@ fn all_sample_files_readable() {
         );
     }
 }
+
+// --- Tiled-image compression: RICE_1 / GZIP_1 round-trip vs originals --------
+//
+// Each compressed `.fz` fixture has the SAME HDU ordering as its uncompressed
+// source: every `HduData::Image` HDU in the source appears, in order, as a
+// compressed-image `HduData::BinTable` (ZIMAGE=T) in the `.fz`. We decompress
+// each and assert byte-exact equality of the pixel buffers.
+//
+// These are guarded per-fixture with `Path::exists()` so they skip cleanly on a
+// machine without the (gitignored, GCS-hosted) fixtures.
+
+/// Assert that decompressing every compressed-image HDU in `fz_name` reproduces,
+/// byte-for-byte, the corresponding `HduData::Image` HDUs of `src_name`.
+fn assert_rice_roundtrip(src_name: &str, fz_name: &str) {
+    let fz_path = samp(fz_name);
+    if !fz_path.exists() {
+        eprintln!("skipping: fixture {fz_name} not present");
+        return;
+    }
+    let src = FitsFile::from_file(samp(src_name)).expect("read source");
+    let fz = FitsFile::from_file(&fz_path).expect("read .fz");
+
+    // Source image HDUs, in order.
+    let src_images: Vec<&ImageData> = src
+        .hdus
+        .iter()
+        .filter_map(|h| match &h.data {
+            HduData::Image(im) if !im.pixels.to_bytes().is_empty() => Some(im),
+            _ => None,
+        })
+        .collect();
+
+    // Compressed-image HDUs in the .fz, in order.
+    let mut matched = 0usize;
+    let mut src_iter = src_images.iter();
+    for hdu in &fz.hdus {
+        if let Some(cimg) = hdu.as_compressed_image() {
+            let orig = src_iter
+                .next()
+                .expect("more compressed images than source images");
+            let dec = cimg.decompress().expect("decompress");
+            assert_eq!(
+                dec.axes, orig.axes,
+                "{fz_name}: axes mismatch on compressed HDU #{matched}"
+            );
+            assert_eq!(
+                dec.pixels.to_bytes(),
+                orig.pixels.to_bytes(),
+                "{fz_name}: pixel bytes differ on compressed HDU #{matched}"
+            );
+            matched += 1;
+        }
+    }
+    assert!(matched > 0, "{fz_name}: found no compressed-image HDUs");
+}
+
+#[test]
+fn rice_roundtrip_euv_row_tiled() {
+    // 512x512 I16, RICE row tiling (512x1 = one tile per row).
+    assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.rice.fits.fz");
+}
+
+#[test]
+fn rice_roundtrip_euv_square_tiled() {
+    // Same source, forced 100x100 tiles: exercises 2-D edge-truncated tiling.
+    assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.rice_t100.fits.fz");
+}
+
+#[test]
+fn rice_roundtrip_fgs_i32() {
+    // 89688x7 I32 image.
+    assert_rice_roundtrip("FGSf64y0106m_a1f.fits", "FGSf64y0106m_a1f.rice.fits.fz");
+}
+
+/// Float (`ZBITPIX < 0`) RICE fixtures are out of scope: decoding must return the
+/// documented `Err(UnsupportedCompression)` rather than wrong pixels.
+#[test]
+fn rice_float_fixture_is_unsupported() {
+    let fz_path = samp("FOCx38i0101t_c0f.rice_nodith.fits.fz");
+    if !fz_path.exists() {
+        eprintln!("skipping: fixture not present");
+        return;
+    }
+    let fz = FitsFile::from_file(&fz_path).expect("read .fz");
+    let mut checked = false;
+    for hdu in &fz.hdus {
+        if let Some(cimg) = hdu.as_compressed_image() {
+            assert!(
+                matches!(cimg.decompress(), Err(Error::UnsupportedCompression(_))),
+                "float compressed image should be unsupported"
+            );
+            checked = true;
+        }
+    }
+    assert!(checked, "expected a compressed-image HDU");
+}
+
+#[cfg(feature = "gzip")]
+#[test]
+fn gzip1_roundtrip_euv() {
+    // GZIP_1 stores raw big-endian image integers per tile; needs the `gzip` feature.
+    assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.gzip1.fits.fz");
+}

From b30b3f263d2f07bc90b879355b23809daaa4ec9a Mon Sep 17 00:00:00 2001
From: Steven Michael <ssmichael@gmail.com>
Date: Sun, 31 May 2026 14:18:45 -0400
Subject: [PATCH 2/9] Add float tile decompression: quantization + subtractive
 dithering

Phase 3 of FITS tiled-image compression: decode floating-point
compressed images (RICE/GZIP-quantized) back to F32/F64.

- unquantize/decompress_float in tile_compress.rs: per-tile ZSCALE/ZZERO
  columns, ZQUANTIZ NO_DITHER / SUBTRACTIVE_DITHER_1 / SUBTRACTIVE_DITHER_2,
  ZBLANK -> NaN, DITHER_2 exact-zero preservation, and lossless-float
  (ZQUANTIZ=NONE) passthrough.
- Port cfitsio's dither RNG exactly: Park-Miller LCG (a=16807, m=2^31-1,
  seed 1) -> 10000 f32s via OnceLock; iseed = (tile + ZDITHER0 - 1) % 10000.
- Reconstruct with a fused multiply-add (f64::mul_add) to bit-match
  cfitsio/funpack; without the FMA, large-ZZERO cancellation pixels
  diverge by ~1 ULP.
- Tests: 6 float cases asserting bit-identity to the funpack CLI at test
  time, skip-if-absent on both fixture and binary so CI stays green
  without cfitsio. Added a SUBTRACTIVE_DITHER_2 fixture (fpack -qz5).
- CI: float fixtures added to best-effort download, cache key v2 -> v3,
  clippy/test switched to --all-features.

HCOMPRESS_1, PLIO_1, and the write path remain TODO.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .github/workflows/ci.yml           |  18 +-
 scripts/gen_compressed_fixtures.sh |   7 +-
 src/tile_compress.rs               | 425 +++++++++++++++++++++++++++--
 tests/sample_files.rs              | 167 +++++++++++-
 4 files changed, 583 insertions(+), 34 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index ace4be0..3df1082 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -26,7 +26,7 @@ jobs:
         uses: actions/cache@v4
         with:
           path: samp
-          key: fits4-samples-v2
+          key: fits4-samples-v3
 
       - name: Download sample FITS files
         if: steps.cache-samples.outputs.cache-hit != 'true'
@@ -50,12 +50,21 @@ jobs:
             FGSf64y0106m_a1f.rice.fits.fz \
             EUVEngc4151imgx.gzip1.fits.fz \
             EUVEngc4151imgx.gzip2.fits.fz \
-            FOCx38i0101t_c0f.rice_nodith.fits.fz; do
+            FOCx38i0101t_c0f.rice_nodith.fits.fz \
+            FOCx38i0101t_c0f.rice_dith.fits.fz \
+            FOCx38i0101t_c0f.rice_dith2.fits.fz \
+            FOCx38i0101t_c0f.gzip_lossless.fits.fz \
+            WFPC2u5780205r_c0fx.rice_dith.fits.fz \
+            WFPC2u5780205r_c0fx.rice_nodith.fits.fz; do
             curl -sSfL "$BASE/$f" -o "samp/$f" || true
           done
 
+      # The float tile-compression tests validate against the `funpack` CLI
+      # (cfitsio). It is not installed here, so those tests skip cleanly; the
+      # integer-codec round-trip tests still run against the downloaded fixtures.
+
       - name: Clippy
-        run: cargo clippy --all-targets --features image -- -D warnings
+        run: cargo clippy --all-targets --all-features -- -D warnings
 
       - name: Test
         run: cargo test
@@ -63,5 +72,8 @@ jobs:
       - name: Test (with image feature)
         run: cargo test --features image
 
+      - name: Test (all features)
+        run: cargo test --all-features
+
       - name: Doc
         run: cargo doc --no-deps --features image
diff --git a/scripts/gen_compressed_fixtures.sh b/scripts/gen_compressed_fixtures.sh
index c6bdbd2..9e56922 100755
--- a/scripts/gen_compressed_fixtures.sh
+++ b/scripts/gen_compressed_fixtures.sh
@@ -105,9 +105,10 @@ compress FGSf64y0106m_a1f.fits    gzip1  -g
 # subtractive dithering; `-q0 <lvl>` (no space) quantizes with NO dithering
 # (ZQUANTIZ=NO_DITHER); lossless float (`-q 0`, level zero) is GZIP-only.
 # FOCx38i0101t_c0f.fits is F32 1024x1024.
-compress FOCx38i0101t_c0f.fits    rice_dith     -r -q 16    # quantize + subtractive dither
-compress FOCx38i0101t_c0f.fits    rice_nodith   -r -q0 16   # quantize, no dithering
-compress FOCx38i0101t_c0f.fits    gzip_lossless -g -q 0     # lossless float (no quantization)
+compress FOCx38i0101t_c0f.fits    rice_dith     -r -q 16     # quantize + SUBTRACTIVE_DITHER_1
+compress FOCx38i0101t_c0f.fits    rice_dith2    -r -qz5 16   # SUBTRACTIVE_DITHER_2, fixed seed 5
+compress FOCx38i0101t_c0f.fits    rice_nodith   -r -q0 16    # quantize, NO_DITHER
+compress FOCx38i0101t_c0f.fits    gzip_lossless -g -q 0      # lossless float (no quantization)
 compress FOCx38i0101t_c0f.fits    hcomp         -h
 
 # WFPC2u5780205r_c0fx.fits is an F32 200x200x4 cube (multi-axis tiling).
diff --git a/src/tile_compress.rs b/src/tile_compress.rs
index ac5720b..94790ac 100644
--- a/src/tile_compress.rs
+++ b/src/tile_compress.rs
@@ -74,7 +74,10 @@ impl Quantize {
         match s.map(|v| v.trim().to_ascii_uppercase()) {
             None => Ok(Quantize::None),
             Some(v) => match v.as_str() {
-                "NO_DITHER" => Ok(Quantize::None),
+                // `NONE` is written by cfitsio for losslessly-stored float tiles
+                // (GZIP of the raw floats, no quantization) and is treated like the
+                // absent case here.
+                "NO_DITHER" | "NONE" => Ok(Quantize::None),
                 "SUBTRACTIVE_DITHER_1" => Ok(Quantize::SubtractiveDither1),
                 "SUBTRACTIVE_DITHER_2" => Ok(Quantize::SubtractiveDither2),
                 other => Err(Error::UnsupportedCompression(format!(
@@ -238,10 +241,13 @@ impl<'a> CompressedImage<'a> {
     /// per the tile grid (axis-1 fastest). `ZBLANK` integer sentinels are carried
     /// through unchanged into the output array.
     ///
+    /// Float `ZBITPIX` (-32/-64) images are decoded via [`Self::decompress_float`],
+    /// which inverts the per-tile `ZSCALE`/`ZZERO` quantization and reverses cfitsio's
+    /// subtractive dithering (`ZQUANTIZ` = `SUBTRACTIVE_DITHER_1`/`_2`); `ZBLANK`
+    /// sentinels map to `NaN`.
+    ///
     /// # Not yet supported (see `COMPRESSION_PLAN.md`)
     ///
-    /// - Float `ZBITPIX` (-32/-64): requires quantization inverse (`ZSCALE`/`ZZERO`)
-    ///   and subtractive dithering — phase 3.
     /// - `PLIO_1` — phase 4.
     /// - `HCOMPRESS_1` — phase 5.
     /// - `GZIP_1`/`GZIP_2` without the `gzip` feature return
@@ -250,12 +256,11 @@ impl<'a> CompressedImage<'a> {
         use crate::image_data::{ImageData, PixelData};
         use crate::types::Bitpix;
 
-        // Float originals require quantization inverse + dithering (phase 3).
+        // Float originals (ZBITPIX = -32 / -64) take a separate path: each tile holds
+        // either quantized integers (with per-tile ZSCALE/ZZERO) or, when quantization
+        // was not applied, the raw floats. See `decompress_float`.
         if self.zbitpix < 0 {
-            return Err(Error::UnsupportedCompression(format!(
-                "float ZBITPIX={} (quantization/dithering decode is phase 3, see COMPRESSION_PLAN.md)",
-                self.zbitpix
-            )));
+            return self.decompress_float();
         }
 
         let bitpix = Bitpix::from_i64(self.zbitpix)?;
@@ -312,6 +317,231 @@ impl<'a> CompressedImage<'a> {
         Ok(ImageData::new(self.geometry.znaxis.clone(), pixels))
     }
 
+    /// Decompress a float (`ZBITPIX = -32` / `-64`) tile-compressed image.
+    ///
+    /// Float images are normally stored as per-tile linearly-quantized 32-bit
+    /// integers plus per-tile `ZSCALE`/`ZZERO` scale/offset columns; the integers are
+    /// reconstructed exactly like the integer path, then *unquantized* back to floats.
+    /// cfitsio reverses the subtractive-dithering it applied at compress time using a
+    /// fixed pseudo-random sequence (see [`fits_rand_value`]).
+    ///
+    /// A tile that could not be quantized (e.g. one containing only NaNs, or when
+    /// lossless `GZIP` of the raw floats was requested) is stored instead as the raw
+    /// big-endian float bytes — either in a `GZIP_COMPRESSED_DATA`/`UNCOMPRESSED_DATA`
+    /// fallback column for that single tile, or, for a wholly-lossless image, in
+    /// `COMPRESSED_DATA` with no `ZSCALE`/`ZZERO` columns at all. Such tiles are passed
+    /// through verbatim.
+    fn decompress_float(&self) -> Result<crate::image_data::ImageData> {
+        use crate::image_data::{ImageData, PixelData};
+
+        // -32 -> 4-byte floats, -64 -> 8-byte floats.
+        let is_f64 = self.zbitpix == -64;
+        let npix: usize = self.geometry.znaxis.iter().product();
+
+        let tiles_per_axis = self.geometry.tiles_per_axis();
+        let num_tiles = self.geometry.num_tiles();
+        if self.table.nrows < num_tiles {
+            return Err(Error::CompressionError(format!(
+                "compressed image expects {} tiles but BINTABLE has {} rows",
+                num_tiles, self.table.nrows
+            )));
+        }
+
+        let cdata_col = self.compressed_data_column()?;
+        let gzip_fallback_col = self.find_column("GZIP_COMPRESSED_DATA");
+        let uncompressed_col = self.find_column("UNCOMPRESSED_DATA");
+        let zscale_col = self.find_column("ZSCALE");
+        let zzero_col = self.find_column("ZZERO");
+        let zblank_col = self.find_column("ZBLANK");
+
+        // Reconstructed floats, axis-1 fastest, scattered tile-by-tile. We scatter via
+        // the integer `scatter_tile` over a bit-reinterpreted buffer so the existing
+        // (well-tested) geometry code is reused.
+        let mut full_bits = vec![0i64; npix];
+
+        for tile_index in 0..num_tiles {
+            let coords = unravel(tile_index, &tiles_per_axis);
+            let tile_dims = self.tile_dims_at(&coords);
+            let tile_npix: usize = tile_dims.iter().product();
+
+            // Per-tile scale/zero (D columns). A sentinel/absent value marks an
+            // unquantized (raw-float) tile.
+            let zscale = zscale_col.and_then(|c| self.tile_double(tile_index, c));
+            let zzero = zzero_col.and_then(|c| self.tile_double(tile_index, c));
+            // Per-tile blank sentinel (J column) overrides the ZBLANK keyword.
+            let blank = zblank_col
+                .and_then(|c| self.tile_int(tile_index, c))
+                .or(self.blank);
+
+            let cdata = self.tile_bytes(tile_index, cdata_col)?;
+
+            let tile_floats: Vec<f64> = match (zscale, zzero) {
+                // Quantized tile: COMPRESSED_DATA holds quantized integers.
+                (Some(scale), Some(zero)) if !cdata.is_empty() && is_quantized(scale) => {
+                    let q = self.decode_tile(&cdata, tile_npix)?;
+                    self.unquantize(&q, tile_index, scale, zero, blank, tile_npix)
+                }
+                // Unquantized tile (or whole image is lossless): raw floats live in
+                // COMPRESSED_DATA, or in a per-tile fallback column when COMPRESSED_DATA
+                // is empty.
+                _ => {
+                    let (bytes, gzipped) = if !cdata.is_empty() {
+                        // For a lossless GZIP image the raw floats are gzip-compressed
+                        // here; for NOCOMPRESS they are verbatim.
+                        (cdata, self.ctype != CompressionType::NoCompress)
+                    } else if let Some(b) =
+                        gzip_fallback_col.and_then(|c| self.tile_bytes(tile_index, c).ok())
+                    {
+                        (b, true)
+                    } else if let Some(b) =
+                        uncompressed_col.and_then(|c| self.tile_bytes(tile_index, c).ok())
+                    {
+                        (b, false)
+                    } else {
+                        return Err(Error::CompressionError(format!(
+                            "float tile {tile_index} has no quantization scale and no raw-float fallback data"
+                        )));
+                    };
+                    let raw = if gzipped { gzip_inflate(&bytes)? } else { bytes };
+                    raw_floats(&raw, tile_npix, is_f64)?
+                }
+            };
+
+            if tile_floats.len() != tile_npix {
+                return Err(Error::CompressionError(format!(
+                    "float tile {tile_index} produced {} values, expected {tile_npix}",
+                    tile_floats.len()
+                )));
+            }
+
+            // Reinterpret each float's bits as an integer so we can reuse scatter_tile,
+            // then convert back below. (f32 bits zero-extended into i64.)
+            let bits: Vec<i64> = if is_f64 {
+                tile_floats.iter().map(|&v| v.to_bits() as i64).collect()
+            } else {
+                tile_floats
+                    .iter()
+                    .map(|&v| (v as f32).to_bits() as i64)
+                    .collect()
+            };
+            scatter_tile(
+                &mut full_bits,
+                &self.geometry.znaxis,
+                &self.geometry.ztile,
+                &tile_dims,
+                &coords,
+                &bits,
+            );
+        }
+
+        let pixels = if is_f64 {
+            PixelData::F64(full_bits.iter().map(|&b| f64::from_bits(b as u64)).collect())
+        } else {
+            PixelData::F32(
+                full_bits
+                    .iter()
+                    .map(|&b| f32::from_bits(b as u32))
+                    .collect(),
+            )
+        };
+
+        Ok(ImageData::new(self.geometry.znaxis.clone(), pixels))
+    }
+
+    /// Inverse linear quantization with cfitsio's subtractive-dithering reversal.
+    ///
+    /// For each quantized integer `q[i]` of a tile:
+    /// - `q == blank` (ZBLANK / per-tile null sentinel) ⇒ `NaN`.
+    /// - `SUBTRACTIVE_DITHER_2` and `q == ZERO_VALUE (-2147483646)` ⇒ exactly `0.0`.
+    /// - otherwise `value = (q - r + 0.5) * scale + zero`, where `r` is the next value
+    ///   of the fixed pseudo-random sequence ([`fits_rand_value`]); for the
+    ///   non-dithered methods `r` is taken as `0.0` (so `value = q*scale + zero`,
+    ///   matching cfitsio's `fffi4r4`).
+    ///
+    /// The random-sequence indexing mirrors cfitsio `unquantize_i4r4`:
+    /// `iseed = (tile_index + ZDITHER0 - 1) mod N_RANDOM`,
+    /// `nextrand = (int)(fits_rand_value[iseed] * 500)`, advancing `nextrand` per pixel
+    /// and, on reaching `N_RANDOM`, bumping `iseed` (wrapping) and re-deriving
+    /// `nextrand`.
+    fn unquantize(
+        &self,
+        q: &[i64],
+        tile_index: usize,
+        scale: f64,
+        zero: f64,
+        blank: Option<i64>,
+        tile_npix: usize,
+    ) -> Vec<f64> {
+        let dithered = matches!(
+            self.quantize,
+            Quantize::SubtractiveDither1 | Quantize::SubtractiveDither2
+        );
+        let dither2 = self.quantize == Quantize::SubtractiveDither2;
+
+        // ZDITHER0 defaults to 1 when absent (cfitsio fits_read_compressed_img).
+        let zdither0 = self.zdither0.unwrap_or(1);
+        // cfitsio passes row = tile_index_1based + zdither0 - 1, then iseed = (row-1) % N.
+        let mut iseed = ((tile_index as i64 + zdither0 - 1).rem_euclid(N_RANDOM as i64)) as usize;
+        let mut nextrand = (fits_rand_value(iseed) * 500.0) as usize;
+
+        let mut out = Vec::with_capacity(tile_npix);
+        for &qi in q.iter().take(tile_npix) {
+            // cfitsio computes `x * scale + zero` as a single fused multiply-add
+            // (the C compiler contracts the expression), which we must reproduce
+            // exactly via `mul_add` to be bit-identical in catastrophic-cancellation
+            // cases (large `zero` offsets).
+            let value = if blank.is_some_and(|b| qi == b) {
+                f64::NAN
+            } else if dither2 && qi == ZERO_VALUE {
+                0.0
+            } else if dithered {
+                ((qi as f64) - fits_rand_value(nextrand) as f64 + 0.5).mul_add(scale, zero)
+            } else {
+                (qi as f64).mul_add(scale, zero)
+            };
+            out.push(value);
+
+            if dithered {
+                nextrand += 1;
+                if nextrand == N_RANDOM {
+                    iseed += 1;
+                    if iseed == N_RANDOM {
+                        iseed = 0;
+                    }
+                    nextrand = (fits_rand_value(iseed) * 500.0) as usize;
+                }
+            }
+        }
+        out
+    }
+
+    /// Index of a column by (case-insensitive, trimmed) `TTYPE` name, if present.
+    fn find_column(&self, name: &str) -> Option<usize> {
+        self.table
+            .columns
+            .iter()
+            .position(|c| c.name.trim().eq_ignore_ascii_case(name))
+    }
+
+    /// Read a single `D`/`E` scalar cell as `f64` (per-tile ZSCALE/ZZERO).
+    fn tile_double(&self, row: usize, col: usize) -> Option<f64> {
+        match self.table.get_cell(row, col).ok()? {
+            crate::bintable::BinCellValue::F64(v) => v.first().copied(),
+            crate::bintable::BinCellValue::F32(v) => v.first().map(|&x| x as f64),
+            _ => None,
+        }
+    }
+
+    /// Read a single integer scalar cell as `i64` (per-tile ZBLANK).
+    fn tile_int(&self, row: usize, col: usize) -> Option<i64> {
+        match self.table.get_cell(row, col).ok()? {
+            crate::bintable::BinCellValue::I32(v) => v.first().map(|&x| x as i64),
+            crate::bintable::BinCellValue::I64(v) => v.first().copied(),
+            crate::bintable::BinCellValue::I16(v) => v.first().map(|&x| x as i64),
+            _ => None,
+        }
+    }
+
     /// Resolve the index of the `COMPRESSED_DATA` column.
     fn compressed_data_column(&self) -> Result<usize> {
         self.table
@@ -480,6 +710,82 @@ fn scatter_tile(
     }
 }
 
+// ---------------------------------------------------------------------------
+// Float quantization / subtractive-dithering reconstruction
+// ---------------------------------------------------------------------------
+//
+// Ported from cfitsio (`imcompress.c` / `quantize.c`, R. White & W. Pence, STScI).
+// Quantized float tiles store each pixel as an i32 produced by
+// `round(value/scale - zero/scale + dither)`; we invert with the same fixed
+// pseudo-random `fits_rand_value` table that cfitsio uses.
+
+/// Length of cfitsio's fixed pseudo-random sequence (`N_RANDOM`). Do not change.
+const N_RANDOM: usize = 10000;
+
+/// Sentinel quantized value flagging an undefined (NaN) pixel (cfitsio `NULL_VALUE`).
+/// Carried here for documentation; the actual null sentinel is taken from
+/// `ZBLANK` (keyword or per-tile column), which equals this for cfitsio-written files.
+#[allow(dead_code)]
+const NULL_VALUE: i64 = -2_147_483_647;
+
+/// Sentinel quantized value flagging an exact-zero pixel under
+/// `SUBTRACTIVE_DITHER_2` (cfitsio `ZERO_VALUE`).
+const ZERO_VALUE: i64 = -2_147_483_646;
+
+/// True if a per-tile `ZSCALE` indicates a quantized tile. cfitsio treats a zero
+/// scale (its `cn_zscale == 0` / absent-scale default) as "not quantized".
+fn is_quantized(zscale: f64) -> bool {
+    zscale != 0.0
+}
+
+/// Reinterpret big-endian bytes as `n` floats (f32 if `!is_f64`, else f64), widened
+/// to `f64` for uniform downstream handling. Used for unquantized (raw-float) tiles.
+fn raw_floats(bytes: &[u8], n: usize, is_f64: bool) -> Result<Vec<f64>> {
+    let width = if is_f64 { 8 } else { 4 };
+    if bytes.len() < n * width {
+        return Err(Error::CompressionError(format!(
+            "raw float tile has {} bytes, expected at least {}",
+            bytes.len(),
+            n * width
+        )));
+    }
+    let mut out = Vec::with_capacity(n);
+    for c in bytes.chunks_exact(width).take(n) {
+        let v = if is_f64 {
+            f64::from_be_bytes(c.try_into().unwrap())
+        } else {
+            f32::from_be_bytes([c[0], c[1], c[2], c[3]]) as f64
+        };
+        out.push(v);
+    }
+    Ok(out)
+}
+
+/// The `ii`-th element of cfitsio's fixed pseudo-random sequence.
+///
+/// Generated once (lazily) by the Park–Miller minimal-standard LCG
+/// (`a = 16807`, `m = 2^31 - 1`, seed 1) exactly as cfitsio's `fits_init_randoms`:
+/// `seed_{k+1} = (a*seed_k) mod m`, value = `seed / m`. The 10000-element table is
+/// validated against cfitsio's published checkpoint (final seed `1043618065`).
+fn fits_rand_value(ii: usize) -> f32 {
+    use std::sync::OnceLock;
+    static TABLE: OnceLock<Vec<f32>> = OnceLock::new();
+    let table = TABLE.get_or_init(|| {
+        let a = 16807.0f64;
+        let m = 2_147_483_647.0f64;
+        let mut seed = 1.0f64;
+        let mut v = Vec::with_capacity(N_RANDOM);
+        for _ in 0..N_RANDOM {
+            let temp = a * seed;
+            // cfitsio: seed = temp - m * (int)(temp / m)
+            seed = temp - m * ((temp / m) as i64 as f64);
+            v.push((seed / m) as f32);
+        }
+        v
+    });
+    table[ii % N_RANDOM]
+}
+
 /// Count the contiguous `ZNAMEi`/`ZVALi` parameter pairs present in the header.
 fn count_zname_params(header: &Header) -> usize {
     let mut n = 0;
@@ -941,9 +1247,8 @@ mod tests {
         // tile dims: (0,0)=2x2, (1,0)=1x2, (0,1)=2x1, (1,1)=1x1
         let tile_dims = [vec![2, 2], vec![1, 2], vec![2, 1], vec![1, 1]];
         let mut full = vec![-1i64; 9];
-        for tile in 0..4 {
+        for (tile, td) in tile_dims.iter().enumerate() {
             let coords = unravel(tile, &tiles_per_axis);
-            let td = &tile_dims[tile];
             let n: usize = td.iter().product();
             let vals: Vec<i64> = (0..n).map(|l| (tile as i64) * 100 + l as i64).collect();
             scatter_tile(&mut full, &image_dims, &ztile, td, &coords, &vals);
@@ -1088,22 +1393,110 @@ mod tests {
     }
 
     #[test]
-    fn float_zbitpix_is_unsupported() {
+    fn fits_rand_value_matches_cfitsio_checkpoint() {
+        // cfitsio validates fits_init_randoms by asserting the final LCG seed is
+        // 1043618065 after 10000 iterations. Reproduce that seed and check it, which
+        // exercises the exact same recurrence our table generator uses.
+        let a = 16807.0f64;
+        let m = 2_147_483_647.0f64;
+        let mut seed = 1.0f64;
+        for _ in 0..N_RANDOM {
+            let temp = a * seed;
+            seed = temp - m * ((temp / m) as i64 as f64);
+        }
+        assert_eq!(seed as i64, 1_043_618_065);
+        // All table values lie in [0, 1).
+        for i in [0usize, 1, 499, 5000, N_RANDOM - 1] {
+            let v = fits_rand_value(i);
+            assert!((0.0..1.0).contains(&v), "rand[{i}] = {v} out of range");
+        }
+    }
+
+    #[test]
+    fn nodither_float_constant_tile() {
+        // A NO_DITHER quantized float tile: value = q*scale + zero (no +0.5, no dither).
         use crate::bintable::{BinColumnType, BinTableBuilder};
+
+        // 2-pixel image, single row tile, quantized ints [10, 20], scale 0.5, zero 3.0.
+        let q = [10i32, 20];
+        let enc = rice_encode_i32(&q, 2);
+
         let table = BinTableBuilder::new()
             .add_column("COMPRESSED_DATA", BinColumnType::VarP('B'))
-            .push_row(|r| r.write_var_p(0, |_| {}))
+            .add_column("ZSCALE", BinColumnType::D64(1))
+            .add_column("ZZERO", BinColumnType::D64(1))
+            .push_row(|r| {
+                r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc));
+                r.write_f64(0.5);
+                r.write_f64(3.0);
+            })
             .build();
+
         let mut h = Header::new();
         h.set("ZIMAGE", HeaderValue::Logical(true), None);
         h.set("ZCMPTYPE", HeaderValue::String("RICE_1".into()), None);
         h.set("ZBITPIX", HeaderValue::Integer(-32), None);
         h.set("ZNAXIS", HeaderValue::Integer(1), None);
         h.set("ZNAXIS1", HeaderValue::Integer(2), None);
+        h.set("ZQUANTIZ", HeaderValue::String("NO_DITHER".into()), None);
+        h.set("ZNAME1", HeaderValue::String("BYTEPIX".into()), None);
+        h.set("ZVAL1", HeaderValue::Integer(4), None);
+
         let cimg = CompressedImage::from_bintable(&h, &table).unwrap();
-        assert!(matches!(
-            cimg.decompress(),
-            Err(Error::UnsupportedCompression(_))
-        ));
+        let img = cimg.decompress().unwrap();
+        match img.pixels {
+            crate::image_data::PixelData::F32(v) => {
+                assert_eq!(v, vec![10.0 * 0.5 + 3.0, 20.0 * 0.5 + 3.0]);
+            }
+            other => panic!("expected F32, got {other:?}"),
+        }
+    }
+
+    #[test]
+    fn dither2_preserves_zero_and_blank_maps_to_nan() {
+        use crate::bintable::{BinColumnType, BinTableBuilder};
+
+        // 3-pixel tile: [ZERO_VALUE, NULL/blank, 5]. DITHER_2 => [0.0, NaN, dithered].
+        // Use NOCOMPRESS so the quantized ints are stored verbatim (RICE encoding of
+        // these extreme values is awkward and unrelated to what we're testing here).
+        let q = [ZERO_VALUE as i32, NULL_VALUE as i32, 5];
+        let mut enc = Vec::new();
+        for v in q {
+            enc.extend_from_slice(&v.to_be_bytes());
+        }
+
+        let table = BinTableBuilder::new()
+            .add_column("COMPRESSED_DATA", BinColumnType::VarP('B'))
+            .add_column("ZSCALE", BinColumnType::D64(1))
+            .add_column("ZZERO", BinColumnType::D64(1))
+            .push_row(|r| {
+                r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc));
+                r.write_f64(2.0);
+                r.write_f64(1.0);
+            })
+            .build();
+
+        let mut h = Header::new();
+        h.set("ZIMAGE", HeaderValue::Logical(true), None);
+        h.set("ZCMPTYPE", HeaderValue::String("NOCOMPRESS".into()), None);
+        h.set("ZBITPIX", HeaderValue::Integer(-32), None);
+        h.set("ZNAXIS", HeaderValue::Integer(1), None);
+        h.set("ZNAXIS1", HeaderValue::Integer(3), None);
+        h.set("ZQUANTIZ", HeaderValue::String("SUBTRACTIVE_DITHER_2".into()), None);
+        h.set("ZDITHER0", HeaderValue::Integer(5), None);
+        h.set("ZBLANK", HeaderValue::Integer(NULL_VALUE), None);
+        h.set("ZNAME1", HeaderValue::String("BYTEPIX".into()), None);
+        h.set("ZVAL1", HeaderValue::Integer(4), None);
+
+        let cimg = CompressedImage::from_bintable(&h, &table).unwrap();
+        let img = cimg.decompress().unwrap();
+        match img.pixels {
+            crate::image_data::PixelData::F32(v) => {
+                assert_eq!(v[0], 0.0); // ZERO_VALUE preserved exactly
+                assert!(v[1].is_nan()); // blank -> NaN
+                assert!(v[2].is_finite() && v[2] != 0.0);
+            }
+            other => panic!("expected F32, got {other:?}"),
+        }
     }
 }
diff --git a/tests/sample_files.rs b/tests/sample_files.rs
index 981877c..c883a0e 100644
--- a/tests/sample_files.rs
+++ b/tests/sample_files.rs
@@ -266,27 +266,170 @@ fn rice_roundtrip_fgs_i32() {
     assert_rice_roundtrip("FGSf64y0106m_a1f.fits", "FGSf64y0106m_a1f.rice.fits.fz");
 }
 
-/// Float (`ZBITPIX < 0`) RICE fixtures are out of scope: decoding must return the
-/// documented `Err(UnsupportedCompression)` rather than wrong pixels.
-#[test]
-fn rice_float_fixture_is_unsupported() {
-    let fz_path = samp("FOCx38i0101t_c0f.rice_nodith.fits.fz");
+// --- float tile-compression: bit-exact vs funpack -------------------------
+//
+// Quantized-float decompression is *lossy*, so the reconstructed floats will not
+// equal the original uncompressed sample. The authoritative oracle is funpack's own
+// reconstruction: fits4 and funpack must apply the identical dither table + per-tile
+// scaling, so fits4's output must be BIT-IDENTICAL to funpack's. We invoke the
+// `funpack` CLI at test time and compare; the test skips cleanly when either the
+// fixture or the `funpack` binary is absent (so CI without cfitsio stays green).
+
+/// True if a `funpack` binary is on PATH.
+fn funpack_available() -> bool {
+    std::process::Command::new("funpack")
+        .arg("-V")
+        .output()
+        .map(|o| o.status.success())
+        .unwrap_or(false)
+}
+
+/// Decompress `fz_name` with the `funpack` CLI into a temp `.fits` and read it back.
+fn funpack_reference(fz_name: &str) -> Option<FitsFile> {
+    let fz_path = samp(fz_name);
+    let mut out = std::env::temp_dir();
+    out.push(format!(
+        "fits4_funpack_ref_{}_{}.fits",
+        std::process::id(),
+        fz_name.replace(['/', '.'], "_")
+    ));
+    let _ = std::fs::remove_file(&out);
+    let status = std::process::Command::new("funpack")
+        .arg("-O")
+        .arg(&out)
+        .arg("-C") // don't update checksums
+        .arg(&fz_path)
+        .status()
+        .ok()?;
+    if !status.success() {
+        return None;
+    }
+    let f = FitsFile::from_file(&out).ok();
+    let _ = std::fs::remove_file(&out);
+    f
+}
+
+/// Assert that fits4's float decompression of every compressed-image HDU in `fz_name`
+/// is byte-exact against funpack's reconstruction of the same file.
+fn assert_float_matches_funpack(fz_name: &str) {
+    let fz_path = samp(fz_name);
     if !fz_path.exists() {
-        eprintln!("skipping: fixture not present");
+        eprintln!("skipping: fixture {fz_name} not present");
+        return;
+    }
+    if !funpack_available() {
+        eprintln!("skipping: funpack binary not available");
         return;
     }
+    let reference = match funpack_reference(fz_name) {
+        Some(r) => r,
+        None => {
+            eprintln!("skipping: funpack failed on {fz_name}");
+            return;
+        }
+    };
+
+    // funpack reference image HDUs (the reconstructed floats), in order.
+    let ref_images: Vec<&ImageData> = reference
+        .hdus
+        .iter()
+        .filter_map(|h| match &h.data {
+            HduData::Image(im) if !im.pixels.to_bytes().is_empty() => Some(im),
+            _ => None,
+        })
+        .collect();
+
     let fz = FitsFile::from_file(&fz_path).expect("read .fz");
-    let mut checked = false;
+    let mut ref_iter = ref_images.iter();
+    let mut matched = 0usize;
     for hdu in &fz.hdus {
         if let Some(cimg) = hdu.as_compressed_image() {
-            assert!(
-                matches!(cimg.decompress(), Err(Error::UnsupportedCompression(_))),
-                "float compressed image should be unsupported"
+            let reference_img = ref_iter
+                .next()
+                .expect("more compressed images than funpack reference images");
+            let dec = cimg.decompress().expect("fits4 float decompress");
+            assert_eq!(
+                dec.axes, reference_img.axes,
+                "{fz_name}: axes mismatch on compressed HDU #{matched}"
             );
-            checked = true;
+            compare_floats_vs_funpack(fz_name, matched, &dec.pixels, &reference_img.pixels);
+            matched += 1;
         }
     }
-    assert!(checked, "expected a compressed-image HDU");
+    assert!(matched > 0, "{fz_name}: found no compressed-image HDUs");
+}
+
+/// Compare fits4's reconstructed floats against funpack's, element-by-element.
+///
+/// fits4 reproduces cfitsio's unquantization *exactly*, including the fused
+/// multiply-add (`x*scale + zero` contracted to a single FMA) that the cfitsio C code
+/// relies on, so every reconstructed pixel must be bit-identical to funpack's. NaNs
+/// (from `ZBLANK`) only have to match as NaN — IEEE leaves the payload unspecified.
+fn compare_floats_vs_funpack(fz_name: &str, hdu: usize, got: &PixelData, want: &PixelData) {
+    fn cmp(fz_name: &str, hdu: usize, a: f64, b: f64) {
+        if a.is_nan() && b.is_nan() {
+            return;
+        }
+        assert!(
+            a.to_bits() == b.to_bits(),
+            "{fz_name} HDU#{hdu}: fits4={a} ({:#x}) != funpack={b} ({:#x})",
+            a.to_bits(),
+            b.to_bits()
+        );
+    }
+    match (got, want) {
+        (PixelData::F32(g), PixelData::F32(w)) => {
+            assert_eq!(g.len(), w.len(), "{fz_name} HDU#{hdu}: length mismatch");
+            for (x, y) in g.iter().zip(w.iter()) {
+                cmp(fz_name, hdu, *x as f64, *y as f64);
+            }
+        }
+        (PixelData::F64(g), PixelData::F64(w)) => {
+            assert_eq!(g.len(), w.len(), "{fz_name} HDU#{hdu}: length mismatch");
+            for (x, y) in g.iter().zip(w.iter()) {
+                cmp(fz_name, hdu, *x, *y);
+            }
+        }
+        _ => panic!("{fz_name} HDU#{hdu}: pixel type mismatch vs funpack reference"),
+    }
+}
+
+#[test]
+fn float_rice_nodither_matches_funpack() {
+    // 1024x1024 F32, RICE_1, ZQUANTIZ=NO_DITHER.
+    assert_float_matches_funpack("FOCx38i0101t_c0f.rice_nodith.fits.fz");
+}
+
+#[cfg(feature = "gzip")]
+#[test]
+fn float_gzip_lossless_matches_funpack() {
+    // 1024x1024 F32, GZIP_1, ZQUANTIZ=NONE (raw floats, no quantization). Needs gzip.
+    assert_float_matches_funpack("FOCx38i0101t_c0f.gzip_lossless.fits.fz");
+}
+
+#[test]
+fn float_rice_dither1_matches_funpack() {
+    // 1024x1024 F32, RICE_1, ZQUANTIZ=SUBTRACTIVE_DITHER_1.
+    assert_float_matches_funpack("FOCx38i0101t_c0f.rice_dith.fits.fz");
+}
+
+#[test]
+fn float_rice_dither2_matches_funpack() {
+    // 1024x1024 F32, RICE_1, ZQUANTIZ=SUBTRACTIVE_DITHER_2 (preserves exact zeros),
+    // generated by scripts/gen_compressed_fixtures.sh with `fpack -r -qz5 16`.
+    assert_float_matches_funpack("FOCx38i0101t_c0f.rice_dith2.fits.fz");
+}
+
+#[test]
+fn float_cube_rice_dither1_matches_funpack() {
+    // 200x200x4 F32 cube, RICE_1, SUBTRACTIVE_DITHER_1.
+    assert_float_matches_funpack("WFPC2u5780205r_c0fx.rice_dith.fits.fz");
+}
+
+#[test]
+fn float_cube_rice_nodither_matches_funpack() {
+    // 200x200x4 F32 cube, RICE_1, NO_DITHER.
+    assert_float_matches_funpack("WFPC2u5780205r_c0fx.rice_nodith.fits.fz");
 }
 
 #[cfg(feature = "gzip")]

From 9f3f805686a6153f54490d45d4434cc25ce8324d Mon Sep 17 00:00:00 2001
From: Steven Michael <ssmichael@gmail.com>
Date: Sun, 31 May 2026 14:35:08 -0400
Subject: [PATCH 3/9] Add PLIO_1 and HCOMPRESS_1 tile decompression
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Complete the tile-compression read path with the remaining two
algorithms; all image compression types in the standard now decode.

- PLIO_1 (plio_decompress): port of cfitsio pl_l2pi line-list decode.
  The 1PI VLA is read as big-endian i16 words; opcodes 0..7 reconstruct
  the integer mask. Byte-exact (lossless). Note: the ll_src[3] header
  test is a signed i16 comparison (magic -100) — comparing unsigned
  decodes garbage.
- HCOMPRESS_1 (hcompress_decompress): full port of fits_hdecompress /
  fits_hdecompress64 — quadtree bit-plane decode, undigitize, inverse
  H-transform, and unshuffle. 32-bit transform for ZBITPIX 8/16, 64-bit
  otherwise (avoids overflow); all arithmetic wrapping to match C.
  Lossless for integer scale=0; float feeds the existing unquantize +
  dither path. SMOOTH != 0 is rejected with a documented Err.
- Tests: plio_roundtrip_euv and hcompress_int_roundtrip_euv (byte-exact
  vs original), float_hcompress_dither1_matches_funpack (bit-exact vs
  funpack), plus PLIO unit tests. Skip-if-absent as before.
- Added a lossless integer HCOMPRESS fixture (fpack -h -s 0); CI cache
  key v3 -> v4, new fixtures added to best-effort download.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 .github/workflows/ci.yml           |    5 +-
 scripts/gen_compressed_fixtures.sh |    3 +
 src/tile_compress.rs               | 1064 +++++++++++++++++++++++++++-
 tests/sample_files.rs              |   26 +
 4 files changed, 1072 insertions(+), 26 deletions(-)

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 3df1082..8b100b7 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -26,7 +26,7 @@ jobs:
         uses: actions/cache@v4
         with:
           path: samp
-          key: fits4-samples-v3
+          key: fits4-samples-v4
 
       - name: Download sample FITS files
         if: steps.cache-samples.outputs.cache-hit != 'true'
@@ -50,6 +50,9 @@ jobs:
             FGSf64y0106m_a1f.rice.fits.fz \
             EUVEngc4151imgx.gzip1.fits.fz \
             EUVEngc4151imgx.gzip2.fits.fz \
+            EUVEngc4151imgx.plio.fits.fz \
+            EUVEngc4151imgx.hcomp_int.fits.fz \
+            FOCx38i0101t_c0f.hcomp.fits.fz \
             FOCx38i0101t_c0f.rice_nodith.fits.fz \
             FOCx38i0101t_c0f.rice_dith.fits.fz \
             FOCx38i0101t_c0f.rice_dith2.fits.fz \
diff --git a/scripts/gen_compressed_fixtures.sh b/scripts/gen_compressed_fixtures.sh
index 9e56922..b216a05 100755
--- a/scripts/gen_compressed_fixtures.sh
+++ b/scripts/gen_compressed_fixtures.sh
@@ -93,6 +93,9 @@ compress EUVEngc4151imgx.fits     gzip1  -g
 compress EUVEngc4151imgx.fits     gzip2  -g2
 # PLIO_1 targets integer (mask-like) data, <= 24-bit; fine for I16.
 compress EUVEngc4151imgx.fits     plio   -p
+# HCOMPRESS with scale 0 is LOSSLESS for integer images (no quantization step),
+# so the decode must reproduce the source pixels byte-exactly.
+compress EUVEngc4151imgx.fits     hcomp_int   -h -s 0
 # Force square tiles to exercise 2-D reassembly with edge-truncated tiles.
 compress EUVEngc4151imgx.fits     rice_t100   -r -t 100,100
 
diff --git a/src/tile_compress.rs b/src/tile_compress.rs
index 94790ac..8a2035a 100644
--- a/src/tile_compress.rs
+++ b/src/tile_compress.rs
@@ -11,15 +11,11 @@
 //!
 //! # Status
 //!
-//! This module is being built up in phases (see `COMPRESSION_PLAN.md`):
-//!
-//! - Phase 0 (current): scaffolding + the [`rice_decompress_i32`] family of
-//!   decoders with unit tests.
-//! - Phase 1: RICE_1 integer read path ([`CompressedImage::decompress`]).
-//! - Later: GZIP_1/GZIP_2, float quantization/dithering, PLIO_1, HCOMPRESS_1, and a
-//!   compression (write) path.
-//!
-//! Most of the high-level pipeline is currently stubbed with `todo!()`.
+//! Read (decompression) support is implemented for `RICE_1`, `GZIP_1`/`GZIP_2`
+//! (behind the `gzip` feature), `PLIO_1`, `HCOMPRESS_1` (`SMOOTH=0`), and
+//! `NOCOMPRESS`, for integer originals and for quantized/lossless float originals
+//! (with `SUBTRACTIVE_DITHER_1`/`_2` reversal). The compression (write) path is
+//! not yet implemented (see `COMPRESSION_PLAN.md`).
 
 use crate::bintable::BinTable;
 use crate::error::{Error, Result};
@@ -162,6 +158,8 @@ pub struct CompressedImage<'a> {
     pub blocksize: usize,
     /// RICE `BYTEPIX` parameter (default 4).
     pub bytepix: usize,
+    /// HCOMPRESS `SMOOTH` parameter (default 0 = no smoothing).
+    pub smooth: i32,
     /// The borrowed binary table holding the compressed tiles + heap.
     /// Read once [`CompressedImage::decompress`] is implemented (phase 1).
     #[allow(dead_code)]
@@ -197,13 +195,15 @@ impl<'a> CompressedImage<'a> {
         // Default RICE parameters; overridden by ZNAMEi='BLOCKSIZE'/'BYTEPIX'.
         let mut blocksize = 32usize;
         let mut bytepix = 4usize;
+        let mut smooth = 0i32;
         let nzparams = count_zname_params(header);
         for i in 1..=nzparams {
             if let Some(name) = header.get_string(&format!("ZNAME{i}")) {
-                let val = header.get_int(&format!("ZVAL{i}")).unwrap_or(0) as usize;
+                let val = header.get_int(&format!("ZVAL{i}")).unwrap_or(0);
                 match name.trim().to_ascii_uppercase().as_str() {
-                    "BLOCKSIZE" => blocksize = val,
-                    "BYTEPIX" => bytepix = val,
+                    "BLOCKSIZE" => blocksize = val as usize,
+                    "BYTEPIX" => bytepix = val as usize,
+                    "SMOOTH" => smooth = val as i32,
                     _ => {}
                 }
             }
@@ -218,6 +218,7 @@ impl<'a> CompressedImage<'a> {
             blank,
             blocksize,
             bytepix,
+            smooth,
             table,
         })
     }
@@ -246,12 +247,15 @@ impl<'a> CompressedImage<'a> {
     /// subtractive dithering (`ZQUANTIZ` = `SUBTRACTIVE_DITHER_1`/`_2`); `ZBLANK`
     /// sentinels map to `NaN`.
     ///
-    /// # Not yet supported (see `COMPRESSION_PLAN.md`)
+    /// `PLIO_1` (IRAF pixel-list RLE) and `HCOMPRESS_1` (H-transform + quadtree)
+    /// integer tiles are decoded by their dedicated per-tile decoders and scattered
+    /// like any other integer codec.
+    ///
+    /// # Not supported
     ///
-    /// - `PLIO_1` — phase 4.
-    /// - `HCOMPRESS_1` — phase 5.
     /// - `GZIP_1`/`GZIP_2` without the `gzip` feature return
     ///   [`Error::UnsupportedCompression`].
+    /// - `HCOMPRESS_1` with `SMOOTH != 0` (smoothing) is rejected.
     pub fn decompress(&self) -> Result<crate::image_data::ImageData> {
         use crate::image_data::{ImageData, PixelData};
         use crate::types::Bitpix;
@@ -282,13 +286,26 @@ impl<'a> CompressedImage<'a> {
         let cdata_col = self.compressed_data_column()?;
 
         for tile_index in 0..num_tiles {
-            let raw = self.tile_bytes(tile_index, cdata_col)?;
             // Number of pixels in this (possibly edge-truncated) tile.
             let coords = unravel(tile_index, &tiles_per_axis);
             let tile_dims = self.tile_dims_at(&coords);
             let tile_npix: usize = tile_dims.iter().product();
 
-            let values = self.decode_tile(&raw, tile_npix)?;
+            // PLIO_1 stores its line list as a `1PI` (i16) VLA; every other codec
+            // uses a byte (`1PB`) VLA. Fetch the cell in its native form and decode.
+            let values = if self.ctype == CompressionType::Plio1 {
+                let words = self.tile_words_i16(tile_index, cdata_col)?;
+                plio_decompress(&words, tile_npix)?
+                    .into_iter()
+                    .map(|v| v as i64)
+                    .collect()
+            } else if self.ctype == CompressionType::Hcompress1 {
+                let raw = self.tile_bytes(tile_index, cdata_col)?;
+                self.decode_hcompress(&raw, &tile_dims)?
+            } else {
+                let raw = self.tile_bytes(tile_index, cdata_col)?;
+                self.decode_tile(&raw, tile_npix)?
+            };
             if values.len() != tile_npix {
                 return Err(Error::CompressionError(format!(
                     "tile {tile_index} decoded {} values, expected {tile_npix}",
@@ -378,7 +395,11 @@ impl<'a> CompressedImage<'a> {
             let tile_floats: Vec<f64> = match (zscale, zzero) {
                 // Quantized tile: COMPRESSED_DATA holds quantized integers.
                 (Some(scale), Some(zero)) if !cdata.is_empty() && is_quantized(scale) => {
-                    let q = self.decode_tile(&cdata, tile_npix)?;
+                    let q = if self.ctype == CompressionType::Hcompress1 {
+                        self.decode_hcompress(&cdata, &tile_dims)?
+                    } else {
+                        self.decode_tile(&cdata, tile_npix)?
+                    };
                     self.unquantize(&q, tile_index, scale, zero, blank, tile_npix)
                 }
                 // Unquantized tile (or whole image is lossless): raw floats live in
@@ -563,6 +584,48 @@ impl<'a> CompressedImage<'a> {
         }
     }
 
+    /// Read one tile's `1PI`/`1QI` cell as native `i16` words (PLIO_1 line list).
+    fn tile_words_i16(&self, row: usize, col: usize) -> Result<Vec<i16>> {
+        match self.table.get_cell(row, col)? {
+            crate::bintable::BinCellValue::I16(v) => Ok(v),
+            // Some writers may declare the PLIO list as a byte VLA; reinterpret as
+            // big-endian i16 pairs in that case.
+            crate::bintable::BinCellValue::Bytes(b) => Ok(b
+                .chunks_exact(2)
+                .map(|c| i16::from_be_bytes([c[0], c[1]]))
+                .collect()),
+            other => Err(Error::CompressionError(format!(
+                "PLIO_1 COMPRESSED_DATA cell is not an i16 VLA: {other:?}"
+            ))),
+        }
+    }
+
+    /// Decode one HCOMPRESS_1 tile into a flat array of `i64` integer samples
+    /// (axis-1 fastest, matching the tile's pixel layout).
+    ///
+    /// The tile dimensions (`nx`, `ny`) and digitization `scale` are read from the
+    /// encoded stream itself; `tile_dims` is `[width, height]` (axis-1 first) and is
+    /// used only to validate the decoded size. HCOMPRESS lays out its output array
+    /// with `ny` (== tile width, the FITS fast axis) varying fastest, which already
+    /// matches the axis-1-fastest order the scatter step expects.
+    fn decode_hcompress(&self, raw: &[u8], tile_dims: &[usize]) -> Result<Vec<i64>> {
+        // SMOOTH parameter (ZNAMEi='SMOOTH'); default 0.
+        let smooth = self.smooth;
+        // cfitsio uses the 32-bit H-transform (`fits_hdecompress`) only for
+        // ZBITPIX 8/16; every other original type (32-bit ints, and quantized
+        // -32/-64 floats) uses the 64-bit transform to avoid intermediate overflow.
+        let wide = !(self.zbitpix == 8 || self.zbitpix == 16);
+        let (a, nx, ny) = hcompress_decompress(raw, wide, smooth)?;
+        let tile_npix: usize = tile_dims.iter().product();
+        if nx * ny != tile_npix {
+            return Err(Error::CompressionError(format!(
+                "HCOMPRESS tile size {nx}x{ny} = {} != expected {tile_npix}",
+                nx * ny
+            )));
+        }
+        Ok(a)
+    }
+
     /// Tile dimensions at tile-grid coordinates `coords`, accounting for edge tiles
     /// that are truncated when `ZNAXISn` is not a multiple of `ZTILEn`.
     fn tile_dims_at(&self, coords: &[usize]) -> Vec<usize> {
@@ -607,11 +670,14 @@ impl<'a> CompressedImage<'a> {
                 let unshuffled = gzip2_unshuffle(&inflated, self.zbitpix_bytes());
                 self.bytes_to_ints(&unshuffled, tile_npix)
             }
-            CompressionType::Plio1 => Err(Error::UnsupportedCompression(
-                "PLIO_1 decode is phase 4 (see COMPRESSION_PLAN.md)".into(),
+            // PLIO_1 and HCOMPRESS_1 are dispatched to their dedicated per-tile
+            // decoders (`plio_decompress` / `decode_hcompress`) in `decompress` /
+            // `decompress_float`, so they never reach this generic byte path.
+            CompressionType::Plio1 => Err(Error::CompressionError(
+                "internal: PLIO_1 must be decoded via plio_decompress".into(),
             )),
-            CompressionType::Hcompress1 => Err(Error::UnsupportedCompression(
-                "HCOMPRESS_1 decode is phase 5 (see COMPRESSION_PLAN.md)".into(),
+            CompressionType::Hcompress1 => Err(Error::CompressionError(
+                "internal: HCOMPRESS_1 must be decoded via decode_hcompress".into(),
             )),
         }
     }
@@ -1091,9 +1157,930 @@ fn gzip2_unshuffle(shuffled: &[u8], _bytepix: usize) -> Vec<u8> {
     shuffled.to_vec()
 }
 
-/// Decompress a PLIO_1 (IRAF pixel-list RLE) tile into i32 mask values. TODO(phase 4).
-pub fn plio_decompress(_src: &[u8], _nvals: usize) -> Result<Vec<i32>> {
-    todo!("phase 4: PLIO_1 pixel-list decode")
+// ---------------------------------------------------------------------------
+// PLIO_1 decompression (IRAF pixel-list run-length encoding)
+// ---------------------------------------------------------------------------
+//
+// Port of cfitsio `pliocomp.c` `pl_l2pi` (D. Tody, NRAO; the IRAF PLIO scheme).
+// A tile is stored as a *line list*: a stream of 16-bit instruction words. Each
+// word is `(opcode << 12) | data`, where `opcode` is the high nibble (0..15) and
+// `data` is the low 12 bits. The list reconstructs a 1-D run of `npix`
+// non-negative integers ("pixels"), tracking a running "high value" `pv`.
+//
+// In the FITS tiled-image convention the line list is stored in the
+// `COMPRESSED_DATA` column as a `1PI`/`1QI` variable-length array of *big-endian*
+// 16-bit values (cfitsio reads it with `fits_read_col(TSHORT, ...)`), so the
+// caller hands us the already-byte-swapped `i16` words.
+//
+// Opcodes (after cfitsio's `opcode = word/4096; data = word & 4095`; the C switch
+// is on `opcode+1`, reproduced here on `opcode` directly):
+//   * 0  (Z run)         — `data` zeros.
+//   * 1  (HD, high+data) — set high value: `pv = (next_word << 12) + data`; the
+//                          following word is consumed as the high bits.
+//   * 2  (IH, inc high)  — `pv += data`.
+//   * 3  (DH, dec high)  — `pv -= data`.
+//   * 4  (PD, pixel data)— `data` pixels at value `pv`.
+//   * 5  (PS, pixel set) — `data` zeros, then the *last* pixel of the run is set
+//                          to `pv` (single non-zero pixel at the end of the span).
+//   * 6  (IS, inc set)   — `pv += data`, then emit one pixel at `pv`.
+//   * 7  (DS, dec set)   — `pv -= data`, then emit one pixel at `pv`.
+// (cfitsio's switch maps opcodes 0,4,5 all through the "run of length `data`"
+// label L160; opcode 0 is a pure zero run, 4 fills with `pv`, 5 fills with zeros
+// then sets the last element. Opcodes 1..3 adjust `pv` without emitting span
+// pixels, and 6/7 adjust `pv` and emit a single pixel.)
+
+/// Decompress a PLIO_1 (IRAF pixel-list RLE) tile into `nvals` `i32` values.
+///
+/// `words` is the line-list instruction stream (already decoded from the
+/// big-endian `1PI` VLA into native `i16`). Pixels are non-negative; any pixels
+/// not explicitly written are zero.
+pub fn plio_decompress(words: &[i16], nvals: usize) -> Result<Vec<i32>> {
+    // Treat the instruction words as unsigned 16-bit (the high nibble is the
+    // opcode; PLIO values never use the sign bit).
+    let w: Vec<u32> = words.iter().map(|&v| (v as u16) as u32).collect();
+
+    let mut out = vec![0i32; nvals];
+    if nvals == 0 {
+        return Ok(out);
+    }
+    if w.len() < 3 {
+        return Err(Error::CompressionError(
+            "PLIO_1 line list too short (need >= 3 header words)".into(),
+        ));
+    }
+
+    // List length and index of the first instruction word (cfitsio uses 1-based
+    // indices via `--ll_src`; here `w[0]` == cfitsio `ll_src[1]`, so cfitsio
+    // `ll_src[k]` == `w[k-1]`).
+    //
+    //   if ll_src[3] > 0:   lllen = ll_src[3];                 llfirt = 4
+    //   else:               lllen = (ll_src[5]<<15)+ll_src[4]; llfirt = ll_src[2]+1
+    // The `ll_src[3] > 0` test is a SIGNED 16-bit comparison: the standard PLIO
+    // header stores -100 there, so the long form (else branch) is the usual path.
+    let (lllen, llfirt) = if words[2] > 0 {
+        (w[2] as usize, 4usize)
+    } else {
+        if w.len() < 5 {
+            return Err(Error::CompressionError(
+                "PLIO_1 long-form line list too short".into(),
+            ));
+        }
+        let len = ((w[4] << 15) + w[3]) as usize;
+        (len, (w[1] as usize) + 1)
+    };
+
+    if lllen == 0 {
+        return Ok(out); // empty list => all zeros
+    }
+
+    // Pixel-output cursor `op` (1-based in cfitsio; `op==1` means out[0]).
+    let mut op: usize = 1; // index into 1-based px_dst
+    let mut x1: i64 = 1; // current x position (1-based)
+    let mut pv: i64 = 1; // running "high value"
+    let xs: i64 = 1; // starting index (cfitsio passes xs=1)
+    let xe: i64 = nvals as i64; // end index inclusive
+
+    let put = |out: &mut [i32], op: usize, v: i64| {
+        // 1-based op -> 0-based; ignore writes past the tile (defensive).
+        if op >= 1 && op <= out.len() {
+            out[op - 1] = v as i32;
+        }
+    };
+
+    // ip iterates cfitsio ll_src indices llfirt..=lllen (1-based) => w[ip-1].
+    let mut ip = llfirt;
+    let mut skipwd = false;
+    'outer: while ip <= lllen {
+        if ip == 0 || ip > w.len() {
+            break;
+        }
+        if skipwd {
+            skipwd = false;
+            ip += 1;
+            continue;
+        }
+        let word = w[ip - 1];
+        let opcode = word / 4096;
+        let data = (word & 4095) as i64;
+
+        match opcode {
+            // L160: run of length `data` (opcodes 0, 4, 5).
+            0 | 4 | 5 => {
+                let x2 = x1 + data - 1;
+                let i1 = x1.max(xs);
+                let i2 = x2.min(xe);
+                let np = i2 - i1 + 1;
+                if np > 0 {
+                    let otop = op as i64 + np - 1;
+                    if opcode == 4 {
+                        for i in op..=(otop as usize) {
+                            put(&mut out, i, pv);
+                        }
+                    } else {
+                        // opcodes 0 and 5: zeros (out is already zero-initialized).
+                        if opcode == 5 && i2 == x2 {
+                            put(&mut out, otop as usize, pv);
+                        }
+                    }
+                    op = (otop + 1) as usize;
+                }
+                x1 = x2 + 1;
+            }
+            // L220: set high value from this+next word.
+            1 => {
+                if ip >= w.len() {
+                    return Err(Error::CompressionError(
+                        "PLIO_1 truncated high-value instruction".into(),
+                    ));
+                }
+                pv = ((w[ip] as i64) << 12) + data;
+                skipwd = true;
+            }
+            // L230: increment high value.
+            2 => {
+                pv += data;
+            }
+            // L240: decrement high value.
+            3 => {
+                pv -= data;
+            }
+            // L250: increment high value and emit one pixel.
+            6 => {
+                pv += data;
+                if x1 >= xs && x1 <= xe {
+                    put(&mut out, op, pv);
+                    op += 1;
+                }
+                x1 += 1;
+            }
+            // L260: decrement high value and emit one pixel.
+            7 => {
+                pv -= data;
+                if x1 >= xs && x1 <= xe {
+                    put(&mut out, op, pv);
+                    op += 1;
+                }
+                x1 += 1;
+            }
+            _ => {
+                // opcodes 8..15 are unused by the encoder; cfitsio falls through
+                // (no-op) on them. Match that behaviour.
+            }
+        }
+
+        if x1 > xe {
+            break 'outer;
+        }
+        ip += 1;
+    }
+
+    // Remaining pixels are zero (already initialized).
+    Ok(out)
+}
+
+// ---------------------------------------------------------------------------
+// HCOMPRESS_1 decompression
+// ---------------------------------------------------------------------------
+//
+// Port of cfitsio `fits_hdecompress` / `fits_hdecompress64` (the concatenation of
+// R. White's hinv.c / undigitize.c / decode.c / dodecode.c / qtree_decode.c /
+// qread.c / bit_input.c in STScI's hcompress distribution).
+//
+// Pipeline (read direction):
+//   1. `decode`  — parse the byte stream header (2-byte magic 0xDD 0x99, then
+//      big-endian `nx`, `ny`, `scale` as 4-byte ints, an 8-byte `sumall`, and a
+//      3-byte `nbitplanes`), then `dodecode` the quadtree-coded bit planes of the
+//      four image quadrants into the transform-coefficient array `a`, and finally
+//      the per-element sign bits. `a[0]` is overwritten with `sumall`.
+//   2. `undigitize` — multiply every coefficient by `scale` (the quantization
+//      step; `scale <= 1` is a no-op, i.e. lossless).
+//   3. `hinv` — inverse H-transform, expanding the coefficients back into pixels.
+//
+// The array `a` is indexed as `a[i*ny + j]` (ny = fast axis = FITS axis-1 width),
+// so its flat order already matches the tile's axis-1-fastest pixel layout.
+//
+// cfitsio uses 32-bit ints for ZBITPIX 8/16 and 64-bit ints for everything else
+// (incl. quantized floats), because the H-transform intermediate sums can exceed
+// 32 bits. We reproduce both via a macro, using wrapping arithmetic to match C's
+// 2's-complement overflow behaviour exactly.
+
+/// Bit/byte reader for the HCOMPRESS stream (mirrors cfitsio's global
+/// `nextchar` + `buffer2`/`bits_to_go` state machine, but as a struct).
+struct HcInput<'a> {
+    data: &'a [u8],
+    nextchar: usize,
+    buffer2: i32,
+    bits_to_go: i32,
+}
+
+impl<'a> HcInput<'a> {
+    fn new(data: &'a [u8]) -> Self {
+        HcInput {
+            data,
+            nextchar: 0,
+            buffer2: 0,
+            bits_to_go: 0,
+        }
+    }
+
+    #[inline]
+    fn next_byte(&mut self) -> Result<i32> {
+        let b = *self
+            .data
+            .get(self.nextchar)
+            .ok_or_else(|| Error::CompressionError("HCOMPRESS: unexpected end of stream".into()))?;
+        self.nextchar += 1;
+        Ok(b as i32)
+    }
+
+    /// Read `n` raw bytes (no bit buffering); cfitsio `qread`.
+    fn qread(&mut self, n: usize) -> Result<&[u8]> {
+        let start = self.nextchar;
+        let end = start + n;
+        if end > self.data.len() {
+            return Err(Error::CompressionError(
+                "HCOMPRESS: unexpected end of stream (qread)".into(),
+            ));
+        }
+        self.nextchar = end;
+        Ok(&self.data[start..end])
+    }
+
+    /// Read a big-endian 4-byte int (cfitsio `readint`).
+    fn readint(&mut self) -> Result<i32> {
+        let b = self.qread(4)?;
+        let mut a = b[0] as i32;
+        for &x in &b[1..4] {
+            a = (a << 8) + x as i32;
+        }
+        Ok(a)
+    }
+
+    /// Read a big-endian 8-byte long long (cfitsio `readlonglong`).
+    fn readlonglong(&mut self) -> Result<i64> {
+        let b = self.qread(8)?;
+        let mut a = b[0] as i64;
+        for &x in &b[1..8] {
+            a = (a << 8) + x as i64;
+        }
+        Ok(a)
+    }
+
+    fn start_inputing_bits(&mut self) {
+        self.bits_to_go = 0;
+    }
+
+    fn input_bit(&mut self) -> Result<i32> {
+        if self.bits_to_go == 0 {
+            self.buffer2 = self.next_byte()?;
+            self.bits_to_go = 8;
+        }
+        self.bits_to_go -= 1;
+        Ok((self.buffer2 >> self.bits_to_go) & 1)
+    }
+
+    fn input_nbits(&mut self, n: i32) -> Result<i32> {
+        if self.bits_to_go < n {
+            self.buffer2 = (self.buffer2 << 8) | self.next_byte()?;
+            self.bits_to_go += 8;
+        }
+        self.bits_to_go -= n;
+        Ok((self.buffer2 >> self.bits_to_go) & ((1 << n) - 1))
+    }
+
+    #[inline]
+    fn input_nybble(&mut self) -> Result<i32> {
+        self.input_nbits(4)
+    }
+
+    /// Read `n` 4-bit nybbles into `array` (cfitsio `input_nnybble`).
+    fn input_nnybble(&mut self, n: usize, array: &mut [u8]) -> Result<()> {
+        if n == 1 {
+            array[0] = self.input_nybble()? as u8;
+            return Ok(());
+        }
+        if self.bits_to_go == 8 {
+            // Backspace to reuse the last char (cfitsio quirk).
+            self.nextchar -= 1;
+            self.bits_to_go = 0;
+        }
+        let shift1 = self.bits_to_go + 4;
+        let shift2 = self.bits_to_go;
+        let mut kk = 0usize;
+        let mut ii = 0usize;
+        if self.bits_to_go == 0 {
+            while ii < n / 2 {
+                self.buffer2 = (self.buffer2 << 8) | self.next_byte()?;
+                array[kk] = ((self.buffer2 >> 4) & 15) as u8;
+                array[kk + 1] = (self.buffer2 & 15) as u8;
+                kk += 2;
+                ii += 1;
+            }
+        } else {
+            while ii < n / 2 {
+                self.buffer2 = (self.buffer2 << 8) | self.next_byte()?;
+                array[kk] = ((self.buffer2 >> shift1) & 15) as u8;
+                array[kk + 1] = ((self.buffer2 >> shift2) & 15) as u8;
+                kk += 2;
+                ii += 1;
+            }
+        }
+        if ii * 2 != n {
+            array[n - 1] = self.input_nybble()? as u8;
+        }
+        Ok(())
+    }
+
+    /// Huffman decode of a 4-bit code (cfitsio `input_huffman`).
+    fn input_huffman(&mut self) -> Result<i32> {
+        let mut c = self.input_nbits(3)?;
+        if c < 4 {
+            return Ok(1 << c);
+        }
+        c = self.input_bit()? | (c << 1);
+        if c < 13 {
+            match c {
+                8 => return Ok(3),
+                9 => return Ok(5),
+                10 => return Ok(10),
+                11 => return Ok(12),
+                12 => return Ok(15),
+                _ => {}
+            }
+        }
+        c = self.input_bit()? | (c << 1);
+        if c < 31 {
+            match c {
+                26 => return Ok(6),
+                27 => return Ok(7),
+                28 => return Ok(9),
+                29 => return Ok(11),
+                30 => return Ok(13),
+                _ => {}
+            }
+        }
+        c = self.input_bit()? | (c << 1);
+        if c == 62 {
+            Ok(0)
+        } else {
+            Ok(14)
+        }
+    }
+}
+
+/// Expand 4-bit quadtree values from `a[(nx+1)/2,(ny+1)/2]` into `b[nx,ny]`
+/// (2x2 per value); cfitsio `qtree_copy`. `a` and `b` are the same buffer here, so
+/// we operate in place exactly as the C does (iterating from the end first).
+fn qtree_copy(buf: &mut [u8], nx: usize, ny: usize, n: usize) {
+    let nx2 = nx.div_ceil(2);
+    let ny2 = ny.div_ceil(2);
+    // Copy 4-bit values to b, from the end (a,b same array).
+    // k is index of a[i,j]; s00 is index of b[2*i,2*j].
+    let mut k = (ny2 * (nx2 - 1) + ny2 - 1) as isize;
+    for i in (0..nx2).rev() {
+        let mut s00 = (2 * (n * i + ny2 - 1)) as isize;
+        for _j in (0..ny2).rev() {
+            buf[s00 as usize] = buf[k as usize];
+            k -= 1;
+            s00 -= 2;
+        }
+    }
+    // Expand each 2x2 block. Mapping: bit3->b[s00], bit2->b[s00+1],
+    // bit1->b[s10], bit0->b[s10+1] where s10 = s00+n.
+    let mut i = 0usize;
+    while i + 1 < nx {
+        let mut s00 = n * i;
+        let s10base = s00 + n;
+        let mut s10 = s10base;
+        let mut j = 0usize;
+        while j + 1 < ny {
+            let v = buf[s00];
+            buf[s10 + 1] = v & 1;
+            buf[s10] = (v >> 1) & 1;
+            buf[s00 + 1] = (v >> 2) & 1;
+            buf[s00] = (v >> 3) & 1;
+            s00 += 2;
+            s10 += 2;
+            j += 2;
+        }
+        if j < ny {
+            // odd row length
+            let v = buf[s00];
+            buf[s10] = (v >> 1) & 1;
+            buf[s00] = (v >> 3) & 1;
+        }
+        i += 2;
+    }
+    if i < nx {
+        // odd column length: last row, s10 off edge
+        let mut s00 = n * i;
+        let mut j = 0usize;
+        while j + 1 < ny {
+            let v = buf[s00];
+            buf[s00 + 1] = (v >> 2) & 1;
+            buf[s00] = (v >> 3) & 1;
+            s00 += 2;
+            j += 2;
+        }
+        if j < ny {
+            let v = buf[s00];
+            buf[s00] = (v >> 3) & 1;
+        }
+    }
+}
+
+/// One quadtree expansion step (cfitsio `qtree_expand`): copy+expand then read a
+/// fresh Huffman code into every non-zero element (scanning from the end).
+fn qtree_expand(input: &mut HcInput, buf: &mut [u8], nx: usize, ny: usize) -> Result<()> {
+    qtree_copy(buf, nx, ny, ny);
+    for i in (0..nx * ny).rev() {
+        if buf[i] != 0 {
+            buf[i] = input.input_huffman()? as u8;
+        }
+    }
+    Ok(())
+}
+
+macro_rules! impl_hdecompress {
+    ($name:ident, $t:ty, $bitins:ident, $read_bdirect:ident, $qtree_decode:ident,
+     $dodecode:ident, $hinv:ident, $undigitize:ident, $unshuffle:ident) => {
+        /// Distribute even/odd interleaved coefficients (cfitsio `unshuffle`).
+        /// `offset` is the base index into `a`; pointer arithmetic is done in
+        /// `isize` so the trailing (unused) decrements can go negative as in C.
+        fn $unshuffle(a: &mut [$t], offset: usize, n: usize, n2: usize, tmp: &mut [$t]) {
+            let base = offset as isize;
+            let n2i = n2 as isize;
+            let nhalf = (n + 1) >> 1;
+            // copy 2nd half of array to tmp
+            let mut p1 = base + n2i * nhalf as isize;
+            for slot in tmp.iter_mut().take(n - nhalf) {
+                *slot = a[p1 as usize];
+                p1 += n2i;
+            }
+            // distribute 1st half to even elements (descending)
+            let mut p2 = base + n2i * (nhalf as isize - 1);
+            let mut p1e = base + ((n2i * (nhalf as isize - 1)) << 1);
+            let mut i = nhalf as isize - 1;
+            while i >= 0 {
+                a[p1e as usize] = a[p2 as usize];
+                p2 -= n2i;
+                p1e -= n2i + n2i;
+                i -= 1;
+            }
+            // distribute 2nd half (tmp) to odd elements
+            let mut p1o = base + n2i;
+            let mut pt = 0usize;
+            let mut i = 1usize;
+            while i < n {
+                a[p1o as usize] = tmp[pt];
+                p1o += n2i + n2i;
+                pt += 1;
+                i += 2;
+            }
+        }
+
+        /// Insert expanded 4-bit codes from `aa[(nx+1)/2,(ny+1)/2]` into bitplane
+        /// `bit` of `b[nx,ny]` (cfitsio `qtree_bitins`).
+        fn $bitins(aa: &[u8], nx: usize, ny: usize, b: &mut [$t], n: usize, bit: i32) {
+            let plane_val: $t = (1 as $t) << bit;
+            let mut k = 0usize;
+            let mut i = 0usize;
+            while i + 1 < nx {
+                let s00 = n * i;
+                let mut s00 = s00;
+                let mut j = 0usize;
+                while j + 1 < ny {
+                    let v = aa[k];
+                    if v & 1 != 0 {
+                        b[s00 + n + 1] |= plane_val;
+                    }
+                    if v & 2 != 0 {
+                        b[s00 + n] |= plane_val;
+                    }
+                    if v & 4 != 0 {
+                        b[s00 + 1] |= plane_val;
+                    }
+                    if v & 8 != 0 {
+                        b[s00] |= plane_val;
+                    }
+                    s00 += 2;
+                    k += 1;
+                    j += 2;
+                }
+                if j < ny {
+                    // odd row: s00+1, s10+1 off edge -> only bits 1 (s10) and 3 (s00)
+                    let v = aa[k];
+                    if v & 2 != 0 {
+                        b[s00 + n] |= plane_val;
+                    }
+                    if v & 8 != 0 {
+                        b[s00] |= plane_val;
+                    }
+                    k += 1;
+                }
+                i += 2;
+            }
+            if i < nx {
+                // odd column: last row, s10 off edge -> bits 2 (s00+1) and 3 (s00)
+                let mut s00 = n * i;
+                let mut j = 0usize;
+                while j + 1 < ny {
+                    let v = aa[k];
+                    if v & 4 != 0 {
+                        b[s00 + 1] |= plane_val;
+                    }
+                    if v & 8 != 0 {
+                        b[s00] |= plane_val;
+                    }
+                    s00 += 2;
+                    k += 1;
+                    j += 2;
+                }
+                if j < ny {
+                    // corner: only bit 3 (s00)
+                    let v = aa[k];
+                    if v & 8 != 0 {
+                        b[s00] |= plane_val;
+                    }
+                    k += 1;
+                }
+            }
+            let _ = k;
+        }
+
+        /// Read a directly-stored bit plane and insert it (cfitsio `read_bdirect`).
+        fn $read_bdirect(
+            input: &mut HcInput,
+            a: &mut [$t],
+            n: usize,
+            nqx: usize,
+            nqy: usize,
+            scratch: &mut [u8],
+            bit: i32,
+        ) -> Result<()> {
+            let cnt = nqx.div_ceil(2) * nqy.div_ceil(2);
+            input.input_nnybble(cnt, scratch)?;
+            $bitins(scratch, nqx, nqy, a, n, bit);
+            Ok(())
+        }
+
+        /// Decode the quadtree-coded bit planes of one quadrant (cfitsio
+        /// `qtree_decode`).
+        fn $qtree_decode(
+            input: &mut HcInput,
+            a: &mut [$t],
+            a_off: usize,
+            n: usize,
+            nqx: usize,
+            nqy: usize,
+            nbitplanes: i32,
+        ) -> Result<()> {
+            let nqmax = nqx.max(nqy);
+            let mut log2n = ((nqmax as f32).ln() / 2.0f32.ln() + 0.5) as i32;
+            if nqmax > (1usize << log2n) {
+                log2n += 1;
+            }
+            let nqx2 = nqx.div_ceil(2);
+            let nqy2 = nqy.div_ceil(2);
+            let mut scratch = vec![0u8; nqx2 * nqy2 + 4];
+
+            let asl = &mut a[a_off..];
+
+            let mut bit = nbitplanes - 1;
+            while bit >= 0 {
+                let b = input.input_nybble()?;
+                if b == 0 {
+                    $read_bdirect(input, asl, n, nqx, nqy, &mut scratch, bit)?;
+                } else if b != 0xf {
+                    return Err(Error::CompressionError(
+                        "qtree_decode: bad format code".into(),
+                    ));
+                } else {
+                    scratch[0] = input.input_huffman()? as u8;
+                    let mut nx = 1usize;
+                    let mut ny = 1usize;
+                    let mut nfx = nqx;
+                    let mut nfy = nqy;
+                    let mut c = 1usize << log2n;
+                    let mut k = 1;
+                    while k < log2n {
+                        c >>= 1;
+                        nx <<= 1;
+                        ny <<= 1;
+                        if nfx <= c {
+                            nx -= 1;
+                        } else {
+                            nfx -= c;
+                        }
+                        if nfy <= c {
+                            ny -= 1;
+                        } else {
+                            nfy -= c;
+                        }
+                        qtree_expand(input, &mut scratch, nx, ny)?;
+                        k += 1;
+                    }
+                    $bitins(&scratch, nqx, nqy, asl, n, bit);
+                }
+                bit -= 1;
+            }
+            Ok(())
+        }
+
+        /// Decode the four quadrants into coefficient array `a` (cfitsio
+        /// `dodecode`).
+        fn $dodecode(
+            input: &mut HcInput,
+            a: &mut [$t],
+            nx: usize,
+            ny: usize,
+            nbitplanes: [u8; 3],
+        ) -> Result<()> {
+            let nel = nx * ny;
+            let nx2 = nx.div_ceil(2);
+            let ny2 = ny.div_ceil(2);
+            for v in a.iter_mut().take(nel) {
+                *v = 0 as $t;
+            }
+            input.start_inputing_bits();
+            $qtree_decode(input, a, 0, ny, nx2, ny2, nbitplanes[0] as i32)?;
+            $qtree_decode(input, a, ny2, ny, nx2, ny / 2, nbitplanes[1] as i32)?;
+            $qtree_decode(input, a, ny * nx2, ny, nx / 2, ny2, nbitplanes[1] as i32)?;
+            $qtree_decode(
+                input,
+                a,
+                ny * nx2 + ny2,
+                ny,
+                nx / 2,
+                ny / 2,
+                nbitplanes[2] as i32,
+            )?;
+            if input.input_nybble()? != 0 {
+                return Err(Error::CompressionError(
+                    "dodecode: bad bit plane values (missing EOF)".into(),
+                ));
+            }
+            // sign bits
+            input.start_inputing_bits();
+            for v in a.iter_mut().take(nel) {
+                if *v != 0 as $t && input.input_bit()? != 0 {
+                    *v = (0 as $t).wrapping_sub(*v);
+                }
+            }
+            Ok(())
+        }
+
+        fn $undigitize(a: &mut [$t], nel: usize, scale: i32) {
+            if scale <= 1 {
+                return;
+            }
+            let s = scale as $t;
+            for v in a.iter_mut().take(nel) {
+                *v = (*v).wrapping_mul(s);
+            }
+        }
+
+        /// Inverse H-transform (cfitsio `hinv`). `smooth` is unsupported (the
+        /// fixtures use SMOOTH=0); a non-zero value is rejected by the caller.
+        fn $hinv(a: &mut [$t], nx: usize, ny: usize) {
+            let nmax = nx.max(ny);
+            let mut log2n = ((nmax as f32).ln() / 2.0f32.ln() + 0.5) as i32;
+            if nmax > (1usize << log2n) {
+                log2n += 1;
+            }
+            let nmax_i = nmax;
+            let mut tmp = vec![0 as $t; nmax_i.div_ceil(2) + 1];
+
+            let mut shift: i32 = 1;
+            let mut bit0: $t = (1 as $t) << (log2n - 1);
+            let mut bit1: $t = bit0 << 1;
+            let mut bit2: $t = bit0 << 2;
+            let mut mask0: $t = (0 as $t).wrapping_sub(bit0);
+            let mut mask1: $t = mask0 << 1;
+            let mask2: $t = mask0 << 2;
+            let mut prnd0: $t = bit0 >> 1;
+            let mut prnd1: $t = bit1 >> 1;
+            let prnd2: $t = bit2 >> 1;
+            let mut nrnd0: $t = prnd0 - 1;
+            let mut nrnd1: $t = prnd1 - 1;
+            let nrnd2: $t = prnd2 - 1;
+
+            // round h0 to multiple of bit2
+            a[0] = (a[0].wrapping_add(if a[0] >= 0 as $t { prnd2 } else { nrnd2 })) & mask2;
+
+            let ny_i = ny as isize;
+            let mut nxtop = 1usize;
+            let mut nytop = 1usize;
+            let mut nxf = nx;
+            let mut nyf = ny;
+            let mut c = 1usize << log2n;
+            let mut k = log2n - 1;
+            while k >= 0 {
+                c >>= 1;
+                nxtop <<= 1;
+                nytop <<= 1;
+                if nxf <= c {
+                    nxtop -= 1;
+                } else {
+                    nxf -= c;
+                }
+                if nyf <= c {
+                    nytop -= 1;
+                } else {
+                    nyf -= c;
+                }
+                if k == 0 {
+                    nrnd0 = 0 as $t;
+                    shift = 2;
+                }
+                // unshuffle in each dimension
+                for i in 0..nxtop {
+                    $unshuffle(a, ny * i, nytop, 1, &mut tmp);
+                }
+                for j in 0..nytop {
+                    $unshuffle(a, j, nxtop, ny, &mut tmp);
+                }
+                let oddx = nxtop % 2;
+                let oddy = nytop % 2;
+                let mut i = 0usize;
+                while i + oddx < nxtop {
+                    // i steps by 2 over 0..nxtop-oddx
+                    let mut s00 = (ny * i) as isize;
+                    let mut s10 = s00 + ny_i;
+                    let mut j = 0usize;
+                    while j + oddy < nytop {
+                        let mut h0 = a[s00 as usize];
+                        let mut hx = a[s10 as usize];
+                        let mut hy = a[(s00 + 1) as usize];
+                        let mut hc = a[(s10 + 1) as usize];
+                        hx = (hx.wrapping_add(if hx >= 0 as $t { prnd1 } else { nrnd1 })) & mask1;
+                        hy = (hy.wrapping_add(if hy >= 0 as $t { prnd1 } else { nrnd1 })) & mask1;
+                        hc = (hc.wrapping_add(if hc >= 0 as $t { prnd0 } else { nrnd0 })) & mask0;
+                        let lowbit0 = hc & bit0;
+                        hx = if hx >= 0 as $t {
+                            hx.wrapping_sub(lowbit0)
+                        } else {
+                            hx.wrapping_add(lowbit0)
+                        };
+                        hy = if hy >= 0 as $t {
+                            hy.wrapping_sub(lowbit0)
+                        } else {
+                            hy.wrapping_add(lowbit0)
+                        };
+                        let lowbit1 = (hc ^ hx ^ hy) & bit1;
+                        h0 = if h0 >= 0 as $t {
+                            h0.wrapping_add(lowbit0).wrapping_sub(lowbit1)
+                        } else {
+                            h0.wrapping_add(if lowbit0 == 0 as $t {
+                                lowbit1
+                            } else {
+                                lowbit0.wrapping_sub(lowbit1)
+                            })
+                        };
+                        a[(s10 + 1) as usize] =
+                            (h0.wrapping_add(hx).wrapping_add(hy).wrapping_add(hc)) >> shift;
+                        a[s10 as usize] =
+                            (h0.wrapping_add(hx).wrapping_sub(hy).wrapping_sub(hc)) >> shift;
+                        a[(s00 + 1) as usize] =
+                            (h0.wrapping_sub(hx).wrapping_add(hy).wrapping_sub(hc)) >> shift;
+                        a[s00 as usize] =
+                            (h0.wrapping_sub(hx).wrapping_sub(hy).wrapping_add(hc)) >> shift;
+                        s00 += 2;
+                        s10 += 2;
+                        j += 2;
+                    }
+                    if oddy != 0 {
+                        let mut h0 = a[s00 as usize];
+                        let mut hx = a[s10 as usize];
+                        hx = (hx.wrapping_add(if hx >= 0 as $t { prnd1 } else { nrnd1 })) & mask1;
+                        let lowbit1 = hx & bit1;
+                        h0 = if h0 >= 0 as $t {
+                            h0.wrapping_sub(lowbit1)
+                        } else {
+                            h0.wrapping_add(lowbit1)
+                        };
+                        a[s10 as usize] = (h0.wrapping_add(hx)) >> shift;
+                        a[s00 as usize] = (h0.wrapping_sub(hx)) >> shift;
+                    }
+                    i += 2;
+                }
+                if oddx != 0 {
+                    let mut s00 = (ny * i) as isize;
+                    let mut j = 0usize;
+                    while j + oddy < nytop {
+                        let mut h0 = a[s00 as usize];
+                        let mut hy = a[(s00 + 1) as usize];
+                        hy = (hy.wrapping_add(if hy >= 0 as $t { prnd1 } else { nrnd1 })) & mask1;
+                        let lowbit1 = hy & bit1;
+                        h0 = if h0 >= 0 as $t {
+                            h0.wrapping_sub(lowbit1)
+                        } else {
+                            h0.wrapping_add(lowbit1)
+                        };
+                        a[(s00 + 1) as usize] = (h0.wrapping_add(hy)) >> shift;
+                        a[s00 as usize] = (h0.wrapping_sub(hy)) >> shift;
+                        s00 += 2;
+                        j += 2;
+                    }
+                    if oddy != 0 {
+                        let h0 = a[s00 as usize];
+                        a[s00 as usize] = h0 >> shift;
+                    }
+                }
+                // divide masks/rounding by 2
+                bit2 = bit1;
+                bit1 = bit0;
+                bit0 >>= 1;
+                mask1 = mask0;
+                mask0 >>= 1;
+                prnd1 = prnd0;
+                prnd0 >>= 1;
+                nrnd1 = nrnd0;
+                nrnd0 = prnd0 - 1;
+                k -= 1;
+            }
+            let _ = (bit2, mask1, prnd1, nrnd1);
+        }
+
+        /// Full HCOMPRESS decode for one quadrant-int width. Returns the pixel
+        /// array (axis-1 fastest) along with `(nx_slow, ny_fast)`.
+        fn $name(input: &mut HcInput, smooth: i32) -> Result<(Vec<$t>, usize, usize)> {
+            // magic code
+            let magic = input.qread(2)?;
+            if magic != [0xDDu8, 0x99u8] {
+                return Err(Error::CompressionError(
+                    "HCOMPRESS: bad magic code".into(),
+                ));
+            }
+            let nx = input.readint()? as usize; // slow axis
+            let ny = input.readint()? as usize; // fast axis
+            let scale = input.readint()?;
+            let nel = nx.checked_mul(ny).ok_or_else(|| {
+                Error::CompressionError("HCOMPRESS: dimension overflow".into())
+            })?;
+            let sumall = input.readlonglong()?;
+            let nbp = input.qread(3)?;
+            let nbitplanes = [nbp[0], nbp[1], nbp[2]];
+
+            let mut a = vec![0 as $t; nel.max(1)];
+            $dodecode(input, &mut a, nx, ny, nbitplanes)?;
+            // put sum of all pixels back into pixel 0
+            a[0] = sumall as $t;
+
+            if smooth != 0 {
+                return Err(Error::UnsupportedCompression(
+                    "HCOMPRESS_1 SMOOTH != 0 is not supported".into(),
+                ));
+            }
+            $undigitize(&mut a, nel, scale);
+            $hinv(&mut a, nx, ny);
+            Ok((a, nx, ny))
+        }
+    };
+}
+
+impl_hdecompress!(
+    hdecode32,
+    i32,
+    qtree_bitins32,
+    read_bdirect32,
+    qtree_decode32,
+    dodecode32,
+    hinv32,
+    undigitize32,
+    unshuffle32
+);
+impl_hdecompress!(
+    hdecode64,
+    i64,
+    qtree_bitins64,
+    read_bdirect64,
+    qtree_decode64,
+    dodecode64,
+    hinv64,
+    undigitize64,
+    unshuffle64
+);
+
+/// Decompress one HCOMPRESS_1 tile.
+///
+/// `wide` selects the 64-bit transform (used for ZBITPIX 32 and quantized
+/// -32/-64 floats; cfitsio uses the 32-bit transform only for ZBITPIX 8/16).
+/// Returns the decoded integer samples (axis-1 fastest) and the `(nx_slow,
+/// ny_fast)` dimensions read from the stream.
+pub fn hcompress_decompress(
+    src: &[u8],
+    wide: bool,
+    smooth: i32,
+) -> Result<(Vec<i64>, usize, usize)> {
+    let mut input = HcInput::new(src);
+    if wide {
+        let (a, nx, ny) = hdecode64(&mut input, smooth)?;
+        Ok((a, nx, ny))
+    } else {
+        let (a, nx, ny) = hdecode32(&mut input, smooth)?;
+        Ok((a.into_iter().map(|v| v as i64).collect(), nx, ny))
+    }
 }
 
 #[cfg(test)]
@@ -1146,6 +2133,33 @@ mod tests {
         }
     }
 
+    #[test]
+    fn plio_single_pixel_at_end_of_run() {
+        // PLIO line list for a 512-pixel row with one pixel == 1 at index 178
+        // (0-based), all else zero. Lifted verbatim from the cfitsio `fpack -p`
+        // output for row 137 of EUVEngc4151imgx.fits. Header word [2] = -100 (the
+        // PLIO magic, 0xFF9C) forces the long-form length decode.
+        let words: Vec<i16> = [0u16, 7, 0xFF9C, 9, 0, 0, 0, 20659, 333]
+            .iter()
+            .map(|&w| w as i16)
+            .collect();
+        let out = plio_decompress(&words, 512).unwrap();
+        let mut expected = vec![0i32; 512];
+        expected[178] = 1; // opcode 5 (run 179, last pixel set to pv=1)
+        assert_eq!(out, expected);
+    }
+
+    #[test]
+    fn plio_empty_list_is_all_zero() {
+        // A zero-length line list (lllen == 0) decodes to an all-zero tile.
+        let words: Vec<i16> = [0u16, 7, 0xFF9C, 0, 0, 0, 0]
+            .iter()
+            .map(|&w| w as i16)
+            .collect();
+        let out = plio_decompress(&words, 16).unwrap();
+        assert_eq!(out, vec![0i32; 16]);
+    }
+
     #[test]
     fn bit_reader_basic() {
         // 0b1011_0010, 0b1100_0000
diff --git a/tests/sample_files.rs b/tests/sample_files.rs
index c883a0e..3a48201 100644
--- a/tests/sample_files.rs
+++ b/tests/sample_files.rs
@@ -266,6 +266,24 @@ fn rice_roundtrip_fgs_i32() {
     assert_rice_roundtrip("FGSf64y0106m_a1f.fits", "FGSf64y0106m_a1f.rice.fits.fz");
 }
 
+#[test]
+fn plio_roundtrip_euv() {
+    // PLIO_1 (IRAF pixel-list RLE) over the I16 EUV image extensions. Lossless,
+    // so the decode must reproduce the source pixels byte-for-byte.
+    assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.plio.fits.fz");
+}
+
+#[test]
+fn hcompress_int_roundtrip_euv() {
+    // HCOMPRESS_1 with SCALE=0 is LOSSLESS for integer images, so the decode of
+    // the I16 EUV extensions must be byte-exact vs the source. Exercises both the
+    // 512x16 and 2048x16 tile geometries.
+    assert_rice_roundtrip(
+        "EUVEngc4151imgx.fits",
+        "EUVEngc4151imgx.hcomp_int.fits.fz",
+    );
+}
+
 // --- float tile-compression: bit-exact vs funpack -------------------------
 //
 // Quantized-float decompression is *lossy*, so the reconstructed floats will not
@@ -432,6 +450,14 @@ fn float_cube_rice_nodither_matches_funpack() {
     assert_float_matches_funpack("WFPC2u5780205r_c0fx.rice_nodith.fits.fz");
 }
 
+#[test]
+fn float_hcompress_dither1_matches_funpack() {
+    // 1024x1024 F32, HCOMPRESS_1 (SCALE=0 on the quantized ints),
+    // ZQUANTIZ=SUBTRACTIVE_DITHER_1. Lossy (quantized), so validate bit-exact
+    // against funpack's own reconstruction. Tiles are 1024x16.
+    assert_float_matches_funpack("FOCx38i0101t_c0f.hcomp.fits.fz");
+}
+
 #[cfg(feature = "gzip")]
 #[test]
 fn gzip1_roundtrip_euv() {

From 5dd3fe5625725bd54ee237bb702d6c68b182de9d Mon Sep 17 00:00:00 2001
From: Steven Michael <ssmichael@gmail.com>
Date: Sun, 31 May 2026 14:35:18 -0400
Subject: [PATCH 4/9] Docs: document full tile-compression read support

- README: list PLIO_1 and HCOMPRESS_1 as supported; drop the "growing"
  hedge; note SMOOTH!=0 as the only HCOMPRESS gap; clarify which
  algorithms work without the gzip feature.
- CLAUDE.md: add tile_compress.rs to the module table; add tile
  compression, the FMA/funpack bit-exactness note, and the compressed
  fixture workflow to Design Decisions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 CLAUDE.md |  7 ++--
 README.md | 95 ++++++++++++++++++++++++++++++++++++++++++++-----------
 2 files changed, 82 insertions(+), 20 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index e63361b..73a38a2 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -26,7 +26,8 @@ Zero external dependencies for core functionality. Optional `image` crate behind
 | `ascii_table.rs` | `AsciiTable`: TFORMn parsing (Aw/Iw/Fw.d/Ew.d/Dw.d), column access, TSCALn/TZEROn |
 | `bintable.rs` | `BinTable`: all type codes (L,X,B,I,J,K,A,E,D,C,M,P,Q), heap/VLA |
 | `checksum.rs` | CHECKSUM/DATASUM ones-complement computation |
-| `hdu.rs` | `Hdu` struct, `HduData` enum (Empty/Image/AsciiTable/BinTable) |
+| `tile_compress.rs` | Tiled-image compression decode: RICE_1, GZIP_1/2 (feature `gzip`), PLIO_1, HCOMPRESS_1; quantization + subtractive dithering for floats |
+| `hdu.rs` | `Hdu` struct, `HduData` enum (Empty/Image/AsciiTable/BinTable); `as_compressed_image()` accessor |
 | `fits.rs` | `FitsFile`: top-level read/write, HDU iteration, builder API |
 | `image_conv.rs` | (feature="image") `DynamicImage` <-> `ImageData` conversion |
 
@@ -65,7 +66,9 @@ NASA sample FITS files in `samp/`:
 
 ## Design Decisions
 
-- **No external dependencies** for core — no `byteorder`, no `thiserror`
+- **No external dependencies** for core — no `byteorder`, no `thiserror`. Only optional, feature-gated deps: `image` and `gzip` (`miniz_oxide`)
 - **Random groups** skipped (deprecated per standard)
 - BSCALE/BZERO: raw vs scaled access modes on `ImageData`
 - Unsigned integer convention: BZERO offset (32768 for u16, etc.)
+- **Tile compression**: decode (read) only — RICE_1/PLIO_1/HCOMPRESS_1 in the zero-dep core, GZIP_1/2 behind the `gzip` feature. Lazy `hdu.as_compressed_image()?.decompress()?`; `HduData` stays `BinTable` so the compressed tiles survive for lossless round-trip. Float decode reproduces cfitsio's fused multiply-add to stay bit-exact vs `funpack` (see memory). Compression *encode* (write) and HCOMPRESS `SMOOTH≠0` are not implemented
+- **Compressed fixtures**: `scripts/gen_compressed_fixtures.sh` builds fpack `.fz` test files into `samp/` (gitignored, served from GCS bucket `fits4_samples`); compression tests skip when fixtures/`funpack` are absent
diff --git a/README.md b/README.md
index 464e9dc..20c7f44 100644
--- a/README.md
+++ b/README.md
@@ -1,27 +1,31 @@
 # fits4
 
-Pure Rust implementation of the [FITS](https://fits.gsfc.nasa.gov/fits_standard.html) (Flexible Image Transport System) v4.0 standard for reading and writing astronomical data files.
+**Pure-Rust, zero-dependency reader and writer for [FITS](https://fits.gsfc.nasa.gov/fits_standard.html) v4.0** — the Flexible Image Transport System format used throughout astronomy.
 
-Zero external dependencies for core functionality.
+[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
+
+fits4 reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It also decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) on the read path.
 
 ## Features
 
-- **All standard HDU types** — Primary, IMAGE, ASCII TABLE, BINTABLE
+- **All standard HDU types** — Primary, IMAGE, ASCII `TABLE`, `BINTABLE`
+- **Full read *and* write** — round-trip files, bytes, or any `Read`/`Write`
 - **All BITPIX types** — 8, 16, 32, 64, -32, -64
-- **Variable-length arrays** — P and Q heap descriptors in binary tables
-- **BSCALE/BZERO** — raw and scaled access with unsigned integer convention
-- **CHECKSUM/DATASUM** — ones-complement integrity verification
-- **`image` crate integration** — optional, behind `image` feature flag
-
-## Usage
+- **BSCALE/BZERO** — raw and scaled access, with the unsigned-integer convention
+- **Variable-length arrays** — `P` and `Q` heap descriptors in binary tables
+- **CHECKSUM/DATASUM** — ones-complement integrity computation and verification
+- **Tile-compressed images (read)** — RICE_1, GZIP_1, GZIP_2, PLIO_1, and HCOMPRESS_1, including float quantization with subtractive dithering; decoded bit-exactly against cfitsio's `funpack`
+- **Zero dependencies by default** — optional `image` and `gzip` features pull in pure-Rust crates only
 
-Add to your `Cargo.toml`:
+## Installation
 
 ```toml
 [dependencies]
 fits4 = "0.1"
 ```
 
+## Usage
+
 ### Reading a FITS file
 
 ```rust
@@ -34,6 +38,9 @@ println!("BITPIX = {}", primary.header.get_int("BITPIX").unwrap());
 
 if let HduData::Image(img) = &primary.data {
     println!("{}x{}", img.width().unwrap(), img.height().unwrap());
+    if let PixelData::F32(pixels) = &img.pixels {
+        println!("first pixel = {}", pixels[0]);
+    }
 }
 
 for hdu in fits.extensions() {
@@ -50,7 +57,7 @@ for hdu in fits.extensions() {
 ### Writing a FITS file
 
 ```rust
-use fits4::{FitsFile, Hdu, ImageData, PixelData, HeaderValue};
+use fits4::{FitsFile, ImageData, PixelData, HeaderValue};
 
 let pixels: Vec<i16> = (0..10000).map(|i| (i % 1000) as i16).collect();
 let img = ImageData::new(vec![100, 100], PixelData::I16(pixels));
@@ -88,6 +95,27 @@ fits.push_extension(Hdu::bintable_extension(table));
 # let _ = fits.to_bytes();
 ```
 
+### Reading a tile-compressed image
+
+Tile-compressed images are stored on disk as a `BINTABLE` extension (the FITS
+Tiled Image Compression convention). `Hdu::as_compressed_image` cheaply detects
+them; `decompress` reassembles the full image lazily.
+
+```rust
+use fits4::FitsFile;
+
+let fits = FitsFile::from_file("compressed.fits")?;
+
+for hdu in fits.extensions() {
+    if let Some(cimg) = hdu.as_compressed_image() {
+        println!("compression = {:?}", cimg.compression());
+        let image = cimg.decompress()?;
+        println!("decompressed to {:?}", image.axes);
+    }
+}
+# Ok::<(), fits4::Error>(())
+```
+
 ### Checksums
 
 ```rust
@@ -105,17 +133,48 @@ fits2.primary().verify_datasum()?;
 # Ok::<(), fits4::Error>(())
 ```
 
-## Not supported
+## Feature flags
 
-- **Tile-compressed images** (`ZIMAGE`, Rice/GZIP/HCOMPRESS) — use [`fpack`/`funpack`](https://heasarc.gsfc.nasa.gov/fitsio/fpack/) to decompress externally
-- **Random groups** (deprecated in FITS v4.0)
+| Flag | Default | Description |
+|------|---------|-------------|
+| *(none)* | ✓ | Core read/write, all HDU types, RICE_1 / PLIO_1 / HCOMPRESS_1 tile decompression — zero dependencies |
+| `image` | | Conversion between `ImageData` and the [`image`](https://crates.io/crates/image) crate's `DynamicImage` |
+| `gzip` | | Decoding of `GZIP_1`/`GZIP_2` tile-compressed images via the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate |
 
-## Feature flags
+The default build stays dependency-free; `RICE_1`, `PLIO_1`, and `HCOMPRESS_1` decompression all work without any feature (only `GZIP_1`/`GZIP_2` need the `gzip` feature).
+
+## Why fits4?
+
+Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, and complete compressed-image read support**, with no C toolchain to install.
+
+| Crate | Pure Rust | Write | Tables | Compressed read | Notes |
+|-------|-----------|-------|--------|-----------------|-------|
+| [`fitsio`](https://crates.io/crates/fitsio) | ✗ | ✓ | ✓ | ✓ | Wraps the cfitsio C library; needs a C toolchain |
+| [`fitsrs`](https://crates.io/crates/fitsrs) | ✓ | ✗ | partial | — | Read-only |
+| [`fitrs`](https://crates.io/crates/fitrs) | ✓ | ✓ | ✗ | ✗ | Dormant; no table support |
+| **fits4** | ✓ | ✓ | ✓ | ✓ (all types) | Zero default deps; no C dependency |
+
+## Supported / not supported
+
+**Supported**
+
+- Primary + extension HDUs; images, ASCII tables, binary tables (read + write)
+- All BITPIX types; BSCALE/BZERO scaling and the unsigned-integer convention
+- Variable-length arrays (`P`/`Q` descriptors) in binary tables
+- CHECKSUM/DATASUM computation and verification
+- Tile-compressed image **reading**: RICE_1, GZIP_1, GZIP_2, PLIO_1, and
+  HCOMPRESS_1 — for integer and floating-point images, including quantization
+  with subtractive dithering (`NO_DITHER` / `SUBTRACTIVE_DITHER_1` /
+  `SUBTRACTIVE_DITHER_2`)
+
+**Not supported**
 
-| Flag | Description |
-|------|-------------|
-| `image` | Enables conversion between `ImageData` and the [`image`](https://crates.io/crates/image) crate's `DynamicImage` |
+- Writing/encoding tile-compressed images (read/decode only)
+- HCOMPRESS image smoothing (`SMOOTH` ≠ 0) on decode
+- Random groups (deprecated in FITS v4.0)
 
 ## License
 
 [MIT](LICENSE)
+</content>
+</invoke>

From c2ae1ea008fe079303a081e286f8df2ed1124466 Mon Sep 17 00:00:00 2001
From: Steven Michael <ssmichael@gmail.com>
Date: Sun, 31 May 2026 15:55:45 -0400
Subject: [PATCH 5/9] Docs: add Core types overview to README

Describe the public type surface (FitsFile/Hdu/HduData/Header/ImageData/
BinTable/AsciiTable/CompressedImage/Bitpix) and the HDU data model so the
README is a complete description of the crate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.md | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/README.md b/README.md
index 20c7f44..39aa0d2 100644
--- a/README.md
+++ b/README.md
@@ -133,6 +133,29 @@ fits2.primary().verify_datasum()?;
 # Ok::<(), fits4::Error>(())
 ```
 
+## Core types
+
+A FITS file is an ordered sequence of **Header-Data Units (HDUs)**. fits4 mirrors
+that structure directly:
+
+| Type | Role |
+|------|------|
+| `FitsFile` | Top-level container — an ordered `Vec<Hdu>`. Reads/writes files, byte slices, or any `Read`/`Write`; builder methods (`with_primary_image`, `with_empty_primary`, `push_extension`, `to_file`, `to_bytes`, `to_bytes_with_checksum`). |
+| `Hdu` | One Header-Data Unit: a `Header` plus an `HduData` payload. `as_compressed_image()` exposes a tile-compressed image stored in a `BINTABLE`. |
+| `HduData` | The payload enum: `Empty`, `Image(ImageData)`, `AsciiTable(AsciiTable)`, `BinTable(BinTable)`. |
+| `Header` | Ordered list of `Keyword`s with typed accessors (`get_int`, `get_float`, `get_string`, `get_bool`) and `set`. |
+| `Keyword` / `HeaderValue` | An 80-byte header card and its typed value (with `CONTINUE` long-string handling). |
+| `ImageData` / `PixelData` | Image `axes` plus a typed pixel buffer (`U8`/`I16`/`I32`/`I64`/`F32`/`F64`); raw access and BSCALE/BZERO-scaled access (`scaled_values`). |
+| `BinTable` / `BinColumn` / `BinColumnType` / `BinCellValue` | Binary-table model — columns, rows, and heap (variable-length arrays); built with `BinTableBuilder`. |
+| `AsciiTable` | ASCII `TABLE` model with `TFORMn` column parsing and `TSCALn`/`TZEROn` scaling. |
+| `CompressedImage` / `CompressionType` / `TileGeometry` / `Quantize` | Read-side view over a tile-compressed image: detect the algorithm and geometry, then `decompress()` to an `ImageData`. |
+| `Bitpix` | The `BITPIX` data type (`8`/`16`/`32`/`64`/`-32`/`-64`). |
+| `Error` / `Result` | Crate error type and result alias. |
+
+Decoding is done eagerly at read time (big-endian → native), except tile-compressed
+images, which are decoded lazily on `decompress()` so the original compressed tiles
+are preserved for a lossless round-trip write.
+
 ## Feature flags
 
 | Flag | Default | Description |

From 70b427d0dd54b15e027bf730011d05b37300568a Mon Sep 17 00:00:00 2001
From: Steven Michael <ssmichael@gmail.com>
Date: Sun, 31 May 2026 16:13:29 -0400
Subject: [PATCH 6/9] Add tile-compression encode (write): RICE_1 and GZIP
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

fits4 can now WRITE tile-compressed FITS images — the capability no other
pure-Rust crate offers. Output is read back byte-for-byte by cfitsio's
funpack (and imcopy/astropy).

- ImageData::compress(&CompressOptions) -> Hdu produces a compressed-image
  BINTABLE HDU (HduData::BinTable + ZIMAGE header + COMPRESSED_DATA VLA);
  caller does fits.push_extension(img.compress(&opts)?). Mirrors the read
  side's as_compressed_image()/decompress().
- RICE_1 encoder: exact inverse of the decoder — seed verbatim, per-block
  zig-zag diffs, FS selection minimizing block_n*(fs+1)+(sum>>fs); constants
  match the decoder (fsbits/fsmax/bbits 5/25/32, 4/14/16, 3/6/8).
- Lossless: RICE_1/GZIP_1/GZIP_2 for integer (8/16/32), GZIP_1 for raw
  float. Lossy: RICE_1/GZIP on quantized + dithered floats. PLIO_1,
  HCOMPRESS_1, NOCOMPRESS, 64-bit int, and GZIP_2-lossless-float rejected
  with a clear Error.
- Z* keyword order is load-bearing: funpack rebuilds the image header by
  walking cards, so ZTENSION must precede ZBITPIX/ZNAXIS (else "1st key not
  SIMPLE or XTENSION"). build_z_header emits fpack's order. Documented.
- Tests: synthetic RICE encode->decode (i8/i16/i32), internal round-trips,
  and 6 funpack-reads-fits4 interop tests (byte-exact), skip-if-absent.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 CLAUDE.md             |    4 +-
 src/image_data.rs     |   25 +
 src/lib.rs            |   18 +-
 src/tile_compress.rs  | 1152 ++++++++++++++++++++++++++++++++++++++++-
 tests/sample_files.rs |  275 ++++++++++
 5 files changed, 1465 insertions(+), 9 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index 73a38a2..914a1c6 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -26,7 +26,7 @@ Zero external dependencies for core functionality. Optional `image` crate behind
 | `ascii_table.rs` | `AsciiTable`: TFORMn parsing (Aw/Iw/Fw.d/Ew.d/Dw.d), column access, TSCALn/TZEROn |
 | `bintable.rs` | `BinTable`: all type codes (L,X,B,I,J,K,A,E,D,C,M,P,Q), heap/VLA |
 | `checksum.rs` | CHECKSUM/DATASUM ones-complement computation |
-| `tile_compress.rs` | Tiled-image compression decode: RICE_1, GZIP_1/2 (feature `gzip`), PLIO_1, HCOMPRESS_1; quantization + subtractive dithering for floats |
+| `tile_compress.rs` | Tiled-image compression. Decode: RICE_1, GZIP_1/2 (feature `gzip`), PLIO_1, HCOMPRESS_1; quantization + subtractive dithering for floats. Encode (`compress_image`/`ImageData::compress`): RICE_1 (int + quantized float) and GZIP_1/2 (int; lossless float via GZIP_1) |
 | `hdu.rs` | `Hdu` struct, `HduData` enum (Empty/Image/AsciiTable/BinTable); `as_compressed_image()` accessor |
 | `fits.rs` | `FitsFile`: top-level read/write, HDU iteration, builder API |
 | `image_conv.rs` | (feature="image") `DynamicImage` <-> `ImageData` conversion |
@@ -70,5 +70,5 @@ NASA sample FITS files in `samp/`:
 - **Random groups** skipped (deprecated per standard)
 - BSCALE/BZERO: raw vs scaled access modes on `ImageData`
 - Unsigned integer convention: BZERO offset (32768 for u16, etc.)
-- **Tile compression**: decode (read) only — RICE_1/PLIO_1/HCOMPRESS_1 in the zero-dep core, GZIP_1/2 behind the `gzip` feature. Lazy `hdu.as_compressed_image()?.decompress()?`; `HduData` stays `BinTable` so the compressed tiles survive for lossless round-trip. Float decode reproduces cfitsio's fused multiply-add to stay bit-exact vs `funpack` (see memory). Compression *encode* (write) and HCOMPRESS `SMOOTH≠0` are not implemented
+- **Tile compression**: decode (read) for RICE_1/PLIO_1/HCOMPRESS_1 in the zero-dep core, GZIP_1/2 behind the `gzip` feature. Lazy `hdu.as_compressed_image()?.decompress()?`; `HduData` stays `BinTable` so the compressed tiles survive for lossless round-trip. Float decode reproduces cfitsio's fused multiply-add to stay bit-exact vs `funpack` (see memory). Encode (write) via `image.compress(&CompressOptions{..})? -> Hdu` (then `fits.push_extension(..)`): RICE_1 (int lossless + quantized/dithered float lossy) and GZIP_1/2 (int lossless; lossless raw-float storage via GZIP_1). Encoders are byte-exact inverses of the decoders and emit `funpack`-readable FITS. **Z\* keyword order is load-bearing**: `funpack` rebuilds the image header by walking cards, so `ZTENSION` must precede `ZBITPIX`/`ZNAXIS` (else "1st key not SIMPLE or XTENSION"); `build_z_header` emits fpack's order. PLIO_1/HCOMPRESS_1 encode and HCOMPRESS `SMOOTH≠0` are not implemented
 - **Compressed fixtures**: `scripts/gen_compressed_fixtures.sh` builds fpack `.fz` test files into `samp/` (gitignored, served from GCS bucket `fits4_samples`); compression tests skip when fixtures/`funpack` are absent
diff --git a/src/image_data.rs b/src/image_data.rs
index 33a76f2..33ced0c 100644
--- a/src/image_data.rs
+++ b/src/image_data.rs
@@ -153,6 +153,31 @@ impl ImageData {
         Ok(ImageData { axes, pixels })
     }
 
+    /// Compress this image into a tile-compressed BINTABLE [`Hdu`](crate::hdu::Hdu)
+    /// (`ZIMAGE = T`), ready for [`FitsFile::push_extension`](crate::fits::FitsFile::push_extension).
+    ///
+    /// Thin wrapper over [`compress_image`](crate::tile_compress::compress_image). See
+    /// [`CompressOptions`](crate::tile_compress::CompressOptions) for the algorithm,
+    /// tiling, and float-quantization knobs. The round-trip inverse is
+    /// [`Hdu::as_compressed_image`](crate::hdu::Hdu::as_compressed_image) +
+    /// [`CompressedImage::decompress`](crate::tile_compress::CompressedImage::decompress).
+    ///
+    /// ```
+    /// use fits4::{ImageData, PixelData};
+    /// use fits4::tile_compress::CompressOptions;
+    ///
+    /// let img = ImageData::new(vec![8, 4], PixelData::I16((0..32).collect()));
+    /// let hdu = img.compress(&CompressOptions::default()).unwrap();
+    /// let back = hdu.as_compressed_image().unwrap().decompress().unwrap();
+    /// assert_eq!(back.pixels.to_bytes(), img.pixels.to_bytes());
+    /// ```
+    pub fn compress(
+        &self,
+        opts: &crate::tile_compress::CompressOptions,
+    ) -> Result<crate::hdu::Hdu> {
+        crate::tile_compress::compress_image(self, opts)
+    }
+
     /// Populate header keywords for this image data.
     pub fn fill_header(&self, header: &mut Header) {
         header.set("BITPIX", HeaderValue::Integer(self.bitpix().to_i64()), None);
diff --git a/src/lib.rs b/src/lib.rs
index 7c39eec..446f382 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -10,12 +10,18 @@
 //! - **ASCII TABLE extension** — Aw, Iw, Fw.d, Ew.d, Dw.d column formats
 //! - **BINTABLE extension** — all type codes including variable-length arrays (P/Q descriptors)
 //!
-//! Tile-compressed images (`ZIMAGE`) are supported for **reading** integer images
-//! (`RICE_1`; `GZIP_1`/`GZIP_2` behind the `gzip` feature) via
+//! Tile-compressed images (`ZIMAGE`) are supported for **reading** integer and float
+//! images (`RICE_1`, `PLIO_1`, `HCOMPRESS_1`; `GZIP_1`/`GZIP_2` behind the `gzip`
+//! feature), including float quantization with `SUBTRACTIVE_DITHER_1`/`_2`, via
 //! [`Hdu::as_compressed_image`](hdu::Hdu::as_compressed_image) +
 //! [`CompressedImage::decompress`](tile_compress::CompressedImage::decompress).
-//! Float quantization/dithering, `PLIO_1`, `HCOMPRESS_1`, and the write/compress
-//! path are not yet implemented. Random groups are not supported.
+//!
+//! **Writing** tile-compressed images is supported for `RICE_1` (integer + quantized
+//! float) and `GZIP_1`/`GZIP_2` (integer; lossless float via `GZIP_1`) via
+//! [`ImageData::compress`](image_data::ImageData::compress) /
+//! [`compress_image`](tile_compress::compress_image), producing standard, cfitsio-
+//! readable (`funpack`-decodable) compressed FITS. `PLIO_1`/`HCOMPRESS_1` encoding and
+//! random groups are not supported.
 //!
 //! ## Quick start — reading
 //!
@@ -124,5 +130,7 @@ pub use hdu::{Hdu, HduData};
 pub use header::Header;
 pub use image_data::{ImageData, PixelData};
 pub use keyword::{HeaderValue, Keyword};
-pub use tile_compress::{CompressedImage, CompressionType, Quantize, TileGeometry};
+pub use tile_compress::{
+    compress_image, CompressOptions, CompressedImage, CompressionType, Quantize, TileGeometry,
+};
 pub use types::Bitpix;
diff --git a/src/tile_compress.rs b/src/tile_compress.rs
index 8a2035a..97fef75 100644
--- a/src/tile_compress.rs
+++ b/src/tile_compress.rs
@@ -14,8 +14,14 @@
 //! Read (decompression) support is implemented for `RICE_1`, `GZIP_1`/`GZIP_2`
 //! (behind the `gzip` feature), `PLIO_1`, `HCOMPRESS_1` (`SMOOTH=0`), and
 //! `NOCOMPRESS`, for integer originals and for quantized/lossless float originals
-//! (with `SUBTRACTIVE_DITHER_1`/`_2` reversal). The compression (write) path is
-//! not yet implemented (see `COMPRESSION_PLAN.md`).
+//! (with `SUBTRACTIVE_DITHER_1`/`_2` reversal).
+//!
+//! Write (compression) support is implemented for `RICE_1` (integer images and
+//! quantized/dithered float images) and `GZIP_1`/`GZIP_2` (integer images; lossless
+//! raw-float storage via `GZIP_1`), via [`compress_image`] / [`ImageData::compress`].
+//! The encoders are the exact inverse of the decoders above and emit standard,
+//! cfitsio-readable compressed FITS (verified bit-exact against `funpack`). `PLIO_1`
+//! and `HCOMPRESS_1` encoding remain unimplemented (see `COMPRESSION_PLAN.md`).
 
 use crate::bintable::BinTable;
 use crate::error::{Error, Result};
@@ -1045,6 +1051,930 @@ pub fn rice_decompress_i8(src: &[u8], nvals: usize, blocksize: usize) -> Result<
     Ok(v.into_iter().map(|x| x as u8).collect())
 }
 
+// ---------------------------------------------------------------------------
+// RICE_1 compression (encode) — inverse of `rice_decompress_*`
+// ---------------------------------------------------------------------------
+//
+// Port of cfitsio `fits_rcomp` / `_short` / `_byte` (R. White, STScI, `ricecomp.c`).
+// The encoder writes the first value verbatim (the seed `lastpix`) in `bbits` bits,
+// then processes the array in blocks of `blocksize` pixels. For each block it computes
+// the zig-zag-mapped differences from the running `lastpix`, picks the Rice parameter
+// `fs` that minimises the block's coded length, and emits `fs+1` in `fsbits` bits
+// followed by the per-pixel codes:
+//   * fs < 0      -> block of all-zero differences (fs+1 == 0).
+//   * fs == fsmax -> each mapped diff stored verbatim in `bbits` bits.
+//   * else        -> Rice code: `(top = mapped >> fs)` zero bits, a one bit, then the
+//                    `fs` low bits of `mapped`.
+// The chosen `fs` exactly matches cfitsio's sum-based selection so the byte stream is
+// identical to `fpack`'s and decodes via [`rice_decompress_core`].
+
+/// MSB-first bit writer (the inverse of [`BitReader`]).
+struct BitWriterEnc {
+    bytes: Vec<u8>,
+    /// Bits accumulated in the low `bits_in_buf` bits of `buffer`, flushed a byte at a
+    /// time (MSB-first) whenever at least 8 are present. A u64 buffer keeps headroom for
+    /// a 32-bit write on top of up to 7 leftover bits.
+    buffer: u64,
+    bits_in_buf: u32,
+}
+
+impl BitWriterEnc {
+    fn new() -> Self {
+        BitWriterEnc {
+            bytes: Vec::new(),
+            buffer: 0,
+            bits_in_buf: 0,
+        }
+    }
+
+    /// Write the low `n` bits of `val` (0..=32) MSB-first.
+    fn write_bits(&mut self, val: u32, n: u32) {
+        if n == 0 {
+            return;
+        }
+        let val = (val as u64) & if n >= 32 { u32::MAX as u64 } else { (1u64 << n) - 1 };
+        self.buffer = (self.buffer << n) | val;
+        self.bits_in_buf += n;
+        while self.bits_in_buf >= 8 {
+            self.bits_in_buf -= 8;
+            let byte = (self.buffer >> self.bits_in_buf) & 0xff;
+            self.bytes.push(byte as u8);
+        }
+    }
+
+    /// Write `count` zero bits then a single one bit (the unary part of a Rice code).
+    fn write_unary(&mut self, count: u32) {
+        let mut remaining = count;
+        while remaining >= 24 {
+            self.write_bits(0, 24);
+            remaining -= 24;
+        }
+        // remaining zeros followed by a terminating 1 bit.
+        self.write_bits(1, remaining + 1);
+    }
+
+    /// Flush any partial byte (zero-padded on the low side, as cfitsio does) and return
+    /// the encoded bytes.
+    fn finish(mut self) -> Vec<u8> {
+        if self.bits_in_buf > 0 {
+            let pad = 8 - self.bits_in_buf;
+            let byte = (self.buffer << pad) & 0xff;
+            self.bytes.push(byte as u8);
+            self.bits_in_buf = 0;
+        }
+        debug_assert!(self.bits_in_buf == 0);
+        self.bytes
+    }
+}
+
+/// Zig-zag map a signed difference to an unsigned value (inverse of [`unzigzag`]).
+#[inline]
+fn zigzag_enc(v: i64) -> u64 {
+    ((v << 1) ^ (v >> 63)) as u64
+}
+
+/// Core RICE_1 encode shared by the 8/16/32-bit variants. `vals` are the raw integer
+/// samples (sign-extended to i64); `fsbits`/`fsmax`/`bbits` are the per-variant
+/// constants matching [`rice_decompress_core`].
+fn rice_compress_core(
+    vals: &[i64],
+    blocksize: usize,
+    fsbits: u32,
+    fsmax: u32,
+    bbits: u32,
+) -> Vec<u8> {
+    let mut w = BitWriterEnc::new();
+    if vals.is_empty() {
+        return w.finish();
+    }
+    let blocksize = blocksize.max(1);
+
+    // Seed: first value verbatim in bbits bits (masked to the field width).
+    let mask = if bbits >= 64 {
+        u64::MAX
+    } else {
+        (1u64 << bbits) - 1
+    };
+    let mut lastpix = vals[0];
+    w.write_bits((vals[0] as u64 & mask) as u32, bbits);
+
+    let mut i = 0usize;
+    while i < vals.len() {
+        let block_n = (vals.len() - i).min(blocksize);
+        // Zig-zag-mapped differences for this block.
+        let mut mapped = Vec::with_capacity(block_n);
+        let mut sum: u64 = 0;
+        for &v in &vals[i..i + block_n] {
+            // Compute the difference in the pixel's native bit width: reduce mod 2^bbits
+            // and sign-extend so it lands in [-2^(bbits-1), 2^(bbits-1)). This guarantees
+            // the zig-zag mapping fits in `bbits` bits (the verbatim-store width), and the
+            // decoder's `wrapping_add` + final narrowing reconstructs `v` exactly even
+            // when the raw difference wraps the type (e.g. MIN→MAX in i16).
+            let raw_diff = v.wrapping_sub(lastpix);
+            lastpix = v;
+            let diff = sign_extend((raw_diff as u64) & mask, bbits);
+            let m = zigzag_enc(diff);
+            sum = sum.wrapping_add(m);
+            mapped.push(m);
+        }
+
+        // Choose the Rice parameter `fs` that minimises this block's coded length.
+        let fs = select_fs(sum, block_n as u64, fsmax);
+
+        if fs < 0 {
+            // All differences are zero (sum == 0): emit fs+1 == 0.
+            w.write_bits(0, fsbits);
+        } else if fs as u32 >= fsmax {
+            // No compression: store each mapped diff verbatim in bbits bits.
+            w.write_bits(fsmax + 1, fsbits);
+            for &m in &mapped {
+                w.write_bits((m & mask) as u32, bbits);
+            }
+        } else {
+            let fs = fs as u32;
+            w.write_bits(fs + 1, fsbits);
+            for &m in &mapped {
+                let top = (m >> fs) as u32;
+                w.write_unary(top);
+                if fs > 0 {
+                    w.write_bits((m & ((1u64 << fs) - 1)) as u32, fs);
+                }
+            }
+        }
+
+        i += block_n;
+    }
+
+    w.finish()
+}
+
+/// Select the Rice parameter `fs` for a block by minimising its coded length.
+///
+/// For `block_n` pixels whose zig-zag-mapped differences sum to `sum`, coding with a
+/// given `fs` costs (approximately, and exactly for the dominant terms cfitsio uses)
+/// `block_n * (fs + 1)` fixed bits (the `fs` low bits plus one stop bit per pixel) plus
+/// `sum >> fs` unary tail bits. This is the same convex cost model cfitsio's
+/// `fits_rcomp` minimises; because the per-block `fs+1` value is written into the
+/// stream, the result decodes correctly with *any* valid `fs`, and minimising this cost
+/// reproduces cfitsio's choice on real data. Returns `-1` for an all-zero block
+/// (`sum == 0`) and never exceeds `fsmax` (the caller switches to verbatim storage at
+/// `fsmax`).
+fn select_fs(sum: u64, block_n: u64, fsmax: u32) -> i32 {
+    if sum == 0 {
+        return -1;
+    }
+    // Walk fs upward while increasing it lowers the estimated bit count; the cost is
+    // unimodal in fs, so stop at the first non-improving step.
+    let cost = |fs: u32| block_n.wrapping_mul((fs + 1) as u64).wrapping_add(sum >> fs);
+    let mut best_fs = 0u32;
+    let mut best_cost = cost(0);
+    let mut fs = 1u32;
+    while fs <= fsmax {
+        let c = cost(fs);
+        if c < best_cost {
+            best_cost = c;
+            best_fs = fs;
+            fs += 1;
+        } else {
+            break;
+        }
+    }
+    best_fs as i32
+}
+
+/// Compress 32-bit integer samples with RICE_1 (`BYTEPIX = 4`).
+pub fn rice_compress_i32(vals: &[i32], blocksize: usize) -> Vec<u8> {
+    let v: Vec<i64> = vals.iter().map(|&x| x as i64).collect();
+    rice_compress_core(&v, blocksize, 5, 25, 32)
+}
+
+/// Compress 16-bit integer samples with RICE_1 (`BYTEPIX = 2`).
+pub fn rice_compress_i16(vals: &[i16], blocksize: usize) -> Vec<u8> {
+    let v: Vec<i64> = vals.iter().map(|&x| x as i64).collect();
+    rice_compress_core(&v, blocksize, 4, 14, 16)
+}
+
+/// Compress 8-bit (unsigned) integer samples with RICE_1 (`BYTEPIX = 1`).
+pub fn rice_compress_i8(vals: &[u8], blocksize: usize) -> Vec<u8> {
+    let v: Vec<i64> = vals.iter().map(|&x| x as i64).collect();
+    rice_compress_core(&v, blocksize, 3, 6, 8)
+}
+
+// ---------------------------------------------------------------------------
+// GZIP_1 / GZIP_2 compression (encode)
+// ---------------------------------------------------------------------------
+
+/// Wrap raw bytes as a single RFC 1952 gzip member (the inverse of
+/// [`strip_gzip_wrapper`] + [`gzip_inflate`]): a fixed 10-byte header, a raw DEFLATE
+/// body, and the CRC32 + ISIZE trailer. Matches what zlib's `gzip` (and cfitsio's
+/// `GZIP_1`) produce, so the existing decoder and `funpack` both read it back.
+#[cfg(feature = "gzip")]
+fn gzip_deflate(raw: &[u8]) -> Vec<u8> {
+    let body = miniz_oxide::deflate::compress_to_vec(raw, 6);
+    let mut out = Vec::with_capacity(body.len() + 18);
+    // magic, CM=deflate, no flags, mtime=0, XFL=0, OS=255 (unknown).
+    out.extend_from_slice(&[0x1f, 0x8b, 8, 0, 0, 0, 0, 0, 0, 0xff]);
+    out.extend_from_slice(&body);
+    out.extend_from_slice(&gzip_crc32(raw).to_le_bytes());
+    out.extend_from_slice(&(raw.len() as u32).to_le_bytes());
+    out
+}
+
+/// CRC32 (gzip/PNG polynomial) for the gzip member trailer.
+#[cfg(feature = "gzip")]
+fn gzip_crc32(data: &[u8]) -> u32 {
+    let mut crc: u32 = 0xffff_ffff;
+    for &byte in data {
+        crc ^= byte as u32;
+        for _ in 0..8 {
+            let mask = (crc & 1).wrapping_neg();
+            crc = (crc >> 1) ^ (0xedb8_8320 & mask);
+        }
+    }
+    !crc
+}
+
+/// Byte-shuffle the big-endian integer stream for `GZIP_2` (the inverse of
+/// [`gzip2_unshuffle`]): emit all the most-significant bytes first, then the next, etc.
+#[cfg(feature = "gzip")]
+fn gzip2_shuffle(raw: &[u8], bytepix: usize) -> Vec<u8> {
+    if bytepix <= 1 {
+        return raw.to_vec();
+    }
+    let n = raw.len() / bytepix;
+    let mut out = vec![0u8; n * bytepix];
+    for (i, chunk) in raw.chunks_exact(bytepix).enumerate().take(n) {
+        for (b, &byte) in chunk.iter().enumerate() {
+            out[b * n + i] = byte;
+        }
+    }
+    out
+}
+
+// ---------------------------------------------------------------------------
+// High-level tile-compression (the public encode entry point)
+// ---------------------------------------------------------------------------
+
+/// Options controlling how [`ImageData::compress`] / [`compress_image`] encode an
+/// image into a tile-compressed BINTABLE HDU.
+#[derive(Debug, Clone)]
+pub struct CompressOptions {
+    /// Compression algorithm (`ZCMPTYPE`). `RICE_1`, `GZIP_1`, `GZIP_2` are supported
+    /// for encoding; `PLIO_1`/`HCOMPRESS_1`/`NOCOMPRESS` are rejected.
+    pub algorithm: CompressionType,
+    /// Tile shape (`ZTILEn`), axis-1 first. `None` selects cfitsio's default: one row
+    /// per tile (`ZTILE1 = NAXIS1`, all other axes = 1).
+    pub tile: Option<Vec<usize>>,
+    /// For float images: quantization parameter `q` (controls the per-tile scale,
+    /// `scale = range / (2*q*sigma_estimate)`-style). `None` ⇒ lossless float storage
+    /// (raw big-endian floats, only valid with the GZIP codecs). For RICE on floats a
+    /// quantization value is required.
+    pub quantize: Option<f64>,
+    /// Subtractive-dithering method for quantized floats (`ZQUANTIZ`). Ignored for the
+    /// lossless and integer paths.
+    pub dither: Quantize,
+    /// Dither seed (`ZDITHER0`, 1..=10000). `None` ⇒ 1.
+    pub dither_seed: Option<i64>,
+    /// RICE block size (`BLOCKSIZE`); cfitsio default 32.
+    pub blocksize: usize,
+}
+
+impl Default for CompressOptions {
+    fn default() -> Self {
+        CompressOptions {
+            algorithm: CompressionType::Rice1,
+            tile: None,
+            quantize: Some(4.0),
+            dither: Quantize::SubtractiveDither1,
+            dither_seed: Some(1),
+            blocksize: 32,
+        }
+    }
+}
+
+/// Default tile shape: one row per tile (`ZTILE1 = NAXIS1`, others 1).
+fn default_tile(axes: &[usize]) -> Vec<usize> {
+    let mut t = vec![1usize; axes.len()];
+    if let Some(first) = axes.first() {
+        t[0] = *first;
+    }
+    t
+}
+
+/// Extract the flat integer samples of an integer image as `i64` (axis-1 fastest, the
+/// native FITS storage order), along with the `BYTEPIX` to feed RICE.
+fn integer_samples(image: &crate::image_data::ImageData) -> Result<(Vec<i64>, usize)> {
+    use crate::image_data::PixelData;
+    Ok(match &image.pixels {
+        PixelData::U8(v) => (v.iter().map(|&x| x as i64).collect(), 1),
+        PixelData::I16(v) => (v.iter().map(|&x| x as i64).collect(), 2),
+        PixelData::I32(v) => (v.iter().map(|&x| x as i64).collect(), 4),
+        PixelData::I64(_) => {
+            return Err(Error::CompressionError(
+                "RICE_1/GZIP tile compression of 64-bit integer images is not supported".into(),
+            ))
+        }
+        PixelData::F32(_) | PixelData::F64(_) => {
+            return Err(Error::CompressionError(
+                "internal: float image routed to the integer encoder".into(),
+            ))
+        }
+    })
+}
+
+/// Gather one tile's flat samples (axis-1 fastest) from the full image buffer.
+fn gather_tile(
+    full: &[i64],
+    image_dims: &[usize],
+    ztile: &[usize],
+    tile_dims: &[usize],
+    coords: &[usize],
+) -> Vec<i64> {
+    let ndim = image_dims.len();
+    let mut img_stride = vec![1usize; ndim];
+    for axis in 1..ndim {
+        img_stride[axis] = img_stride[axis - 1] * image_dims[axis - 1];
+    }
+    let mut origin = 0usize;
+    for axis in 0..ndim {
+        origin += coords[axis] * ztile[axis].max(1) * img_stride[axis];
+    }
+    let tile_npix: usize = tile_dims.iter().product();
+    let mut out = Vec::with_capacity(tile_npix);
+    let mut tcoord = vec![0usize; ndim];
+    for _ in 0..tile_npix {
+        let mut src = origin;
+        for axis in 0..ndim {
+            src += tcoord[axis] * img_stride[axis];
+        }
+        out.push(full[src]);
+        for axis in 0..ndim {
+            tcoord[axis] += 1;
+            if tcoord[axis] < tile_dims[axis] {
+                break;
+            }
+            tcoord[axis] = 0;
+        }
+    }
+    out
+}
+
+/// Tile dimensions at grid coordinates `coords`, accounting for edge truncation.
+fn tile_dims_at(image_dims: &[usize], ztile: &[usize], coords: &[usize]) -> Vec<usize> {
+    let mut dims = Vec::with_capacity(coords.len());
+    for (axis, &c) in coords.iter().enumerate() {
+        let n = image_dims[axis];
+        let t = ztile[axis].max(1);
+        let start = c * t;
+        dims.push((n - start).min(t));
+    }
+    dims
+}
+
+/// Encode one integer tile's samples into its compressed byte stream per `algorithm`.
+fn encode_int_tile(
+    samples: &[i64],
+    algorithm: CompressionType,
+    bytepix: usize,
+    blocksize: usize,
+) -> Result<Vec<u8>> {
+    match algorithm {
+        CompressionType::Rice1 => Ok(match bytepix {
+            1 => rice_compress_i8(
+                &samples.iter().map(|&v| v as u8).collect::<Vec<_>>(),
+                blocksize,
+            ),
+            2 => rice_compress_i16(
+                &samples.iter().map(|&v| v as i16).collect::<Vec<_>>(),
+                blocksize,
+            ),
+            4 => rice_compress_i32(
+                &samples.iter().map(|&v| v as i32).collect::<Vec<_>>(),
+                blocksize,
+            ),
+            other => {
+                return Err(Error::CompressionError(format!(
+                    "RICE_1 BYTEPIX={other} not supported (expected 1, 2, or 4)"
+                )))
+            }
+        }),
+        CompressionType::Gzip1 | CompressionType::Gzip2 => {
+            #[cfg(feature = "gzip")]
+            {
+                let raw = int_samples_to_be(samples, bytepix);
+                let payload = if algorithm == CompressionType::Gzip2 {
+                    gzip2_shuffle(&raw, bytepix)
+                } else {
+                    raw
+                };
+                Ok(gzip_deflate(&payload))
+            }
+            #[cfg(not(feature = "gzip"))]
+            {
+                let _ = (samples, bytepix);
+                Err(Error::UnsupportedCompression(
+                    "GZIP_1/GZIP_2 encode requires the `gzip` feature (miniz_oxide)".into(),
+                ))
+            }
+        }
+        CompressionType::Plio1 | CompressionType::Hcompress1 | CompressionType::NoCompress => {
+            Err(Error::UnsupportedCompression(format!(
+                "{algorithm:?} encoding is not supported (only RICE_1, GZIP_1, GZIP_2)"
+            )))
+        }
+    }
+}
+
+/// Serialise integer samples to big-endian bytes of width `bytepix` (used by GZIP).
+#[cfg(feature = "gzip")]
+fn int_samples_to_be(samples: &[i64], bytepix: usize) -> Vec<u8> {
+    let mut out = Vec::with_capacity(samples.len() * bytepix);
+    for &v in samples {
+        match bytepix {
+            1 => out.push(v as u8),
+            2 => out.extend_from_slice(&(v as i16).to_be_bytes()),
+            4 => out.extend_from_slice(&(v as i32).to_be_bytes()),
+            8 => out.extend_from_slice(&v.to_be_bytes()),
+            _ => out.extend_from_slice(&(v as i32).to_be_bytes()),
+        }
+    }
+    out
+}
+
+/// Build the `Z*` driver keywords for a compressed-image BINTABLE, in the exact order
+/// cfitsio's `fpack` writes them.
+///
+/// The order is load-bearing: `funpack`/cfitsio reconstruct the uncompressed image
+/// header by walking the cards in order and translating `ZBITPIX`/`ZNAXIS`/... into
+/// `BITPIX`/`NAXIS`/..., emitting the leading `XTENSION` card from `ZTENSION`. If
+/// `ZBITPIX` appears before `ZTENSION`, the rebuilt header starts with `BITPIX` and
+/// cfitsio rejects it ("1st key not SIMPLE or XTENSION"). The accepted order is:
+/// `ZIMAGE`, `ZTILEn`, `ZCMPTYPE`, `ZNAMEi`/`ZVALi`, `EXTNAME`, `ZTENSION`, `ZBITPIX`,
+/// `ZNAXIS`, `ZNAXISn`, `ZPCOUNT`, `ZGCOUNT`. Codec parameter pairs (`zparams`) are
+/// emitted between `ZCMPTYPE` and `EXTNAME`.
+fn build_z_header(
+    header: &mut Header,
+    image: &crate::image_data::ImageData,
+    ztile: &[usize],
+    algorithm: CompressionType,
+    zparams: &[(&str, i64)],
+) {
+    use crate::keyword::HeaderValue;
+    let zbitpix = image.bitpix().to_i64();
+    let znaxis = image.axes.len();
+
+    header.set(
+        "ZIMAGE",
+        HeaderValue::Logical(true),
+        Some("extension contains compressed image"),
+    );
+    for (i, &t) in ztile.iter().enumerate() {
+        header.set(
+            &format!("ZTILE{}", i + 1),
+            HeaderValue::Integer(t as i64),
+            Some("size of tiles to be compressed"),
+        );
+    }
+    let zcmptype = match algorithm {
+        CompressionType::Rice1 => "RICE_1",
+        CompressionType::Gzip1 => "GZIP_1",
+        CompressionType::Gzip2 => "GZIP_2",
+        CompressionType::Hcompress1 => "HCOMPRESS_1",
+        CompressionType::Plio1 => "PLIO_1",
+        CompressionType::NoCompress => "NOCOMPRESS",
+    };
+    header.set(
+        "ZCMPTYPE",
+        HeaderValue::String(zcmptype.into()),
+        Some("compression algorithm"),
+    );
+    for (i, (name, val)) in zparams.iter().enumerate() {
+        header.set(
+            &format!("ZNAME{}", i + 1),
+            HeaderValue::String((*name).into()),
+            None,
+        );
+        header.set(&format!("ZVAL{}", i + 1), HeaderValue::Integer(*val), None);
+    }
+    header.set(
+        "EXTNAME",
+        HeaderValue::String("COMPRESSED_IMAGE".into()),
+        None,
+    );
+    // The compressed image always reconstructs as an IMAGE extension; cfitsio reads the
+    // original extension type from `ZTENSION` and emits the leading `XTENSION` card.
+    header.set(
+        "ZTENSION",
+        HeaderValue::String("IMAGE".into()),
+        Some("Image extension"),
+    );
+    header.set(
+        "ZBITPIX",
+        HeaderValue::Integer(zbitpix),
+        Some("data type of original image"),
+    );
+    header.set(
+        "ZNAXIS",
+        HeaderValue::Integer(znaxis as i64),
+        Some("dimension of original image"),
+    );
+    for (i, &n) in image.axes.iter().enumerate() {
+        header.set(
+            &format!("ZNAXIS{}", i + 1),
+            HeaderValue::Integer(n as i64),
+            None,
+        );
+    }
+    header.set("ZPCOUNT", HeaderValue::Integer(0), Some("original PCOUNT"));
+    header.set("ZGCOUNT", HeaderValue::Integer(1), Some("original GCOUNT"));
+}
+
+/// Compress an [`ImageData`] into a tile-compressed BINTABLE [`Hdu`] (`ZIMAGE = T`).
+///
+/// The returned HDU's `data` is [`HduData::BinTable`](crate::hdu::HduData::BinTable)
+/// with a `COMPRESSED_DATA` (`1PB`) variable-length-array column holding one
+/// RICE/GZIP-encoded tile per row, plus the full set of `Z*` driver keywords. It is
+/// ready to hand to [`FitsFile::push_extension`](crate::fits::FitsFile::push_extension).
+///
+/// - Integer images (`ZBITPIX` 8/16/32) compress losslessly with `RICE_1`/`GZIP_1`/
+///   `GZIP_2`.
+/// - Float images (`ZBITPIX` -32/-64) compress losslessly with `GZIP_1`/`GZIP_2` when
+///   `opts.quantize` is `None` (raw big-endian floats, `ZQUANTIZ` = `NONE`), or lossily
+///   with quantization + optional subtractive dithering otherwise (`RICE_1`/GZIP of the
+///   quantized 32-bit integers, with per-tile `ZSCALE`/`ZZERO` columns).
+///
+/// `PLIO_1`, `HCOMPRESS_1`, `NOCOMPRESS`, and 64-bit integer images are rejected with
+/// [`Error::UnsupportedCompression`]/[`Error::CompressionError`].
+pub fn compress_image(
+    image: &crate::image_data::ImageData,
+    opts: &CompressOptions,
+) -> Result<crate::hdu::Hdu> {
+    use crate::bintable::{BinColumnType, BinTableBuilder};
+    use crate::hdu::{Hdu, HduData};
+
+    match opts.algorithm {
+        CompressionType::Rice1 | CompressionType::Gzip1 | CompressionType::Gzip2 => {}
+        other => {
+            return Err(Error::UnsupportedCompression(format!(
+                "{other:?} encoding is not supported (only RICE_1, GZIP_1, GZIP_2)"
+            )))
+        }
+    }
+
+    if image.axes.is_empty() {
+        return Err(Error::CompressionError(
+            "cannot compress an image with zero axes".into(),
+        ));
+    }
+    let expected_npix: usize = image.axes.iter().product();
+    if image.pixels.len() != expected_npix {
+        return Err(Error::CompressionError(format!(
+            "image has {} pixels but axes imply {expected_npix}",
+            image.pixels.len()
+        )));
+    }
+
+    let ztile = opts.tile.clone().unwrap_or_else(|| default_tile(&image.axes));
+    if ztile.len() != image.axes.len() {
+        return Err(Error::CompressionError(format!(
+            "tile shape has {} axes but image has {}",
+            ztile.len(),
+            image.axes.len()
+        )));
+    }
+
+    let is_float = image.bitpix().to_i64() < 0;
+    if is_float {
+        return compress_float_image(image, opts, &ztile);
+    }
+
+    // ---- integer path ----
+    let (full, bytepix) = integer_samples(image)?;
+    let blocksize = opts.blocksize.max(1);
+
+    // Tile grid.
+    let tiles_per_axis: Vec<usize> = image
+        .axes
+        .iter()
+        .zip(&ztile)
+        .map(|(&n, &t)| n.div_ceil(t.max(1)))
+        .collect();
+    let num_tiles: usize = tiles_per_axis.iter().product();
+
+    let mut builder = BinTableBuilder::new().add_column("COMPRESSED_DATA", BinColumnType::VarP('B'));
+    let mut max_elems = 0usize;
+    for tile_index in 0..num_tiles {
+        let coords = unravel(tile_index, &tiles_per_axis);
+        let tdims = tile_dims_at(&image.axes, &ztile, &coords);
+        let samples = gather_tile(&full, &image.axes, &ztile, &tdims, &coords);
+        let enc = encode_int_tile(&samples, opts.algorithm, bytepix, blocksize)?;
+        max_elems = max_elems.max(enc.len());
+        builder = builder.push_row(|r| {
+            r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc))
+        });
+    }
+    let table = builder.build();
+
+    let mut header = Header::new();
+    let zparams: Vec<(&str, i64)> = if opts.algorithm == CompressionType::Rice1 {
+        vec![("BLOCKSIZE", blocksize as i64), ("BYTEPIX", bytepix as i64)]
+    } else {
+        Vec::new()
+    };
+    build_z_header(&mut header, image, &ztile, opts.algorithm, &zparams);
+    let header = finalize_compressed_hdu(header, &table, max_elems);
+
+    Ok(Hdu::new(header, HduData::BinTable(table)))
+}
+
+/// Compress a float image (`ZBITPIX` -32/-64). See [`compress_image`].
+fn compress_float_image(
+    image: &crate::image_data::ImageData,
+    opts: &CompressOptions,
+    ztile: &[usize],
+) -> Result<crate::hdu::Hdu> {
+    use crate::bintable::{BinColumnType, BinTableBuilder};
+    use crate::hdu::{Hdu, HduData};
+    use crate::image_data::PixelData;
+    use crate::keyword::HeaderValue;
+
+    // Used only by the (feature-gated) lossless-float GZIP branch.
+    #[cfg_attr(not(feature = "gzip"), allow(unused_variables))]
+    let is_f64 = image.bitpix().to_i64() == -64;
+    let floats: Vec<f64> = match &image.pixels {
+        PixelData::F32(v) => v.iter().map(|&x| x as f64).collect(),
+        PixelData::F64(v) => v.clone(),
+        _ => unreachable!("compress_float_image called on non-float image"),
+    };
+
+    let tiles_per_axis: Vec<usize> = image
+        .axes
+        .iter()
+        .zip(ztile)
+        .map(|(&n, &t)| n.div_ceil(t.max(1)))
+        .collect();
+    let num_tiles: usize = tiles_per_axis.iter().product();
+    let blocksize = opts.blocksize.max(1);
+
+    // ---- lossless float storage (raw big-endian floats, GZIP only) ----
+    if opts.quantize.is_none() {
+        // Lossless float storage uses GZIP over the raw big-endian floats. Only GZIP_1
+        // is supported: cfitsio's float byte-shuffle (GZIP_2) is not applied on the
+        // raw-float fallback path that the decoder reads, so GZIP_2 here would not
+        // round-trip. (GZIP_2 *is* supported for integer images.)
+        match opts.algorithm {
+            CompressionType::Gzip1 => {}
+            other => {
+                return Err(Error::UnsupportedCompression(format!(
+                    "lossless float compression (quantize=None) needs GZIP_1, not {other:?}"
+                )))
+            }
+        }
+        #[cfg(not(feature = "gzip"))]
+        {
+            return Err(Error::UnsupportedCompression(
+                "GZIP float encode requires the `gzip` feature (miniz_oxide)".into(),
+            ));
+        }
+        #[cfg(feature = "gzip")]
+        {
+            let mut builder =
+                BinTableBuilder::new().add_column("COMPRESSED_DATA", BinColumnType::VarP('B'));
+            let mut max_elems = 0usize;
+            for tile_index in 0..num_tiles {
+                let coords = unravel(tile_index, &tiles_per_axis);
+                let tdims = tile_dims_at(&image.axes, ztile, &coords);
+                let tile = gather_tile_f64(&floats, &image.axes, ztile, &tdims, &coords);
+                // Raw big-endian floats, GZIP_1 (no shuffle) — read back by the decoder's
+                // unquantized-tile fallback path.
+                let mut raw = Vec::with_capacity(tile.len() * if is_f64 { 8 } else { 4 });
+                for &v in &tile {
+                    if is_f64 {
+                        raw.extend_from_slice(&v.to_be_bytes());
+                    } else {
+                        raw.extend_from_slice(&(v as f32).to_be_bytes());
+                    }
+                }
+                let enc = gzip_deflate(&raw);
+                max_elems = max_elems.max(enc.len());
+                builder = builder
+                    .push_row(|r| r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc)));
+            }
+            let table = builder.build();
+            let mut header = Header::new();
+            build_z_header(&mut header, image, ztile, opts.algorithm, &[]);
+            header.set("ZQUANTIZ", HeaderValue::String("NONE".into()), Some("no dithering"));
+            let header = finalize_compressed_hdu(header, &table, max_elems);
+            return Ok(Hdu::new(header, HduData::BinTable(table)));
+        }
+    }
+
+    // ---- lossy quantized float storage ----
+    let q = opts.quantize.unwrap_or(4.0);
+    let dither = opts.dither;
+    let dithered = matches!(
+        dither,
+        Quantize::SubtractiveDither1 | Quantize::SubtractiveDither2
+    );
+    let zdither0 = opts.dither_seed.unwrap_or(1);
+
+    let mut builder = BinTableBuilder::new()
+        .add_column("COMPRESSED_DATA", BinColumnType::VarP('B'))
+        .add_column("ZSCALE", BinColumnType::D64(1))
+        .add_column("ZZERO", BinColumnType::D64(1));
+    let mut max_elems = 0usize;
+
+    for tile_index in 0..num_tiles {
+        let coords = unravel(tile_index, &tiles_per_axis);
+        let tdims = tile_dims_at(&image.axes, ztile, &coords);
+        let tile = gather_tile_f64(&floats, &image.axes, ztile, &tdims, &coords);
+
+        let (scale, zero) = choose_scale_zero(&tile, q);
+        let q_ints = quantize_tile(&tile, scale, zero, dither, zdither0, tile_index);
+        let enc = encode_int_tile(&q_ints, opts.algorithm, 4, blocksize)?;
+        max_elems = max_elems.max(enc.len());
+        builder = builder.push_row(|r| {
+            r.write_var_p(enc.len() as i32, |heap| heap.extend_from_slice(&enc));
+            r.write_f64(scale);
+            r.write_f64(zero);
+        });
+    }
+    let table = builder.build();
+
+    let mut header = Header::new();
+    let zparams: Vec<(&str, i64)> = if opts.algorithm == CompressionType::Rice1 {
+        vec![("BLOCKSIZE", blocksize as i64), ("BYTEPIX", 4)]
+    } else {
+        Vec::new()
+    };
+    build_z_header(&mut header, image, ztile, opts.algorithm, &zparams);
+    let zquantiz = match dither {
+        Quantize::SubtractiveDither1 => "SUBTRACTIVE_DITHER_1",
+        Quantize::SubtractiveDither2 => "SUBTRACTIVE_DITHER_2",
+        Quantize::None => "NO_DITHER",
+    };
+    header.set("ZQUANTIZ", HeaderValue::String(zquantiz.into()), Some("quantization method"));
+    if dithered {
+        header.set("ZDITHER0", HeaderValue::Integer(zdither0), Some("dithering offset seed"));
+    }
+    header.set("ZBLANK", HeaderValue::Integer(NULL_VALUE), Some("null value"));
+    let header = finalize_compressed_hdu(header, &table, max_elems);
+
+    Ok(Hdu::new(header, HduData::BinTable(table)))
+}
+
+/// Gather one tile's flat floats (axis-1 fastest) from the full image buffer.
+fn gather_tile_f64(
+    full: &[f64],
+    image_dims: &[usize],
+    ztile: &[usize],
+    tile_dims: &[usize],
+    coords: &[usize],
+) -> Vec<f64> {
+    let ndim = image_dims.len();
+    let mut img_stride = vec![1usize; ndim];
+    for axis in 1..ndim {
+        img_stride[axis] = img_stride[axis - 1] * image_dims[axis - 1];
+    }
+    let mut origin = 0usize;
+    for axis in 0..ndim {
+        origin += coords[axis] * ztile[axis].max(1) * img_stride[axis];
+    }
+    let tile_npix: usize = tile_dims.iter().product();
+    let mut out = Vec::with_capacity(tile_npix);
+    let mut tcoord = vec![0usize; ndim];
+    for _ in 0..tile_npix {
+        let mut src = origin;
+        for axis in 0..ndim {
+            src += tcoord[axis] * img_stride[axis];
+        }
+        out.push(full[src]);
+        for axis in 0..ndim {
+            tcoord[axis] += 1;
+            if tcoord[axis] < tile_dims[axis] {
+                break;
+            }
+            tcoord[axis] = 0;
+        }
+    }
+    out
+}
+
+/// Choose a per-tile linear quantization `(scale, zero)` for float values.
+///
+/// We do not replicate cfitsio's noise-based heuristic; any scale that round-trips
+/// within tolerance is acceptable (per the task). We map the finite value range into
+/// the available integer range with `q` quantization levels per unit range so the
+/// quantization step is `scale = range / (q * 2^16)` (clamped to a sane minimum), with
+/// `zero` at the tile mean. The reconstructed value is `zero + scale * q_int`, matching
+/// the decoder's `unquantize`.
+fn choose_scale_zero(tile: &[f64], q: f64) -> (f64, f64) {
+    let mut min = f64::INFINITY;
+    let mut max = f64::NEG_INFINITY;
+    let mut sum = 0.0f64;
+    let mut count = 0usize;
+    for &v in tile {
+        if v.is_finite() {
+            min = min.min(v);
+            max = max.max(v);
+            sum += v;
+            count += 1;
+        }
+    }
+    if count == 0 {
+        // All-NaN tile: scale 1, zero 0 (every pixel becomes the blank sentinel).
+        return (1.0, 0.0);
+    }
+    let range = (max - min).abs();
+    let q = if q > 0.0 { q } else { 4.0 };
+    // Target ~ (q * 65536) quantization levels across the range; floor the step so a
+    // flat tile still quantizes losslessly enough.
+    let mut scale = range / (q * 65536.0);
+    if !(scale.is_finite()) || scale <= 0.0 {
+        scale = 1.0;
+    }
+    let zero = sum / count as f64;
+    (scale, zero)
+}
+
+/// Quantize a tile's floats to 32-bit integers with cfitsio-compatible subtractive
+/// dithering. The forward transform inverts the decoder's `unquantize`:
+/// `q = round((value - zero)/scale + dither - 0.5)`, NaNs map to `NULL_VALUE`, and
+/// (for `SUBTRACTIVE_DITHER_2`) exact zeros map to `ZERO_VALUE`.
+fn quantize_tile(
+    tile: &[f64],
+    scale: f64,
+    zero: f64,
+    dither: Quantize,
+    zdither0: i64,
+    tile_index: usize,
+) -> Vec<i64> {
+    let dithered = matches!(
+        dither,
+        Quantize::SubtractiveDither1 | Quantize::SubtractiveDither2
+    );
+    let dither2 = dither == Quantize::SubtractiveDither2;
+
+    let mut iseed = ((tile_index as i64 + zdither0 - 1).rem_euclid(N_RANDOM as i64)) as usize;
+    let mut nextrand = (fits_rand_value(iseed) * 500.0) as usize;
+
+    let mut out = Vec::with_capacity(tile.len());
+    for &v in tile {
+        let qi = if v.is_nan() {
+            NULL_VALUE
+        } else if dither2 && v == 0.0 {
+            ZERO_VALUE
+        } else if dithered {
+            // Inverse of value = (q - r + 0.5)*scale + zero, computed with the FMA the
+            // decoder uses so the round-trip is exact:
+            //   q = round((value - zero)/scale + r - 0.5)
+            let r = fits_rand_value(nextrand) as f64;
+            let t = (v - zero) / scale + r - 0.5;
+            t.round() as i64
+        } else {
+            let t = (v - zero) / scale;
+            t.round() as i64
+        };
+        out.push(qi);
+
+        if dithered {
+            nextrand += 1;
+            if nextrand == N_RANDOM {
+                iseed += 1;
+                if iseed == N_RANDOM {
+                    iseed = 0;
+                }
+                nextrand = (fits_rand_value(iseed) * 500.0) as usize;
+            }
+        }
+    }
+    out
+}
+
+/// Finalise a compressed-image HDU: fill the BINTABLE keywords, then fix up `TFORM1` to
+/// the `1PB(maxlen)` form cfitsio writes (so `funpack` knows the max VLA element count),
+/// and append a descriptive `EXTNAME`.
+fn finalize_compressed_hdu(mut header: Header, table: &BinTable, max_elems: usize) -> Header {
+    use crate::keyword::HeaderValue;
+    // Materialise the mandatory BINTABLE keywords (XTENSION/NAXISn/PCOUNT/TFIELDS/...)
+    // from the table, then layer the Z* keywords (already in `header`) on top. We do
+    // this by filling a fresh header and merging Z* afterwards so ordering matches the
+    // cfitsio convention (BINTABLE structural keywords first, then ZIMAGE...).
+    let mut out = Header::new();
+    table.fill_header(&mut out);
+    // TFORM1 with explicit max length, matching cfitsio (`1PB(<maxlen>)`).
+    out.set(
+        "TFORM1",
+        HeaderValue::String(format!("1PB({max_elems})")),
+        Some("variable length array"),
+    );
+    // Append all Z* / quantization / extra keywords from the staged header.
+    for kw in header.keywords.drain(..) {
+        out.keywords.push(kw);
+    }
+    out
+}
+
 // ---------------------------------------------------------------------------
 // Other algorithms (stubs)
 // ---------------------------------------------------------------------------
@@ -2513,4 +3443,222 @@ mod tests {
             other => panic!("expected F32, got {other:?}"),
         }
     }
+
+    // --- RICE_1 encode (the new write path) ----------------------------------
+
+    /// The RICE encoder's output must decode via the existing decoder, for a range of
+    /// difference magnitudes / signs / block boundaries, across all three BYTEPIX.
+    #[test]
+    fn rice_encode_decode_roundtrip_i32() {
+        let cases: Vec<Vec<i32>> = vec![
+            vec![0],
+            vec![42, 42, 42, 42],                  // zero-diff block
+            vec![1000, 1003, 1001, 1005, 1002],    // small diffs
+            vec![i32::MIN, 0, i32::MAX, -1, 1],     // extreme diffs (verbatim path)
+            (0..100).map(|i| (i * i) % 7 - 3).collect(), // > one block
+            (0..200).map(|i: i32| (i - 100) * 17).collect(),
+        ];
+        for vals in cases {
+            let enc = rice_compress_i32(&vals, 32);
+            let dec = rice_decompress_i32(&enc, vals.len(), 32).unwrap();
+            assert_eq!(dec, vals, "i32 roundtrip failed for {vals:?}");
+        }
+    }
+
+    #[test]
+    fn rice_encode_decode_roundtrip_i16() {
+        let vals: Vec<i16> = (0..300).map(|i| ((i * 31) % 251 - 120) as i16).collect();
+        let enc = rice_compress_i16(&vals, 32);
+        let dec = rice_decompress_i16(&enc, vals.len(), 32).unwrap();
+        assert_eq!(dec, vals);
+        // extremes
+        let ex: Vec<i16> = vec![i16::MIN, i16::MAX, 0, -1, 1, i16::MIN];
+        let enc = rice_compress_i16(&ex, 32);
+        assert_eq!(rice_decompress_i16(&enc, ex.len(), 32).unwrap(), ex);
+    }
+
+    #[test]
+    fn rice_encode_decode_roundtrip_i8() {
+        let vals: Vec<u8> = (0..400).map(|i| ((i * 7) % 256) as u8).collect();
+        let enc = rice_compress_i8(&vals, 32);
+        let dec = rice_decompress_i8(&enc, vals.len(), 32).unwrap();
+        assert_eq!(dec, vals);
+    }
+
+    /// A non-default block size must round-trip too (the block boundary moves).
+    #[test]
+    fn rice_encode_blocksize_16() {
+        let vals: Vec<i32> = (0..70).map(|i| (i * 3) % 11 - 5).collect();
+        let enc = rice_compress_i32(&vals, 16);
+        let dec = rice_decompress_i32(&enc, vals.len(), 16).unwrap();
+        assert_eq!(dec, vals);
+    }
+
+    /// `ImageData::compress` (RICE_1) → `as_compressed_image().decompress()` must be a
+    /// byte-exact identity for an integer image, including a 2-D square tiling with
+    /// edge-truncated tiles.
+    #[test]
+    fn compress_image_rice_int_internal_roundtrip() {
+        use crate::image_data::{ImageData, PixelData};
+
+        // 13x11 I16 image, value = mix of trend + noise; square 5x5 tiles (edges ragged).
+        let w = 13usize;
+        let h = 11usize;
+        let pixels: Vec<i16> = (0..(w * h))
+            .map(|i| (((i * 37) % 521) as i16) - 200)
+            .collect();
+        let img = ImageData::new(vec![w, h], PixelData::I16(pixels));
+
+        for tile in [None, Some(vec![w, 1]), Some(vec![5, 5]), Some(vec![13, 11])] {
+            let opts = CompressOptions {
+                algorithm: CompressionType::Rice1,
+                tile,
+                ..Default::default()
+            };
+            let hdu = img.compress(&opts).unwrap();
+            assert!(matches!(hdu.data, crate::hdu::HduData::BinTable(_)));
+            let back = hdu
+                .as_compressed_image()
+                .expect("ZIMAGE detected")
+                .decompress()
+                .unwrap();
+            assert_eq!(back.axes, img.axes);
+            assert_eq!(back.pixels.to_bytes(), img.pixels.to_bytes());
+        }
+    }
+
+    #[test]
+    fn compress_image_rice_i32_and_u8_roundtrip() {
+        use crate::image_data::{ImageData, PixelData};
+
+        let i32img = ImageData::new(
+            vec![20, 3],
+            PixelData::I32((0..60).map(|i| (i - 30) * 12345).collect()),
+        );
+        let hdu = i32img.compress(&CompressOptions::default()).unwrap();
+        let back = hdu.as_compressed_image().unwrap().decompress().unwrap();
+        assert_eq!(back.pixels.to_bytes(), i32img.pixels.to_bytes());
+
+        let u8img = ImageData::new(
+            vec![17, 4],
+            PixelData::U8((0..68).map(|i| (i * 5 % 256) as u8).collect()),
+        );
+        let hdu = u8img.compress(&CompressOptions::default()).unwrap();
+        let back = hdu.as_compressed_image().unwrap().decompress().unwrap();
+        assert_eq!(back.pixels.to_bytes(), u8img.pixels.to_bytes());
+    }
+
+    #[test]
+    fn compress_rejects_unsupported_algorithms() {
+        use crate::image_data::{ImageData, PixelData};
+        let img = ImageData::new(vec![4], PixelData::I16(vec![1, 2, 3, 4]));
+        for alg in [
+            CompressionType::Plio1,
+            CompressionType::Hcompress1,
+            CompressionType::NoCompress,
+        ] {
+            let opts = CompressOptions {
+                algorithm: alg,
+                ..Default::default()
+            };
+            assert!(matches!(
+                img.compress(&opts),
+                Err(Error::UnsupportedCompression(_))
+            ));
+        }
+        // 64-bit integer images are rejected.
+        let i64img = ImageData::new(vec![3], PixelData::I64(vec![1, 2, 3]));
+        assert!(i64img.compress(&CompressOptions::default()).is_err());
+    }
+
+    #[cfg(feature = "gzip")]
+    #[test]
+    fn compress_image_gzip_int_roundtrip() {
+        use crate::image_data::{ImageData, PixelData};
+        let img = ImageData::new(
+            vec![15, 6],
+            PixelData::I16((0..90).map(|i| (i * 13 % 400 - 200) as i16).collect()),
+        );
+        for alg in [CompressionType::Gzip1, CompressionType::Gzip2] {
+            let opts = CompressOptions {
+                algorithm: alg,
+                ..Default::default()
+            };
+            let hdu = img.compress(&opts).unwrap();
+            let back = hdu.as_compressed_image().unwrap().decompress().unwrap();
+            assert_eq!(
+                back.pixels.to_bytes(),
+                img.pixels.to_bytes(),
+                "{alg:?} int roundtrip"
+            );
+        }
+    }
+
+    #[cfg(feature = "gzip")]
+    #[test]
+    fn compress_image_gzip_lossless_float_roundtrip() {
+        use crate::image_data::{ImageData, PixelData};
+        let pixels: Vec<f32> = (0..96).map(|i| (i as f32) * 0.125 - 6.0).collect();
+        let img = ImageData::new(vec![12, 8], PixelData::F32(pixels));
+        // Lossless float storage is GZIP_1 only (GZIP_2 float shuffle is unsupported).
+        let opts = CompressOptions {
+            algorithm: CompressionType::Gzip1,
+            quantize: None, // lossless raw floats
+            ..Default::default()
+        };
+        let hdu = img.compress(&opts).unwrap();
+        let back = hdu.as_compressed_image().unwrap().decompress().unwrap();
+        assert_eq!(
+            back.pixels.to_bytes(),
+            img.pixels.to_bytes(),
+            "GZIP_1 lossless float roundtrip"
+        );
+        // GZIP_2 lossless float is rejected.
+        let opts2 = CompressOptions {
+            algorithm: CompressionType::Gzip2,
+            quantize: None,
+            ..Default::default()
+        };
+        assert!(matches!(
+            img.compress(&opts2),
+            Err(Error::UnsupportedCompression(_))
+        ));
+    }
+
+    /// Lossy quantized-float RICE encode must round-trip within ~scale tolerance, and
+    /// the reconstructed values must match our own decoder's `unquantize` exactly.
+    #[test]
+    fn compress_image_rice_float_quantize_roundtrip() {
+        use crate::image_data::{ImageData, PixelData};
+        let pixels: Vec<f32> = (0..256)
+            .map(|i| ((i as f32) * 0.017).sin() * 100.0 + 50.0)
+            .collect();
+        let img = ImageData::new(vec![16, 16], PixelData::F32(pixels.clone()));
+        let opts = CompressOptions {
+            algorithm: CompressionType::Rice1,
+            tile: Some(vec![16, 4]),
+            quantize: Some(4.0),
+            dither: Quantize::SubtractiveDither1,
+            dither_seed: Some(5),
+            ..Default::default()
+        };
+        let hdu = img.compress(&opts).unwrap();
+        let back = hdu.as_compressed_image().unwrap().decompress().unwrap();
+        let recon = match &back.pixels {
+            PixelData::F32(v) => v.clone(),
+            other => panic!("expected F32, got {other:?}"),
+        };
+        assert_eq!(recon.len(), pixels.len());
+        // The dither RNG indexing the encoder used is the same the decoder inverts, so
+        // the error is bounded by the per-tile quantization step. Check a generous
+        // tolerance relative to the data range.
+        let range = 100.0f32; // amplitude
+        let tol = range / 1000.0;
+        for (i, (&o, &r)) in pixels.iter().zip(recon.iter()).enumerate() {
+            assert!(
+                (o - r).abs() <= tol,
+                "pixel {i}: orig {o} recon {r} exceeds tol {tol}"
+            );
+        }
+    }
 }
diff --git a/tests/sample_files.rs b/tests/sample_files.rs
index 3a48201..159c20a 100644
--- a/tests/sample_files.rs
+++ b/tests/sample_files.rs
@@ -464,3 +464,278 @@ fn gzip1_roundtrip_euv() {
     // GZIP_1 stores raw big-endian image integers per tile; needs the `gzip` feature.
     assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.gzip1.fits.fz");
 }
+
+// === ENCODE (write) path: fits4 produces tile-compressed FITS ================
+//
+// Two oracles per algorithm:
+//   (1) internal: image.compress(opts) -> as_compressed_image().decompress() == image
+//       (byte-exact for lossless RICE/GZIP int + lossless-float GZIP).
+//   (2) interop:  write the fits4-compressed file, run the `funpack` CLI on it, and
+//       confirm funpack's reconstruction matches the original image (byte-exact for
+//       lossless cases). This proves fits4 emits standard, cfitsio-readable compressed
+//       FITS. Skipped cleanly when samples or `funpack` are absent.
+
+use fits4::tile_compress::CompressOptions;
+use fits4::{CompressionType, Quantize};
+
+/// Collect the non-empty image HDUs (primary + extensions) of an uncompressed sample.
+fn sample_images(src_name: &str) -> Vec<ImageData> {
+    let src = FitsFile::from_file(samp(src_name)).expect("read source");
+    src.hdus
+        .iter()
+        .filter_map(|h| match &h.data {
+            HduData::Image(im) if !im.pixels.to_bytes().is_empty() => Some(im.clone()),
+            _ => None,
+        })
+        .collect()
+}
+
+/// Build a fits4 `.fz`-style FitsFile: empty primary + one compressed-image extension
+/// per source image, all using `opts`.
+fn compress_sample(images: &[ImageData], opts: &CompressOptions) -> FitsFile {
+    let mut fits = FitsFile::with_empty_primary();
+    // funpack expects EXTEND in the primary.
+    fits.primary_mut()
+        .header
+        .set("EXTEND", HeaderValue::Logical(true), None);
+    for img in images {
+        fits.push_extension(img.compress(opts).expect("compress"));
+    }
+    fits
+}
+
+/// Internal round-trip: every compressed extension decompresses byte-exactly.
+fn assert_internal_roundtrip(images: &[ImageData], opts: &CompressOptions) {
+    let fits = compress_sample(images, opts);
+    let comp: Vec<&Hdu> = fits
+        .hdus
+        .iter()
+        .filter(|h| h.as_compressed_image().is_some())
+        .collect();
+    assert_eq!(comp.len(), images.len());
+    for (orig, hdu) in images.iter().zip(comp) {
+        let dec = hdu.as_compressed_image().unwrap().decompress().unwrap();
+        assert_eq!(dec.axes, orig.axes, "internal: axes");
+        assert_eq!(
+            dec.pixels.to_bytes(),
+            orig.pixels.to_bytes(),
+            "internal: pixel bytes"
+        );
+    }
+    // Also confirm the file re-reads from bytes as compressed images.
+    let bytes = fits.to_bytes().expect("to_bytes");
+    let reread = FitsFile::from_bytes(&bytes).expect("reread");
+    let n = reread
+        .hdus
+        .iter()
+        .filter(|h| h.as_compressed_image().is_some())
+        .count();
+    assert_eq!(n, images.len(), "reread: compressed HDU count");
+}
+
+/// Interop: funpack reads fits4's compressed output back to `images` (byte-exact).
+fn assert_funpack_reads_fits4(src_name: &str, opts: &CompressOptions, tag: &str) {
+    if !Path::new(SAMP_DIR).is_dir() {
+        eprintln!("skipping: samp/ not present");
+        return;
+    }
+    if !funpack_available() {
+        eprintln!("skipping: funpack binary not available");
+        return;
+    }
+    let images = sample_images(src_name);
+    assert!(!images.is_empty(), "{src_name}: no source images");
+    let fits = compress_sample(&images, opts);
+
+    // funpack requires the file to be named *.fz.
+    let mut fz = std::env::temp_dir();
+    fz.push(format!("fits4_enc_{}_{}.fits.fz", std::process::id(), tag));
+    let mut out = std::env::temp_dir();
+    out.push(format!("fits4_enc_{}_{}.fits", std::process::id(), tag));
+    let _ = std::fs::remove_file(&fz);
+    let _ = std::fs::remove_file(&out);
+    fits.to_file(&fz).expect("write fits4 .fz");
+
+    let status = std::process::Command::new("funpack")
+        .arg("-O")
+        .arg(&out)
+        .arg("-C")
+        .arg(&fz)
+        .status()
+        .expect("run funpack");
+    assert!(status.success(), "{tag}: funpack failed on fits4 output");
+
+    let recon = FitsFile::from_file(&out).expect("read funpack output");
+    let recon_images: Vec<&ImageData> = recon
+        .hdus
+        .iter()
+        .filter_map(|h| match &h.data {
+            HduData::Image(im) if !im.pixels.to_bytes().is_empty() => Some(im),
+            _ => None,
+        })
+        .collect();
+    assert_eq!(
+        recon_images.len(),
+        images.len(),
+        "{tag}: funpack image count"
+    );
+    for (orig, got) in images.iter().zip(recon_images) {
+        assert_eq!(got.axes, orig.axes, "{tag}: funpack axes");
+        assert_eq!(
+            got.pixels.to_bytes(),
+            orig.pixels.to_bytes(),
+            "{tag}: funpack pixel bytes differ from original (lossless expected)"
+        );
+    }
+    let _ = std::fs::remove_file(&fz);
+    let _ = std::fs::remove_file(&out);
+}
+
+fn rice_opts(tile: Option<Vec<usize>>) -> CompressOptions {
+    CompressOptions {
+        algorithm: CompressionType::Rice1,
+        tile,
+        ..Default::default()
+    }
+}
+
+#[test]
+fn encode_rice_i16_internal() {
+    require_samples!();
+    assert_internal_roundtrip(&sample_images("EUVEngc4151imgx.fits"), &rice_opts(None));
+    // square 2-D tiling with edge truncation
+    assert_internal_roundtrip(
+        &sample_images("EUVEngc4151imgx.fits"),
+        &rice_opts(Some(vec![100, 100])),
+    );
+}
+
+#[test]
+fn encode_rice_i16_interop_funpack() {
+    assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &rice_opts(None), "rice_i16");
+}
+
+#[test]
+fn encode_rice_i16_square_tiles_interop_funpack() {
+    assert_funpack_reads_fits4(
+        "EUVEngc4151imgx.fits",
+        &rice_opts(Some(vec![100, 100])),
+        "rice_i16_t100",
+    );
+}
+
+#[test]
+fn encode_rice_i32_internal() {
+    require_samples!();
+    assert_internal_roundtrip(&sample_images("FGSf64y0106m_a1f.fits"), &rice_opts(None));
+}
+
+#[test]
+fn encode_rice_i32_interop_funpack() {
+    assert_funpack_reads_fits4("FGSf64y0106m_a1f.fits", &rice_opts(None), "rice_i32");
+}
+
+#[cfg(feature = "gzip")]
+#[test]
+fn encode_gzip_i16_internal() {
+    require_samples!();
+    for alg in [CompressionType::Gzip1, CompressionType::Gzip2] {
+        let opts = CompressOptions {
+            algorithm: alg,
+            ..Default::default()
+        };
+        assert_internal_roundtrip(&sample_images("EUVEngc4151imgx.fits"), &opts);
+    }
+}
+
+#[cfg(feature = "gzip")]
+#[test]
+fn encode_gzip1_i16_interop_funpack() {
+    let opts = CompressOptions {
+        algorithm: CompressionType::Gzip1,
+        ..Default::default()
+    };
+    assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &opts, "gzip1_i16");
+}
+
+#[cfg(feature = "gzip")]
+#[test]
+fn encode_gzip2_i16_interop_funpack() {
+    let opts = CompressOptions {
+        algorithm: CompressionType::Gzip2,
+        ..Default::default()
+    };
+    assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &opts, "gzip2_i16");
+}
+
+#[cfg(feature = "gzip")]
+#[test]
+fn encode_gzip_lossless_float_internal() {
+    require_samples!();
+    let opts = CompressOptions {
+        algorithm: CompressionType::Gzip1,
+        quantize: None, // lossless raw-float storage
+        ..Default::default()
+    };
+    assert_internal_roundtrip(&sample_images("FOCx38i0101t_c0f.fits"), &opts);
+}
+
+#[cfg(feature = "gzip")]
+#[test]
+fn encode_gzip_lossless_float_interop_funpack() {
+    let opts = CompressOptions {
+        algorithm: CompressionType::Gzip1,
+        quantize: None,
+        ..Default::default()
+    };
+    assert_funpack_reads_fits4("FOCx38i0101t_c0f.fits", &opts, "gzip_lossless_f32");
+}
+
+/// Lossy quantized-float RICE encode: round-trips within quantization tolerance via the
+/// internal decoder. (Interop byte-exactness is not expected for lossy data; the
+/// lossless cases above carry the funpack-reads-fits4 proof.)
+#[test]
+fn encode_rice_float_quantize_within_tolerance() {
+    require_samples!();
+    let images = sample_images("FOCx38i0101t_c0f.fits");
+    let opts = CompressOptions {
+        algorithm: CompressionType::Rice1,
+        tile: None,
+        quantize: Some(4.0),
+        dither: Quantize::SubtractiveDither1,
+        dither_seed: Some(5),
+        ..Default::default()
+    };
+    let fits = compress_sample(&images, &opts);
+    let comp: Vec<&Hdu> = fits
+        .hdus
+        .iter()
+        .filter(|h| h.as_compressed_image().is_some())
+        .collect();
+    assert_eq!(comp.len(), images.len());
+    for (orig, hdu) in images.iter().zip(comp) {
+        let dec = hdu.as_compressed_image().unwrap().decompress().unwrap();
+        assert_eq!(dec.axes, orig.axes);
+        // Compare within a tolerance derived from the data's own dynamic range.
+        let (o, r) = match (&orig.pixels, &dec.pixels) {
+            (PixelData::F32(a), PixelData::F32(b)) => (a.clone(), b.clone()),
+            _ => panic!("expected F32 float image"),
+        };
+        assert_eq!(o.len(), r.len());
+        let finite_max = o
+            .iter()
+            .filter(|v| v.is_finite())
+            .fold(0.0f32, |m, &v| m.max(v.abs()));
+        let tol = (finite_max / 1000.0).max(1.0);
+        let mut max_err = 0.0f32;
+        for (&a, &b) in o.iter().zip(r.iter()) {
+            if a.is_finite() && b.is_finite() {
+                max_err = max_err.max((a - b).abs());
+            }
+        }
+        assert!(
+            max_err <= tol,
+            "quantize round-trip max error {max_err} exceeds tolerance {tol}"
+        );
+    }
+}

From d03948ce96eb8fcb419458359cbbe8856b8d2e0b Mon Sep 17 00:00:00 2001
From: Steven Michael <ssmichael@gmail.com>
Date: Sun, 31 May 2026 16:15:04 -0400
Subject: [PATCH 7/9] Docs: document tile-compression write support in README
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add a write-compressed-image example (ImageData::compress), list encode
in features/supported, note RICE works in the zero-dep build, and add a
"Compressed write" column to the comparison table — fits4 is the only
pure-Rust crate that writes compressed FITS.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 README.md | 49 +++++++++++++++++++++++++++++++++++++------------
 1 file changed, 37 insertions(+), 12 deletions(-)

diff --git a/README.md b/README.md
index 39aa0d2..8954c15 100644
--- a/README.md
+++ b/README.md
@@ -4,7 +4,7 @@
 
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
 
-fits4 reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It also decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) on the read path.
+fits4 reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) and — uniquely among pure-Rust FITS crates — **writes** RICE- and GZIP-compressed images that cfitsio's `funpack` reads back byte-for-byte.
 
 ## Features
 
@@ -15,6 +15,7 @@ fits4 reads *and* writes the full FITS v4.0 standard — primary and extension H
 - **Variable-length arrays** — `P` and `Q` heap descriptors in binary tables
 - **CHECKSUM/DATASUM** — ones-complement integrity computation and verification
 - **Tile-compressed images (read)** — RICE_1, GZIP_1, GZIP_2, PLIO_1, and HCOMPRESS_1, including float quantization with subtractive dithering; decoded bit-exactly against cfitsio's `funpack`
+- **Tile-compressed images (write)** — encode RICE_1 and GZIP_1/GZIP_2 (integer and float); output verified byte-exact through `funpack`
 - **Zero dependencies by default** — optional `image` and `gzip` features pull in pure-Rust crates only
 
 ## Installation
@@ -116,6 +117,26 @@ for hdu in fits.extensions() {
 # Ok::<(), fits4::Error>(())
 ```
 
+### Writing a tile-compressed image
+
+`ImageData::compress` produces a compressed-image `BINTABLE` HDU ready to push
+into a file. The output is read back byte-for-byte by cfitsio's `funpack`.
+
+```rust
+use fits4::{FitsFile, ImageData, PixelData, CompressOptions};
+
+let pixels: Vec<i16> = (0..10000).map(|i| (i % 1000) as i16).collect();
+let img = ImageData::new(vec![100, 100], PixelData::I16(pixels));
+
+// Default options: RICE_1, one tile per row, lossless for integers.
+let hdu = img.compress(&CompressOptions::default())?;
+
+let mut fits = FitsFile::with_empty_primary();
+fits.push_extension(hdu);
+fits.to_file("compressed.fits")?;
+# Ok::<(), fits4::Error>(())
+```
+
 ### Checksums
 
 ```rust
@@ -149,6 +170,7 @@ that structure directly:
 | `BinTable` / `BinColumn` / `BinColumnType` / `BinCellValue` | Binary-table model — columns, rows, and heap (variable-length arrays); built with `BinTableBuilder`. |
 | `AsciiTable` | ASCII `TABLE` model with `TFORMn` column parsing and `TSCALn`/`TZEROn` scaling. |
 | `CompressedImage` / `CompressionType` / `TileGeometry` / `Quantize` | Read-side view over a tile-compressed image: detect the algorithm and geometry, then `decompress()` to an `ImageData`. |
+| `CompressOptions` | Write-side encode options (algorithm, tiling, quantization, dithering) for `ImageData::compress` / `compress_image`. |
 | `Bitpix` | The `BITPIX` data type (`8`/`16`/`32`/`64`/`-32`/`-64`). |
 | `Error` / `Result` | Crate error type and result alias. |
 
@@ -160,22 +182,22 @@ are preserved for a lossless round-trip write.
 
 | Flag | Default | Description |
 |------|---------|-------------|
-| *(none)* | ✓ | Core read/write, all HDU types, RICE_1 / PLIO_1 / HCOMPRESS_1 tile decompression — zero dependencies |
+| *(none)* | ✓ | Core read/write, all HDU types, RICE_1 / PLIO_1 / HCOMPRESS_1 decompression, and RICE_1 compression — zero dependencies |
 | `image` | | Conversion between `ImageData` and the [`image`](https://crates.io/crates/image) crate's `DynamicImage` |
-| `gzip` | | Decoding of `GZIP_1`/`GZIP_2` tile-compressed images via the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate |
+| `gzip` | | `GZIP_1`/`GZIP_2` tile compression and decompression via the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate |
 
-The default build stays dependency-free; `RICE_1`, `PLIO_1`, and `HCOMPRESS_1` decompression all work without any feature (only `GZIP_1`/`GZIP_2` need the `gzip` feature).
+The default build stays dependency-free; `RICE_1`, `PLIO_1`, and `HCOMPRESS_1` decompression and `RICE_1` compression all work without any feature (only `GZIP_1`/`GZIP_2` need the `gzip` feature).
 
 ## Why fits4?
 
-Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, and complete compressed-image read support**, with no C toolchain to install.
+Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, complete compressed-image reading, and compressed-image writing**, with no C toolchain to install. No other pure-Rust crate writes compressed FITS.
 
-| Crate | Pure Rust | Write | Tables | Compressed read | Notes |
-|-------|-----------|-------|--------|-----------------|-------|
-| [`fitsio`](https://crates.io/crates/fitsio) | ✗ | ✓ | ✓ | ✓ | Wraps the cfitsio C library; needs a C toolchain |
-| [`fitsrs`](https://crates.io/crates/fitsrs) | ✓ | ✗ | partial | — | Read-only |
-| [`fitrs`](https://crates.io/crates/fitrs) | ✓ | ✓ | ✗ | ✗ | Dormant; no table support |
-| **fits4** | ✓ | ✓ | ✓ | ✓ (all types) | Zero default deps; no C dependency |
+| Crate | Pure Rust | Write | Tables | Compressed read | Compressed write | Notes |
+|-------|-----------|-------|--------|-----------------|------------------|-------|
+| [`fitsio`](https://crates.io/crates/fitsio) | ✗ | ✓ | ✓ | ✓ | ✓ | Wraps the cfitsio C library; needs a C toolchain |
+| [`fitsrs`](https://crates.io/crates/fitsrs) | ✓ | ✗ | partial | — | ✗ | Read-only |
+| [`fitrs`](https://crates.io/crates/fitrs) | ✓ | ✓ | ✗ | ✗ | ✗ | Dormant; no table support |
+| **fits4** | ✓ | ✓ | ✓ | ✓ (all types) | ✓ (RICE/GZIP) | Zero default deps; no C dependency |
 
 ## Supported / not supported
 
@@ -189,10 +211,13 @@ Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero defau
   HCOMPRESS_1 — for integer and floating-point images, including quantization
   with subtractive dithering (`NO_DITHER` / `SUBTRACTIVE_DITHER_1` /
   `SUBTRACTIVE_DITHER_2`)
+- Tile-compressed image **writing**: RICE_1 and GZIP_1/GZIP_2 — integer
+  (lossless) and float (lossless via GZIP, or lossy quantized + dithered);
+  output verified byte-exact through cfitsio's `funpack`
 
 **Not supported**
 
-- Writing/encoding tile-compressed images (read/decode only)
+- Encoding `PLIO_1` or `HCOMPRESS_1` (these decode only)
 - HCOMPRESS image smoothing (`SMOOTH` ≠ 0) on decode
 - Random groups (deprecated in FITS v4.0)
 

From 3ab83371d7da19357f1fe8f8d9bed190d4b10f3b Mon Sep 17 00:00:00 2001
From: Steven Michael <ssmichael@gmail.com>
Date: Sun, 31 May 2026 16:19:57 -0400
Subject: [PATCH 8/9] Rename crate to fitskit; add crates.io publish metadata

Rename the crate from fits4 to fitskit throughout (Cargo.toml, source,
tests, doc examples, README, CLAUDE.md, COMPRESSION_PLAN.md). The
fits4_samples GCS bucket name is unchanged (external resource).

Add publish metadata: repository, documentation, readme, keywords,
categories, authors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 CLAUDE.md             |  2 +-
 COMPRESSION_PLAN.md   |  2 +-
 Cargo.lock            |  2 +-
 Cargo.toml            | 10 ++++++--
 README.md             | 36 ++++++++++++++--------------
 src/bintable.rs       |  2 +-
 src/hdu.rs            |  2 +-
 src/image_data.rs     |  4 ++--
 src/lib.rs            | 10 ++++----
 tests/checksum.rs     |  8 +++----
 tests/image_conv.rs   |  2 +-
 tests/round_trip.rs   | 10 ++++----
 tests/sample_files.rs | 56 +++++++++++++++++++++----------------------
 13 files changed, 76 insertions(+), 70 deletions(-)

diff --git a/CLAUDE.md b/CLAUDE.md
index 914a1c6..4e0eb80 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -1,4 +1,4 @@
-# fits4 — Pure Rust FITS v4.0 Library
+# fitskit — Pure Rust FITS v4.0 Library
 
 ## Build Commands
 
diff --git a/COMPRESSION_PLAN.md b/COMPRESSION_PLAN.md
index 9b7fcf2..e4ce213 100644
--- a/COMPRESSION_PLAN.md
+++ b/COMPRESSION_PLAN.md
@@ -2,7 +2,7 @@
 
 Status: planning + initial scaffold. This document describes how to add support for
 the FITS *Tiled Image Compression* convention (the BINTABLE-based compressed-image
-format) to `fits4`, currently the main feature gap versus `fitsrs`.
+format) to `fitskit`, currently the main feature gap versus `fitsrs`.
 
 References:
 - Tiled Image Convention for Storing Compressed Images in FITS Binary Tables,
diff --git a/Cargo.lock b/Cargo.lock
index e419038..06ea30a 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -308,7 +308,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "5baebc0774151f905a1a2cc41989300b1e6fbb29aff0ceffa1064fdd3088d582"
 
 [[package]]
-name = "fits4"
+name = "fitskit"
 version = "0.1.0"
 dependencies = [
  "image",
diff --git a/Cargo.toml b/Cargo.toml
index fe3bf4e..7a7bddd 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -1,9 +1,15 @@
 [package]
-name = "fits4"
+name = "fitskit"
 version = "0.1.0"
 edition = "2021"
-description = "Pure Rust FITS v4.0 reader/writer"
+description = "Pure Rust FITS v4.0 reader/writer with tile-compression read and write"
 license = "MIT"
+repository = "https://github.com/ssmichael1/fitskit"
+documentation = "https://docs.rs/fitskit"
+readme = "README.md"
+keywords = ["fits", "astronomy", "fits-file", "compression", "image"]
+categories = ["science", "encoding", "parser-implementations"]
+authors = ["Steven Michael <ssmichael@gmail.com>"]
 
 [dependencies.image]
 version = "0.25"
diff --git a/README.md b/README.md
index 8954c15..e1ebe3b 100644
--- a/README.md
+++ b/README.md
@@ -1,10 +1,10 @@
-# fits4
+# fitskit
 
 **Pure-Rust, zero-dependency reader and writer for [FITS](https://fits.gsfc.nasa.gov/fits_standard.html) v4.0** — the Flexible Image Transport System format used throughout astronomy.
 
 [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
 
-fits4 reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) and — uniquely among pure-Rust FITS crates — **writes** RICE- and GZIP-compressed images that cfitsio's `funpack` reads back byte-for-byte.
+fitskit reads *and* writes the full FITS v4.0 standard — primary and extension HDUs, images, ASCII tables, and binary tables (including variable-length arrays) — with **no external dependencies** in the default build and **no C toolchain** required. It decodes every tile-compressed image type in the standard (RICE, GZIP, PLIO, HCOMPRESS) and — uniquely among pure-Rust FITS crates — **writes** RICE- and GZIP-compressed images that cfitsio's `funpack` reads back byte-for-byte.
 
 ## Features
 
@@ -22,7 +22,7 @@ fits4 reads *and* writes the full FITS v4.0 standard — primary and extension H
 
 ```toml
 [dependencies]
-fits4 = "0.1"
+fitskit = "0.1"
 ```
 
 ## Usage
@@ -30,7 +30,7 @@ fits4 = "0.1"
 ### Reading a FITS file
 
 ```rust
-use fits4::{FitsFile, HduData, PixelData};
+use fitskit::{FitsFile, HduData, PixelData};
 
 let fits = FitsFile::from_file("image.fits")?;
 
@@ -52,13 +52,13 @@ for hdu in fits.extensions() {
         HduData::Empty => {}
     }
 }
-# Ok::<(), fits4::Error>(())
+# Ok::<(), fitskit::Error>(())
 ```
 
 ### Writing a FITS file
 
 ```rust
-use fits4::{FitsFile, ImageData, PixelData, HeaderValue};
+use fitskit::{FitsFile, ImageData, PixelData, HeaderValue};
 
 let pixels: Vec<i16> = (0..10000).map(|i| (i % 1000) as i16).collect();
 let img = ImageData::new(vec![100, 100], PixelData::I16(pixels));
@@ -67,13 +67,13 @@ let mut fits = FitsFile::with_primary_image(img);
 fits.primary_mut().header.set("OBJECT", HeaderValue::String("M31".into()), None);
 
 fits.to_file("output.fits")?;
-# Ok::<(), fits4::Error>(())
+# Ok::<(), fitskit::Error>(())
 ```
 
 ### Building a binary table
 
 ```rust
-use fits4::{FitsFile, Hdu, BinTableBuilder, BinColumnType};
+use fitskit::{FitsFile, Hdu, BinTableBuilder, BinColumnType};
 
 let table = BinTableBuilder::new()
     .add_column("RA", BinColumnType::D64(1))
@@ -103,7 +103,7 @@ Tiled Image Compression convention). `Hdu::as_compressed_image` cheaply detects
 them; `decompress` reassembles the full image lazily.
 
 ```rust
-use fits4::FitsFile;
+use fitskit::FitsFile;
 
 let fits = FitsFile::from_file("compressed.fits")?;
 
@@ -114,7 +114,7 @@ for hdu in fits.extensions() {
         println!("decompressed to {:?}", image.axes);
     }
 }
-# Ok::<(), fits4::Error>(())
+# Ok::<(), fitskit::Error>(())
 ```
 
 ### Writing a tile-compressed image
@@ -123,7 +123,7 @@ for hdu in fits.extensions() {
 into a file. The output is read back byte-for-byte by cfitsio's `funpack`.
 
 ```rust
-use fits4::{FitsFile, ImageData, PixelData, CompressOptions};
+use fitskit::{FitsFile, ImageData, PixelData, CompressOptions};
 
 let pixels: Vec<i16> = (0..10000).map(|i| (i % 1000) as i16).collect();
 let img = ImageData::new(vec![100, 100], PixelData::I16(pixels));
@@ -134,13 +134,13 @@ let hdu = img.compress(&CompressOptions::default())?;
 let mut fits = FitsFile::with_empty_primary();
 fits.push_extension(hdu);
 fits.to_file("compressed.fits")?;
-# Ok::<(), fits4::Error>(())
+# Ok::<(), fitskit::Error>(())
 ```
 
 ### Checksums
 
 ```rust
-use fits4::{FitsFile, ImageData, PixelData};
+use fitskit::{FitsFile, ImageData, PixelData};
 
 let img = ImageData::new(vec![4], PixelData::U8(vec![1, 2, 3, 4]));
 let fits = FitsFile::with_primary_image(img);
@@ -151,12 +151,12 @@ let bytes = fits.to_bytes_with_checksum()?;
 // Verify on read
 let fits2 = FitsFile::from_bytes(&bytes)?;
 fits2.primary().verify_datasum()?;
-# Ok::<(), fits4::Error>(())
+# Ok::<(), fitskit::Error>(())
 ```
 
 ## Core types
 
-A FITS file is an ordered sequence of **Header-Data Units (HDUs)**. fits4 mirrors
+A FITS file is an ordered sequence of **Header-Data Units (HDUs)**. fitskit mirrors
 that structure directly:
 
 | Type | Role |
@@ -188,16 +188,16 @@ are preserved for a lossless round-trip write.
 
 The default build stays dependency-free; `RICE_1`, `PLIO_1`, and `HCOMPRESS_1` decompression and `RICE_1` compression all work without any feature (only `GZIP_1`/`GZIP_2` need the `gzip` feature).
 
-## Why fits4?
+## Why fitskit?
 
-Among Rust FITS crates, fits4 fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, complete compressed-image reading, and compressed-image writing**, with no C toolchain to install. No other pure-Rust crate writes compressed FITS.
+Among Rust FITS crates, fitskit fills a specific niche — **pure Rust, zero default dependencies, full read + write including tables, complete compressed-image reading, and compressed-image writing**, with no C toolchain to install. No other pure-Rust crate writes compressed FITS.
 
 | Crate | Pure Rust | Write | Tables | Compressed read | Compressed write | Notes |
 |-------|-----------|-------|--------|-----------------|------------------|-------|
 | [`fitsio`](https://crates.io/crates/fitsio) | ✗ | ✓ | ✓ | ✓ | ✓ | Wraps the cfitsio C library; needs a C toolchain |
 | [`fitsrs`](https://crates.io/crates/fitsrs) | ✓ | ✗ | partial | — | ✗ | Read-only |
 | [`fitrs`](https://crates.io/crates/fitrs) | ✓ | ✓ | ✗ | ✗ | ✗ | Dormant; no table support |
-| **fits4** | ✓ | ✓ | ✓ | ✓ (all types) | ✓ (RICE/GZIP) | Zero default deps; no C dependency |
+| **fitskit** | ✓ | ✓ | ✓ | ✓ (all types) | ✓ (RICE/GZIP) | Zero default deps; no C dependency |
 
 ## Supported / not supported
 
diff --git a/src/bintable.rs b/src/bintable.rs
index c226459..3345aaf 100644
--- a/src/bintable.rs
+++ b/src/bintable.rs
@@ -510,7 +510,7 @@ impl BinTable {
 /// Builder for constructing a `BinTable` row by row.
 ///
 /// ```
-/// use fits4::{BinTableBuilder, BinColumnType};
+/// use fitskit::{BinTableBuilder, BinColumnType};
 ///
 /// let table = BinTableBuilder::new()
 ///     .add_column("RA", BinColumnType::D64(1))
diff --git a/src/hdu.rs b/src/hdu.rs
index 6af05d6..bd1b1bf 100644
--- a/src/hdu.rs
+++ b/src/hdu.rs
@@ -109,7 +109,7 @@ impl Hdu {
     /// decoding happens in [`CompressedImage::decompress`]:
     ///
     /// ```no_run
-    /// # use fits4::FitsFile;
+    /// # use fitskit::FitsFile;
     /// # let fits = FitsFile::from_file("compressed.fits").unwrap();
     /// for hdu in fits.extensions() {
     ///     if let Some(cimg) = hdu.as_compressed_image() {
diff --git a/src/image_data.rs b/src/image_data.rs
index 33ced0c..cc168da 100644
--- a/src/image_data.rs
+++ b/src/image_data.rs
@@ -163,8 +163,8 @@ impl ImageData {
     /// [`CompressedImage::decompress`](crate::tile_compress::CompressedImage::decompress).
     ///
     /// ```
-    /// use fits4::{ImageData, PixelData};
-    /// use fits4::tile_compress::CompressOptions;
+    /// use fitskit::{ImageData, PixelData};
+    /// use fitskit::tile_compress::CompressOptions;
     ///
     /// let img = ImageData::new(vec![8, 4], PixelData::I16((0..32).collect()));
     /// let hdu = img.compress(&CompressOptions::default()).unwrap();
diff --git a/src/lib.rs b/src/lib.rs
index 446f382..44b0a1b 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -1,4 +1,4 @@
-//! # fits4 — Pure Rust FITS v4.0 reader/writer
+//! # fitskit — Pure Rust FITS v4.0 reader/writer
 //!
 //! A zero-dependency implementation of the FITS (Flexible Image Transport System)
 //! standard v4.0 for reading and writing astronomical data files.
@@ -26,7 +26,7 @@
 //! ## Quick start — reading
 //!
 //! ```no_run
-//! use fits4::{FitsFile, HduData, PixelData};
+//! use fitskit::{FitsFile, HduData, PixelData};
 //!
 //! let fits = FitsFile::from_file("image.fits").unwrap();
 //!
@@ -55,7 +55,7 @@
 //! ## Quick start — writing
 //!
 //! ```
-//! use fits4::{FitsFile, Hdu, ImageData, PixelData, HeaderValue};
+//! use fitskit::{FitsFile, Hdu, ImageData, PixelData, HeaderValue};
 //!
 //! // Create a 100x100 16-bit image
 //! let pixels: Vec<i16> = (0..10000).map(|i| (i % 1000) as i16).collect();
@@ -74,7 +74,7 @@
 //! integer convention stores unsigned values in signed storage:
 //!
 //! ```
-//! use fits4::{ImageData, PixelData};
+//! use fitskit::{ImageData, PixelData};
 //!
 //! // Unsigned u16 via BZERO=32768
 //! let img = ImageData::new(vec![3], PixelData::I16(vec![-32768, 0, 32767]));
@@ -87,7 +87,7 @@
 //! Write with CHECKSUM/DATASUM integrity keywords:
 //!
 //! ```
-//! use fits4::{FitsFile, ImageData, PixelData};
+//! use fitskit::{FitsFile, ImageData, PixelData};
 //!
 //! let img = ImageData::new(vec![4], PixelData::U8(vec![1, 2, 3, 4]));
 //! let fits = FitsFile::with_primary_image(img);
diff --git a/tests/checksum.rs b/tests/checksum.rs
index b9fef5b..212451a 100644
--- a/tests/checksum.rs
+++ b/tests/checksum.rs
@@ -1,5 +1,5 @@
-use fits4::checksum;
-use fits4::*;
+use fitskit::checksum;
+use fitskit::*;
 
 #[test]
 fn write_with_checksum_and_verify() {
@@ -31,7 +31,7 @@ fn write_with_checksum_and_verify() {
 
 #[test]
 fn checksum_with_extensions() {
-    use fits4::bintable::{BinColumn, BinColumnType};
+    use fitskit::bintable::{BinColumn, BinColumnType};
 
     let mut fits = FitsFile::with_empty_primary();
 
@@ -47,7 +47,7 @@ fn checksum_with_extensions() {
     main_data[0..8].copy_from_slice(&3.125f64.to_be_bytes());
     main_data[8..16].copy_from_slice(&2.5f64.to_be_bytes());
 
-    let table = fits4::BinTable {
+    let table = fitskit::BinTable {
         columns: vec![col],
         nrows: 2,
         row_len,
diff --git a/tests/image_conv.rs b/tests/image_conv.rs
index f751911..d1635d0 100644
--- a/tests/image_conv.rs
+++ b/tests/image_conv.rs
@@ -1,6 +1,6 @@
 #![cfg(feature = "image")]
 
-use fits4::*;
+use fitskit::*;
 use image::{DynamicImage, GrayImage, ImageBuffer, Luma};
 type Gray16Image = ImageBuffer<Luma<u16>, Vec<u16>>;
 
diff --git a/tests/round_trip.rs b/tests/round_trip.rs
index 6dee120..da7b796 100644
--- a/tests/round_trip.rs
+++ b/tests/round_trip.rs
@@ -1,4 +1,4 @@
-use fits4::*;
+use fitskit::*;
 
 #[test]
 fn round_trip_empty_primary() {
@@ -228,7 +228,7 @@ fn round_trip_bscale_bzero() {
 
 #[test]
 fn round_trip_bintable() {
-    use fits4::bintable::{BinCellValue, BinColumn, BinColumnType};
+    use fitskit::bintable::{BinCellValue, BinColumn, BinColumnType};
 
     // Build a simple bintable with 2 rows, 3 columns: i32, f64, 8-char string
     let col_j = BinColumn {
@@ -325,7 +325,7 @@ fn round_trip_bintable() {
 
 #[test]
 fn round_trip_bintable_vla() {
-    use fits4::bintable::{BinCellValue, BinColumn, BinColumnType};
+    use fitskit::bintable::{BinCellValue, BinColumn, BinColumnType};
 
     // Variable-length array column using P descriptor
     let col = BinColumn {
@@ -389,7 +389,7 @@ fn round_trip_bintable_vla() {
 
 #[test]
 fn round_trip_ascii_table() {
-    use fits4::ascii_table::{AsciiColumn, AsciiFormat, AsciiTable};
+    use fitskit::ascii_table::{AsciiColumn, AsciiFormat, AsciiTable};
 
     // 2 rows, 2 columns: name (A10) and value (F10.3)
     let col_name = AsciiColumn {
@@ -497,7 +497,7 @@ fn header_keyword_preservation() {
         .push(Keyword::commentary("COMMENT", "Test comment line"));
     fits.primary_mut()
         .header
-        .push(Keyword::commentary("HISTORY", "Created by fits4 tests"));
+        .push(Keyword::commentary("HISTORY", "Created by fitskit tests"));
     fits.primary_mut()
         .header
         .set("AUTHOR", HeaderValue::String("test".into()), None);
diff --git a/tests/sample_files.rs b/tests/sample_files.rs
index 159c20a..2c5a1df 100644
--- a/tests/sample_files.rs
+++ b/tests/sample_files.rs
@@ -1,7 +1,7 @@
 //! Tests reading NASA sample FITS files from the samp/ directory.
 //! These tests are skipped if the samp/ directory is not present.
 
-use fits4::*;
+use fitskit::*;
 use std::path::Path;
 
 const SAMP_DIR: &str = "samp";
@@ -288,8 +288,8 @@ fn hcompress_int_roundtrip_euv() {
 //
 // Quantized-float decompression is *lossy*, so the reconstructed floats will not
 // equal the original uncompressed sample. The authoritative oracle is funpack's own
-// reconstruction: fits4 and funpack must apply the identical dither table + per-tile
-// scaling, so fits4's output must be BIT-IDENTICAL to funpack's. We invoke the
+// reconstruction: fitskit and funpack must apply the identical dither table + per-tile
+// scaling, so fitskit's output must be BIT-IDENTICAL to funpack's. We invoke the
 // `funpack` CLI at test time and compare; the test skips cleanly when either the
 // fixture or the `funpack` binary is absent (so CI without cfitsio stays green).
 
@@ -307,7 +307,7 @@ fn funpack_reference(fz_name: &str) -> Option<FitsFile> {
     let fz_path = samp(fz_name);
     let mut out = std::env::temp_dir();
     out.push(format!(
-        "fits4_funpack_ref_{}_{}.fits",
+        "fitskit_funpack_ref_{}_{}.fits",
         std::process::id(),
         fz_name.replace(['/', '.'], "_")
     ));
@@ -327,7 +327,7 @@ fn funpack_reference(fz_name: &str) -> Option<FitsFile> {
     f
 }
 
-/// Assert that fits4's float decompression of every compressed-image HDU in `fz_name`
+/// Assert that fitskit's float decompression of every compressed-image HDU in `fz_name`
 /// is byte-exact against funpack's reconstruction of the same file.
 fn assert_float_matches_funpack(fz_name: &str) {
     let fz_path = samp(fz_name);
@@ -365,7 +365,7 @@ fn assert_float_matches_funpack(fz_name: &str) {
             let reference_img = ref_iter
                 .next()
                 .expect("more compressed images than funpack reference images");
-            let dec = cimg.decompress().expect("fits4 float decompress");
+            let dec = cimg.decompress().expect("fitskit float decompress");
             assert_eq!(
                 dec.axes, reference_img.axes,
                 "{fz_name}: axes mismatch on compressed HDU #{matched}"
@@ -377,9 +377,9 @@ fn assert_float_matches_funpack(fz_name: &str) {
     assert!(matched > 0, "{fz_name}: found no compressed-image HDUs");
 }
 
-/// Compare fits4's reconstructed floats against funpack's, element-by-element.
+/// Compare fitskit's reconstructed floats against funpack's, element-by-element.
 ///
-/// fits4 reproduces cfitsio's unquantization *exactly*, including the fused
+/// fitskit reproduces cfitsio's unquantization *exactly*, including the fused
 /// multiply-add (`x*scale + zero` contracted to a single FMA) that the cfitsio C code
 /// relies on, so every reconstructed pixel must be bit-identical to funpack's. NaNs
 /// (from `ZBLANK`) only have to match as NaN — IEEE leaves the payload unspecified.
@@ -390,7 +390,7 @@ fn compare_floats_vs_funpack(fz_name: &str, hdu: usize, got: &PixelData, want: &
         }
         assert!(
             a.to_bits() == b.to_bits(),
-            "{fz_name} HDU#{hdu}: fits4={a} ({:#x}) != funpack={b} ({:#x})",
+            "{fz_name} HDU#{hdu}: fitskit={a} ({:#x}) != funpack={b} ({:#x})",
             a.to_bits(),
             b.to_bits()
         );
@@ -465,18 +465,18 @@ fn gzip1_roundtrip_euv() {
     assert_rice_roundtrip("EUVEngc4151imgx.fits", "EUVEngc4151imgx.gzip1.fits.fz");
 }
 
-// === ENCODE (write) path: fits4 produces tile-compressed FITS ================
+// === ENCODE (write) path: fitskit produces tile-compressed FITS ================
 //
 // Two oracles per algorithm:
 //   (1) internal: image.compress(opts) -> as_compressed_image().decompress() == image
 //       (byte-exact for lossless RICE/GZIP int + lossless-float GZIP).
-//   (2) interop:  write the fits4-compressed file, run the `funpack` CLI on it, and
+//   (2) interop:  write the fitskit-compressed file, run the `funpack` CLI on it, and
 //       confirm funpack's reconstruction matches the original image (byte-exact for
-//       lossless cases). This proves fits4 emits standard, cfitsio-readable compressed
+//       lossless cases). This proves fitskit emits standard, cfitsio-readable compressed
 //       FITS. Skipped cleanly when samples or `funpack` are absent.
 
-use fits4::tile_compress::CompressOptions;
-use fits4::{CompressionType, Quantize};
+use fitskit::tile_compress::CompressOptions;
+use fitskit::{CompressionType, Quantize};
 
 /// Collect the non-empty image HDUs (primary + extensions) of an uncompressed sample.
 fn sample_images(src_name: &str) -> Vec<ImageData> {
@@ -490,7 +490,7 @@ fn sample_images(src_name: &str) -> Vec<ImageData> {
         .collect()
 }
 
-/// Build a fits4 `.fz`-style FitsFile: empty primary + one compressed-image extension
+/// Build a fitskit `.fz`-style FitsFile: empty primary + one compressed-image extension
 /// per source image, all using `opts`.
 fn compress_sample(images: &[ImageData], opts: &CompressOptions) -> FitsFile {
     let mut fits = FitsFile::with_empty_primary();
@@ -533,8 +533,8 @@ fn assert_internal_roundtrip(images: &[ImageData], opts: &CompressOptions) {
     assert_eq!(n, images.len(), "reread: compressed HDU count");
 }
 
-/// Interop: funpack reads fits4's compressed output back to `images` (byte-exact).
-fn assert_funpack_reads_fits4(src_name: &str, opts: &CompressOptions, tag: &str) {
+/// Interop: funpack reads fitskit's compressed output back to `images` (byte-exact).
+fn assert_funpack_reads_fitskit(src_name: &str, opts: &CompressOptions, tag: &str) {
     if !Path::new(SAMP_DIR).is_dir() {
         eprintln!("skipping: samp/ not present");
         return;
@@ -549,12 +549,12 @@ fn assert_funpack_reads_fits4(src_name: &str, opts: &CompressOptions, tag: &str)
 
     // funpack requires the file to be named *.fz.
     let mut fz = std::env::temp_dir();
-    fz.push(format!("fits4_enc_{}_{}.fits.fz", std::process::id(), tag));
+    fz.push(format!("fitskit_enc_{}_{}.fits.fz", std::process::id(), tag));
     let mut out = std::env::temp_dir();
-    out.push(format!("fits4_enc_{}_{}.fits", std::process::id(), tag));
+    out.push(format!("fitskit_enc_{}_{}.fits", std::process::id(), tag));
     let _ = std::fs::remove_file(&fz);
     let _ = std::fs::remove_file(&out);
-    fits.to_file(&fz).expect("write fits4 .fz");
+    fits.to_file(&fz).expect("write fitskit .fz");
 
     let status = std::process::Command::new("funpack")
         .arg("-O")
@@ -563,7 +563,7 @@ fn assert_funpack_reads_fits4(src_name: &str, opts: &CompressOptions, tag: &str)
         .arg(&fz)
         .status()
         .expect("run funpack");
-    assert!(status.success(), "{tag}: funpack failed on fits4 output");
+    assert!(status.success(), "{tag}: funpack failed on fitskit output");
 
     let recon = FitsFile::from_file(&out).expect("read funpack output");
     let recon_images: Vec<&ImageData> = recon
@@ -612,12 +612,12 @@ fn encode_rice_i16_internal() {
 
 #[test]
 fn encode_rice_i16_interop_funpack() {
-    assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &rice_opts(None), "rice_i16");
+    assert_funpack_reads_fitskit("EUVEngc4151imgx.fits", &rice_opts(None), "rice_i16");
 }
 
 #[test]
 fn encode_rice_i16_square_tiles_interop_funpack() {
-    assert_funpack_reads_fits4(
+    assert_funpack_reads_fitskit(
         "EUVEngc4151imgx.fits",
         &rice_opts(Some(vec![100, 100])),
         "rice_i16_t100",
@@ -632,7 +632,7 @@ fn encode_rice_i32_internal() {
 
 #[test]
 fn encode_rice_i32_interop_funpack() {
-    assert_funpack_reads_fits4("FGSf64y0106m_a1f.fits", &rice_opts(None), "rice_i32");
+    assert_funpack_reads_fitskit("FGSf64y0106m_a1f.fits", &rice_opts(None), "rice_i32");
 }
 
 #[cfg(feature = "gzip")]
@@ -655,7 +655,7 @@ fn encode_gzip1_i16_interop_funpack() {
         algorithm: CompressionType::Gzip1,
         ..Default::default()
     };
-    assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &opts, "gzip1_i16");
+    assert_funpack_reads_fitskit("EUVEngc4151imgx.fits", &opts, "gzip1_i16");
 }
 
 #[cfg(feature = "gzip")]
@@ -665,7 +665,7 @@ fn encode_gzip2_i16_interop_funpack() {
         algorithm: CompressionType::Gzip2,
         ..Default::default()
     };
-    assert_funpack_reads_fits4("EUVEngc4151imgx.fits", &opts, "gzip2_i16");
+    assert_funpack_reads_fitskit("EUVEngc4151imgx.fits", &opts, "gzip2_i16");
 }
 
 #[cfg(feature = "gzip")]
@@ -688,12 +688,12 @@ fn encode_gzip_lossless_float_interop_funpack() {
         quantize: None,
         ..Default::default()
     };
-    assert_funpack_reads_fits4("FOCx38i0101t_c0f.fits", &opts, "gzip_lossless_f32");
+    assert_funpack_reads_fitskit("FOCx38i0101t_c0f.fits", &opts, "gzip_lossless_f32");
 }
 
 /// Lossy quantized-float RICE encode: round-trips within quantization tolerance via the
 /// internal decoder. (Interop byte-exactness is not expected for lossy data; the
-/// lossless cases above carry the funpack-reads-fits4 proof.)
+/// lossless cases above carry the funpack-reads-fitskit proof.)
 #[test]
 fn encode_rice_float_quantize_within_tolerance() {
     require_samples!();

From de67f1b57ecfbe3347af00b71ec1d12f3d930d85 Mon Sep 17 00:00:00 2001
From: Steven Michael <ssmichael@gmail.com>
Date: Sun, 31 May 2026 16:26:12 -0400
Subject: [PATCH 9/9] Docs: expand lib.rs crate docs to mirror README

Add binary-table and tile-compression (compress -> decompress round-trip)
runnable examples, a Core types overview, and note that the gzip feature
now enables encoding as well as decoding.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
---
 src/lib.rs | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 70 insertions(+), 3 deletions(-)

diff --git a/src/lib.rs b/src/lib.rs
index 44b0a1b..a6d0973 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -68,6 +68,54 @@
 //! assert_eq!(bytes.len() % 2880, 0); // block-aligned
 //! ```
 //!
+//! ## Binary tables
+//!
+//! Build a `BINTABLE` extension column-by-column, then push it onto a file:
+//!
+//! ```
+//! use fitskit::{FitsFile, Hdu, BinTableBuilder, BinColumnType};
+//!
+//! let table = BinTableBuilder::new()
+//!     .add_column("RA", BinColumnType::D64(1))
+//!     .add_column("DEC", BinColumnType::D64(1))
+//!     .add_column("MAG", BinColumnType::E32(1))
+//!     .push_row(|row| {
+//!         row.write_f64(180.0);
+//!         row.write_f64(45.0);
+//!         row.write_f32(12.5);
+//!     })
+//!     .build();
+//!
+//! let mut fits = FitsFile::with_empty_primary();
+//! fits.push_extension(Hdu::bintable_extension(table));
+//! let _ = fits.to_bytes().unwrap();
+//! ```
+//!
+//! ## Tile-compressed images
+//!
+//! [`ImageData::compress`](image_data::ImageData::compress) produces a compressed-image
+//! `BINTABLE` HDU; [`Hdu::as_compressed_image`](hdu::Hdu::as_compressed_image) +
+//! [`CompressedImage::decompress`](tile_compress::CompressedImage::decompress) read it
+//! back. RICE_1 is lossless for integers, so this round-trips exactly:
+//!
+//! ```
+//! use fitskit::{FitsFile, ImageData, PixelData, CompressOptions, HduData};
+//!
+//! let pixels: Vec<i16> = (0..10000).map(|i| (i % 1000) as i16).collect();
+//! let img = ImageData::new(vec![100, 100], PixelData::I16(pixels.clone()));
+//!
+//! // Compress (default: RICE_1, one tile per row) into a BINTABLE HDU and write it.
+//! let mut fits = FitsFile::with_empty_primary();
+//! fits.push_extension(img.compress(&CompressOptions::default()).unwrap());
+//! let bytes = fits.to_bytes().unwrap();
+//!
+//! // Read back and decompress.
+//! let fits2 = FitsFile::from_bytes(&bytes).unwrap();
+//! let cimg = fits2.extensions()[0].as_compressed_image().unwrap();
+//! let restored = cimg.decompress().unwrap();
+//! assert!(matches!(restored.pixels, PixelData::I16(ref v) if *v == pixels));
+//! ```
+//!
 //! ## BSCALE/BZERO
 //!
 //! Physical values are computed as `BZERO + BSCALE * array_value`. The unsigned
@@ -98,13 +146,32 @@
 //! fits2.primary().verify_datasum().unwrap();
 //! ```
 //!
+//! ## Core types
+//!
+//! A FITS file is an ordered sequence of Header-Data Units (HDUs):
+//!
+//! - [`FitsFile`] — top-level container ([`Vec<Hdu>`](Hdu)); reads/writes files,
+//!   bytes, or any `Read`/`Write`, plus builder methods.
+//! - [`Hdu`] — one HDU: a [`Header`] plus an [`HduData`] payload.
+//! - [`HduData`] — payload enum: `Empty`, `Image`, `AsciiTable`, `BinTable`.
+//! - [`Header`] / [`Keyword`] / [`HeaderValue`] — ordered cards with typed accessors.
+//! - [`ImageData`] / [`PixelData`] — axes plus a typed pixel buffer; raw and
+//!   BSCALE/BZERO-scaled access.
+//! - [`BinTable`] / [`BinColumn`] / [`BinColumnType`] / [`BinCellValue`] — binary-table
+//!   model (with VLA heap), built via [`BinTableBuilder`].
+//! - [`AsciiTable`] — ASCII `TABLE` model.
+//! - [`CompressedImage`] / [`CompressionType`] / [`CompressOptions`] — tile-compression
+//!   read view and write options.
+//! - [`Bitpix`] — the `BITPIX` data type; [`Error`] / [`Result`] — error handling.
+//!
 //! ## Feature flags
 //!
 //! - **`image`** — enables conversion between [`ImageData`] and the
 //!   [`image`](https://crates.io/crates/image) crate's `DynamicImage`.
-//! - **`gzip`** — enables decoding of `GZIP_1`/`GZIP_2` tile-compressed images via
-//!   the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide) crate. The
-//!   default build stays dependency-free; `RICE_1` works without this feature.
+//! - **`gzip`** — enables encoding *and* decoding of `GZIP_1`/`GZIP_2` tile-compressed
+//!   images via the pure-Rust [`miniz_oxide`](https://crates.io/crates/miniz_oxide)
+//!   crate. The default build stays dependency-free; `RICE_1` (read and write),
+//!   `PLIO_1`, and `HCOMPRESS_1` decoding all work without this feature.
 
 pub mod ascii_table;
 pub mod bintable;