Skip to content

Add SPZ v4 (NGSP / ZSTD multi-stream) read and write support#332

Open
udwinj wants to merge 2 commits intosparkjsdev:mainfrom
udwinj:add-spz-v4-support
Open

Add SPZ v4 (NGSP / ZSTD multi-stream) read and write support#332
udwinj wants to merge 2 commits intosparkjsdev:mainfrom
udwinj:add-spz-v4-support

Conversation

@udwinj
Copy link
Copy Markdown

@udwinj udwinj commented May 4, 2026

Summary

Adds support for SPZ v4 to spark — both reading and writing — bringing parity with the latest nianticlabs/spz reference encoder.

SPZ v4 replaces the single gzip-wrapped payload (v1–v3) with a 32-byte NGSP header followed by per-attribute ZSTD-compressed streams. The wire format is identical to upstream's saveSpz() so files round-trip cleanly between the C++ encoder and spark.

Changes

Dependencies

  • Adds @bokuweb/zstd-wasm (~50 KB WASM blob, lazy-loaded). Used for both ZSTD compression and decompression. Native CompressionStream("zstd") was considered but dropped because it isn't yet supported in Firefox or Safari.

src/SplatLoader.ts

  • getSplatFileType recognizes the NGSP magic at offset 0 (v4) in addition to the existing gzip-wrapped detection (v1–v3).

src/spz.ts — read path

  • SpzReader detects v4 in the constructor by inspecting the first 4 bytes; legacy files continue to flow through GunzipReader unchanged.
  • For v4, parseHeader() parses the 32-byte NGSP header, awaits ZSTD WASM init, and decompresses every attribute stream up front into v4Streams: Uint8Array[].
  • parseSplats() uses a small read() abstraction so v3 (gzip stream) and v4 (pre-decompressed buffers) share the same decode logic.
  • The smallest-three quaternion branch now triggers on version >= 3 (was === 3) since v4 uses the same encoding as v3.
  • The LOD tail (a spark-only extension to the gzip format) is correctly skipped for v4 files.

src/spz.ts — write path

  • SPZ_VERSION bumped to 4.
  • SpzWriter now stores each attribute in its own Uint8Array (positions, alphas, colors, scales, rotations, sh) and assembles [32-byte header][TOC][ZSTD streams] in finalize().
  • Setter API (setCenter / setAlpha / setRgb / setScale / setQuat / setSh) is unchanged, so transcodeSpz and other callers don't need updates.
  • TOC entries are [u64 compressedSize LE][u64 uncompressedSize LE], matching the reference encoder.

Backward compatibility

  • v2, v3 file reads are unchanged (same code path).
  • v3 → v4 is the only writer behavior change; the writer no longer emits gzip files. Callers depending on writing legacy v3 files would need a flag added in a follow-up.

Known gaps (out of scope)

  • Extensions (FlagHasExtensions = 0x2): not read or written. Reader correctly skips over them via the tocByteOffset field, so files containing extensions still load.
  • SH degree 4 (upstream SH_MAX_DEGREE = 4): pre-existing spark limitation — SH_DEGREE_TO_VECS only goes up to 3. Files with shDegree == 4 already failed to load before this PR; behavior is unchanged.

judwin and others added 2 commits May 4, 2026 14:41
SPZ v4 files use a 32-byte NGSP header with per-attribute ZSTD-compressed
streams instead of a single gzip-wrapped payload (v1-v3). This adds
@bokuweb/zstd-wasm for ZSTD codec support (works in all browsers via
WASM, no CompressionStream("zstd") browser dependency) and updates both
the reader and writer to handle v4.

Read path:
- getSplatFileType: detect v4 files that start with NGSP magic directly
  (not gzip-wrapped) and return SplatFileType.SPZ
- SpzReader: detect v4 in constructor via magic bytes; parse the 32-byte
  header and decompress all attribute streams upfront in parseHeader()
- SpzReader: unified read() abstraction in parseSplats() so v3 and v4
  share identical decode logic
- SpzReader: extend smallest-three quaternion path to version >= 3
  (was === 3), since v4 uses the same encoding as v3
- SpzReader: legacy v1-v3 gzip path is unchanged

Write path:
- SPZ_VERSION bumped from 3 to 4
- SpzWriter rewritten to keep per-attribute Uint8Array buffers and emit
  the v4 file layout: [32-byte header][TOC][concatenated ZSTD streams]
- Setter API (setCenter, setAlpha, setRgb, setScale, setQuat, setSh) is
  unchanged, so transcodeSpz and other callers don't need updates
- finalize() ZSTD-compresses each attribute stream independently and
  assembles the output, mirroring the C++ saveSpz() reference encoder

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
The viewer's load path goes through the spark-worker-rs WASM module,
which uses spark-lib's Rust SpzDecoder. Without this change, v4 files
fail with "Invalid gzip header" even though the TS SpzReader handles
them, because the worker never invokes the TS path.

This adds a parallel v4 path to SpzDecoder that mirrors the C++
reference implementation:

- New SpzFormat enum (Unknown / Gzip / Ngsp). The decoder detects which
  on the first 4 bytes of input — NGSP magic = v4, gzip magic = legacy.
- For v4: accumulate raw bytes, parse the 32-byte NgspFileHeader, walk
  the TOC (numStreams × [u64 compressedSize][u64 uncompressedSize]),
  ZSTD-decompress each attribute stream with ruzstd, concatenate the
  decompressed bytes in stream order, then run the existing per-stage
  state machine (Centers/Alphas/Rgb/Scales/Quats/Sh).
- For v1-v3: gzip path is unchanged.
- Smallest-three quaternion branch now triggers on version >= 3, since
  v4 uses the same encoding as v3.

decoder.rs: MultiDecoder routes files starting with NGSP magic directly
to SpzDecoder (in addition to the existing gzip-wrapped detection).

Dependencies:
- ruzstd 0.7 (pure-Rust ZSTD decoder; works in WASM with no C bindings)

The Rust SpzEncoder is intentionally untouched — only the build-lod
CLI uses it. SPZ writing from spark.js goes through the TypeScript
SpzWriter, which already produces v4 files.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@udwinj
Copy link
Copy Markdown
Author

udwinj commented May 4, 2026

Update: Rust WASM decoder now also supports SPZ v4

After initial testing, we found that the viewer's load path (SplatLoaderworkerPoolworker.ts) goes through the Rust WASM decoder (rust/spark-lib/src/spz.rs), not the TypeScript SpzReader. The first commit's TS changes only covered the transcodeSpz / legacy worker path. V4 files were failing with "Invalid gzip header" because the Rust decoder was hitting the NGSP magic bytes and treating them as malformed gzip.

A second commit adds full v4 support to the Rust decoder.


What the second commit changes

New dependency — ruzstd 0.7.3

Pure-Rust ZSTD decoder, no C bindings, compiles cleanly to WASM.

# rust/Cargo.toml (workspace)
ruzstd = { version = "0.7.3", default-features = false, features = ["std"] }

rust/spark-lib/src/decoder.rs

MultiDecoder now recognises NGSP magic at the start of a file (in addition to the existing gzip-wrapped detection):

if magic == SPZ_MAGIC {
    // NGSP magic at file start — SPZ v4 (ZSTD multi-stream, not gzip-wrapped)
    return self.init_file_type(SplatFileType::SPZ);
}

rust/spark-lib/src/spz.rs

Format detection — new SpzFormat enum; the decoder self-detects on the first 4 bytes of input:

enum SpzFormat { Unknown, Gzip, Ngsp }
  • First 4 bytes = NGSP → v4 path (accumulate raw bytes, decode via try_decode_v4())
  • First 4 bytes = gzip magic → v1–v3 path (existing streaming gzip, unchanged)

try_decode_v4() — once all bytes are buffered, parses the 32-byte NGSP header, walks the TOC, ZSTD-decompresses each attribute stream with ruzstd, then feeds decompressed bytes into the existing per-stage state machine (Centers → Alphas → Rgb → Scales → Quats → Sh):

fn try_decode_v4(&mut self) -> anyhow::Result<()> {
    // parse 32-byte header, validate magic + version
    // walk TOC: numStreams × [u64 compressedSize][u64 uncompressedSize]
    for (offset, size) in &compressed_offsets {
        let compressed = &self.raw[*offset..*offset + *size];
        let mut decoder = ruzstd::StreamingDecoder::new(compressed)?;
        decoder.read_to_end(&mut self.buffer)?;
    }
    self.init_state(version, num_splats, sh_degree, fractional_bits, flags)?;
    self.poll_sections()?;
    self.done = true;
    Ok(())
}

Quaternion branch extended to version >= 3 (was == 3), since v4 uses the same smallest-three encoding as v3.

The v1–v3 gzip path is byte-for-byte unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant