Flatten a sphere. desphere is a clean-room, MIT-licensed, zero-dependency
transcoder from NIST SPHERE audio (the .sph format used by TIMIT, WSJ,
Switchboard, and friends) to plain RIFF/WAV. The CLI is sph2wav.
No tooling, no upload: a browser page converts one or many
.sphfiles to WAV or FLAC entirely client-side (WASM) — live at https://ucpresearch.github.io/desphere/ (source inweb/). Multiple files come back as a.zip. Prefersph2pipe/ffmpeg/soxif you have them; the page is for when you don't.
pip install desphere # pure Python, zero deps, works anywhere
pip install "desphere[fast]" # + optional Rust accelerator (big shorten files)
sph2wav utterance.sph # -> utterance.wav
sph2wav utterance.sph out.wav
sph2wav --info utterance.sph # inspect the SPHERE header
sph2wav utterance.sph - # WAV to stdoutfrom desphere import read_sphere, transcode, transcode_bytes
# Streaming API (the reference):
header, data = read_sphere("utterance.sph")
with open("utterance.wav", "wb") as f:
transcode(header, data, f)
# One-shot bytes API:
wav_bytes = transcode_bytes(open("utterance.sph", "rb").read())Both APIs transparently use the Rust accelerator when installed (and fall
back to pure Python) — same bytes out either way. With desphere[fast], the
heavy kernels (shorten decode, G.711 expansion) run in Rust; pure-Python PCM is
already C-speed. The CLI passes a stray .wav through unchanged (with a warning),
so pointing it at an already-decoded file is harmless.
libsndfile/soundfile cannot read SPHERE (especially shorten-compressed
SPHERE), and the tools that can — sph2pipe, the original shorten, FFmpeg's
decoder — are GPL/LGPL or otherwise awkwardly licensed. desphere is a
permissively licensed, dependency-free reimplementation.
desphere is implemented only from:
- The public NIST SPHERE format description (ASCII header, typed object fields).
- ITU-T G.711 for μ-law / a-law.
- Tony Robinson (1994), SHORTEN: Simple lossless and near-lossless waveform compression (CUED/F-INFENG/TR.156) for the shorten algorithm.
- Black-box testing: running
sox/ffmpeg/sph2pipeas binaries and comparing only their output — never reading their source.
Copyright protects source code, not file formats or algorithms (the idea/expression dichotomy), so consulting prose/tabular descriptions of a format — even third-party ones — is fine. The one rule we never break: we do not read GPL/LGPL source code, and we do not use writeups that embed verbatim GPL source. Authoritative non-source references (the TR, ITU specs, NIST docs) are the primary sources; prose writeups are at most a cross-check.
The design principle is support the obvious lossless path first, and fail loudly on anything not yet validated — never emit a plausible-but-wrong WAV.
| Coding | Status |
|---|---|
pcm, 16-bit, 01/10 byte order |
✅ supported |
pcm, 32-bit, 01/10 byte order |
✅ supported |
pcm, multi-channel (interleaved) |
✅ supported (validate against an oracle for exotic files) |
pcm, 8-bit / 24-bit |
⛔ rejected (UnsupportedFormat) — sign/packing not yet validated |
ulaw / alaw (G.711, 8-bit) |
✅ supported (Phase B) — verified byte-exact vs ffmpeg on all 256 codes |
pcm,embedded-shorten-v2.00 (16-bit) |
✅ supported (Phase C) — byte-exact vs ffmpeg, mono & stereo |
ulaw,embedded-shorten (shorten type 8), incl. bitshift |
✅ supported (Phase C) — byte-exact vs sph2pipe (real CALLHOME) |
| shorten QLPC (LPC) blocks | ✅ supported (Phase C) — byte-exact vs shorten encoder + ffmpeg (orders 1–20) |
Adding a coding means registering a decoder in desphere/codecs.py; until then
the gate raises a precise error.
We don't depend on real corpus files to test: tools/make_fixtures.py
generates a zoo of SPHERE variants — every byte order, bit depth, and channel
count — and commits them under tests/fixtures/ with a manifest recording the
exact expected output. Harder codings (μ-law/a-law, eventually shorten) are
produced best-effort by driving sox/ffmpeg/shorten as black-box binaries.
python tools/make_fixtures.py # regenerate fixtures + manifest
pytest # validate everythingThe Python in src/desphere is the reference spec; rust/ is a
Rust port that reproduces it bit-for-bit (same fixtures), for speed and for
the web. It builds three ways from one crate:
- a Rust library — a Rust client (praatfan is a likely consumer) imports it via a path/git dependency;
- WASM (
wasm-pack build rust --target web --features wasm) — a WASM web app and theweb/page consume it; nothing is sent to a server; - a Python extension
desphere-native(pyo3/maturin) — the optionaldesphere[fast]accelerator above.
Because the consumers live in the same ucpresearch org, they depend on desphere
directly (path/git); publishing to crates.io / npm is not required (and is
skipped). PyPI is the one registry we target (the pure-Python desphere,
plus the optional desphere-native wheels).
The virtualenv lives outside the (Syncthing-synced) repo and is symlinked in:
uv venv ~/local/scr/venvs/desphere
ln -s ~/local/scr/venvs/desphere .venv
uv pip install -e ".[dev]"
pytest
# Rust port:
cd rust && cargo test # byte-exact vs the same fixtures- Phase A (done): SPHERE header + 16/32-bit PCM, lossless.
- Phase B (done): μ-law / a-law decode (ITU-T G.711).
- Phase C (done): embedded-shorten of 16-bit PCM, lossless μ-law (type 8,
including the non-linear BITSHIFT remap), and QLPC (LPC) blocks — validated
byte-for-byte vs ffmpeg / sph2pipe / the
shortenencoder on real and synthetic streams, mono and stereo. Remaining (low priority): 8/24-bit linear PCM. Seedocs/STATUS.mdanddocs/SHORTEN.md. - Rust port (done): a bit-for-bit Rust reimplementation in
rust/powers thedesphere[fast]accelerator, the WASM web page, and any Rust client, mirroringpraatfan-core-clean's Python-first-then-Rust path. The Python implementation stays the readable reference. Porting guidance:docs/RUST_PORT.md. - Released:
desphereanddesphere-nativeare on PyPI (v0.1.0); the web page is live on GitHub Pages. Release process:RELEASING.md.