Pure Go implementation of the SNAC neural audio codec decoder for text-to-speech applications.
SNAC-Go decodes multi-scale neural audio codes into high-quality audio waveforms. It integrates with llama-go to enable complete Go-based text-to-speech using models like Orpheus TTS, with no Python runtime dependencies.
- Pure Go implementation - No CGO, Python, or PyTorch dependencies
- SNAC 24kHz model - Speech-optimised decoder (19.8M parameters)
- Hierarchical decoding - Multi-scale vector quantisation with 3 codebook levels
- Production-ready - Ground-truth verified against upstream Python SNAC (MSE < 1e-5)
go get github.com/tcpipuk/snac-gopackage main
import (
"github.com/tcpipuk/snac-go/snac"
"log"
)
func main() {
// Load SNAC decoder with 24kHz model
decoder, err := snac.NewDecoder("hubertsiuzdak/snac_24khz")
if err != nil {
log.Fatal(err)
}
// Decode SNAC tokens to audio
// tokens is [3][]int representing 3 hierarchical codebook levels
audio, err := decoder.Decode(tokens)
if err != nil {
log.Fatal(err)
}
// Write to WAV file
if err := writeWAV("output.wav", audio, 24000); err != nil {
log.Fatal(err)
}
}The Python SNAC decoder is ~474 lines whilst this is nearly 4700. We want pure Go for TTS (llama-go runs Orpheus, snac-go decodes audio), but PyTorch does the heavy lifting in Python - convolutions and attention get implemented manually using Gorgonia. GoMLX would shrink this significantly, but it's pre-v1.0 with API instability. Current approach is pragmatic: reliable pure Go now, but ready to migrate when GoMLX stabilises.
See Architecture Guide for technical details about how SNAC decoder works.
- ✅ Architecture implementation complete (~4700 lines pure Go)
- ✅ Weight loading and conversion tooling ready
- ✅ Ground truth verification complete (151 tests passing, MSE < 1e-5)
- ✅ Production-ready decoder verified against Python SNAC
- ⏳ API subject to change before v1.0
Licensed under the Apache Licence 2.0