Neural Codecs

A pipeline library for batch encode → decode round-trips through neural audio codec models. Feed it a folder of WAV files, pick a codec, get reconstructed WAVs — useful for building codec-distorted datasets, evaluating codec quality, or preprocessing audio for TTS/ASR training.

Codec Support

ID	Name	Sample Rate	Install	Output
1	`snac_24khz`	24 kHz	`pip install snac`	mono
2	`snac_32khz`	32 kHz	`pip install snac`	mono
3	`snac_44khz`	44 kHz	`pip install snac`	mono
4	`dac_16khz`	16 kHz	`pip install descript-audio-codec`	mono
5	`dac_24khz`	24 kHz	`pip install descript-audio-codec`	mono
6	`dac_44khz`	44 kHz	`pip install descript-audio-codec`	mono
7	`encodec_24khz`	24 kHz	`pip install transformers encodec`	mono
8	`encodec_48khz`	48 kHz	`pip install transformers encodec`	stereo
9	`soundstream_16khz`	16 kHz	`pip install soundstream` ⚠️	mono
10	`speechtokenizer`	16 kHz	pip + manual checkpoint	mono
—	`FunCodec`	16 kHz	external repo	—
—	`AudioDec`	24 / 48 kHz	external repo	—

⚠️ SoundStream (soundstream==0.0.1) pins numpy<2.0 and huggingface-hub<0.16. After installing it, run pip install --upgrade huggingface-hub to keep EnCodec working. For a fully clean setup, use a dedicated virtual environment for SoundStream.

All model weights download automatically from HuggingFace on first use (except SpeechTokenizer — see below).

Project Structure

Neural-Codecs/
├── audio_codec/
│   ├── config.py          ← AUTO_INSTALL_DEPS flag lives here
│   ├── registry.py        ← codec metadata (packages, import checks, hub names)
│   ├── installer.py       ← dep-check, auto-install, setup commands
│   ├── cli.py             ← neural-codec CLI entry point
│   └── codecs/
│       ├── snac.py
│       ├── dac.py
│       ├── encodec24.py
│       ├── encodec48.py
│       ├── soundstream.py
│       └── speechtokenizer.py
├── requirements/
│   ├── base.txt           ← torch, torchaudio, soundfile, numpy, tqdm
│   ├── snac.txt           ← IDs 1–3
│   ├── dac.txt            ← IDs 4–6
│   ├── encodec.txt        ← IDs 7–8
│   ├── soundstream.txt    ← ID 9
│   └── speechtokenizer.txt← ID 10
├── config/
│   └── config.json        ← SpeechTokenizer model config
├── checkpoints/           ← place SpeechTokenizer.pt here
├── audio_sample/          ← put your input WAV files here
└── pyproject.toml

Installation

git clone https://github.com/CodeVault-girish/NeuralCodecDecoder.git
cd NeuralCodecDecoder
pip install -e .

This registers the neural-codec CLI command. Base dependencies (torch, torchaudio, soundfile, tqdm) are installed automatically. Per-codec packages are installed on demand.

Auto-Install

Missing codec dependencies are installed automatically the first time you run a codec.

Controlled by one flag in audio_codec/config.py:

# audio_codec/config.py

AUTO_INSTALL_DEPS = True   # auto-install missing packages before decoding (default)
AUTO_INSTALL_DEPS = False  # print the install command and exit instead

Value	Behaviour
`True`	First `decode_folder()` call installs any missing packages, then runs. No manual setup needed.
`False`	Prints the missing packages and exact `pip install` / `neural-codec setup` command, then exits cleanly.

Quick Start

# 1. See all codecs with live install status
neural-codec list

# 2. Decode a folder — auto-installs deps on first run (AUTO_INSTALL_DEPS=True)
neural-codec decode --codec snac_24khz --input ./audio_sample --output ./out

# 3. Use a different codec
neural-codec decode --codec dac_16khz    --input ./audio_sample --output ./out
neural-codec decode --codec encodec_24khz --input ./audio_sample --output ./out

# 4. Use GPU
neural-codec decode --codec snac_24khz --input ./audio_sample --output ./out --device cuda

# 5. Use codec by numeric ID instead of name
neural-codec decode --codec 7 --input ./audio_sample --output ./out

# 6. Pre-install deps without decoding
neural-codec setup --codec snac_24khz
neural-codec setup --all

CLI Reference

`neural-codec list`

Shows every codec — ID, name, sample rate, install status, and required packages. Also shows whether AUTO_INSTALL_DEPS is currently enabled.

neural-codec list

`neural-codec setup`

Install dependencies for a codec without running it.

neural-codec setup --codec snac_24khz       # install by name
neural-codec setup --codec 1                # install by ID
neural-codec setup --all                    # install all pip-installable codecs

# External codecs — prints manual setup steps
neural-codec setup --codec funcodec
neural-codec setup --codec audiodec

`neural-codec decode`

Batch encode/decode all WAV files in a folder (recursive).

neural-codec decode --codec <NAME_OR_ID> --input <DIR> --output <DIR> [--device cpu|cuda]

Flag	Required	Description
`--codec`	yes	Codec name (`snac_24khz`) or numeric ID (`1`)
`--input`	yes	Folder with `.wav` files (searched recursively)
`--output`	yes	Folder where decoded files are written
`--device`	no	`cpu` (default) or `cuda`

Output files are named <original_stem>_<codec_name>.wav.

Python API

from audio_codec import decode_folder, decoder_list, setup_codec, setup_all

# show all codecs and their install status
decoder_list()

# decode a folder (auto-installs deps if AUTO_INSTALL_DEPS=True)
decode_folder("1",           "audio_sample/", "out/", "cpu")   # by ID
decode_folder("snac_24khz",  "audio_sample/", "out/", "cuda")  # by name

# explicitly install before decoding
setup_codec("snac_24khz")
setup_all()

# check if a codec's deps are satisfied (no side effects)
from audio_codec import deps_satisfied
from audio_codec.registry import CODEC_REGISTRY
print(deps_satisfied(CODEC_REGISTRY["7"]))  # True / False

Per-Codec Requirements

1 · 2 · 3 — SNAC

Models: snac_24khz · snac_32khz · snac_44khz Weights: auto-download from HuggingFace (~30–80 MB each)

# requirements file
pip install -r requirements/snac.txt

# or via CLI
neural-codec setup --codec snac_24khz

Package	Notes
`snac`	SNAC model
`torchaudio`	resampling
`soundfile`	WAV I/O

neural-codec decode --codec snac_24khz --input ./wavs --output ./out
neural-codec decode --codec snac_32khz --input ./wavs --output ./out
neural-codec decode --codec snac_44khz --input ./wavs --output ./out

4 · 5 · 6 — DAC (Descript Audio Codec)

Models: dac_16khz · dac_24khz · dac_44khz Weights: auto-download via dac.utils.download() (~75 MB each)

pip install -r requirements/dac.txt

neural-codec setup --codec dac_16khz

Package	Notes
`descript-audio-codec`	DAC model + bundles `audiotools`
`soundfile`	WAV I/O

neural-codec decode --codec dac_16khz --input ./wavs --output ./out
neural-codec decode --codec dac_24khz --input ./wavs --output ./out
neural-codec decode --codec dac_44khz --input ./wavs --output ./out

7 · 8 — EnCodec (Facebook)

Models: encodec_24khz (mono) · encodec_48khz (stereo) Weights: auto-download from HuggingFace

pip install -r requirements/encodec.txt

neural-codec setup --codec encodec_24khz

Package	Notes
`transformers`	HuggingFace model loader
`encodec`	EnCodec core
`soundfile`	WAV I/O

neural-codec decode --codec encodec_24khz --input ./wavs --output ./out  # mono, 24 kHz
neural-codec decode --codec encodec_48khz --input ./wavs --output ./out  # stereo, 48 kHz

9 — SoundStream

Model: soundstream_16khz Weights: auto-download from HuggingFace (naturalspeech2.pt, ~143 MB)

pip install -r requirements/soundstream.txt

# After installing, restore a newer huggingface-hub so EnCodec still works:
pip install --upgrade huggingface-hub

neural-codec setup --codec soundstream_16khz

Package	Version	Notes
`soundstream`	0.0.1	pins `numpy<2.0` and `huggingface-hub<0.16`
`soundfile`	latest	WAV I/O

Dependency conflict with EnCodec: soundstream==0.0.1 forces huggingface-hub<0.16, which breaks transformers. Fix: after installing soundstream, run pip install --upgrade huggingface-hub. Both codecs then work in the same environment. For a fully isolated setup, use a dedicated virtual environment:

python -m venv venv_soundstream
venv_soundstream\Scripts\activate         # Windows
source venv_soundstream/bin/activate      # Linux / Mac

pip install -r requirements/soundstream.txt
neural-codec decode --codec soundstream_16khz --input ./wavs --output ./out

10 — SpeechTokenizer

Model: speechtokenizer (16 kHz) Weights: manual download required

pip install -r requirements/speechtokenizer.txt

Package	Notes
`speechtokenizer`	model loader
`beartype`	required runtime dependency of speechtokenizer
`soundfile`	WAV I/O

The pip packages alone are not enough — you must also download the checkpoint manually.

Step 1 — Install packages:

neural-codec setup --codec speechtokenizer

Step 2 — Download both files from HuggingFace:

fnlp/SpeechTokenizer → speechtokenizer_hubert_avg

Place them at these exact paths:

Neural-Codecs/
  checkpoints/
    SpeechTokenizer.pt     ← download from HuggingFace
  config/
    config.json            ← download from HuggingFace

Step 3 — Decode:

neural-codec decode --codec speechtokenizer --input ./wavs --output ./out

If the checkpoint is missing the CLI prints the exact download URL and expected path — no silent failures.

FunCodec (external)

Requires its own virtual environment due to dependency conflicts.

# Print full step-by-step instructions
neural-codec setup --codec funcodec

Manual summary:

python -m venv funcodec
funcodec\Scripts\activate                     # Windows
source funcodec/bin/activate                  # Linux / Mac

git clone https://github.com/alibaba-damo-academy/FunCodec.git
cd FunCodec && pip install -e .
pip install torch torchaudio numpy soundfile

cd egs/LibriTTS/codec && mkdir -p exp
git lfs install
git clone https://huggingface.co/alibaba-damo/audio_codec-encodec-en-libritts-16k-nq32ds640-pytorch \
    exp/audio_codec-encodec-en-libritts-16k-nq32ds640-pytorch

Build input list:

find /path/to/wavs -name "*.wav" \
  | awk -F/ '{printf "%s %s\n", $(NF-1)"_"$NF, $0}' > input.scp

Encode:

model=audio_codec-encodec-en-libritts-16k-nq32ds640-pytorch
bash encoding_decoding.sh \
  --stage 1 --batch_size 1 --num_workers 1 --gpu_devices 0 \
  --model_dir exp/${model} --bit_width 16000 --file_sampling_rate 16000 \
  --wav_scp input.scp --out_dir outputs/codecs

Decode:

bash encoding_decoding.sh \
  --stage 2 --batch_size 1 --num_workers 1 --gpu_devices 0 \
  --model_dir exp/${model} --bit_width 16000 --file_sampling_rate 16000 \
  --wav_scp outputs/codecs/codecs.txt --out_dir outputs/recon_wavs

AudioDec (external)

# Print full step-by-step instructions
neural-codec setup --codec audiodec

Manual summary:

git clone https://github.com/facebookresearch/AudioDec.git
cd AudioDec && pip install -r requirements.txt

Download exp.zip, extract into AudioDec/, then copy AudioDec.py from this repo into AudioDec/.

# Encode + decode
python AudioDec.py --model libritts_v1 -i input/ -o output/   # 24 kHz
python AudioDec.py --model vctk_v1     -i input/ -o output/   # 48 kHz

Model	Sample Rate
`libritts_v1`	24 kHz
`vctk_v1`	48 kHz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neural Codecs

Codec Support

Project Structure

Installation

Auto-Install

Quick Start

CLI Reference

`neural-codec list`

`neural-codec setup`

`neural-codec decode`

Python API

Per-Codec Requirements

1 · 2 · 3 — SNAC

4 · 5 · 6 — DAC (Descript Audio Codec)

7 · 8 — EnCodec (Facebook)

9 — SoundStream

10 — SpeechTokenizer

FunCodec (external)

AudioDec (external)

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 1

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
audio_codec		audio_codec
config		config
neural_codecs.egg-info		neural_codecs.egg-info
requirements		requirements
.gitignore		.gitignore
AudioDec.py		AudioDec.py
README.md		README.md
neural-codec.bat		neural-codec.bat
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Neural Codecs

Codec Support

Project Structure

Installation

Auto-Install

Quick Start

CLI Reference

neural-codec list

neural-codec setup

neural-codec decode

Python API

Per-Codec Requirements

1 · 2 · 3 — SNAC

4 · 5 · 6 — DAC (Descript Audio Codec)

7 · 8 — EnCodec (Facebook)

9 — SoundStream

10 — SpeechTokenizer

FunCodec (external)

AudioDec (external)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors 1

Languages

`neural-codec list`

`neural-codec setup`

`neural-codec decode`

Packages