Fast-DACVAE

Fast inference engine for DACVAE, a neural audio codec that compresses and reconstructs audio using a convolutional encoder-decoder with a VAE bottleneck. This library accelerates DACVAE inference up to 11.2x on NVIDIA GPUs through graph-level optimizations — with no custom kernels, no quality loss at FP32, and no changes to model weights.

Benchmark

NVIDIA H100 PCIe | facebook/dacvae-watermarked (107.7M params) | 100s audio @ 48kHz

Full Precision (FP32) — Zero Quality Loss

Method	Latency	Speedup	Real-time Factor
PyTorch FP32	1,047 ms	1.0x	96x
+ channels_last + wn_off	549 ms	1.9x	182x
+ torch.compile + graph	209 ms	5.0x	478x

Half Precision (FP16 / BF16)

Method	Latency	Speedup	RTF	SNR vs FP32
PyTorch FP16	775 ms	1.4x	129x	40.4 dB
+ channels_last + wn_off	307 ms	3.4x	326x	40.2 dB
+ torch.compile + graph (FP16)	93 ms	11.2x	1,071x	40.2 dB
+ torch.compile + graph (BF16)	100 ms	10.5x	1,004x	29.8 dB

Quick Start

pip install git+https://github.com/kadirnar/fast-dacvae.git

from dacvae import DACVAE
from dacvae.optimize import optimize_dacvae
import torch

model = DACVAE.load("facebook/dacvae-watermarked").cuda().eval()
audio = torch.randn(1, 1, 4800000, device="cuda")

# FP32 — zero quality loss, ~209ms
replay = optimize_dacvae(model, audio, dtype="fp32")
output = replay()

# FP16 — fastest, ~93ms
replay = optimize_dacvae(model, audio, dtype="fp16")
output = replay()

# BF16 — ~100ms
replay = optimize_dacvae(model, audio, dtype="bf16")
output = replay()

Requirements

PyTorch 2.9+
NVIDIA GPU (Hopper/Ampere)

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
dacvae		dacvae
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fast-DACVAE

Benchmark

Full Precision (FP32) — Zero Quality Loss

Half Precision (FP16 / BF16)

Quick Start

Requirements

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fast-DACVAE

Benchmark

Full Precision (FP32) — Zero Quality Loss

Half Precision (FP16 / BF16)

Quick Start

Requirements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages