Skip to content

v0.2.0 — the UX bundle

Choose a tag to compare

@ParkNorth ParkNorth released this 22 May 02:06
· 11 commits to main since this release

The UX bundle: turn "it works if you know what you're doing" into "it just works".

Headline features

  • 🎤 --karaoke shortcut — one flag, instant karaoke instrumental (sum of drums/bass/other, vocals removed).
  • 🔀 --mix-stems vocals,drums — write a single file that's the sum of whichever stems you list.
  • 🎧 --mp3 output with --bitrate 192k (32-320 kbps). Powered by the tiny lameenc wheel — no ffmpeg required.
  • ⚡ Auto execution-provider routing: providers="auto" (the new default) picks CoreML on macOS arm64, CUDA on Linux+NVIDIA, DML on Windows DX12, CPU otherwise.
  • 🪶 fp16-weight downloads with --small / precision="fp16weights": 166 MB per model instead of 316 MB (1.91× smaller). Same runtime memory and latency, max abs diff vs fp32 is ~6e-5.
  • 🎚️ Auto-resampling: any sample rate input (8 kHz to 192 kHz, mono or multi-channel) is transparently resampled to 44.1 kHz for inference and back to the input rate before writing.
  • 📊 Progress bar via tqdm when stdout is a TTY (--quiet to silence, --verbose for the old chunk-by-chunk log).

Try it

pip install 'demucs-onnx[mp3]==0.2.0'
demucs-onnx separate song.mp3 out/ --karaoke --mp3
# -> out/karaoke.mp3 (drums + bass + other, vocals removed)

Backwards compatibility

Every v0.1.0 call still works. The only behavior change worth noting:
providers=None now means "auto" (was "CPU only"); pass "cpu" explicitly if you want to force CPU.

See CHANGELOG.md for the full diff, and PyPI for the published artifact.