Skip to content

Open-Unmix (UMX-L) running client-side in the browser with WebAssembly

License

Notifications You must be signed in to change notification settings

olilarkin/free-music-demixer

 
 

Repository files navigation

free-music-demixer

A free client-side static website for music demixing (aka music source separation) using the AI model Open-Unmix (with UMX-L weights):

I transliterated the original PyTorch model Python code to C++ using Eigen. It compiles to WebAssembly with Emscripten. The UMX-L weights are quantized (mostly uint8, uint16 for the last 4 layers) and saved with the ggml binary file format. They are then gzipped. This reduces the 425 MB of UMX-L weights down to 45 MB, while achieving similar performance (verified empirically using BSS metrics).

This is based on umx.cpp, my other project. This repo focuses on the WASM and web aspects, while umx.cpp is more about maintaining 1:1 performance parity with the original Open-Unmix (supporting both umxhq and umxl).

Roadmap

  • Use less memory: I need to use up to 4 GB, but lots of it is wasteful (copying float* to std::vector to Eigen::MatrixXf etc.)
  • Implement Wiener Expectation-Maximization post-processing (adds ~1 dB performance overall); see umx.cpp issue #1

Dev instructions

Clone the repo with submodules:

git clone --recurse-submodules https://github.com/sevagh/free-music-demixer

To generate a weights file with Python, first create a Python venv, then:

python -m pip install -r ./scripts/requirements.txt
python ./scripts/convert-pth-to-ggml.py --model=umxl ./ggml-umxl
gzip -k ./ggml-umxl/ggml-model-umxhl-u8.bin

Build for WebAssembly with Emscripten using emcmake:

mkdir -p build-wasm && cd build-wasm && emcmake cmake .. && make

Build a regular library and the file_demixer binary (only tested on Linux):

mkdir -p build-cpp && cd build-cpp && cmake .. && make

Notes

The wav-file-encoder project has been vendored in; I manually compiled the Typescript file to Javascript with these commands:

npm install typescript
npx tsc --module es6 ../vendor/wav-file-encoder/src/WavFileEncoder.ts

Output quality

MUSDB18-HQ test track 'Zeno - Signs', demixed by this app:

vocals          ==> SDR:   6.550  SIR:  14.583  ISR:  13.820  SAR:   6.974
drums           ==> SDR:   6.538  SIR:  11.209  ISR:  11.163  SAR:   8.317
bass            ==> SDR:   1.646  SIR:   0.931  ISR:   5.261  SAR:   2.944
other           ==> SDR:   5.190  SIR:   6.623  ISR:  10.221  SAR:   8.599

About

Open-Unmix (UMX-L) running client-side in the browser with WebAssembly

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 77.8%
  • Python 13.5%
  • CMake 6.2%
  • Shell 2.5%