Skip to content

BetterInc/audio2midi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

audio2midi

High-quality audio separation and MIDI transcription CLI.

Separate audio into stems (vocals, drums, bass, guitar, piano, other) and transcribe each to MIDI using state-of-the-art ML models.

Features

  • Best-in-class separation: Uses BS-RoFormer (SDR 12.97) and Demucs models
  • 6-stem separation: Vocals, drums, bass, guitar, piano, other
  • Accurate transcription: Spotify's Basic Pitch for MIDI conversion
  • GPU acceleration: CUDA (NVIDIA), MPS (Apple Silicon), CPU fallback
  • BPM & key detection: Automatic tempo and musical key analysis
  • DAW-ready MIDI: Proper instrument assignments and multi-track export

Installation

pip install audio2midi

For GPU acceleration (NVIDIA):

pip install audio2midi[gpu]

RTX 50 Series (Blackwell) GPUs

RTX 5070/5080/5090 require PyTorch with CUDA 12.8 support:

pip install --pre torch torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128

Usage

Full Pipeline (Recommended)

Separate and transcribe in one command:

audio2midi convert song.mp3 -o output/

This will:

  1. Analyze BPM and key
  2. Separate into 6 stems
  3. Transcribe each stem to MIDI
  4. Output individual + combined MIDI files

Separate Only

audio2midi separate song.mp3 --model htdemucs_6s

Available models:

  • htdemucs_6s - 6 stems (default)
  • htdemucs - 4 stems (faster)
  • bs_roformer - Best vocal separation

Transcribe Only

audio2midi transcribe vocals.wav -o vocals.mid --instrument vocals

Analyze Audio

audio2midi analyze song.mp3

Check Device

audio2midi device

Output Structure

output/
└── song/
    ├── analysis.json
    ├── stems/
    │   ├── vocals.wav
    │   ├── drums.wav
    │   ├── bass.wav
    │   ├── guitar.wav
    │   ├── piano.wav
    │   └── other.wav
    └── midi/
        ├── vocals.mid
        ├── drums.mid
        ├── bass.mid
        ├── guitar.mid
        ├── piano.mid
        ├── other.mid
        └── combined.mid

Requirements

  • Python 3.10+
  • CUDA-capable GPU recommended (10x faster than CPU)

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages