Skip to content

gudgud96/basic-pitch-torch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

basic-pitch-torch

PyTorch version of Spotify's Basic Pitch, a lightweight audio-to-MIDI converter. The provided weights in Spotify's repo are converted using this script. Hopefully this helps researchers who are more accustomed to PyTorch to re-use the pretrained model.

Usage

For transcribing MIDI files, similar to Basic Pitch:

from basic_pitch_torch.inference import predict

model_output, midi_data, note_events = predict(audio_path)

For loading the nn.Module:

from basic_pitch_torch.model import BasicPitchTorch

pt_model = BasicPitchTorch()
pt_model.load_state_dict(torch.load('assets/basic_pitch_pytorch_icassp_2022.pth'))
pt_model.eval()

with torch.no_grad():
    output_pt = pt_model(y_torch)
    contour_pt, note_pt, onset_pt = output_pt['contour'], output_pt['note'], output_pt['onset']

Result Validation

In tests/ we show two levels of validation tests using a test audio from GuitarSet:

  • On model output

    • Most of the discrepancies originated from float division (e.g. normalized_log) and error propagation further down the network. The difference should be minimal enough to be ignored during MIDI note creation.
    Contour abs diff - max: 0.0003006, min: 0.0, avg: 5.863e-06
    Onset abs diff   - max: 0.0002712, min: 0.0, avg: 1.431e-05
    Note abs diff    - max: 0.0002297, min: 0.0, avg: 6.6e-06
    
  • On MIDI transcription

    • The transcribed MIDI using both TF and PT models are identical (see midi_data_pt.mid and midi_data_tf.mid)

References

Bittner, Rachel M., et al. "A lightweight instrument-agnostic model for polyphonic note transcription and multipitch estimation." ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022.

About

PyTorch version of Spotify's Basic Pitch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages