Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


GitHub license CI codecov Documentation Status

practically universal music pre-processor

pumpp up the jams

The goal of this package is to make it easy to convert pairs of (audio, jams) into data that can be easily consumed by statistical algorithms. Some desired features:

  • Converting tags to sparse encoding vectors
  • Sampling (start, end, label) to frame-level annotations at a specific frame rate
  • Extracting input features (eg, Mel spectra or CQT) from audio
  • Converting between annotation spaces for a given task

Example usage

>>> import jams
>>> import pumpp

>>> audio_f = '/path/to/audio/myfile.ogg'
>>> jams_f = '/path/to/annotations/myfile.jamz'

>>> # Set up sampling and frame rate parameters
>>> sr, hop_length = 44100, 512

>>> # Create a feature extraction object
>>> p_cqt = pumpp.feature.CQT(name='cqt', sr=sr, hop_length=hop_length)

>>> # Create some annotation extractors
>>> p_beat = pumpp.task.BeatTransformer(sr=sr, hop_length=hop_length)
>>> p_chord = pumpp.task.SimpleChordTransformer(sr=sr, hop_length=hop_length)

>>> # Collect the operators in a pump
>>> pump = pumpp.Pump(p_cqt, p_beat, p_chord)

>>> # Apply the extractors to generate training data
>>> data = pump(audio_f=audio_f, jam=jams_fjams_f)

>>> # Or test data
>>> test_data = pump(audio_f='/my/test/audio.ogg')

>>> # Or in-memory
>>> y, sr = librosa.load(audio_f)
>>> test_data = pump(y=y, sr=sr)