In [3]:
%pip install cython

Note: you may need to restart the kernel to use updated packages.


In [2]:
%matplotlib inline
from ADTLib import ADT
import numpy, scipy, matplotlib.pyplot as plt, librosa, IPython.display as ipd
import stanford_mir; stanford_mir.init()

ModuleNotFoundError: No module named 'ADTLib'

[&larr; Back to Index](index.html)

# Drum Transcription using ADTLib

This notebook requires `ADTLib`. See [ADTLib repo](https://github.com/CarlSouthall/ADTLib) for installation instructions. If you experience problems, be sure to install the latest versions of `tensorflow` and `dask`.

Load the audio file into an array:

In [None]:
filename = 'audio/classic_rock_beat.mp3'
x, sr = librosa.load(filename)

Listen to the signal:

In [None]:
ipd.Audio(x, rate=sr)

## ADTLib

Use ADTLib to identify the location and types of each onset:

In [None]:
drum_onsets = ADT([filename])[0]

In [None]:
drum_onsets

ADT also produces the file `classic_rock_beat_drumtab.pdf` which looks like this:

In [None]:
ipd.Image('img/classic_rock_beat_drumtab.png')

## Listen to onsets

For each type of drum, create a click track from the onsets, and listen to it with the original signal.

Bass/kick drum:

In [None]:
clicks = librosa.clicks(times=drum_onsets['Kick'], sr=sr, length=len(x))
ipd.Audio(x + clicks, rate=sr)

Snare drum:

In [None]:
clicks = librosa.clicks(times=drum_onsets['Snare'], sr=sr, length=len(x))
ipd.Audio(x + clicks, rate=sr)

Hi-hat:

In [None]:
clicks = librosa.clicks(times=drum_onsets['Hihat'], sr=sr, length=len(x))
ipd.Audio(x + clicks, rate=sr)

## Visualize spectrum

For each drum type, let's compute an average drum beat from the original signal and visualize the spectrum for that average drum beat.

Create a function that returns a log-amplitude spectrum of an average drum beat for a particular drum type:

In [None]:
def plot_avg_spectrum(x, onset_times):
    
    # Compute average drum beat signal.
    frame_sz = int(0.100*sr)
    def normalize(z): 
        return z/scipy.linalg.norm(z)
    onset_samples = librosa.time_to_samples(onset_times, sr=sr)
    x_avg = numpy.mean([normalize(x[i:i+frame_sz]) for i in onset_samples], axis=0)
    
    # Compute average spectrum.
    X = librosa.spectrum.fft.fft(x_avg)
    Xmag = librosa.amplitude_to_db(abs(X))
    
    # Plot spectrum.
    f = numpy.arange(frame_sz)*sr/frame_sz
    Nd2 = int(frame_sz/2)
    plt.figure(figsize=(14, 5))
    plt.plot(f[:Nd2], Xmag[:Nd2])
    plt.xlim(xmax=f[Nd2])
    plt.ylim([-50, 20])
    plt.xlabel('Frequency (Hertz)')

Plot the spectrum for an average bass drum:

In [None]:
plot_avg_spectrum(x, drum_onsets['Kick'])

Snare drum:

In [None]:
plot_avg_spectrum(x, drum_onsets['Snare'])

Hi-hat:

In [None]:
plot_avg_spectrum(x, drum_onsets['Hihat'])

[&larr; Back to Index](index.html)