# Cover Song Detection and Beat-Synchronous Chroma 
### George Tzanetakis, University of Victoria

In this notebook, we explore a simple method for cover song detection using a beat-synchronous 
chroma representation. This is a toy example used to become familiar with the key concepts. An 
actual cover song detection system would be more complicated. 

We will consider a small toy collection of 5 recordings. These consists 
of the song Dreamer by the group Supertramp. The second recording 
is the same as the first but slowed down. The third recording 
is a different performance of the same song. The fourth recording 
is the song Goodbye Stranger also by Supertramp that has somewhat 
similar instrumentation but is a different song. Finally the last recording 
is the beautiful jazz balad Naima by John Coltrane. 


In [None]:
%matplotlib inline

import warnings
import matplotlib.pyplot as plt
import scipy.io.wavfile as wav
import numpy as np
import IPython.display as ipd
import pyrubberband as pyrb
import scipy


with warnings.catch_warnings():
    warnings.simplefilter("ignore", category=DeprecationWarning)
    import librosa, librosa.display

def load_wav(fname): 
    srate, audio = wav.read(fname)
    audio = audio.astype(np.float32) / 32767.0 
    audio = (0.9 / np.max(audio)) * audio
    # convert to mono 
    if (len(audio.shape) == 2):
        audio = (audio[:, 0] + audio[:, 1]) / 2
    return (audio,srate) 


dreamer,srate = load_wav('dreamer.wav')
dreamer_live, srate = load_wav('dreamer_live.wav')
dreamer_slow = pyrb.time_stretch(dreamer, srate, 0.75)
goodbye_stranger, srate = load_wav('goodbye_stranger.wav')
naima,srate = load_wav('naima.wav')



In [None]:
ipd.Audio(dreamer,rate=srate)

In [None]:
ipd.Audio(dreamer_live,rate=srate)

In [None]:
ipd.Audio(dreamer_slow, rate=srate)

In [None]:
ipd.Audio(goodbye_stranger, rate=srate)

We can calculate a chromagram using a CQT transform that represents the energy in different pitch classes across time. To calculate a beat-synchronous representation first the beats are extracted. Multiple chroma vectors correspond to each beat so we take their median to aggregate them. The resulting representation theoretically is tempo invariant. Even though visually it is not obvious - notice the significant reduction in the sizes of the chromagram matrices when beat-syncronization is used. 

In [None]:

def plot_chromagram(y):
    y = y[0:8000000]
    C = librosa.feature.chroma_cqt(y, sr=srate, bins_per_octave=12, norm=2)
    
    plt.figure(figsize=(12,4))
    plt.subplot(1, 2, 1)
    # Display the chromagram: the energy in each chromatic pitch class as a function of time
    # To make sure that the colors span the full range of chroma values, set vmin and vmax
    librosa.display.specshow(C, sr=srate, x_axis='time', y_axis='chroma', vmin=0, vmax=1)
    plt.title('Chromagram')
    plt.colorbar()
    plt.tight_layout()

    plt.subplot(1, 2, 2)

    # extract beats 
    tempo, beats = librosa.beat.beat_track(y, sr=srate)
    C_sync = librosa.util.sync(C, beats, aggregate=np.median)

    librosa.display.specshow(C_sync, y_axis='chroma', sr=srate,vmin=0.0, vmax=1.0, x_axis='time', 
                         x_coords=librosa.frames_to_time(librosa.util.fix_frames(beats), sr=srate))
    plt.title('Beat Synchronous Chromagram')
    plt.colorbar()
    plt.tight_layout()
    plt.show()
    # for the beat-synchronous use 280 beats
    return C,C_sync[:,0:280] 
    
    
dreamer_cqt, dreamer_bcqt = plot_chromagram(dreamer)
dreamer_slow_cqt, dreamer_slow_bcqt = plot_chromagram(dreamer_slow)
dreamer_live_cqt, dreamer_live_bcqt = plot_chromagram(dreamer_live)
goodbye_stranger_cqt, goodbye_stranger_bcqt = plot_chromagram(goodbye_stranger)
naima_cqt, naima_bcqt = plot_chromagram(naima)


print(dreamer_cqt.shape, dreamer_bcqt.shape)
print(dreamer_slow_cqt.shape, dreamer_slow_bcqt.shape)
print(dreamer_live_cqt.shape, dreamer_live_bcqt.shape)
print(goodbye_stranger_cqt.shape, goodbye_stranger_bcqt.shape)
print(naima_cqt.shape, naima_bcqt.shape)

In [None]:
cqt_list = [dreamer_cqt, dreamer_slow_cqt, dreamer_live_cqt, goodbye_stranger_cqt, naima_cqt]


sim_matrix = np.zeros([5,5])
for (i,s1) in enumerate(cqt_list):
    for (j,s2) in enumerate(cqt_list):
        b = np.mean(np.sum(s1 * s2,axis=0))
        sim_matrix[i,j] = b
print(sim_matrix)

plt.imshow(sim_matrix)
plt.colorbar()

In [None]:
cqt_list = [dreamer_cqt, dreamer_slow_cqt, dreamer_live_cqt, goodbye_stranger_cqt, naima_cqt]
bcqt_list = [dreamer_bcqt, dreamer_slow_bcqt, dreamer_live_bcqt, goodbye_stranger_bcqt, naima_bcqt]


sim_matrix = np.zeros([5,5])
for (i,s1) in enumerate(bcqt_list):
    for (j,s2) in enumerate(bcqt_list):
        b = np.mean(np.sum(s1 * s2,axis=0))
        sim_matrix[i,j] = b
print(sim_matrix)

plt.imshow(sim_matrix)
plt.colorbar()