# Known-Artist Live Song ID: A Hashprint Approach

This notebook is a python implementation of the [original matlab version] (http://pages.hmc.edu/ttsai/assets/livesongid.tar.gz). The pipeline consists of the following modules:

1. Preprocessing with Constant-Q Transform (CQT)
2. Learning PCA filters
3. Obtaining hashprint representation by applying PCA filters to preprocessed CQT matrix
4. Generating database with the hashprint of pitch-shifted CQT matrix of reference tracks
5. Matching query against database

In [1]:
GIT_DIR = '/home/zhwang/workspace/live-song-id/'
%cd $GIT_DIR

/home/zhwang/workspace/live-song-id


## Data Preprocessing

For a given artist, we first preprocess the list of reference tracks by taking CQT transform for each audio file.

In [2]:
from preprocess import *
import os
from utils import pitch_shift_CQT

In [3]:
# audio_path - where audio files are stored
# list_dir - where list of softlinks are stored
# cqt_dir - where results will be stored
audio_path = '/home/nbanerjee/SoftLinks/'
list_dir = audio_path + '/Lists/'
cqt_dir = '/home/zhwang/ttemp/livesong_results/' # change this
artist = 'taylorswift'
out_dir = os.path.join(cqt_dir, artist+'_out')
file_paths = get_allpaths(artist, list_dir)

In [None]:
%cd $cqt_dir
%mkdir $out_dir
%cd $out_dir

f = open(os.path.join(out_dir, artist + '_cqtList.txt'), 'w')
for cur_file in file_paths:
    print('==> Computing CQT of %s'%cur_file)
    y, sr = librosa.load(audio_path + cur_file + '.wav')
    Q = librosa.cqt(y, sr=sr, fmin=130.81, n_bins=121, bins_per_octave=24)
    logQ = preprocess(Q, 3)
    cur_file_path = os.path.join(os.getcwd(), cur_file + '.dat')
    np.savetxt(cur_file_path, logQ)
    f.write(cur_file_path + '\n')
f.close()


## Learning PCA Filters

In [35]:
%cd $GIT_DIR
from PCA import *

/home/zhwang/workspace/live-song-id


Next, we generate the filters for the hashprint representation by taking PCA on the covariance matrix for all reference songs for this artist. First, we compute the covariance matrix:

In [36]:
num_features = 64
nbins = 121
m = 20 # number of frames

In [37]:
f = open(os.path.join(out_dir, artist + '_cqtList.txt'), 'r')
accum_cov = np.zeros((nbins * m, nbins * m))
count = 0

for line in f:
    print('==> Computing covariance matrix for %s'%os.path.basename(line)[:-1])
    Q = np.loadtxt(line[:-1])
    A = getTDE(Q)
    if A.shape[1] > 1:
        accum_cov += np.cov(A.T)
    count += 1

f.close()

==> Computing covariance matrix for taylorswift_ref1.dat
==> Computing covariance matrix for taylorswift_ref2.dat
==> Computing covariance matrix for taylorswift_ref3.dat
==> Computing covariance matrix for taylorswift_ref4.dat
==> Computing covariance matrix for taylorswift_ref5.dat
==> Computing covariance matrix for taylorswift_ref6.dat
==> Computing covariance matrix for taylorswift_ref7.dat
==> Computing covariance matrix for taylorswift_ref8.dat
==> Computing covariance matrix for taylorswift_ref9.dat
==> Computing covariance matrix for taylorswift_ref10.dat
==> Computing covariance matrix for taylorswift_ref11.dat
==> Computing covariance matrix for taylorswift_ref12.dat
==> Computing covariance matrix for taylorswift_ref13.dat
==> Computing covariance matrix for taylorswift_ref14.dat
==> Computing covariance matrix for taylorswift_ref15.dat
==> Computing covariance matrix for taylorswift_ref16.dat
==> Computing covariance matrix for taylorswift_ref17.dat
==> Computing covarianc

Next, we compute PCA from these covariance matrices.

In [39]:
evals, evecs = LA.eig(accum_cov / count)
ind = np.argsort(-evals)
evecs = (evecs[:, ind])[:, :num_features] # this turns out to be complex
evecs = np.absolute(evecs)
np.savetxt(os.path.join(out_dir, artist + '_model.dat'), evecs)

## Applying Filters to CQT Matrix to generate the Database

In [52]:
evecs = np.loadtxt(os.path.join(out_dir, artist + '_model.dat')).T

In order to stay consistency with the keras model, we transpose the filters and the CQT images to have shape ```(width, height)```.

In [53]:
pca_matrix = np.array([vec.reshape((m, -1)) for vec in evecs])
delta = 16
max_pitch_shift = 4

For each reference track, we pitch shift the CQT matrix by four pitches, both up and down. For each pitch-shifted version, we compute the hashprint by first passing it through the convolutional network, and then taking the delta feature and thresholding by zero.

The database contains filenames as keys and the pitch-shifted delta features as values.

In [54]:
from model import *

f = open(os.path.join(out_dir, artist + '_cqtList.txt'), 'r')
db = {}

for line in f:
    print('==> Generating database for %s'%os.path.basename(line)[:-1])
    Q = np.loadtxt(line[:-1]).T
    pitch_shift_Qs = np.empty((2 * max_pitch_shift + 1, ) + Q.shape)
    pitch_shift_Qs[0, :, :] = Q
    for i in range(1, max_pitch_shift + 1):
        pitch_shift_Qs[i, :, :] = pitch_shift_CQT(Q.T, i).T
    for i in range(1, max_pitch_shift + 1):
        pitch_shift_Qs[i + max_pitch_shift, :, :] = pitch_shift_CQT(Q.T, -i).T
    
    conv_1d_net = build_model(pca_matrix, Q.shape, delta=delta)
    
    fpseqs = run_model(conv_1d_net, pitch_shift_Qs)
    delta_fp = fpseqs[:, :, :fpseqs.shape[2] - delta] - fpseqs[:, :, delta:]
    
    db[os.path.basename(line)[:-1]] = np.where(delta_fp > 0, 1, 0)
f.close()
        

==> Generating database for taylorswift_ref1.dat
==> Generating database for taylorswift_ref2.dat
==> Generating database for taylorswift_ref3.dat
==> Generating database for taylorswift_ref4.dat
==> Generating database for taylorswift_ref5.dat
==> Generating database for taylorswift_ref6.dat
==> Generating database for taylorswift_ref7.dat
==> Generating database for taylorswift_ref8.dat
==> Generating database for taylorswift_ref9.dat
==> Generating database for taylorswift_ref10.dat
==> Generating database for taylorswift_ref11.dat
==> Generating database for taylorswift_ref12.dat
==> Generating database for taylorswift_ref13.dat
==> Generating database for taylorswift_ref14.dat
==> Generating database for taylorswift_ref15.dat
==> Generating database for taylorswift_ref16.dat
==> Generating database for taylorswift_ref17.dat
==> Generating database for taylorswift_ref18.dat
==> Generating database for taylorswift_ref19.dat
==> Generating database for taylorswift_ref20.dat
==> Gener

After generating the database, we serialize it and save it to the disk.

In [61]:
import pickle
db_path = os.path.join(out_dir, artist + '_db.pickle')

In [60]:
with open(db_path, 'wb') as handle:
    pickle.dump(db, handle, pickle.HIGHEST_PROTOCOL)

## Matching Query Against Database

First, we load the database from disk:

In [62]:
db = {}
with open(db_path, 'rb') as handle:
    db = pickle.load(handle)