## Rank by similarity

This notebook allows you to pick a song to fingerprint and then go through the process of fingerprinting multiple songs which are then ranked by similarity to the chosen song.

In [1]:
# Import relevant packages

from bokeh.io import output_notebook
import warnings
import sys
import AudioFP as afp

warnings.filterwarnings('ignore')
output_notebook()

### Setting parameters

The parameters below can be used to tune the audio fingerprinting algorithm.

In [2]:
# Parameters for tuning the Audiofingerprinting algorithm

# Parameters used in generating spectrogram
#----------------------------------
afp.nperseg = 16 * 256  # window size
afp.overlap_ratio = 0.4  # degree of overlap, larger number->more overlap, denser fingerprint
#----------------------------------

# Parameters used in finding local peaks
#-------------------------
afp.min_peak_sep = 15  # larger sep -> less peaks -> less accuracy, but faster fingerprinting
afp.min_peak_amp = 10  # larger min amp -> less peaks -> less accuracy, but faster fingerprinting
#-------------------------

# Parameters used in generating fingerprint
#------------------------------
afp.peak_connectivity = 15  # Number of neighboring peaks to use as target for each anchor
afp.peak_time_delta_min = 0  # Minimum spacing in time between peaks for anchor and target
afp.peak_time_delta_max = 200  # Maximum spacing in time between peaks for anchor and target
#------------------------------

### Fingerprint a song
We start by first picking a chosen song. When prompted, enter "a" to automatically run through all the functions that will generate the fingerprints. If you have already saved the fingerprint of that song, enter "s" to open the saved fingerprint.

In [3]:
# Choose the song to rank by similarity against
chosen_song = afp.AudioFP()

Enter "a" for automated fingerprinting or "m" to proceed manually: a
Enter "f" to read from audio file, "r" to record audio, or "s" to open saved fingerprint: s
Enter the filename (excluding the extention) where the fingerprint is saved: SoundHelix-Song-1
Do you want to see the details of the file? Enter "y" or "n": n


### Fingerprinting in a loop
Enter the number of songs you want to fingerprint. When prompted, enter "a" to go through all the fingerprinting steps automatically.

In [4]:
# Pick how many other songs to compare and rank
num_songs = int(input('Enter number of songs to compare and rank: '))
afp_objs = [afp.AudioFP() for i in range(num_songs)]

Enter number of songs to compare and rank: 5
Enter "a" for automated fingerprinting or "m" to proceed manually: a
Enter "f" to read from audio file, "r" to record audio, or "s" to open saved fingerprint: f
Enter the filename you want to read (excluding the extension): SoundHelix-Song-12
Do you want to show all plots? Enter "y" or "n": n
Do you want to save the fingerprint to file for later use? Enter "y" or "n": n
Not saving anything
Enter "a" for automated fingerprinting or "m" to proceed manually: a
Enter "f" to read from audio file, "r" to record audio, or "s" to open saved fingerprint: s
Enter the filename (excluding the extention) where the fingerprint is saved: SoundHelix-Song-4
Do you want to see the details of the file? Enter "y" or "n": y
Songname:  SoundHelix-Song-4
Framerate:  44100
Audio-fingerprint:
[  6322 158005  28308 173514  21783 101719 279508  32146 233224  89005
  47148 141494  40005 263561  11039 131802  24436 119782 301203  56272
 260805 108405 171153  90199  7338

### Comparing and ranking

All the songs fingerprinted in the step above will be compared to the chosen song. The comparison is done by calculating Jaccard similarity index. Closer this index is to 1, more similar the songs.

In [5]:
# Comparing songs and creating a ranked list
ranked_list = []
for song in afp_objs:
    ranked_list.append((song.songname, chosen_song.fingerprint.jaccard(song.fingerprint)))
ranked_list = sorted(ranked_list, key=lambda x: x[1], reverse=True)

# Print out the results
print('List of songs ranked in order of similarity to {}'.format(chosen_song.songname))
print('')
print('Rank, Song Name, Jaccard similarity index')
for item, n in zip(ranked_list, range(num_songs)):
    print(n+1, item[0], item[1])

List of songs ranked in order of similarity to SoundHelix-Song-1

Rank, Song Name, Jaccard similarity index
1 SoundHelix-Song-1 1.0
2 SoundHelix-Song-8 0.04296875
3 SoundHelix-Song-4 0.02734375
4 SoundHelix-Song-12 0.00390625
5 test_track 0.0
