<a href="https://colab.research.google.com/github/gened1080/audio-fingerprinting/blob/master/Rank_By_Similarity_Fall_2020.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Rank by similarity

This notebook allows you to pick a song to fingerprint and then go through the process of fingerprinting multiple songs which are then ranked by similarity to the chosen song.

In [2]:
%%bash
!(stat -t /usr/local/lib/*/dist-packages/google/colab > /dev/null 2>&1) && exit 
rm -rf audio-fingerprinting
git clone https://github.com/gened1080/audio-fingerprinting.git
pip install pydub
pip install datasketch
sudo apt-get install libasound-dev portaudio19-dev libportaudio2 libportaudiocpp0 ffmpeg
pip install pyaudio

Reading package lists...
Building dependency tree...
Reading state information...
libportaudio2 is already the newest version (19.6.0-1).
libportaudiocpp0 is already the newest version (19.6.0-1).
portaudio19-dev is already the newest version (19.6.0-1).
libasound2-dev is already the newest version (1.1.3-5ubuntu0.6).
ffmpeg is already the newest version (7:3.4.8-0ubuntu0.2).
0 upgraded, 0 newly installed, 0 to remove and 37 not upgraded.


Cloning into 'audio-fingerprinting'...


In [3]:
# Import relevant packages

from bokeh.io import output_notebook
import warnings
import sys
sys.path.append('/content/audio-fingerprinting')
import AudioFP as afp
import os

warnings.filterwarnings('ignore')
output_notebook()

### Fingerprint a song
We start by first picking a chosen song. Execute the code below and follow the prompts to select a song and fingerprint it. If you have already saved the fingerprint of that song, enter `s` to open the saved fingerprint.

In [4]:
# Choose the song to rank by similarity against
chosen_song = afp.AudioFP(process='a')

Enter "f" to read from audio file or "s" to open saved fingerprint: f
Enter the filename you want to read (excluding the extension): queen_under_pressure
Do you want to show all plots? Enter "y" or "n": n
Do you want to save the fingerprint to file for later use? Enter "y" or "n": n
Not saving anything


### Fingerprinting multiple songs
Read all the files in the folder `songs` and fingerprint all the `.mp3` files.

In [5]:
# Obtain current path
mypath = os.getcwd() + '/audio-fingerprinting/songs'
# Get list of all files
all_files = next(os.walk(mypath))[2]
songfiles = []
# count and get names of all .mp3 files
for file in all_files:
    name, extension = file.rsplit('.', 1)
    if extension == 'mp3':
        songfiles.append(name)
num_songs = len(songfiles)
# Create AudioFP objects for each .mp3 file
afp_objs = [afp.AudioFP(process='m') for i in range(num_songs)]
index = 0
# Generate Audio fingerprints for all tracks
for song in afp_objs:
    channels, framerate = afp.AudioFP.read_audiofile(song, False, songfiles[index])
    f, t, sgram = afp.AudioFP.generate_spectrogram(song, False, channels, framerate)
    fp, tp, peaks = afp.AudioFP.find_peaks(song, False, f, t, sgram)
    fp = afp.AudioFP.generate_fingerprint(song, False, fp, tp, peaks)
    index += 1

### Comparing and ranking

All the songs fingerprinted in the step above will be compared to the chosen song. The comparison is done by calculating Jaccard similarity index. Closer this index is to 1, more similar the songs.

In [6]:
# Comparing songs and creating a ranked list
ranked_list = []
for song in afp_objs:
    ranked_list.append((song.songname, afp.calc_jaccard(chosen_song, song)))
ranked_list = sorted(ranked_list, key=lambda x: x[1], reverse=True)

# Print out the results
print('List of songs ranked in order of similarity to {}'.format(chosen_song.songname))
print('')
print('Rank, Song Name, Jaccard similarity index')
for item, n in zip(ranked_list, range(num_songs)):
    print(n+1, item[0], item[1])

List of songs ranked in order of similarity to queen_under_pressure

Rank, Song Name, Jaccard similarity index
1 queen_under_pressure 1.0
2 queen_david_bowie_under_pressure_classic_queen_mix 1.0
3 SoundHelix-Song-12 0.011923905441552642
4 SoundHelix-Song-1 0.0001483757026003439
5 vanilla_ice_ice_ice_baby -0.0031594663881128983
6 SoundHelix-Song-8 -0.006092039038546704
7 SoundHelix-Song-4 -0.016741140637581076
