## Comparing a signal with itself while adding noise

With this notebook you can compare an audio signal with itself while adding noise to it. We use the AudioFP class for fingerprinting.

In [1]:
# Import relevant packages

from bokeh.io import output_notebook
import warnings
import sys
import AudioFP as afp

warnings.filterwarnings('ignore')
output_notebook()

### Setting parameters

The parameters below can be used to tune the audio fingerprinting algorithm. 

In [2]:
# Parameters for tuning the Audiofingerprinting algorithm

# Parameters used in generating spectrogram
#----------------------------------
afp.nperseg = 16 * 256  # window size
afp.overlap_ratio = 0.4  # degree of overlap, larger number->more overlap, denser fingerprint
#----------------------------------

# Parameters used in finding local peaks
#-------------------------
afp.min_peak_sep = 15  # larger sep -> less peaks -> less accuracy, but faster fingerprinting
afp.min_peak_amp = 10  # larger min amp -> less peaks -> less accuracy, but faster fingerprinting
#-------------------------

# Parameters used in generating fingerprint
#------------------------------
afp.peak_connectivity = 15  # Number of neighboring peaks to use as target for each anchor
afp.peak_time_delta_min = 0  # Minimum spacing in time between peaks for anchor and target
afp.peak_time_delta_max = 200  # Maximum spacing in time between peaks for anchor and target
#------------------------------

### Fingerprint a song

We start by first fingerprinting a song. When prompted, enter "a" to automatically run through all the functions that will generate the fingerprints.

In [3]:
# Create AudioFP object for first song (enter "a" to proceed normally)
song1 = afp.AudioFP()

Enter "a" for automated fingerprinting or "m" to proceed manually: a
Enter "f" to read from audio file, "r" to record audio, or "s" to open saved fingerprint: f
Enter the filename you want to read (excluding the extension): vanilla_ice_ice_ice_baby
Do you want to show all plots? Enter "y" or "n": n
Do you want to save the fingerprint to file for later use? Enter "y" or "n": n
Not saving anything


### Add noise to the signal and fingerprint

Next, we create another AudioFP object. This time when prompted, enter "m" to proceed with the fingerprinting steps manually. We start by creating an empty object and manually setting its "framerate" and "songname" properties based on the original signal. Next, we will use the function "add_noise" defined in the AudioFP class to generate Gaussian white noise of a specified decibel level and add to the signal. See [this page](https://chchearing.org/noise/common-environmental-noise-levels/) for common noise levels in decibels. The function "add_noise" takes the audio signal and its framerate as inputs in that order and outputs the signal with the added noise. Finally, we will go through the steps to generate a fingerprint of the signal with the noise.

In [4]:
# Create another AudioFP object from the same file and add noise (Enter "m" to proceed normally)
song2 = afp.AudioFP()
plot = False  # boolean to display results
song2.songname = 'noisy_' + song1.songname 
channels, song2.framerate = afp.AudioFP.read_audiofile(song2, plot, song1.songname)
# Add noise to the signal
channels = afp.add_noise(channels, song2.framerate)
# Create audio fingerprint
f, t, sgram = afp.AudioFP.generate_spectrogram(song2, plot, channels, song2.framerate)
fp, tp, peaks = afp.AudioFP.find_peaks(song2, plot, f, t, sgram)
afp.AudioFP.generate_fingerprint(song2, plot, fp, tp, peaks)

Enter "a" for automated fingerprinting or "m" to proceed manually: m
Enter the noise level you want to add in dB: 50


### Comparing fingerprints

For comparing two fingerprints, we will calculate what is known as the Jaccard similarity. Jaccard similarity, mathematically is the size of the intersection divided by the size of the union between two givent sets. Thus, two identical sets would have a Jaccard similarity index of 1 while entirely dissimilar sets would result in 0. However, there isnt any specific rule for identifying similarity between sets that result in a number between 0 and 1. One would have to use [bootstrapping](https://en.wikipedia.org/wiki/Bootstrapping_(statistics)) to determine the extent of similarity of an arbitrary similarity score. Below, we have used some ranges based on some intuition using a small set of songs. The function "compare_fingerprints" is defined in the AudioFP class. If you want to see how the ranges are defined, take a look at the file "AudioFP.py".

In [5]:
afp.compare_fingerprints(song1, song2)

vanilla_ice_ice_ice_baby and noisy_vanilla_ice_ice_ice_baby are quite similar
Jaccard similarity =  0.1796875
