1_Mel_Spectrograms

# BIRD SONG CLASSIFICATION

Calla Jamieson, 
Data Science Candidate
April 11, 2021
___

__Project Overview__

Some weeks ago I asked "_How can we use Machine Learning to identify bird song?_" , begging three more questions:

* Can we identify the species of bird from an audio sample?

* Can we identify bird songs out of range? (not on migratory route, out of season)

* Can we identify whether a particular song is unique to an individual bird?

As it turns out, by converting the audio waves to Mel Spectrograms, we can.

_A brief overview on audio signals in the context of MEL Spectrograms can be found here:_

<div align="center">https://medium.com/analytics-vidhya/understanding-the-mel-spectrogram-fca2afa2ce53</div>

I will use librosa to convert the sound files to Mel spectrograms.

The inputs:

* 264 .flac files

The outputs

* 264 Mel spectrogram images
* A list of arrays containing the raw spectrogram data in the shape of 





Start by importing the packages we're going to be using.

In [None]:

# Basic packages
import numpy as np
import pandas as pd
from glob import glob 

import librosa as librosa
import librosa.util
import librosa.display
# For visualizations
import matplotlib.pyplot as plt
import seaborn as sns

import os

# For reproducible results
np.random.seed(101001010)

%matplotlib inline

pd.set_option('display.max_rows', 300)
pd.set_option("display.max_columns", 40)

In [None]:
#Setting up input and output directories
'''
Setting up the file paths
'''
audio_fpath = 'D:\\kaggle_birds\\songs\\'
image_fpath = 'D:\\kaggle_birds\\mel_spec\\'
audio_clips = os.listdir(audio_fpath)
mel_spec = os.listdir(image_fpath)
print("No. of files in audio folder = ",len(audio_clips))
print("No. of files in mel_spec folder = ", len(mel_spec))

In [None]:
def make_mels(i):
    
    # load audio file 
    y, sr = librosa.load(i, sr=44100)
    print(y)
      
    # setting the number of Mel filterbanks to 128 (we want to capture as much direction as possible,)
    S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128, fmax=10000)
    
    # store the spectrogram data 
    song_files.append(S)
    
    # convert to decibal units
    S_dB = librosa.power_to_db(S, ref=np.max)
    
    # make the spectrograrm image
    img = librosa.display.specshow(S_dB, x_axis='time', y_axis='mel', sr=sr, fmax=10000)
    
    # collect the figure
    fig = img.get_figure()
    
    # parse the file name
    fname = fname_parser(i)
    print(fname)
    
    # write img file 
    plt.savefig(image_fpath+fname)
    
    # show the spectrograms
    plt.show(fig)
    
    # print doc string
    

## Loading the Audio Files

In [None]:
# using walk function 
import os                                                                                                             
                                                                                                                      
def list_files(dir):                                                                                                  
    r = []                                                                                                            
    subdirs = [x[0] for x in os.walk(dir)]                                                                            
    for subdir in subdirs:                                                                                            
        files = os.walk(subdir).__next__()[2]                                                                             
        if (len(files) > 0):                                                                                          
            for file in files:                                                                                        
                r.append(os.path.join(subdir, file))                                                                         
    return r           

In [None]:
# check the list for the input flac files
audio_file_list = list_files(audio_fpath)
audio_file_list

### make a function to parse filenames from filepath

In [None]:

# define filename extractor 

def fname_parser(i):
    '''
    fname_ parser - Parses the filename from the path. The text prefix(xc: xeno-canto) is stripped in the first split
    and is added again after extracting the core filename. Returns the filename for use in writing files. authors code.
    '''
    fname = i
    a_name = fname.split('xc')
    b_name = a_name[1]
    c_name = b_name.split('.')
    d_name = c_name[0]
    song_name = 'xc'+d_name
    return song_name
print(fname_parser.__doc__)

In [None]:
# store the list of songs
song_names = []
for i in audio_file_list:
    song_name = fname_parser(i)
    song_names.append(song_name)
    

In [None]:
# the list of mel spectrograms
mel_file_list = list_files(image_fpath)
mel_file_list

# Converting the audio files into Mel Spectrograms

In [None]:
'''
Converting the audio files into Mel Spectrograms

    Initialize container for raw data
    Set up loop
    Apply the filterbanks and collect the data
    Convert Scale to dB
    Parse the filename
    Write the sprectrogram image
'''
song_files = []

#Set up loop
for i in audio_file_list:
    
    make_mels(i)

print(make_mels.__doc__)
  
   
    

Now that we have converted the audio files to Mel spectrograms, let's get those spectrograms into data frames

In [None]:
# convert a song-image from array to dataframe to get a feel of the scope
for i in range(len(song_files)):
    df = pd.DataFrame(data= song_files[i])
    df.to_csv(f'{i}.csv', index = False)

In [None]:
song_0 = pd.read_csv('0.csv')
song_0

In [None]:
song_0.info()

In [None]:
# Calculating the data points from this one image
data_points = 128*2188*32
data_points