#Frog Call Identification Method
##Building A Classification Model
Function that need to be inluded (from libraries, etc):
* Fast Fourier Transform for Spectrogram
* 2D peak detector, 1D can work as well
* MFCC 
* KNN (machine learning classifier)



Method Also requires
* .wav reader
* Linear Algebra (Matrix multiplication, inversions, scalar/matrix multiplication/division)
* Matrix concatenation
* Reading data from .txt files

This tutorial has code written for python. 


#Database
The database of frog calls from which we build our classification model consists of 15 species with between 9 and 22 calls per species. Each wave file is labeled as the acronym for the common name and the number of the call. For example the first bull frog call is labeled BF01.wav, the second is BF02.wav, etc. Each call also has an user input .txt file with the same name. These files include information on the season, type of call, and other environmental factors that can separate species from other. 

##Getting Started
For this method a a for loop is used to run through the list of .wav files. Before this loop, however, an empty matrix is constructed that will serve as the feature vector matrix for the entire database of calls. This matrix is what will be used in the last steps to build a KNN classifier. 

```python
total_feature_vect_matrix = np.zeros(18) 
'''
In python one can declare an empty array without specified dimensions. However, we will be concatenating matrices to make up the total_feature_vector_matrix, so creating an 18x1 array full of zeros will allow us to do this, then in the end we can eliminate the first vector. 
'''


for file in os.listdir("PathName/calls"):
    if file.endswith(".wav"):
    
    #get and read .wav file
        file_name = "calls/" + file

        (rate,sig) = wav.read(file_name)
        
```

##Spectrogram
Now that we have read the .wav file we can use the FFT to get the log spectrogram of the sound file.

```python 
        # getting log spec
        fbank_feat_not_norm = logfbank(sig,rate)
       
````
We also want to normalize the spectrogram so that quite recordings and loud records will be on equal footing while we find the engergy peaks. 

```python
         max_log = np.amax(fbank_feat_not_norm)
         fbank_feat = (1/max_log) * fbank_feat_not_norm 
         '''Multiplying a matrix by a scaler in python is straight foreward:
             scalar * matrix
            
            fbank_feat is now a 2D matrix representing our spectrogram. 
            we need the X dimension of this matrix as that represents the time length
         '''
         logSizeY =len(fbank_feat[:,1])# this is the TIME axis
         
         
```

        



##Peak Finder
Finding a good 2D peak detector in python was difficult, so I wen with a 1D peak detector which will scan the spectrogram maxtrix row by row and return the indices (column numbers) where the peaks occur. This may result in multiply peaks in a single column, however, since we don't want duplicate features we will emlinate repeats.

```python
            spec_peaks_array = [];
       
            for n in range (0,26): #The spectrogram has 26 rows

                ind = detect_peaks(fbank_feat[:,n], mph = 0.8, mpd = 10) 
'''
Thresholds, mph is the max peak height (since we normalized the abosulte max is 1), mpd is the max peak distance, there must be 10 columns between each peak (so we can ignore plateaus)
'''
               
                      
                spec_peaks_array = np.concatenate((ind, spec_peaks_array), axis = 0)
                #this is now an array with all the columns with peaks
                
            spec_peaks = list(set(spec_peaks_array))
            #this gets rid of duplictes
        
```

###Ratio
For quiet calls, background calls, etc the threshold of mpd = 0.8 might be too high. We don't want to miss a frog call if there is one in the background. So we use this ratio to make sure there are enough peaks for use to find the frogs. We want at least 5 peaks per 100 columns. 

```python
            ratio = len(spec_peaks)/logSizeY
            if ratio < 0.05:
            # we will rerun the above for loop with mph = 0.53
            ```

##MFCC

Now we find the MFCCs for the wav file. Then collect each MFCC corresponding to the column from spec_peaks. These are the features. 

```python

            mfcc_feat_not_norm = mfcc(sig,rate)
            #normalize the MFCC's
            max_mfcc = np.amax(mfcc_feat_not_norm)
            mfcc_feat = (1/max_mfcc) * mfcc_feat_not_norm
            mfcc_size = len(mfcc_feat[:,1])
            '''mfcc_size must equal logSizeY for the time slices to be equal. Each function (spectrogram, MFCC) should have a default and changable time unit. 
            '''
```


##Getting the feature vector matrx
Now we have to get the MFCC's we need, add in the user inputs, and add the label. 

```python
            MFCC_features_UI = np.zeros([len(spec_peaks), 18]);
            
            for counter in range(0, len(spec_peaks)):
            
```
This is matrix full of zeros and the begining of the foor loop to fill it. To add the labels I used if and if else statements to give a different label to each species. ex:

```python

                if file[:2] == "BF": #file is a string
                    label = [1];
                elif file[:2] == "BT":
                    label = [2];
``` 

Getting the MFCC

```python

                time_slice = spec_peaks[counter]
                temp = mfcc_feat[time_slice, :]
```

To get the user inputs, which we have in txt files, we read the file, read it line by line, separate by spaces, save it as an array:

```python

                for texts in os.listdir("/Users/katrinasmart/Desktop/python_shiz/old_data_UI"):
                if file[:5] == texts[:5]:
                    text_name = "old_data_UI/" + texts
                    with open(text_name, 'r') as t:
                        line =  [t.readline().strip() for i in range(1)]
                        Uinput = [u for u in line]
                        UI1 = [ui for ui in u.split(' ')]
                        UI = map(int, UI1[:4])
                UI = np.array(UI)
```

Concatenate it all!

```python
                MFCC_features_UI[counter] = np.concatenate((label,  UI, temp.transpose()), axis = 0)
```


##PCA 
Principal Componant Analysis Time!! You can find a function to do it for you, however, it is a pretty straight forward equation. The steps are
* Take the transpose of the matrix (x) you want to change the basis of (transpose = xT)
* Find xT * x
* Get the eigan vectors of the resulting matrix
* The new projected matrix (your original matrix with a change of basis) is x * [eigan vectors]

Note: python indices begin with 0, but from our previous code the first row is all zeros. We leave the first row off by beginning with row 1. Also when slicing matrices [5:] means "begin at 5 till the end", [0:5] means "begin at 0, stop before 5 (at 4)"

```python
x = all_data_MFCC_unshifted_label_UI[1:,5:] #we don't want to change the label or the user inputs!  
xT = x.T #transpose
xTx = np.dot(xT,x) #how to multiply matrices with python. np = numpy
eig_val_cov, eig_vec_cov = np.linalg.eig(xTx) #function to get e_vectors and values


projection = np.dot(all_data_MFCC_unshifted_label_UI[1:,5:], eig_vec_cov) #matrix with change of basis


np.savetxt('PCA_matrix_UI', eig_vec_cov, fmt = '%2.14f') #SAVE THIS!!!
```
The PCA matrix MUST be saved. We will use this to change the basis of new calls we have to ID. 

Now we put the labels and the user inputs back on. Just in case the user inputs are unavailable, we also want to build a model with the only user input as the season (0 = winter, 1 = spring, 2 = summer, 3 = fall). Season is the first user input. 

```python

data_shifted_UI = np.concatenate((all_data_MFCC_unshifted_label_UI[1:,0:5], projection), axis = 1)
data_shifted = np.concatenate((all_data_MFCC_unshifted_label_UI[1:,0:2], projection), axis = 1)

```


##Model 
KNN is a very common machine learning method. Finding a library with it should not be difficult.
First we have to change the data from an array to a list 

```python
training_data_UI = data_shifted_UI[:,1:] # we don't want labels included in the list
T_data_UI = []
for i in range(0,len(traing_data_UI[:,1]):
    data = training_data_UI[i,:]
    data = data.tolist()
    T_data_UI.append(data)
    
#The labels also have to be a list, not array    
training_labels_UI = data_label_T_UI[0,:]
training_labels_UI = training_labels_UI.tolist()

#run and build the KNN model
neigh_UI = KNeighborsClassifier(n_neighbors=1) 
neigh_UI.fit(T_data_UI, training_labels_UI ) 

```

Now the object neigh_UI needs to be saved and can be used for new calls. The same process needs to be done with data_shifted. This is the feature vector matrix minus 3 user inputs. 
    