# Introduction

This notebook introduces a simple way to prepare the data from the MP3 audio files before feeding it to an LSTM-based model. The model is then trained to classify one of the 264 birds. Please note that this is a work-in-progress and that I have made this code available solely to help everyone to get started with the competition. Of course, feel free to upvote this notebook if you find it useful!



* **V9:** The entire pipeline now works from data preparation to model predictions and submission.
* **V11:** As the model is failing to classify "no call" samples due to the lack of existing data, I am adding a check at prediction time. This check automatically classifies a sample as "no call" if none of the model's outputs are not activated with high confidence. *score: 0.50*
* **V12:** Attempt to include all bird categories in the dataset/model and increase the confidence level required to take the model's output into account. *score: 0.54*
* **V13:** The previous version taught us that the model did not learn to predict anything with high confidence. Therefore, this version will try to reduce the number of bird species being taught. The goal in this version is just to beat the baseline. *score: 0.53*
* **V17:** Attempting classification of many bird species at the same time did not prove to be successful. In this version, we try to classify only 5 species while **all** other species will be categorised as other/nocall. 
* **V18:** We try to recognise 20 species instead of just 5 as V18 did not learn at all despite the use of class weights.

In [1]:
import numpy as np
import pandas as pd
import wave
from scipy.io import wavfile
import os
import librosa
import warnings
from sklearn.utils import shuffle
from sklearn.utils import class_weight
import sklearn
from tqdm import tqdm

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras import Input
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import ModelCheckpoint, ReduceLROnPlateau, EarlyStopping
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers import Dense, Flatten, Dropout, Activation, LSTM, SimpleRNN, Conv1D

import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

# Data preparation

In this section, we first create two dictionaries to allow to translate each bird into an ID code and vice versa. As we won't be able to use the entirety of the data in this notebook for processing time reasons, we will shuffle the `DataFrame` with the training data before preparing the data. We will then create a new `DataFrame` where we will store all the samples, of 5 seconds each with a sampling rate of 10 data points per second.

In [2]:
# train_df = pd.read_csv('/kaggle/input/birdsong-recognition/train.csv')
train_df = pd.read_csv('./rawdata/train.csv')

It looks like each ebird is associated with a "reasonable" amount of samples. We only keep the bird species with 100 samples of the highest quality (4 or 5) to make the problem easier to start with.

In [3]:
train_df = train_df.query("rating>=4")

birds_count = {}
for bird_species, count in zip(train_df.ebird_code.unique(), train_df.groupby("ebird_code")["ebird_code"].count().values):
    birds_count[bird_species] = count
most_represented_birds = [key for key,value in birds_count.items() if value == 100]

train_df = train_df.query("ebird_code in @most_represented_birds")

We now have only 49 bird species left in our dataset.

In [5]:
len(train_df.ebird_code.unique())

49

In [11]:
birds_to_recognise = shuffle(most_represented_birds)[:20]
print(birds_to_recognise)

['warvir', 'gnwtea', 'normoc', 'whcspa', 'spotow', 'comrav', 'astfly', 'orcwar', 'houspa', 'redcro', 'bkhgro', 'linspa', 'chispa', 'norcar', 'grhowl', 'mallar3', 'wewpew', 'rewbla', 'easmea', 'bewwre']


We build 2 dictionaries to translate the ebird code into an integer and vice versa.

In [12]:
sequence_length = 50

ebird_to_id = {}
id_to_ebird = {}
ebird_to_id["nocall"] = 0
id_to_ebird[0] = "nocall"
for idx, unique_ebird_code in enumerate(birds_to_recognise):
    ebird_to_id[unique_ebird_code] = str(idx+1)
    id_to_ebird[idx+1] = str(unique_ebird_code)

In [13]:
train_df = shuffle(train_df)
train_df.head()

Unnamed: 0,rating,playback_used,ebird_code,channels,date,pitch,duration,filename,speed,species,...,xc_id,url,country,author,primary_label,longitude,length,time,recordist,license
19916,5.0,no,whcspa,2 (stereo),2019-05-19,Not specified,15,XC478421.mp3,level,White-crowned Sparrow,...,478421,https://www.xeno-canto.org/478421,Canada,Guy Breton,Zonotrichia leucophrys_White-crowned Sparrow,-71.309,0-3(s),08:30,Guy Breton,Creative Commons Attribution-NonCommercial-Sha...
6413,5.0,no,comrav,2 (stereo),2013-03-10,Not specified,21,XC124411.mp3,Not specified,Northern Raven,...,124411,https://www.xeno-canto.org/124411,Slovakia,Ľubomír Haluška,Corvus corax_Common Raven,19.2856,Not specified,15:00,Ľubomír Haluška,Creative Commons Attribution-NonCommercial-Sha...
12842,5.0,no,marwre,1 (mono),2018-05-06,both,5,XC417865.mp3,accelerating,Marsh Wren,...,417865,https://www.xeno-canto.org/417865,United States,Sue Riffe,Cistothorus palustris_Marsh Wren,-83.2065,0-3(s),08:55,Sue Riffe,Creative Commons Attribution-NonCommercial-Sha...
17953,5.0,no,sonspa,1 (mono),2017-04-21,Not specified,169,XC393994.mp3,Not specified,Song Sparrow,...,393994,https://www.xeno-canto.org/393994,United States,Dominic Garcia-Hall,Melospiza melodia_Song Sparrow,-74.0217,Not specified,08:00,Dominic Garcia-Hall,Creative Commons Attribution-NonCommercial-Sha...
18247,5.0,no,spotow,1 (mono),2016-08-15,Not specified,35,XC391164.mp3,Not specified,Spotted Towhee,...,391164,https://www.xeno-canto.org/391164,United States,Antonio Xeira,Pipilo maculatus_Spotted Towhee,-109.3172,Not specified,08:00,Antonio Xeira,Creative Commons Attribution-NonCommercial-Sha...


In [14]:
len(train_df)

4900

In [15]:
def get_sample(filename, bird):
    wave_data, wave_rate = librosa.load(filename)
    data_point_per_second = int(sequence_length/5)
    
    #Take 10 data points every second
    prepared_sample = wave_data[0::int(wave_rate/data_point_per_second)]
    #We normalize each sample before extracting 5s samples from it
    normalized_sample = sklearn.preprocessing.minmax_scale(prepared_sample, axis=0)
    
    #only take 5s samples and add them to the dataframe
    song_sample = []
    sample_length = 5*data_point_per_second
    samples_from_file = []
    for idx in range(0,len(normalized_sample),sample_length): 
        song_sample = normalized_sample[idx:idx+sample_length]
        if len(song_sample)>=sample_length:
            samples_from_file.append({"song_sample":np.asarray(song_sample).astype(np.float32),
                                            "bird":int(ebird_to_id[bird])})
    return samples_from_file

The following cell will set all the samples with non-selected birds to the "nocall" ID code. This allows to focus on the classification of the 5 selected bird species while all of bird species will be categorised as "nocall".

In [None]:
%%time
warnings.filterwarnings("ignore")
samples_df = pd.DataFrame(columns=["song_sample","bird"])

#We limit the number of audio files being sampled to 6000 in this notebook to save time
#However, we have already limited the number of bird species
sample_limit = 5000
sample_list = []
with tqdm(total=sample_limit) as pbar:
    for idx, row in train_df[:sample_limit].iterrows():
        pbar.update(1)
        try:
            audio_file_path = "./rawdata/train_audio/"
            audio_file_path += row.ebird_code
            
            if row.ebird_code in birds_to_recognise:
                sample_list += get_sample('{}/{}'.format(audio_file_path, row.filename), row.ebird_code)
            else:
                sample_list += get_sample('{}/{}'.format(audio_file_path, row.filename), "nocall")
        except:
            print("{} is corrupted".format(audio_file_path))
            
            
#Generate some fake random samples to represent "no calls", being 5% of the total samples
number_of_nocalls = int(len(samples_df)*0.05)
for idx in range(0,number_of_nocalls):
    synthetic_nocall = sklearn.preprocessing.minmax_scale(np.random.randn(sequence_length), axis=0)
    sample_list += [{"song_sample":synthetic_nocall,
                    "bird":ebird_to_id["nocall"]}]
    
samples_df = pd.DataFrame(sample_list)

  0%|                                                                                 | 4/5000 [00:00<05:00, 16.64it/s]

/kaggle/input/birdsong-recognition/train_audio/whcspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted
/kaggle/input/birdsong-recognition/train_audio/marwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/sonspa is corrupted

  0%|▏                                                                                | 8/5000 [00:00<05:04, 16.41it/s]


/kaggle/input/birdsong-recognition/train_audio/spotow is corrupted
/kaggle/input/birdsong-recognition/train_audio/daejun is corrupted
/kaggle/input/birdsong-recognition/train_audio/normoc is corrupted


  0%|▏                                                                               | 12/5000 [00:00<05:17, 15.70it/s]

/kaggle/input/birdsong-recognition/train_audio/cangoo is corrupted
/kaggle/input/birdsong-recognition/train_audio/amered is corrupted
/kaggle/input/birdsong-recognition/train_audio/amered is corrupted
/kaggle/input/birdsong-recognition/train_audio/eastow is corrupted


  0%|▏                                                                               | 14/5000 [00:00<05:19, 15.60it/s]

/kaggle/input/birdsong-recognition/train_audio/horlar is corrupted
/kaggle/input/birdsong-recognition/train_audio/norcar is corrupted
/kaggle/input/birdsong-recognition/train_audio/greegr is corrupted


  0%|▎                                                                               | 18/5000 [00:01<05:18, 15.66it/s]

/kaggle/input/birdsong-recognition/train_audio/linspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/whcspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/rewbla is corrupted
/kaggle/input/birdsong-recognition/train_audio/comred is corrupted

  0%|▎                                                                               | 20/5000 [00:01<05:17, 15.68it/s]


/kaggle/input/birdsong-recognition/train_audio/houspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/swathr is corrupted
/kaggle/input/birdsong-recognition/train_audio/whbnut is corrupted

  0%|▍                                                                               | 24/5000 [00:01<05:17, 15.69it/s]


/kaggle/input/birdsong-recognition/train_audio/orcwar is corrupted
/kaggle/input/birdsong-recognition/train_audio/redcro is corrupted
/kaggle/input/birdsong-recognition/train_audio/blujay is corrupted


  1%|▍                                                                               | 28/5000 [00:01<05:16, 15.69it/s]

/kaggle/input/birdsong-recognition/train_audio/warvir is corrupted
/kaggle/input/birdsong-recognition/train_audio/warvir is corrupted
/kaggle/input/birdsong-recognition/train_audio/carwre is corrupted


  1%|▍                                                                               | 30/5000 [00:01<05:17, 15.66it/s]

/kaggle/input/birdsong-recognition/train_audio/linspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/houspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/houfin is corrupted
/kaggle/input/birdsong-recognition/train_audio/tuftit is corrupted

  1%|▌                                                                               | 34/5000 [00:02<05:19, 15.55it/s]


/kaggle/input/birdsong-recognition/train_audio/bewwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/bewwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/daejun is corrupted


  1%|▌                                                                               | 38/5000 [00:02<05:18, 15.57it/s]

/kaggle/input/birdsong-recognition/train_audio/chispa is corrupted
/kaggle/input/birdsong-recognition/train_audio/astfly is corrupted
/kaggle/input/birdsong-recognition/train_audio/carwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/wesmea is corrupted


  1%|▋                                                                               | 42/5000 [00:02<05:14, 15.78it/s]

/kaggle/input/birdsong-recognition/train_audio/whtspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/ovenbi1 is corrupted
/kaggle/input/birdsong-recognition/train_audio/redcro is corrupted
/kaggle/input/birdsong-recognition/train_audio/comyel is corrupted

  1%|▋                                                                               | 44/5000 [00:02<05:18, 15.55it/s]


/kaggle/input/birdsong-recognition/train_audio/whcspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/eucdov is corrupted
/kaggle/input/birdsong-recognition/train_audio/savspa is corrupted

  1%|▊                                                                               | 48/5000 [00:03<05:18, 15.57it/s]


/kaggle/input/birdsong-recognition/train_audio/bewwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/amecro is corrupted
/kaggle/input/birdsong-recognition/train_audio/astfly is corrupted

  1%|▊                                                                               | 50/5000 [00:03<05:12, 15.86it/s]


/kaggle/input/birdsong-recognition/train_audio/mallar3 is corrupted
/kaggle/input/birdsong-recognition/train_audio/rewbla is corrupted
/kaggle/input/birdsong-recognition/train_audio/whbnut is corrupted

  1%|▊                                                                               | 54/5000 [00:03<05:19, 15.48it/s]


/kaggle/input/birdsong-recognition/train_audio/normoc is corrupted
/kaggle/input/birdsong-recognition/train_audio/ovenbi1 is corrupted
/kaggle/input/birdsong-recognition/train_audio/sonspa is corrupted


  1%|▉                                                                               | 58/5000 [00:03<05:18, 15.53it/s]

/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted
/kaggle/input/birdsong-recognition/train_audio/easmea is corrupted
/kaggle/input/birdsong-recognition/train_audio/wewpew is corrupted
/kaggle/input/birdsong-recognition/train_audio/whbnut is corrupted

  1%|▉                                                                               | 60/5000 [00:03<05:16, 15.59it/s]


/kaggle/input/birdsong-recognition/train_audio/amecro is corrupted
/kaggle/input/birdsong-recognition/train_audio/bkhgro is corrupted
/kaggle/input/birdsong-recognition/train_audio/comter is corrupted

  1%|█                                                                               | 64/5000 [00:04<05:17, 15.56it/s]


/kaggle/input/birdsong-recognition/train_audio/gnwtea is corrupted
/kaggle/input/birdsong-recognition/train_audio/astfly is corrupted


  1%|█                                                                               | 66/5000 [00:04<05:16, 15.58it/s]

/kaggle/input/birdsong-recognition/train_audio/astfly is corrupted
/kaggle/input/birdsong-recognition/train_audio/spotow is corrupted
/kaggle/input/birdsong-recognition/train_audio/normoc is corrupted
/kaggle/input/birdsong-recognition/train_audio/greegr is corrupted

  1%|█                                                                               | 70/5000 [00:04<05:15, 15.63it/s]


/kaggle/input/birdsong-recognition/train_audio/comyel is corrupted
/kaggle/input/birdsong-recognition/train_audio/eucdov is corrupted
/kaggle/input/birdsong-recognition/train_audio/savspa is corrupted

  1%|█▏                                                                              | 72/5000 [00:04<05:15, 15.63it/s]


/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted
/kaggle/input/birdsong-recognition/train_audio/gnwtea is corrupted
/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted

  2%|█▏                                                                              | 76/5000 [00:04<05:14, 15.64it/s]


/kaggle/input/birdsong-recognition/train_audio/amerob is corrupted
/kaggle/input/birdsong-recognition/train_audio/gnwtea is corrupted


  2%|█▏                                                                              | 78/5000 [00:04<05:14, 15.67it/s]

/kaggle/input/birdsong-recognition/train_audio/linspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/cangoo is corrupted
/kaggle/input/birdsong-recognition/train_audio/herthr is corrupted
/kaggle/input/birdsong-recognition/train_audio/astfly is corrupted

  2%|█▎                                                                              | 82/5000 [00:05<05:09, 15.91it/s]


/kaggle/input/birdsong-recognition/train_audio/daejun is corrupted
/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted
/kaggle/input/birdsong-recognition/train_audio/orcwar is corrupted

  2%|█▎                                                                              | 84/5000 [00:05<05:15, 15.60it/s]


/kaggle/input/birdsong-recognition/train_audio/marwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/comter is corrupted
/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted

  2%|█▍                                                                              | 88/5000 [00:05<05:11, 15.77it/s]


/kaggle/input/birdsong-recognition/train_audio/norfli is corrupted
/kaggle/input/birdsong-recognition/train_audio/swathr is corrupted
/kaggle/input/birdsong-recognition/train_audio/carwre is corrupted


  2%|█▍                                                                              | 92/5000 [00:05<05:11, 15.77it/s]

/kaggle/input/birdsong-recognition/train_audio/eastow is corrupted
/kaggle/input/birdsong-recognition/train_audio/bewwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/wewpew is corrupted
/kaggle/input/birdsong-recognition/train_audio/amerob is corrupted


  2%|█▌                                                                              | 96/5000 [00:06<05:14, 15.61it/s]

/kaggle/input/birdsong-recognition/train_audio/warvir is corrupted
/kaggle/input/birdsong-recognition/train_audio/comter is corrupted
/kaggle/input/birdsong-recognition/train_audio/horlar is corrupted
/kaggle/input/birdsong-recognition/train_audio/savspa is corrupted

  2%|█▌                                                                              | 98/5000 [00:06<05:07, 15.92it/s]


/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted
/kaggle/input/birdsong-recognition/train_audio/buggna is corrupted
/kaggle/input/birdsong-recognition/train_audio/eastow is corrupted

  2%|█▌                                                                             | 102/5000 [00:06<05:13, 15.64it/s]


/kaggle/input/birdsong-recognition/train_audio/rewbla is corrupted
/kaggle/input/birdsong-recognition/train_audio/houwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/redcro is corrupted

  2%|█▋                                                                             | 104/5000 [00:06<05:14, 15.58it/s]


/kaggle/input/birdsong-recognition/train_audio/orcwar is corrupted
/kaggle/input/birdsong-recognition/train_audio/houwre is corrupted


  2%|█▋                                                                             | 108/5000 [00:06<05:14, 15.55it/s]

/kaggle/input/birdsong-recognition/train_audio/horlar is corrupted
/kaggle/input/birdsong-recognition/train_audio/ovenbi1 is corrupted
/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted
/kaggle/input/birdsong-recognition/train_audio/horlar is corrupted


  2%|█▊                                                                             | 112/5000 [00:07<05:14, 15.53it/s]

/kaggle/input/birdsong-recognition/train_audio/amered is corrupted
/kaggle/input/birdsong-recognition/train_audio/whtspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/wesmea is corrupted


  2%|█▊                                                                             | 114/5000 [00:07<05:14, 15.52it/s]

/kaggle/input/birdsong-recognition/train_audio/easmea is corrupted
/kaggle/input/birdsong-recognition/train_audio/easmea is corrupted
/kaggle/input/birdsong-recognition/train_audio/comred is corrupted
/kaggle/input/birdsong-recognition/train_audio/normoc is corrupted

  2%|█▊                                                                             | 118/5000 [00:07<05:13, 15.55it/s]


/kaggle/input/birdsong-recognition/train_audio/herthr is corrupted
/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted


  2%|█▉                                                                             | 120/5000 [00:07<05:09, 15.79it/s]

/kaggle/input/birdsong-recognition/train_audio/whbnut is corrupted
/kaggle/input/birdsong-recognition/train_audio/bkhgro is corrupted
/kaggle/input/birdsong-recognition/train_audio/daejun is corrupted
/kaggle/input/birdsong-recognition/train_audio/buggna is corrupted

  2%|█▉                                                                             | 124/5000 [00:07<05:13, 15.57it/s]


/kaggle/input/birdsong-recognition/train_audio/savspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/houfin is corrupted
/kaggle/input/birdsong-recognition/train_audio/houfin is corrupted

  3%|█▉                                                                             | 126/5000 [00:08<05:10, 15.69it/s]


/kaggle/input/birdsong-recognition/train_audio/spotow is corrupted
/kaggle/input/birdsong-recognition/train_audio/eastow is corrupted
/kaggle/input/birdsong-recognition/train_audio/bewwre is corrupted

  3%|██                                                                             | 130/5000 [00:08<05:11, 15.62it/s]


/kaggle/input/birdsong-recognition/train_audio/linspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/normoc is corrupted


  3%|██                                                                             | 132/5000 [00:08<05:11, 15.62it/s]

/kaggle/input/birdsong-recognition/train_audio/blujay is corrupted
/kaggle/input/birdsong-recognition/train_audio/comred is corrupted
/kaggle/input/birdsong-recognition/train_audio/bkhgro is corrupted
/kaggle/input/birdsong-recognition/train_audio/whcspa is corrupted

  3%|██▏                                                                            | 136/5000 [00:08<05:11, 15.61it/s]


/kaggle/input/birdsong-recognition/train_audio/amered is corrupted
/kaggle/input/birdsong-recognition/train_audio/eucdov is corrupted
/kaggle/input/birdsong-recognition/train_audio/sonspa is corrupted


  3%|██▏                                                                            | 140/5000 [00:08<05:14, 15.44it/s]

/kaggle/input/birdsong-recognition/train_audio/chispa is corrupted
/kaggle/input/birdsong-recognition/train_audio/wewpew is corrupted
/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted
/kaggle/input/birdsong-recognition/train_audio/whcspa is corrupted

  3%|██▏                                                                            | 142/5000 [00:09<05:13, 15.47it/s]


/kaggle/input/birdsong-recognition/train_audio/eastow is corrupted
/kaggle/input/birdsong-recognition/train_audio/comred is corrupted
/kaggle/input/birdsong-recognition/train_audio/marwre is corrupted

  3%|██▎                                                                            | 146/5000 [00:09<05:09, 15.67it/s]


/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted
/kaggle/input/birdsong-recognition/train_audio/whbnut is corrupted
/kaggle/input/birdsong-recognition/train_audio/blujay is corrupted

  3%|██▎                                                                            | 148/5000 [00:09<05:12, 15.55it/s]


/kaggle/input/birdsong-recognition/train_audio/chispa is corrupted
/kaggle/input/birdsong-recognition/train_audio/easmea is corrupted


  3%|██▍                                                                            | 152/5000 [00:09<05:12, 15.49it/s]

/kaggle/input/birdsong-recognition/train_audio/eastow is corrupted
/kaggle/input/birdsong-recognition/train_audio/houfin is corrupted
/kaggle/input/birdsong-recognition/train_audio/norcar is corrupted
/kaggle/input/birdsong-recognition/train_audio/redcro is corrupted

  3%|██▍                                                                            | 154/5000 [00:09<05:13, 15.46it/s]


/kaggle/input/birdsong-recognition/train_audio/gnwtea is corrupted
/kaggle/input/birdsong-recognition/train_audio/daejun is corrupted
/kaggle/input/birdsong-recognition/train_audio/savspa is corrupted

  3%|██▍                                                                            | 158/5000 [00:10<05:13, 15.46it/s]


/kaggle/input/birdsong-recognition/train_audio/amered is corrupted
/kaggle/input/birdsong-recognition/train_audio/spotow is corrupted
/kaggle/input/birdsong-recognition/train_audio/amecro is corrupted

  3%|██▌                                                                            | 160/5000 [00:10<05:12, 15.47it/s]


/kaggle/input/birdsong-recognition/train_audio/gnwtea is corrupted
/kaggle/input/birdsong-recognition/train_audio/buggna is corrupted


  3%|██▌                                                                            | 164/5000 [00:10<05:10, 15.59it/s]

/kaggle/input/birdsong-recognition/train_audio/cangoo is corrupted
/kaggle/input/birdsong-recognition/train_audio/daejun is corrupted
/kaggle/input/birdsong-recognition/train_audio/spotow is corrupted
/kaggle/input/birdsong-recognition/train_audio/sonspa is corrupted


  3%|██▋                                                                            | 168/5000 [00:10<05:06, 15.78it/s]

/kaggle/input/birdsong-recognition/train_audio/bewwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/cangoo is corrupted
/kaggle/input/birdsong-recognition/train_audio/blujay is corrupted
/kaggle/input/birdsong-recognition/train_audio/houwre is corrupted

  3%|██▋                                                                            | 170/5000 [00:10<05:09, 15.60it/s]


/kaggle/input/birdsong-recognition/train_audio/herthr is corrupted
/kaggle/input/birdsong-recognition/train_audio/buggna is corrupted
/kaggle/input/birdsong-recognition/train_audio/wesmea is corrupted

  3%|██▋                                                                            | 174/5000 [00:11<05:09, 15.58it/s]


/kaggle/input/birdsong-recognition/train_audio/buggna is corrupted
/kaggle/input/birdsong-recognition/train_audio/spotow is corrupted


  4%|██▊                                                                            | 176/5000 [00:11<05:10, 15.55it/s]

/kaggle/input/birdsong-recognition/train_audio/amered is corrupted
/kaggle/input/birdsong-recognition/train_audio/houspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted
/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted

  4%|██▊                                                                            | 180/5000 [00:11<05:01, 15.97it/s]


/kaggle/input/birdsong-recognition/train_audio/eastow is corrupted
/kaggle/input/birdsong-recognition/train_audio/redcro is corrupted
/kaggle/input/birdsong-recognition/train_audio/tuftit is corrupted

  4%|██▉                                                                            | 182/5000 [00:11<05:09, 15.55it/s]


/kaggle/input/birdsong-recognition/train_audio/wesmea is corrupted
/kaggle/input/birdsong-recognition/train_audio/bkhgro is corrupted
/kaggle/input/birdsong-recognition/train_audio/orcwar is corrupted

  4%|██▉                                                                            | 186/5000 [00:11<05:07, 15.65it/s]


/kaggle/input/birdsong-recognition/train_audio/ovenbi1 is corrupted
/kaggle/input/birdsong-recognition/train_audio/easmea is corrupted
/kaggle/input/birdsong-recognition/train_audio/ovenbi1 is corrupted

  4%|███                                                                            | 190/5000 [00:12<05:03, 15.85it/s]


/kaggle/input/birdsong-recognition/train_audio/marwre is corrupted
/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted
/kaggle/input/birdsong-recognition/train_audio/houspa is corrupted


  4%|███                                                                            | 192/5000 [00:12<05:06, 15.71it/s]

/kaggle/input/birdsong-recognition/train_audio/whtspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/wesmea is corrupted
/kaggle/input/birdsong-recognition/train_audio/swathr is corrupted
/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted

  4%|███                                                                            | 196/5000 [00:12<05:05, 15.74it/s]


/kaggle/input/birdsong-recognition/train_audio/ovenbi1 is corrupted
/kaggle/input/birdsong-recognition/train_audio/buggna is corrupted
/kaggle/input/birdsong-recognition/train_audio/sonspa is corrupted

  4%|███▏                                                                           | 198/5000 [00:12<05:07, 15.60it/s]


/kaggle/input/birdsong-recognition/train_audio/spotow is corrupted
/kaggle/input/birdsong-recognition/train_audio/comred is corrupted


  4%|███▏                                                                           | 202/5000 [00:12<05:06, 15.66it/s]

/kaggle/input/birdsong-recognition/train_audio/astfly is corrupted
/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted
/kaggle/input/birdsong-recognition/train_audio/gnwtea is corrupted
/kaggle/input/birdsong-recognition/train_audio/herthr is corrupted

  4%|███▏                                                                           | 204/5000 [00:13<05:07, 15.61it/s]


/kaggle/input/birdsong-recognition/train_audio/bkhgro is corrupted
/kaggle/input/birdsong-recognition/train_audio/norfli is corrupted
/kaggle/input/birdsong-recognition/train_audio/blujay is corrupted

  4%|███▎                                                                           | 208/5000 [00:13<05:06, 15.65it/s]


/kaggle/input/birdsong-recognition/train_audio/eastow is corrupted
/kaggle/input/birdsong-recognition/train_audio/rewbla is corrupted
/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted

  4%|███▎                                                                           | 210/5000 [00:13<05:07, 15.56it/s]


/kaggle/input/birdsong-recognition/train_audio/sonspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/amered is corrupted
/kaggle/input/birdsong-recognition/train_audio/sonspa is corrupted

  4%|███▍                                                                           | 214/5000 [00:13<05:08, 15.50it/s]


/kaggle/input/birdsong-recognition/train_audio/swathr is corrupted
/kaggle/input/birdsong-recognition/train_audio/tuftit is corrupted
/kaggle/input/birdsong-recognition/train_audio/horlar is corrupted

  4%|███▍                                                                           | 216/5000 [00:13<05:09, 15.45it/s]


/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted
/kaggle/input/birdsong-recognition/train_audio/chispa is corrupted


  4%|███▍                                                                           | 220/5000 [00:14<05:04, 15.69it/s]

/kaggle/input/birdsong-recognition/train_audio/wesmea is corrupted
/kaggle/input/birdsong-recognition/train_audio/houspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/horlar is corrupted
/kaggle/input/birdsong-recognition/train_audio/whcspa is corrupted


  4%|███▌                                                                           | 224/5000 [00:14<05:06, 15.56it/s]

/kaggle/input/birdsong-recognition/train_audio/orcwar is corrupted
/kaggle/input/birdsong-recognition/train_audio/whbnut is corrupted
/kaggle/input/birdsong-recognition/train_audio/comrav is corrupted
/kaggle/input/birdsong-recognition/train_audio/redcro is corrupted


  5%|███▌                                                                           | 228/5000 [00:14<05:05, 15.64it/s]

/kaggle/input/birdsong-recognition/train_audio/savspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/linspa is corrupted
/kaggle/input/birdsong-recognition/train_audio/herthr is corrupted
/kaggle/input/birdsong-recognition/train_audio/houwre is corrupted


  5%|███▋                                                                           | 232/5000 [00:14<05:01, 15.80it/s]

/kaggle/input/birdsong-recognition/train_audio/barswa is corrupted
/kaggle/input/birdsong-recognition/train_audio/amered is corrupted
/kaggle/input/birdsong-recognition/train_audio/mallar3 is corrupted
/kaggle/input/birdsong-recognition/train_audio/sonspa is corrupted

  5%|███▋                                                                           | 234/5000 [00:14<05:05, 15.60it/s]

In [None]:
samples_df = shuffle(samples_df)
samples_df[:10]

In [None]:
plt.plot(samples_df.iloc[0].song_sample)
plt.show()

# Model creation

In [None]:
training_percentage = 0.9
training_item_count = int(len(samples_df)*training_percentage)
validation_item_count = len(samples_df)-int(len(samples_df)*training_percentage)
training_df = samples_df[:training_item_count]
validation_df = samples_df[training_item_count:]

Just a simple LSTM-based architecture that we will be able to improve later. Beside the LSTM layers and the Dense layer, the key elements is the input layer with 50 units representing the 50 data point of our 5s samples, and the output layer with the number of bird classes in our training set (+1 for "nocall").

The architecture is mostly a placeholder as, as shown in this [post](https://www.kaggle.com/c/birdsong-recognition/discussion/158943) by [Nanashi](https://www.kaggle.com/jesucristo), CNN-based models may be more performant for such a problem.

Also, I have realised that we can have several birds singing at the same time in our samples, which means that we will have to change the output layer and loss to have several possible outputs and not just one.

In [None]:
model = Sequential()
model.add(LSTM(32, return_sequences=True, recurrent_dropout=0.2,input_shape=(None, sequence_length)))
model.add(LSTM(32))
model.add(Dense(128))
model.add(Dropout(0.3))
model.add(Dense(len(ebird_to_id.keys()), activation="softmax"))

model.summary()

In [None]:
callbacks = [ReduceLROnPlateau(monitor='val_loss', patience=2, verbose=1, factor=0.7),
             EarlyStopping(monitor='val_loss', patience=10),
             ModelCheckpoint(filepath='best_model.h5', monitor='val_loss', save_best_only=True)]
model.compile(loss="categorical_crossentropy", optimizer='adam')

This cell allows to format the data to feed it into the model, with the expected outputs being one-hot encoded. This needs to be cleaned up eventually. 

In [None]:
X_train = np.asarray(np.reshape(np.asarray([np.asarray(x) for x in training_df["song_sample"]]),(training_item_count,1,sequence_length))).astype(np.float32)
groundtruth = np.asarray([np.asarray(x) for x in training_df["bird"]]).astype(np.float32)
Y_train = to_categorical(
                groundtruth, num_classes=len(ebird_to_id.keys()), dtype='float32'
            )


X_validation = np.asarray(np.reshape(np.asarray([np.asarray(x) for x in validation_df["song_sample"]]),(validation_item_count,1,sequence_length))).astype(np.float32)
validation_groundtruth = np.asarray([np.asarray(x) for x in validation_df["bird"]]).astype(np.float32)
Y_validation = to_categorical(
                validation_groundtruth, num_classes=len(ebird_to_id.keys()), dtype='float32'
            )

In [None]:
class_weights = class_weight.compute_class_weight("balanced", samples_df.bird.unique(), samples_df.bird.values)
class_weights_dict = {i : class_weights[i] for i in samples_df.bird.unique()}

# Model training

In [None]:
history = model.fit(X_train, Y_train,
          epochs = 200, 
          batch_size = 32, 
          validation_data=(X_validation, Y_validation),
          class_weight=class_weights_dict,
          callbacks=callbacks)

In [None]:
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Loss over epochs')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='best')
plt.show()

Running predictions on our validation set just to check if our model displays any anomalies.

In [None]:
preds = model.predict(X_validation)
validation_df = pd.DataFrame(columns=["prediction", "groundtruth", "correct_prediction"])

for pred, groundtruth in zip(preds[:30], Y_validation[:30]):
    validation_df = validation_df.append({"prediction":id_to_ebird[np.argmax(pred)], 
                                       "groundtruth":id_to_ebird[np.argmax(groundtruth)], 
                                       "correct_prediction":np.argmax(pred)==np.argmax(groundtruth)}, ignore_index=True)
validation_df

# Predictions

We load the weights for the best-performing model on our validation set.

In [None]:
model.load_weights("best_model.h5")

As for the training samples, we will only load each audio file and predict on 5-second sequences. This prediction function ensures that we do not reload the .mp3 audio file for every sample as it would significantly increase the processing time. Then, it adds all the predictions to the `test_df` DataFrame before generating the submission file.

In [None]:
def predict_submission(df, audio_file_path):
        
    loaded_audio_sample = []
    previous_filename = ""
    data_point_per_second = 10
    sample_length = 5*data_point_per_second
    wave_data = []
    wave_rate = None
    
    for idx,row in df.iterrows():
        #I added this exception as I've heard that some files may be corrupted.
        try:
            if previous_filename == "" or previous_filename!=row.audio_id:
                filename = '{}/{}.mp3'.format(audio_file_path, row.audio_id)
                wave_data, wave_rate = librosa.load(filename)
                prepared_sample = wave_data[0::int(wave_rate/data_point_per_second)]
                sample = sklearn.preprocessing.minmax_scale(prepared_sample, axis=0)
            previous_filename = row.audio_id

            #basically allows to check if we are running the examples or the test set.
            if "site" in df.columns:
                if row.site=="site_1" or row.site=="site_2":
                    song_sample = np.array(sample[int(row.seconds-5)*data_point_per_second:int(row.seconds)*data_point_per_second])
                elif row.site=="site_3":
                    #for now, I only take the first 5s of the samples from site_3 as they are groundtruthed at file level
                    song_sample = np.array(sample[0:sample_length])
            else:
                #same as the first condition but I isolated it for later and it is for the example file
                song_sample = np.array(sample[int(row.seconds-5)*data_point_per_second:int(row.seconds)*data_point_per_second])
            
            input_data = np.reshape(np.asarray([song_sample]),(1,sequence_length)).astype(np.float32)
            prediction = model.predict(np.array([input_data]))
            
            #condition to ensure that at least one output is activated with "some" confidence
            if any(prediction[0]>0.3):
                predicted_bird = id_to_ebird[np.argmax(prediction)]
                df.at[idx,"birds"] = predicted_bird
            else:
                df.at[idx,"birds"] = "nocall"
        except:
            df.at[idx,"birds"] = "nocall"
    return df

Below, We can test our prediction function using the examples provided.

In [None]:
audio_file_path = "/kaggle/input/birdsong-recognition/example_test_audio"
example_df = pd.read_csv("/kaggle/input/birdsong-recognition/example_test_audio_summary.csv")
#Ajusting the example filenames and creating the audio_id column to match with the test file.
example_df["audio_id"] = [ "BLKFR-10-CPL_20190611_093000.pt540" if filename=="BLKFR-10-CPL" else "ORANGE-7-CAP_20190606_093000.pt623" for filename in example_df["filename"]]

if os.path.exists(audio_file_path):
    example_df = predict_submission(example_df, audio_file_path)
example_df

In [None]:
test_file_path = "/kaggle/input/birdsong-recognition/test_audio"
test_df = pd.read_csv("/kaggle/input/birdsong-recognition/test.csv")
submission_df = pd.read_csv("/kaggle/input/birdsong-recognition/sample_submission.csv")

if os.path.exists(test_file_path):
    submission_df = predict_submission(test_df, test_file_path)

submission_df[["row_id","birds"]].to_csv('submission.csv', index=False)
submission_df.head()

### Thanks for reading this notebook! If you found this notebook helpful, please give it an upvote. It is always greatly appreciated!