## <font size='5' color='red'>Objective</font>
In this competition the researchers from Cornell Lab of Ornithology’s Center for Conservation Bioacoustics (CBC) wants the kaggle community to help them build an AI solution to identify bird species using their bird call audio.Birds are excellent indicators of deteriorating habitat quality and environmental pollution.If successful, your work will help researchers better understand changes in habitat quality, levels of pollution, and the effectiveness of restoration efforts.

![](https://i.ytimg.com/vi/0LJY0a1dmhg/maxresdefault.jpg)

## <font size='5' color='blue'>Contents</font> 


* [Basic Exploratory Data analysis](#1)  
    * [Getting started]()
    * [Bird species in data]()
    * [Countries from which samples are taken]()
    * [Dates on which samples are collected]()
    * [Popular time of the day]()
    * [Duration of samples]()
    * [Pitch and Volume]()
    * [Sampling rate]()
    * [Channels]()
 
 
* [Audio Data analysis](#2)   
     * [Playing audio]()
     * [Visualizing audio in 2D]()
     * [Spectrogram analysis]()
 
 
* [Feature Extraction](#3)    
     * [Spectral Centroid]()
     * [Spectral Bandwidth]()
     * [Spectral Rolloff]()
     * [Zero-Crossing Rate]()
     * [Mel-Frequency Cepstral Coefficients(MFCCs)]()
     * [Chroma feature]()
     
* [Compare sound features](#4)

## <font size='4' color='red'>Importing Libraries</font><a id='1'></a>


In [None]:
!pip install librosa

In [None]:
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
import IPython.display as ipd
import plotly.express as px
import librosa.display
import pandas as pd
import numpy as  np
import librosa
import warnings
import IPython
import os

plt.style.use("ggplot")

In [None]:
warnings.filterwarnings(action='ignore')

## <font size='4' color='red'>Getting a Basic Idea</font><a id='2'></a>


In [None]:

train = pd.read_csv("../input/birdsong-recognition/train.csv")

In [None]:
train.info()

In [None]:
train.head(3)

In [None]:
print("Train dataset has {} rows and {} columns".format(*train.shape))

## <font size='4' color='red'>Bird Species</font>

Bird species name ( target col name )

In [None]:
print("There are {} unique species of birds in train dataset".format(train.species.nunique()))

In [None]:
species=train.species.value_counts()

In [None]:

fig = go.Figure(data=[
    go.Bar(y=species.values, x=species.index,marker_color='deeppink')
])

fig.update_layout(title='Distribution of Bird Species')
fig.show()


- We can see that there is exactly 100 samples for almost half number of species.
- The min number of samples is for `Redhead`,it has only 9 samples.

## <font size='4' color='red'>Country</font>

Country in which the observation is made

In [None]:
country = train.country.value_counts()[:20]
fig = go.Figure(data=[
    go.Bar(x=country.index, y=country.values,marker_color='deeppink')
])

fig.update_layout(title='Countries from which data is obtained')
fig.show()

- Most number of samples are taken from USA.
- North American countries dominates the list.

## <font size='4' color='red'>Date</font>
Date in which the observation is made

In [None]:
plt.figure(figsize=(12, 8))
train['date'].value_counts().sort_index().plot(color='pink',alpha=1)

- Date starts from 1992 to 2019.
- Most number of samples where taken between 2013-2015.

## <font size='4' color='red'>Time</font>
Time of the day in which the observation is made (in 24hrs format)

In [None]:


hist_data = pd.to_datetime(train.time,errors='coerce').dropna().dt.hour.values.tolist()
fig = go.Figure(data=[go.Histogram(x=hist_data, histnorm='probability',marker_color='deeppink')])
fig.update_layout(title='Time of the day at which data is obtained')

fig.show()



- 8.00 am seems to the peak time of observation.
- Most samples are recoring in the morning time.

## <font size='4' color='red'>Duration</font>
Duration of the observation



In [None]:


hist_data = train.duration.values.tolist()
fig = go.Figure(data=[go.Histogram(x=hist_data,marker_color='deeppink')])
fig.update_layout(title='Duration of the observation')

fig.show()


## <font size='4' color='red'>Rating</font>
Rating given to the observation ( 0-5)

In [None]:

hist_data = train.rating.values.tolist()
fig = go.Figure(data=[go.Histogram(x=hist_data,marker_color='deeppink')])
fig.update_layout(title='Rating of the observation')

fig.show()


## <font size='4' color='red'>Bird Seen</font>
If the bird was seen during the recording.

In [None]:
colors = ['gold', 'mediumturquoise', 'darkorange', 'lightgreen']
df = train.bird_seen.value_counts()
fig = px.pie(df,df.index,df.values,labels={'index':'Bird Seen'})
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=20,
                  marker=dict(colors=colors, line=dict(color='#000000', width=2)))
fig.update_layout(title='Bird Seen')

fig.show()

- 80% of time,the bird is visually seen while recoring the audio.

## <font size='4' color='red'>Pitch and Volume</font>
Pitch and Volume of the recording

In [None]:

fig = make_subplots(rows=1, cols=2, specs=[[{'type':'domain'}, {'type':'domain'}]])
df = train.volume.value_counts()
fig.add_trace(go.Pie(labels=df.index, values=df.values, name="Volume"),
              1, 1)

df = train.pitch.value_counts()
fig.add_trace(go.Pie(labels=df.index ,values=df.values, name="Pitch"),
              1, 2)

# Use `hole` to create a donut-like pie chart
fig.update_traces(hole=.4, hoverinfo="label+percent+name")
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=20,
                  marker=dict(colors=colors, line=dict(color='#000000', width=2)))

fig.update_layout(
    title_text="Volume and Pitch of Observation",
    # Add annotations in the center of the donut pies.
    annotations=[dict(text='Volume', x=0.18, y=0.5, font_size=20, showarrow=False),
                 dict(text='Pitch', x=0.82, y=0.5, font_size=20, showarrow=False)])
fig.show()

## <font size='4' color='red'>Sampling rate</font>
Sampling rate (audio) Sampling rate or sampling frequency defines the number of samples per second.


In [None]:
rec = train.sampling_rate.value_counts()
fig = go.Figure(data=[
    go.Bar(x=rec.index, y=rec.values,marker_color='deeppink')
])

fig.update_layout(title='Top Recordists')
fig.show()

- 44kHz is the common sampling rate used.

## <font size='3' color='red'>Channel</font>
Channel is the passage way a signal or data is transported.One Channel is usually referred to as mono, while more Channels could either indicate stereo, surround sound and the like.

In [None]:
rec = train.channels.value_counts()
fig = go.Figure(data=[
    go.Bar(x=rec.index, y=rec.values,marker_color='deeppink')
])

fig.update_layout(title='Top Recordists')
fig.show()

## <font size='4' color='red'>Length</font>
length of the the audio signal 

In [None]:
df=train.length.value_counts()
fig = px.pie(df,df.index,df.values,labels={'index':'length of audio'})
fig.update_layout(title='Length of audio signal')
fig.update_traces(hoverinfo='label+percent', textinfo='value', textfont_size=20,
                  marker=dict(colors=colors, line=dict(color='#000000', width=2)))

fig.show()

## <font size='4' color='red'>Geographical Analysis</font>

In [None]:
df=train.groupby(['latitude','longitude'],as_index=False)['ebird_code'].agg('count')

In [None]:
df=df[df.latitude!='Not specified']
fig = go.Figure()
fig.add_trace(go.Scattergeo(
        lon = df['longitude'],
        lat = df['latitude'],
        text = df['ebird_code'],
        marker = dict(
            size = df['ebird_code'],
            line_color='rgb(40,40,40)',
            line_width=0.5,
            sizemode = 'area'
        )))


fig.update_layout(
        title_text = 'Bird Samples collected From Parts of World',
        showlegend = True,
        geo = dict(
            landcolor = 'rgb(217, 217, 217)',
        )
    )

fig.show()


- Much of the bird samples are collected from USA,SO let's have a look at USA states

### <font size='3' color='red'>Samples from USA</font>

In [None]:
fig = go.Figure()
fig.add_trace(go.Scattergeo(
        locationmode = 'USA-states',
        lon = df['longitude'],
        lat = df['latitude'],
        text = df['ebird_code'],
        marker = dict(
            size = df['ebird_code'],
            line_color='rgb(40,40,40)',
            line_width=0.5,
            sizemode = 'area'
        )))


fig.update_layout(
        title_text = 'Bird Samples collected From USA',
        showlegend = True,
        geo = dict(
            scope = 'usa',
            landcolor = 'rgb(217, 217, 217)',
        )
    )

fig.show()


## <font size='4' color='red'>Playing some audio</font><a id='2'></a>

In [None]:
path="../input/birdsong-recognition/train_audio/"
birds=train.ebird_code.unique()[:6]
file=train[train.ebird_code==birds[0]]['filename'][0]

In [None]:

for i in range(0,2):
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    print(birds[i])
    IPython.display.display(ipd.Audio(audio_path))


## <font size='4' color='red'>Visualizing Audio</font>

In this section,we will just visualize our audio signals in a 2D plot.

In [None]:

plt.figure(figsize=(17,20 ))


for i in range(0,6):
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    plt.subplot(6,2,i+1)
    x , sr = librosa.load(audio_path)
    librosa.display.waveplot(x, sr=sr,color='r')
    plt.gca().set_title(birds[i])
    plt.gca().get_xaxis().set_visible(False)



## <font size='4' color='red'>Spectrogram</font>

A spectrogram is a visual way of representing the signal strength, or “loudness”, of a signal over time at various frequencies present in a particular waveform. Not only can one see whether there is more or less energy at, for example, 2 Hz vs 10 Hz, but one can also see how energy levels vary over time.A spectrogram is usually depicted as a heat map.

In [None]:
plt.figure(figsize=(17,20 ))


for i in range(0,6):
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    plt.subplot(6,2,i+1)
    x , sr = librosa.load(audio_path)
    x = librosa.stft(x)
    Xdb = librosa.amplitude_to_db(abs(x))
    librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='hz')
    plt.gca().set_title(birds[i])
    plt.gca().get_xaxis().set_visible(False)
    plt.colorbar()

## <font size='5' color='blue'>Feature extraction from Audio signal</font><a id='3'></a>


The spectral features (frequency-based features), which are obtained by converting the time-based signal into the frequency domain using the Fourier Transform, like fundamental frequency, frequency components, spectral centroid, spectral flux, spectral density, spectral roll-off, etc.


### <font size='4' color='red'>Spectral Centroid</font>

The spectral centroid indicates at which frequency the energy of a spectrum is centered upon or in other words It indicates where the ” center of mass” for a sound is located. This is like a weighted mean:
![](https://miro.medium.com/max/710/1*DkT47WzLrjigT_KVhDoMuQ.png)

where S(k) is the spectral magnitude at frequency bin k, f(k) is the frequency at bin k.

In [None]:
import sklearn
# Normalising the spectral centroid for visualisation
def normalize(x, axis=0):
    return sklearn.preprocessing.minmax_scale(x, axis=axis)

In [None]:

plt.figure(figsize=(17,20 ))


for i in range(0,6):
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    plt.subplot(6,2,i+1)
    x , sr = librosa.load(audio_path)
    spectral_centroids = librosa.feature.spectral_centroid(x, sr=sr)[0]
    frames = range(len(spectral_centroids))
    t = librosa.frames_to_time(frames)
    librosa.display.waveplot(x, sr=sr, alpha=0.4)
    plt.plot(t, normalize(spectral_centroids), color='b')
    plt.gca().set_title(birds[i])
    plt.gca().get_xaxis().set_visible(False)
    



### <font size='4' color='red'>Spectral Rolloff</font>
It is a measure of the shape of the signal. It represents the frequency at which high frequencies decline to 0. To obtain it, we have to calculate the fraction of bins in the power spectrum where 85% of its power is at lower frequencies.

In [None]:
plt.figure(figsize=(17,20 ))


for i in range(0,6):
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    plt.subplot(6,3,i+1)
    x , sr = librosa.load(audio_path)
    spectral_centroids = librosa.feature.spectral_centroid(x, sr=sr)[0]
    frames = range(len(spectral_centroids))
    t = librosa.frames_to_time(frames)
    spectral_rolloff = librosa.feature.spectral_rolloff(x+0.01, sr=sr)[0]
    librosa.display.waveplot(x, sr=sr, alpha=0.4)
    plt.plot(t, normalize(spectral_rolloff), color='r')
    plt.gca().set_title(birds[i])
    plt.gca().get_xaxis().set_visible(False)
    

###  <font size='4' color='red'>Spectral Bandwidth</font>
The spectral bandwidth is defined as the width of the band of light at one-half the peak maximum (or full width at half maximum [FWHM]) and is represented by the two vertical red lines and λSB on the wavelength axis.
![](https://miro.medium.com/max/1030/1*oUtYY0-j6iEc78Dew3d0uA.png)

In [None]:

plt.figure(figsize=(17,20 ))


for i in range(0,6):
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    plt.subplot(6,3,i+1)
    x , sr = librosa.load(audio_path)
    spectral_centroids = librosa.feature.spectral_centroid(x, sr=sr)[0]
    frames = range(len(spectral_centroids))
    t = librosa.frames_to_time(frames)
    spectral_bandwidth_2 = librosa.feature.spectral_bandwidth(x+0.01, sr=sr)[0]
    spectral_bandwidth_3 = librosa.feature.spectral_bandwidth(x+0.01, sr=sr, p=3)[0]
    spectral_bandwidth_4 = librosa.feature.spectral_bandwidth(x+0.01, sr=sr, p=4)[0]
    librosa.display.waveplot(x, sr=sr, alpha=0.4)
    plt.plot(t, normalize(spectral_bandwidth_2), color='r')
    plt.plot(t, normalize(spectral_bandwidth_3), color='g')
    plt.plot(t, normalize(spectral_bandwidth_4), color='y')
    plt.gca().set_title(birds[i])
    plt.gca().get_xaxis().set_visible(False)
    plt.legend(('p = 2', 'p = 3', 'p = 4'))

###  Zero-Crossing Rate
A very simple way for measuring the smoothness of a signal is to calculate the number of zero-crossing within a segment of that signal. A voice signal oscillates slowly — for example, a 100 Hz signal will cross zero 100 per second — whereas an unvoiced fricative can have 3000 zero crossings per second.
![](https://miro.medium.com/max/1400/1*E_XSqizmLNksjknrD8oV2w.png)

In [None]:
x , sr = librosa.load(audio_path)
plt.figure(figsize=(14, 5))
librosa.display.waveplot(x, sr=sr)
# Zooming in
n0 = 9000
n1 = 9100
plt.figure(figsize=(14, 5))
plt.plot(x[n0:n1])
plt.grid()

Zooming in...

In [None]:
zero_crossings = librosa.zero_crossings(x[n0:n1], pad=False)
print(sum(zero_crossings))

There are seven points in which the wave crosses zero.

### <font size='4' color='red'>Mel-Frequency Cepstral Coefficients(MFCCs)</font>
The Mel frequency cepstral coefficients (MFCCs) of a signal are a small set of features (usually about 10–20) which concisely describe the overall shape of a spectral envelope. It models the characteristics of the human voice.


In [None]:
plt.figure(figsize=(17, 20))

for i in range(0,6):
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    plt.subplot(6,2,i+1)
    x , sr = librosa.load(audio_path)
    mfccs = librosa.feature.mfcc(x, sr=sr)
    librosa.display.specshow(mfccs, sr=sr, x_axis='time')
    plt.gca().set_title(birds[i])
    plt.gca().get_xaxis().set_visible(False)
    

### <font size='4' color='red'>Chroma feature</font>
A chroma feature or vector is typically a 12-element feature vector indicating how much energy of each pitch class, {C, C#, D, D#, E, …, B}, is present in the signal. In short, It provides a robust way to describe a similarity measure between music pieces.



In [None]:
plt.figure(figsize=(17, 20))

for i in range(0,6):
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    plt.subplot(6,3,i+1)
    x , sr = librosa.load(audio_path)
    chromagram = librosa.feature.chroma_stft(x, sr=sr)
    librosa.display.specshow(chromagram, x_axis='time', y_axis='chroma', cmap='coolwarm')
    plt.gca().set_title(birds[i])
    plt.gca().get_xaxis().set_visible(False)
    

## <font size='4' color='blue'>Compare features for Species</font><a id='4'></a>

In [None]:
fig=plt.figure(figsize=(15,15))
k=1
for i in range(5):
    
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    plt.subplot(5,3,k)
    k+=1
    x , sr = librosa.load(audio_path)
    librosa.display.waveplot(x, sr=sr)
    plt.gca().set_title('Spectral Centroid')
    plt.gca().set_ylabel(birds[i])
    plt.gca().get_xaxis().set_visible(False)

    plt.subplot(5,3,k)
    k+=1
    spectral_centroids = librosa.feature.spectral_centroid(x, sr=sr)[0]
    frames = range(len(spectral_centroids))
    t = librosa.frames_to_time(frames)
    spectral_rolloff = librosa.feature.spectral_rolloff(x+0.01, sr=sr)[0]
    librosa.display.waveplot(x, sr=sr, alpha=0.4)
    plt.plot(t, normalize(spectral_rolloff), color='r')
    plt.gca().set_title('Spectral Rolloff ')
    plt.gca().get_xaxis().set_visible(False)

    plt.subplot(5,3,k)
    k+=1
    #spectral_centroids = librosa.feature.spectral_centroid(x, sr=sr)[0]
    #frames = range(len(spectral_centroids))
    #t = librosa.frames_to_time(frames)
    spectral_bandwidth_2 = librosa.feature.spectral_bandwidth(x+0.01, sr=sr)[0]
    spectral_bandwidth_3 = librosa.feature.spectral_bandwidth(x+0.01, sr=sr, p=3)[0]
    spectral_bandwidth_4 = librosa.feature.spectral_bandwidth(x+0.01, sr=sr, p=4)[0]
    librosa.display.waveplot(x, sr=sr, alpha=0.4)
    plt.plot(t, normalize(spectral_bandwidth_2), color='r')
    plt.plot(t, normalize(spectral_bandwidth_3), color='g')
    plt.plot(t, normalize(spectral_bandwidth_4), color='y')
    plt.gca().set_title('Spectral Bandwidth')
    plt.gca().get_xaxis().set_visible(False)
    plt.legend(('p = 2', 'p = 3', 'p = 4'))

    
#plt.gca().set_title('Comparing audio features for bird species')
plt.tight_layout()
plt.show()

Now,let's compare spectrogram,MFFC feature and chroma feature for some bird species.

In [None]:
fig=plt.figure(figsize=(15,15))
k=1
for i in range(5):
    
    file=train[train.ebird_code==birds[i]]['filename'].values[0]
    audio_path=os.path.join(path,birds[i],file)
    plt.subplot(5,3,k)
    k+=1
    x , sr = librosa.load(audio_path)
    s = librosa.stft(x)
    Xdb = librosa.amplitude_to_db(abs(s))
    librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='hz')
    plt.gca().set_title('Spectrogram')
    plt.gca().set_ylabel(birds[i])
    plt.gca().get_xaxis().set_visible(False)

    plt.subplot(5,3,k)
    k+=1
    mfccs = librosa.feature.mfcc(x, sr=sr)
    librosa.display.specshow(mfccs, sr=sr, x_axis='time')
    plt.gca().set_title('MFFC features ')
    plt.gca().get_xaxis().set_visible(False)

    plt.subplot(5,3,k)
    k+=1
    chromagram = librosa.feature.chroma_stft(x, sr=sr)
    librosa.display.specshow(chromagram, x_axis='time', y_axis='chroma', cmap='coolwarm')
    plt.gca().set_title('Chroma feature')
    plt.gca().get_xaxis().set_visible(False)
  

    
#fig.suptitle('Comparing audio features for bird species')
plt.tight_layout()
plt.show()


<font size='5' color='blue'>Leave an upvote if you think this was helpful!</font>