This notebook records how I generate the training data for the CF call classifier. The CF call classifier takes in snippets of 200 millisecond audio recorded at 250 kHz. It outputs the horseshoe bat group size label and the species type detected.  The horseshoe groupsize label refers to the number of bats in the snippet ('none', 'single' or 'multi'). The call type detected refers to the type of species in the snippet : 'ferrumequinum', 'BlEuMe' or 'Myotis'. 

### Audio snippet :

The audio snippet length is set to 200 milliseconds as this represents the smallest duration of time that can be used to reliably classify a scenario. Setting the window longer means that the changes in behaviour might be 'averaged' out, while setting it shorter means that there may not be enough calls to get a proper picture. The exact number 200 milliseconds was chosed based on Neetash (N) saying that many flight events can occur within a couple of video frames at 22 Hz, this corresponds to about ~100 milliseconds. I chose the 200 milliseconds in order to match the video time frame + to be able to get at least a couple of long *R. ferrumequinum* calls into that window. 



### The classification labels:

#### Horseshoe group size labels:

The group size refers to the numberof horseshoe bats expected to be echolocating in the snippet. 

'None' : Zero bats, a silent snippet, with/out  *Myotis* like FM calls

'Single' : One horseshoe bat, of any species, with/out *Myotis* like calls. A snippet is considered to have single calls when there are no overlapping calls with different peak frequencies, or the 'leg's of different calls don't cross calls.  

'Multiple' : $\geq$ 2 horseshoebats of any specis, with/without *Myotis* like calls. When call peak frequencies differ and the 'legs' of calls cross each other, then it is considered to be a multi bat snippet. 

*Myotis* calls are not considered in the number of bats as they do not fly within the dome cave. The calls recorded are from the flying bats in the corridor leading to the exit/entrance of Orlova Chuka. 

####  Call type:
The call type refers to the species or species group that emitted the calls. 
'ferrumequinum' : The calls of *Rhinolophus ferrumequinum*, with the peak CF frequency at 80 kHz. 
'BlEuMe' : The calls of either *R. blasii*, *R. euryale*, *R. mehelyi* with their peak frequencies around 100-110 kHz. 
'Myotis' : any FM like call. Most of the FM calls are emitted by the majority resident *Myotis myotis* and *Myotis blythii*


In [1]:
import matplotlib.pyplot as plt
import numpy as np 
import pandas as pd

In [2]:
audio_labels = pd.read_csv('audio_labels.csv')

In [3]:
audio_labels.tail()

Unnamed: 0,date_recorded,file_name,time_start,time_end,duration,channel_num,groupsize_label,Ferrum_call,BlEuMi_call,Myotis_call,Unnamed: 10
303,,T0000027.WAV,53.5,53.7,,2,none,0,0,0,
304,,T0000027.WAV,58.0,58.2,,1,none,0,0,0,
305,,T0000029.WAV,9.5,9.7,,0,multi,1,1,0,
306,,T0000029.WAV,9.9,10.1,,1,multi,1,1,0,
307,,T0000035.WAV,0.9,1.1,,1,single,1,0,0,


In [4]:
%matplotlib notebook

In [5]:
def make_combinationname(df):
    combined = df['groupsize_label']+str(df['Ferrum_call'])+str(df['BlEuMi_call'])+str(df['Myotis_call'])
    return(combined)

In [6]:
audio_labels['combination_name'] = audio_labels.apply(make_combinationname,1)
    

In [7]:
combination_counts = audio_labels['combination_name'].value_counts()
channel_counts = audio_labels['channel_num'].value_counts()
file_counts = audio_labels['file_name'].value_counts()

print (combination_counts)

multi010     49
single010    43
none001      38
single100    35
single011    33
none000      32
single101    31
multi110     21
multi011     20
multi111      6
Name: combination_name, dtype: int64


In [8]:
print(channel_counts)

0    147
1    126
2     35
Name: channel_num, dtype: int64


In [9]:
print(file_counts)

T0000393.WAV    20
T0000971.WAV    19
T0000094.WAV    12
T0000979.WAV    10
T0000406.WAV     9
T0000126.WAV     8
T0000407.WAV     8
T0000373.WAV     7
T0000106.WAV     7
T0000086.WAV     7
T0000109.WAV     7
T0000093.WAV     7
T0000831.WAV     6
T0000134.WAV     6
T0000371.WAV     6
T0000115.WAV     6
T0000844.WAV     5
T0000895.WAV     5
T0000405.WAV     5
T0000117.WAV     5
T0000370.WAV     4
T0000071.WAV     4
T0000027.WAV     4
T0000955.WAV     4
T0000907.WAV     4
T0000112.WAV     4
T0000128.WAV     4
T0000113.WAV     4
T0000823.WAV     4
T0000937.WAV     4
                ..
T0000939.WAV     1
T0000022.WAV     1
T0000948.WAV     1
T0000966.WAV     1
T0000904.WAV     1
T0000944.WAV     1
T0000914.WAV     1
T0000980.WAV     1
T0000967.WAV     1
T0000926.WAV     1
T0000825.WAV     1
T0000908.WAV     1
T0000859.WAV     1
T0000917.WAV     1
T0000922.WAV     1
T0000853.WAV     1
T0000913.WAV     1
T0000024.WAV     1
T0000949.WAV     1
T0000856.WAV     1
T0000924.WAV     1
T0000092.WAV