#Flowchart


![Here is the flowchart of the steps that I hve taken in my notebook](https://drive.google.com/uc?export=view&id=1a8YUCGo4CMkiCC_JugKPt9686BQVZ_bJ)

# Contents

- [Preprocessing](#Preprocessing-Stage)  
  - [Importing packages](#Importing-the-necessary-packages)  
  - [Standardize](#Standardize-the-tsv-files)  
  - [Extract positive and negative calls from audio using tsv's](#Extract)
  - [Apply PCEN](#PCEN)
  - [Apply Wavelet Denoising](#Wavelet-denoising)  
  - [Plot Spectrograms](#Spec-generate)
  - [Same steps for test data](#test-data)  
- [Training Stage](#Training-Stage)  
   - [Basic CNN model](#Basic-CNN-model)
   - [VGG16-model](#VGG-16-model)

#**Preprocessing Stage**



###Downloading the PodcastR2,PodcastR3 and Podcast_Test files and extracting them.
####Since these files contains the calls of SRKW, only podcastR2 and podcastR3 have been downloaded and used

In [None]:
!apt-get -qq install awscli
!aws --no-sign-request s3 cp s3://acoustic-sandbox/labeled-data/detection/train/OrcasoundLab07052019_PodCastRound2.tar.gz ./ 
!aws --no-sign-request s3 cp s3://acoustic-sandbox/labeled-data/detection/train/OrcasoundLab09272017_PodCastRound3.tar.gz ./
!aws --no-sign-request s3 cp s3://acoustic-sandbox/labeled-data/detection/test/OrcasoundLab09272017_Test.tar.gz ./
!tar -xzf OrcasoundLab09272017_PodCastRound3.tar.gz
!tar -xzf OrcasoundLab07052019_PodCastRound2.tar.gz
!tar -xzf OrcasoundLab09272017_Test.tar.gz
!pip -q install ketos==2.0.0b4
!pip -q install pysoundfile
!pip install pydub


Selecting previously unselected package sgml-base.
(Reading database ... (Reading database ... 5%(Reading database ... 10%(Reading database ... 15%(Reading database ... 20%(Reading database ... 25%(Reading database ... 30%(Reading database ... 35%(Reading database ... 40%(Reading database ... 45%(Reading database ... 50%(Reading database ... 55%(Reading database ... 60%(Reading database ... 65%(Reading database ... 70%(Reading database ... 75%(Reading database ... 80%(Reading database ... 85%(Reading database ... 90%(Reading database ... 95%(Reading database ... 100%(Reading database ... 144328 files and directories currently installed.)
Preparing to unpack .../00-sgml-base_1.29_all.deb ...
Unpacking sgml-base (1.29) ...
Selecting previously unselected package python3-yaml.
Preparing to unpack .../01-python3-yaml_3.12-1build2_amd64.deb ...
Unpacking python3-yaml (3.12-1build2) ...
Selecting previously unselected package python3-six.
Preparing to unpack .../02-pytho

##Preprocessing on positive train dataset

###Importing the necessary packages 

In [None]:
import pandas as pd
from ketos.data_handling import selection_table as sl
from ketos.data_handling.parsing import load_audio_representation
import numpy as np
from os import listdir
from os.path import isfile, join
from scipy import signal
import soundfile as sf
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
from pydub import AudioSegment
import librosa
import os
import soundfile as sf
from skimage.restoration import (denoise_wavelet, estimate_sigma)


In [None]:
#Generate mean and annot-train
def duration_mean(filename):
    annot_train = pd.read_csv(filename, sep='\t')
    mean=annot_train['duration_s'].mean()
    return annot_train,mean
  

In [None]:
#Function to add the end time
def add_end(filename):
    filename["end"]=filename["start"]+filename["duration_s"]


In [None]:
#Function to find extract filename and start time
def fname_stime(filename):
    file_name=filename.iloc[:,0].values
    start_time=filename.iloc[:,1].values
    return file_name,start_time


In [None]:
#Function to extract audio from the .wav files to generate complete positive and negative calls
def extract_audio(label,filename,path,position):
    file_name=filename.iloc[:,0].values
    start_time=filename.iloc[:,position].values
    i=0
    o=0
    for x in file_name:
  
        AUDIO_FILE=x
        sound = AudioSegment.from_file(AUDIO_FILE)
        p=start_time[i]
        p=p*1000
        print(p)
        i=i+1
        o=p+2000
        call=sound[p:o]
        call.export(path+label+ "MMMcalls{0}.wav".format(i),format="wav")


In [None]:
def apply_per_channel_energy_norm(data, sampling_rate):
    '''Compute Per-Channel Energy Normalization (PCEN)'''
    S = librosa.feature.melspectrogram(
        data, sr=sampling_rate, power=1)  # Compute mel-scaled spectrogram
    # Convert an amplitude spectrogram to dB-scaled spectrogram
    log_S = librosa.amplitude_to_db(S, ref=np.max)
    pcen_S = librosa.core.pcen(S)
    return pcen_S

In [None]:
def wavelet_denoising(data):
    '''
    Wavelet Denoising using scikit-image
    NOTE: Wavelet denoising is an effective method for SNR improvement in environments with
              wide range of noise types competing for the same subspace.
    '''
    sigma_est = estimate_sigma(data, multichannel=False, average_sigmas=True)
    im_bayes = denoise_wavelet(data, multichannel=False, convert2ycbcr=False, method='BayesShrink',
                               mode='soft')
    im_visushrink = denoise_wavelet(data, multichannel=False, convert2ycbcr=False, method='VisuShrink',
                                    mode='soft')

    # VisuShrink is designed to eliminate noise with high probability, but this
    # results in a visually over-smooth appearance. Here, we specify a reduction
    # in the threshold by factors of 2 and 4.
    im_visushrink2 = denoise_wavelet(data, multichannel=False, convert2ycbcr=False, method='VisuShrink',
                                     mode='soft', sigma=sigma_est / 2)
    im_visushrink4 = denoise_wavelet(data, multichannel=False, convert2ycbcr=False, method='VisuShrink',
                                     mode='soft', sigma=sigma_est / 4)
    return im_bayes

In [None]:
def plot_and_save(denoised_data, f_name):

    fig, ax = plt.subplots()

    i = 0
    # Add this line to show plots else ignore warnings
    # plt.ion()

    ax.imshow(denoised_data)
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
    fig.set_size_inches(10, 10)
    fig.savefig(
        f"{f_name[:-4]}" + "_{:04d}.png".format(i),
        dpi=80,
        bbox_inches="tight",
        quality=95,
        pad_inches=0.0)

    fig.canvas.draw()
    fig.canvas.flush_events()
    i += 1
    plt.close(fig)

In [None]:
def final_plot(base_path,plot_path,folder_path):

    basePath = base_path
    plotPath = join(basePath,plot_path)
    folderpath = join(basePath, folder_path)
    onlyfiles = [f for f in listdir(folderpath) if isfile(join(join(folderpath, f)))]

    for idx,file in enumerate(onlyfiles):
        #data, samplerate = sf.read(join(folderpath, file))
   
        data, sr = librosa.core.load(
                    os.path.join(folderpath, file), res_type='kaiser_best')
        #print(data)
        #print(sf)
        f_name = os.path.basename(file)
        pcen_S = apply_per_channel_energy_norm(data, sr)

        denoised_data = wavelet_denoising(pcen_S)
        plot_and_save(denoised_data, f_name)



In [None]:
!pwd

/content/Round3_OS_09_27_2017/wav


####The tsv files contains the parameters like start_time,duration_s,etc, but since these are not in the format Ketos accepts,we need to perform some changes in labels and therefore these files have been uploaded from the local machine


In [None]:
annot_train2,mean2=duration_mean('/content/podcast2.tsv')
annot_train3,mean3 = duration_mean('/content/podcast3.tsv')
annot_test,mean_test = duration_mean('/content/v10_test.tsv')

print(mean2)
annot_train2.head()


2.1110548004254963


Unnamed: 0,wav_filename,start,duration_s,location,date,data_source,data_source_id,label
0,1562337136_0004.wav,49.765625,2.45,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs
1,1562337136_0004.wav,41.046007,1.658854,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs
2,1562337136_0004.wav,37.345486,1.743924,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs
3,1562337136_0004.wav,42.917535,2.594618,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs
4,1562337136_0004.wav,45.980035,2.041667,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs


####Here is how the .tsv files and their labels look like when they are in format that could be accepted by ketos

In [None]:
add_end(annot_train2)
add_end(annot_train3)
add_end(annot_test)
annot_train2.head()

Unnamed: 0,wav_filename,start,duration_s,location,date,data_source,data_source_id,label,end
0,1562337136_0004.wav,49.765625,2.45,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,52.215625
1,1562337136_0004.wav,41.046007,1.658854,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,42.704861
2,1562337136_0004.wav,37.345486,1.743924,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,39.08941
3,1562337136_0004.wav,42.917535,2.594618,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,45.512153
4,1562337136_0004.wav,45.980035,2.041667,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,48.021701


###Standardizing the tsv files




In [None]:
map_to_ketos_annot_std ={'wav_filename': 'filename'} 
std_annot_train2 = sl.standardize(table=annot_train2, signal_labels=["SRKWs"], mapper=map_to_ketos_annot_std, trim_table=True)
std_annot_train3 = sl.standardize(table=annot_train3, signal_labels=["SRKWs"], mapper=map_to_ketos_annot_std, trim_table=True)

std_annot_test = sl.standardize(table=annot_test, signal_labels=["SRKWs"], mapper=map_to_ketos_annot_std, trim_table=True)


###Here we could see how each these tsv files look like after standardizing

In [None]:
std_annot_train2.head()


Unnamed: 0_level_0,Unnamed: 1_level_0,start,label,end
filename,annot_id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1562337136_0004.wav,0,49.765625,1,52.215625
1562337136_0004.wav,1,41.046007,1,42.704861
1562337136_0004.wav,2,37.345486,1,39.08941
1562337136_0004.wav,3,42.917535,1,45.512153
1562337136_0004.wav,4,45.980035,1,48.021701


In [None]:
std_annot_train3.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,start,label,end
filename,annot_id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
OS_9_27_2017_08_14_00__0002.wav,0,6.110451,1,7.856295
OS_9_27_2017_08_14_00__0004.wav,0,12.717882,1,15.167882
OS_9_27_2017_08_14_00__0004.wav,1,29.825347,1,31.637326
OS_9_27_2017_08_14_00__0004.wav,2,43.504514,1,45.103819
OS_9_27_2017_08_14_00__0004.wav,3,48.404514,1,50.344097


In [None]:
std_annot_test.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,start,label,end
filename,annot_id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
OS_9_27_2017_08_14_00__0001.wav,0,11.643564,1,14.093564
OS_9_27_2017_08_14_00__0001.wav,1,15.594059,1,17.759901
OS_9_27_2017_08_14_00__0001.wav,2,53.9,1,56.35
OS_9_27_2017_08_14_00__0001.wav,3,59.781486,1,61.25
OS_9_27_2017_08_19_00__0002.wav,0,6.592882,1,7.826389


###Saving these standardized tsv files

In [None]:
std_annot_train2.to_csv('standardized_train2.tsv', mode='a', sep='\t',header=False)
std_annot_train3.to_csv('standardized_train3.tsv', mode='a', sep='\t',header=False)
std_annot_test.to_csv('standardized_test.tsv', mode='a', sep='\t',header=False)

In [None]:
%cd /content/

/content


###Saving these standardized tsv files

In [None]:
annot_id2 = pd.read_csv('/content/standardized_train2.tsv', sep='\t')
annot_id3 = pd.read_csv('/content/standardized_train3.tsv', sep='\t')
annot_idtest = pd.read_csv('/content/standardized_test.tsv', sep='\t')
#annot_val=pd.read_csv('/content/standardized_val.tsv', sep='\t')

####Extracting the .wav file names and start time from these .tsv files which would be used by Pydub to extract small segemets of sounds(one containing the calls and the other not)

In [None]:
filename,start_time=fname_stime(annot_train2)
print(filename[0])
print(start_time[0])
annot_train2.head()

1562337136_0004.wav
49.765625


Unnamed: 0,wav_filename,start,duration_s,location,date,data_source,data_source_id,label,end
0,1562337136_0004.wav,49.765625,2.45,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,52.215625
1,1562337136_0004.wav,41.046007,1.658854,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,42.704861
2,1562337136_0004.wav,37.345486,1.743924,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,39.08941
3,1562337136_0004.wav,42.917535,2.594618,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,45.512153
4,1562337136_0004.wav,45.980035,2.041667,orcasound_lab,2019-07-05,Orcasound_PodCast_Round2,1562337136,SRKWs,48.021701


####We could verify the start time and the file duration matches to the same column as above

####We would change to the directory from where we want to extract calls using pydub

In [None]:
!pwd

/content


In [None]:
%cd /content/Round2_OS_07_05/wav

/content/Round2_OS_07_05/wav


In [None]:
!mkdir pod_calls

In [None]:
extract_audio('round2_calls',annot_train2,"/content/Round2_OS_07_05/wav/pod_calls/",1)

49765.625
41046.0069444444
37345.4861111111
42917.534722222204
45980.034722222204
52700.5208333333
55295.1388888889
1147.64052741152
26115.197779319897
29995.0728660652
34725.0520471894
52485.426786953496
36554.4760582929
13883.906030855502
17708.3800841515
21964.095371669
19672.159887798
27846.966527196604
29329.9511854951
34229.951185495105
56773.709902370996
37544.4560669456
46172.41980474201
6233.23983169705
42204.9438990182
56264.09537166899
58972.44491458839
13916.549789621302
7261.1576011157595
8286.26220362622
10550.0348675035
11318.8633193863
34554.5676429568
52237.62203626221
53647.1408647141
14451.171875
47755.859375
56895.5078125
31508.298465829805
37607.6708507671
1250.0
58064.3398354815
13398.4375
27945.3125
17035.15625
38233.3984375
46798.828125
57363.1450488145
11881.54296875
20336.9140625
34300.0
41412.04351204351
46430.0699300699
51488.28125
53785.15625
57134.765625
129.12860154603
5552.52986647927
8522.487702037952
21091.0049191848
24534.434293745606
30216.0927617709

####Extracting the start time plus two second sound which we know by taking the mean of the duration

In [None]:
%cd /content/Round3_OS_09_27_2017/wav

/content/Round3_OS_09_27_2017/wav


####Similarly we would extract the calls for podcast3

In [None]:
extract_audio('round3_calls',annot_train3,"/content/Round2_OS_07_05/wav/pod_calls/",1)

6110.451306413301
12717.881944444402
29825.347222222197
43504.5138888889
48404.5138888889
3530.3819444444403
18842.8819444444
21692.7083333333
38281.25
45980.034722222204
54104.1666666667
9311.163895486938
19058.7885985748
22259.501187648504
32977.0387965162
30880.2083333333
11994.791666666699
36111.9791666667
37898.4375
45894.965277777796
49042.534722222204
11532.2265625
16365.234375
20145.5078125
39094.7265625
5790.0390625
8326.171875
2631.8359375
24500.0
38519.444444444394
53729.8611111111
56605.208333333394
5869.79166666667
2807.2916666666697
0.0
33087.6088677751
36750.0
50115.3998416469
23132.4228028504
25557.2050673001
29776.326207442606
40833.3333333333
45440.41963578779
47671.2193190816
0.0
19600.0
26093.3566433566
30351.8259518259
37368.6868686869
44195.1825951826
6139.27738927739
12135.7808857809
47591.297591297596
50351.592851592904
53545.8984375
57134.765625
59240.234375
17728.740157480304
20564.5669291339
24500.0
26950.0
50061.0236220473
56784.05511811029
59468.52494475739

####Verify the calls that we extracted are of annotated three itself

In [None]:
annot_train3.head()

Unnamed: 0,wav_filename,start,duration_s,location,date,data_source,data_source_id,label,end
0,OS_9_27_2017_08_14_00__0002.wav,6.110451,1.745843,orcasound_lab,2017-09-27,Orcasound_PodCast_Round3,OS_9_27_2017_08_14,SRKWs,7.856295
1,OS_9_27_2017_08_14_00__0004.wav,12.717882,2.45,orcasound_lab,2017-09-27,Orcasound_PodCast_Round3,OS_9_27_2017_08_14,SRKWs,15.167882
2,OS_9_27_2017_08_14_00__0004.wav,29.825347,1.811979,orcasound_lab,2017-09-27,Orcasound_PodCast_Round3,OS_9_27_2017_08_14,SRKWs,31.637326
3,OS_9_27_2017_08_14_00__0004.wav,43.504514,1.599306,orcasound_lab,2017-09-27,Orcasound_PodCast_Round3,OS_9_27_2017_08_14,SRKWs,45.103819
4,OS_9_27_2017_08_14_00__0004.wav,48.404514,1.939583,orcasound_lab,2017-09-27,Orcasound_PodCast_Round3,OS_9_27_2017_08_14,SRKWs,50.344097


In [None]:
%cd /content/Round2_OS_07_05

/content/Round2_OS_07_05


###In the Round2 folder we are going to create two folders train and test respectively and in each of the folders we are going to create calls and nocalls folders respectively 

In [None]:
!mkdir train
!mkdir test
%cd train
!mkdir calls
!mkdir nocalls
%cd /content/Round2_OS_07_05/test

/content/Round2_OS_07_05/train
/content/Round2_OS_07_05/test


In [None]:
!pwd
!mkdir calls
!mkdir nocalls

/content/Round2_OS_07_05/test


###Now we would plot the graphs i.e the spectrograms without x and y labels into calls folder

In [None]:
final_plot('/content/Round2_OS_07_05/','train/calls','wav/pod_calls/')

  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is ad

##Generation of spectrograms from the background sounds 

####Since we have generated the positive calls, its time to generate the negative ones.

####The above table shows the time area for the podcast-2 and podcast-3 which does not contain the calls timeframe
####The table displays the start time and the end time that does not contain the calls

In [None]:
positives_train2 = sl.select(annotations=std_annot_train2, length=2.0)
file_durations_train2 = sl.file_duration_table('/content/Round2_OS_07_05/wav')
negatives_train2=sl.create_rndm_backgr_selections(annotations=std_annot_train2, files=file_durations_train2, length=2.0, num=len(positives_train2), trim_table=True)
negatives_train2.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,start,end,label
filename,sel_id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1562337136_0004.wav,0,2.298165,4.298165,0
1562337136_0004.wav,1,6.010853,8.010853,0
1562337136_0004.wav,2,6.188857,8.188857,0
1562337136_0004.wav,3,7.792858,9.792858,0
1562337136_0004.wav,4,9.84633,11.84633,0


####Extracting the area by looking at time where there are no occurences of calls in tsv file 

In [None]:
positives_train3 = sl.select(annotations=std_annot_train3, length=2.0)
file_durations_train33 = sl.file_duration_table('/content/Round3_OS_09_27_2017/wav')
negatives_train33=sl.create_rndm_backgr_selections(annotations=std_annot_train3, files=file_durations_train33, length=2.0, num=len(positives_train3), trim_table=True)
negatives_train33.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,start,end,label
filename,sel_id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
OS_9_27_2017_08_03_00__0002.wav,0,2.158776,4.158776,0
OS_9_27_2017_08_03_00__0002.wav,1,14.006486,16.006486,0
OS_9_27_2017_08_03_00__0002.wav,2,14.741979,16.741979,0
OS_9_27_2017_08_03_00__0002.wav,3,20.980461,22.980461,0
OS_9_27_2017_08_03_00__0002.wav,4,27.223733,29.223733,0


###Then the steps for generating audio is same as we did above for the positive calls

In [None]:
!pwd

/content/Round2_OS_07_05/test


In [None]:
%cd '/content/Round2_OS_07_05/wav'

/content/Round2_OS_07_05/wav


In [None]:
!mkdir neg_pod_calls

In [None]:
%cd /content

/content


In [None]:
negatives_train2.to_csv('negative2.tsv', mode='a', sep='\t',header=False)
negatives_train33.to_csv('negative3.tsv', mode='a', sep='\t',header=False)

In [None]:
negatives_train2save=pd.read_csv('/content/negative2.tsv',sep='\t')
negatives_train33save=pd.read_csv('/content/negative3.tsv',sep='\t')

In [None]:
negatives_train2save.head()

Unnamed: 0,1562337136_0004.wav,0,2.298164947921821,4.298164947921821,0.1
0,1562337136_0004.wav,1,6.010853,8.010853,0
1,1562337136_0004.wav,2,6.188857,8.188857,0
2,1562337136_0004.wav,3,7.792858,9.792858,0
3,1562337136_0004.wav,4,9.84633,11.84633,0
4,1562337136_0004.wav,5,15.501313,17.501313,0


In [None]:
%cd /content/Round2_OS_07_05/wav

/content/Round2_OS_07_05/wav


In [None]:
extract_audio('r2negcalls_calls',negatives_train2save,"/content/Round2_OS_07_05/wav/neg_pod_calls/",2)

6010.8529725514445
6188.856797807477
7792.858411320733
9846.329504146193
15501.313189515136
20852.964010487107
28397.712024804765
30528.055250253143
31015.90190296255
210.69653228761356
2262.0799993745563
2775.9005795531666
11272.958445136836
13655.939090937223
29503.269692502698
52442.956270787356
52602.97581844435
54188.35433202276
7142.147110353619
18595.524119757018
8055.565776961003
26318.12871113249
29293.885330106692
37498.45265504422
38926.42768716533
39325.911781961884
40496.77921788179
42764.87354936992
57153.68337984371
332.78345261842895
5908.016547605912
6884.528782925685
9724.253188265864
10633.081503438803
13332.658111923822
51279.23979920615
51882.32098215457
52683.87306256716
10045.958808007981
15654.36904170366
32977.163157723175
35497.97441538618
3548.2242655294267
19916.863885922172
26046.795970633353
32152.302305399422
44165.96891045077
46417.90350484593
16912.80048959652
17564.703301607155
20096.300710284086
20399.935733997092
21142.785854574526
22845.93114268114


In [None]:
%cd '/content/Round3_OS_09_27_2017/wav'

/content/Round3_OS_09_27_2017/wav


In [None]:
extract_audio('r3neg_calls',negatives_train33save,"/content/Round2_OS_07_05/wav/neg_pod_calls/",2)

14006.485940130699
14741.978525566661
20980.46092330884
27223.732854684982
5383.800880543632
6234.457162627535
9683.17182066035
18944.270901962765
36161.10500646843
38208.41663585668
56311.49017830336
57894.38600064052
2340.3729105924504
4724.257184502491
30097.29403650681
34542.63508236423
35472.285047801044
47109.79971733207
49940.056544631174
28257.520050070496
28709.25186706055
30579.44923457049
56355.49624443283
8927.92903603879
20043.510933028763
29153.42532173816
36557.524197319035
40905.97077553798
3176.9912175012114
8764.06045745864
27483.993847459602
39560.385497884454
48412.336518691576
1063.9999308801862
20582.324940310737
21506.467274295912
24963.583597590125
27810.795401317246
41595.888438476824
46953.670978116854
56678.3119465178
57123.47595561153
11150.953315250206
12466.97376179327
23453.22364776229
34709.6550003622
45289.68168463956
46162.829009708395
46735.08166323875
53851.270369287246
56099.63358034525
1024.9738229816785
2094.721433845848
2192.7149341479435
14339.2

In [None]:
negatives_train33save.head()

Unnamed: 0,OS_9_27_2017_08_03_00__0002.wav,0,2.1587760626806163,4.158776062680616,0.1
0,OS_9_27_2017_08_03_00__0002.wav,1,14.006486,16.006486,0
1,OS_9_27_2017_08_03_00__0002.wav,2,14.741979,16.741979,0
2,OS_9_27_2017_08_03_00__0002.wav,3,20.980461,22.980461,0
3,OS_9_27_2017_08_03_00__0002.wav,4,27.223733,29.223733,0
4,OS_9_27_2017_08_03_00__0003.wav,0,5.383801,7.383801,0


In [None]:
%cd '/content/Round2_OS_07_05/train/train/nocalls'

/content/Round2_OS_07_05/train/train/nocalls


In [None]:

final_plot('/content/Round2_OS_07_05/','train/nocalls','wav/neg_pod_calls')


  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is ad

###Now we have saved the images(spectrograms) of the negtive calls 

##We would perform similar steps with test data as well where in :

*   Extract area responsible for the call from the Orcasound_test tsv file
*   Standardize that tsv file
*   Generate 2 second calls that contains the audio data
*   Generate background sound from tsv of 2 seconds 
*   Generate spectrograms from these sounds 



---







In [None]:
%cd /content/Round2_OS_07_05/wav
!mkdir pod_calls_test
!mkdir pod_calls_neg_test

/content/Round2_OS_07_05/wav


In [None]:
%cd '/content/OrcasoundLab09272017_Test/wav'

/content/OrcasoundLab09272017_Test/wav


In [None]:
annot_test.head()

Unnamed: 0,wav_filename,start,duration_s,location,date,data_source,data_source_id,label,end
0,OS_9_27_2017_08_14_00__0001.wav,11.643564,2.45,orcasound_lab,9/27/2017,Orcasound_PodCast_Round3,OS_9_27_2017_08_14,SRKWs,14.093564
1,OS_9_27_2017_08_14_00__0001.wav,15.594059,2.165842,orcasound_lab,9/27/2017,Orcasound_PodCast_Round3,OS_9_27_2017_08_14,SRKWs,17.759901
2,OS_9_27_2017_08_14_00__0001.wav,53.9,2.45,orcasound_lab,9/27/2017,Orcasound_PodCast_Round3,OS_9_27_2017_08_14,SRKWs,56.35
3,OS_9_27_2017_08_14_00__0001.wav,59.781486,1.468514,orcasound_lab,9/27/2017,Orcasound_PodCast_Round3,OS_9_27_2017_08_14,SRKWs,61.25
4,OS_9_27_2017_08_19_00__0002.wav,6.592882,1.233507,orcasound_lab,9/27/2017,Orcasound_PodCast_Round3,OS_9_27_2017_08_19,SRKWs,7.826389


In [None]:
%cd /content/OrcasoundLab09272017_Test/wav

/content/OrcasoundLab09272017_Test/wav


In [None]:
extract_audio('test_calls',annot_test,"/content/Round2_OS_07_05/wav/pod_test",1)

11643.564359999998
15594.05941
53900.0
59781.486399999994
6592.881944
23011.28472
29519.09722
50769.44444
20709.25197
22725.19685
41650.0
43280.118109999996
52125.19685
54671.65354
56687.59843
60295.078740000004
0.0
2666.724257
4688.355218
7815.618521
16091.77609
18457.118179999998
20991.77609
23187.802349999998
28045.473390000003
30011.23013
31704.38839
47311.921220000004
51957.94748
60069.86869
4432.118056
7945.486111
10506.07639
12760.41667
16299.30556
17685.9375
20331.59722
23351.5625
33253.64583
34538.19444
38221.70139
46550.0
49042.534719999996
54027.60417
578.472222
5282.8125
12888.02083
19217.1875
25010.416670000002
34300.0
36750.0
49000.0
1745.84323
6401.4251779999995
9800.0
18727.07838
22971.41726
28307.878070000002
29496.99129
32577.434680000002
35595.80364
49000.0
51207.52177
52930.08709
935.7638890000001
3360.2430560000003
5486.979167
7350.0
10165.79861
12250.0
14631.94444
16767.1875
18800.34722
20842.013890000002
22245.65972
24500.0
25988.71528
27052.083329999998
29697.74

In [None]:
%cd /content/Round2_OS_07_05/Train_phase/test/calls

/content/Round2_OS_07_05/Train_phase/test/calls


In [None]:
final_plot('/content/Round2_OS_07_05/','Train_phase/test/calls/','wav/pod_calls_test')

  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is ad

In [None]:
positives_test = sl.select(annotations=std_annot_test, length=2.0)
file_durations_test = sl.file_duration_table('/content/OrcasoundLab09272017_Test/wav')
negatives_test=sl.create_rndm_backgr_selections(annotations=std_annot_test, files=file_durations_test, length=2.0, num=len(positives_test), trim_table=True)
negatives_test.to_csv('negg.tsv', mode='a', sep='\t',header=False)

In [None]:
%cd '/content/'
negatives_test.to_csv('negg.tsv', mode='a', sep='\t',header=False)

/content


In [None]:
%cd '/content/OrcasoundLab09272017_Test/wav'

/content/OrcasoundLab09272017_Test/wav


In [None]:
%cd '/content/OrcasoundLab09272017_Test/wav'

/content/OrcasoundLab09272017_Test/wav


In [None]:
neg=pd.read_csv('/content/negg.tsv',sep='\t')
extract_audio('r3neg_calls_test',neg,"/content/Round2_OS_07_05/wav/neg_pod_calls_test/",2)

22621.371638687484
27183.743214646172
45684.80657240487
781.3629465117629
9588.556285768276
31994.968540712107
45867.856407870815
11955.627594657415
34551.485412144735
40213.63348528283
44048.86046983222
49590.64207272027
57250.28521486697
28534.590431875527
51461.448316592854
16436.15236058332
32309.8920917173
36868.660064320815
1669.252463339376
2591.110830586274
11028.55729674684
19540.862843569925
23720.361496930534
32202.071910700284
43438.86375915365
15518.917076644866
57100.70200581362
57119.765248499505
36913.11168697757
53327.98366271879
19332.510692521053
21700.411356918263
24534.04836710536
32882.363348755694
2711.736083771484
5442.020037046973
9328.765750779326
11745.23514107409
35737.76694362152
40170.20142987246
42732.238979566435
55517.24428130513
3316.495816089514
26029.134813220375
33417.808559136225
34109.45382520811
43543.55589960755
53648.32791076299
13301.579604180348
18136.68931272241
20036.55965742746
12694.22207370542
13219.336796381867
32498.358229006044
32538.

In [None]:
neg.head()

Unnamed: 0,OS_9_27_2017_08_14_00__0001.wav,0,6.927192787596847,8.927192787596848,0.1
0,OS_9_27_2017_08_14_00__0001.wav,1,22.621372,24.621372,0
1,OS_9_27_2017_08_19_00__0002.wav,0,27.183743,29.183743,0
2,OS_9_27_2017_08_19_00__0002.wav,1,45.684807,47.684807,0
3,OS_9_27_2017_08_25_00__0003.wav,0,0.781363,2.781363,0
4,OS_9_27_2017_08_25_00__0003.wav,1,9.588556,11.588556,0


In [None]:
%cd /content/Round2_OS_07_05/Train_phase/test/nocalls

/content/Round2_OS_07_05/Train_phase/test/nocalls


In [None]:
final_plot('/content/Round2_OS_07_05/','test/nocalls','wav/neg_pod_calls_test')

  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is added back by InteractiveShellApp.init_path()
  if __name__ == '__main__':
  # This is ad

#Saving and training

In [None]:

!zip -r /content/pcen_and_wavelet_trainsave_two_sec_save.zip /content/Round2_OS_07_05/Train_phase

  adding: content/Round2_OS_07_05/Train_phase/ (stored 0%)
  adding: content/Round2_OS_07_05/Train_phase/.ipynb_checkpoints/ (stored 0%)
  adding: content/Round2_OS_07_05/Train_phase/test/ (stored 0%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/ (stored 0%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls98_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls65_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls63_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls16_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls80_0000.png (deflated 7%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls81_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_te

In [None]:
!zip -r /content/pcen_and_wavelet_test_two_sec_save.zip /content/Round2_OS_07_05/Train_phase/test

  adding: content/Round2_OS_07_05/Train_phase/test/ (stored 0%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/ (stored 0%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls98_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls65_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls63_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls16_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls80_0000.png (deflated 7%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls81_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls28_0000.png (deflated 8%)
  adding: content/Round2_OS_07_05/Train_phase/test/nocalls/r3neg_calls_testMMMcalls84_0000.png (defla