This notebook will handle the 'odd' calls resulting from the strict FM-rate based segmentation of the itsFM package [1]. Because of the weird segmentation the measurements are also irrelevant for the study. This notebook will 'fuse' the poor segmentations (all 'regions' post cf1 will be fused into fm2 after manual checking), and measurements will be run anew for these correctly segmented calls. 

Here is an example call that has been poorly segmented because of the strict FMrate based segmentation. This leads to multiple FM or CF segments being detected because the FM rate goes up and down over the  FM hook for instance.
![](./hooked_fm_eg.png)

The other odd calls could also be the result of poor ```itsfm``` parameter choices, and may require removal of a call too. In these cases, a second check takes place.

Author: Thejasvi Beleyur 
Date: 2020-06-26

In [1]:
import datetime as dt
print(f"This notebook run was initiated at : {dt.datetime.now()}")
import itsfm
import matplotlib.pyplot as plt 
import pandas as pd 
import numpy as np 
import soundfile as sf
import sys 


This notebook run was initiated at : 2020-07-10 19:26:50.918637


In [2]:
%matplotlib notebook

In [3]:
df = pd.read_csv('odd_calls_msmts.csv')
df_by_calls = df.groupby(['video_annot_id'])
df.head()

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,audio_file,duration,peak_amplitude,peak_freq_resolution,peak_frequency,region_id,rms,start,stop,terminal_frequency,terminal_frequency_threshold,video_annot_id,num_bats
0,0,0,segment_matching_annotaudio_Aditya_2018-08-16_...,0.001556,0.259338,1282.051282,103470.437018,fm1,0.125976,0.0,0.001556,93830.33419,-10,Aditya_2018-08-16_21502300_100,2
1,1,1,segment_matching_annotaudio_Aditya_2018-08-16_...,0.017596,0.406403,113.636364,105876.335531,cf1,0.172472,0.001556,0.019152,105705.842237,-10,Aditya_2018-08-16_21502300_100,2
2,2,2,segment_matching_annotaudio_Aditya_2018-08-16_...,0.001764,0.110443,1131.221719,104875.283447,fm2,0.069835,0.019152,0.020916,92403.628118,-10,Aditya_2018-08-16_21502300_100,2
3,3,3,segment_matching_annotaudio_Aditya_2018-08-16_...,0.001088,0.089539,1824.817518,91911.764706,cf2,0.04511,0.020916,0.022004,91911.764706,-10,Aditya_2018-08-16_21502300_100,2
4,4,0,segment_matching_annotaudio_Aditya_2018-08-16_...,0.000192,0.013245,10000.0,104166.666667,fm1,0.00666,0.001288,0.00148,98958.333333,-10,Aditya_2018-08-16_21502300_102,1


In [4]:
len(df_by_calls.groups.keys())

63

In [5]:
print(pd.DataFrame(data=df_by_calls.groups.keys()))

                                 0
0   Aditya_2018-08-16_21502300_100
1   Aditya_2018-08-16_21502300_102
2    Aditya_2018-08-16_21502300_19
3        Aditya_2018-08-16_2324_14
4       Aditya_2018-08-16_2324_148
..                             ...
58         Aditya_2018-08-19_23_16
59         Aditya_2018-08-19_23_17
60         Aditya_2018-08-19_23_18
61         Aditya_2018-08-19_23_39
62  Aditya_2018-08-20_0300-0400_56

[63 rows x 1 columns]


### Actions to be taken for each 'odd' call

In [6]:
import correct_bad_segs.re_measure_bad_segs as rmbg
from itsfm import measure_hbc_call
from itsfm.measurement_functions import measure_peak_amplitude, measure_peak_frequency, measure_rms, measure_terminal_frequency

In [7]:
call_id = 'Aditya_2018-08-17_23_56'

df_by_calls.get_group(call_id)['region_id']

280    fm1
281    cf1
282    fm2
283    cf2
284    fm3
Name: region_id, dtype: object

In [8]:
# make a dictionary containing the relevant call IDs as keys and old_regions_list + reassigned_list as entries
fm2cf2_to_fm2 = ([['fm2','cf2']] , ['fm2'])
cf1fm1_to_fm1 = ([['cf1','fm1']], ['fm1'])
fm1cf2_to_fm1 = ([['fm1','cf2']], ['fm1'])

cf1fm2cf2_to_cf1 = ([['cf1','fm2','cf2']], ['cf1'])

fm1cf2fm2_to_fm1 = ([['fm1', 'cf2', 'fm2']], ['fm1'])
fm2cf2fm3_to_fm2 = ([['fm2', 'cf2', 'fm3']], ['fm2'])
fm2cf2fm3cf3_to_fm2 = ([['fm2', 'cf2', 'fm3','cf3']], ['fm2'])
fm1cf2fm2cf3fm3_to_fm1 = ([['fm1','cf2','fm2','cf3','fm3']], ['fm1'])
fm1cf1fm2_to_fm1 = ([['fm1', 'cf1', 'fm2']], ['fm1'])
fm2cf2fm3cf3_to_fm2 = ([['fm2', 'cf2', 'fm3','cf3']], ['fm2'])

action_delete = ([],[])

In [9]:
corrections = {    
    'Aditya_2018-08-16_21502300_100'  : fm2cf2_to_fm2,
    'Aditya_2018-08-16_21502300_102'  : ([['fm1','cf1'], ['fm2','cf2','fm3']],['cf1', 'fm1']),
    'Aditya_2018-08-16_21502300_19'   : ([['fm2','fm3'],['fm2']]),
    'Aditya_2018-08-16_2324_14'       : cf1fm1_to_fm1,
    'Aditya_2018-08-16_2324_148'      : fm1cf2fm2_to_fm1,
    'Aditya_2018-08-16_2324_154'      : fm2cf2_to_fm2,
    'Aditya_2018-08-16_2324_197'      : action_delete,
    'Aditya_2018-08-16_2324_207'      : fm2cf2fm3cf3_to_fm2,
    'Aditya_2018-08-16_2324_223'      : fm2cf2_to_fm2,
    'Aditya_2018-08-16_2324_230'      : fm2cf2_to_fm2,
    'Aditya_2018-08-16_2324_258'      : fm2cf2_to_fm2,
    'Aditya_2018-08-16_2324_28'       : fm1cf2fm2cf3fm3_to_fm1,
    'Aditya_2018-08-16_2324_68'       : fm2cf2_to_fm2,
    'Aditya_2018-08-16_2324_90'       : cf1fm1_to_fm1,
    'Aditya_2018-08-16_2324_94'       : fm1cf2fm2_to_fm1,
    'Aditya_2018-08-17_01_15'         : cf1fm1_to_fm1,
    'Aditya_2018-08-17_01_28'         : fm1cf2_to_fm1,
    'Aditya_2018-08-17_01_29'         : fm2cf2fm3cf3_to_fm2,
    'Aditya_2018-08-17_01_35'         : fm1cf2fm2_to_fm1,
    'Aditya_2018-08-17_01_40'         : ([['cf1','fm2','cf2','fm3','cf3']], ['cf1']),
    'Aditya_2018-08-17_01_46'         : fm2cf2fm3_to_fm2,
    'Aditya_2018-08-17_01_60'         : fm1cf2fm2_to_fm1,
    'Aditya_2018-08-17_01_80'         : fm2cf2_to_fm2,
    'Aditya_2018-08-17_01_81'         : cf1fm1_to_fm1,
    'Aditya_2018-08-17_12_100'        : ([['cf1','fm1','cf2','fm2','cf3','fm3','cf4']], 
                                        ['cf1']),
    'Aditya_2018-08-17_12_108'        : action_delete,
    'Aditya_2018-08-17_12_110'        : action_delete,
    'Aditya_2018-08-17_12_117'        : action_delete,
    'Aditya_2018-08-17_12_121'        : ([['fm1','cf1','fm2']], ['fm1']),
    'Aditya_2018-08-17_12_76'         : ([['fm1','cf1','fm2']], ['fm1']),
    'Aditya_2018-08-17_34_40'         : fm2cf2_to_fm2,
    'Aditya_2018-08-17_34_65'         : fm2cf2fm3_to_fm2,
    'Aditya_2018-08-17_34_71'         : ([['cf1','fm1'], ['fm2','cf3']], ['fm1', 'fm2']),
    'Aditya_2018-08-17_34_72'         : ([['fm1','cf2','fm2','cf3','fm3']], ['fm1']),
    'Aditya_2018-08-17_45_116'        : fm2cf2_to_fm2,
    'Aditya_2018-08-17_45_126'        : action_delete,
    'Aditya_2018-08-17_45_127'        : fm1cf2fm2_to_fm1,
    'Aditya_2018-08-17_45_173'        : cf1fm1_to_fm1,
    'Aditya_2018-08-17_45_200'        : cf1fm2cf2_to_cf1,
    'Aditya_2018-08-17_45_203'        : action_delete,
    'Aditya_2018-08-17_45_269'        : cf1fm1_to_fm1,
    'Aditya_2018-08-17_45_29'         : fm2cf2_to_fm2,
    'Aditya_2018-08-17_45_337'        : ([['fm2','cf2','fm3','cf3','fm4','cf4']], ['fm2']),
    'Aditya_2018-08-17_45_342'        : ([['cf1','fm1'], ['fm2','cf3']], ['fm1', 'fm2']),
    'Aditya_2018-08-19_0120-0200_110' : fm2cf2fm3_to_fm2,
    'Aditya_2018-08-19_0120-0200_112' : cf1fm1_to_fm1,
    'Aditya_2018-08-19_0120-0200_59'  : fm1cf2_to_fm1,
    'Aditya_2018-08-19_0120-0200_90'  : ([['cf1','fm1'], ['fm2','cf3']], ['fm1', 'fm2']),
    'Aditya_2018-08-19_23_16'         : ([['fm1', 'cf1', 'fm2','cf2']], ['cf1']),
    'Aditya_2018-08-19_23_17'         : fm2cf2fm3_to_fm2,
    'Aditya_2018-08-19_23_18'         : fm1cf1fm2_to_fm1,
    'Aditya_2018-08-19_23_39'         : fm2cf2fm3cf3_to_fm2,
    'Aditya_2018-08-20_0300-0400_56'  : fm2cf2fm3_to_fm2,
    'Aditya_2018-08-17_23_111'        : cf1fm2cf2_to_cf1,
    'Aditya_2018-08-17_23_133'        : fm2cf2_to_fm2,
    'Aditya_2018-08-17_23_14'         : ([['cf1', 'fm1'], ['fm2', 'cf3', 'fm3']], ['cf1', 'fm2']),
    'Aditya_2018-08-17_23_145'        : cf1fm2cf2_to_cf1,
    'Aditya_2018-08-17_23_15'         : cf1fm1_to_fm1,
    'Aditya_2018-08-17_23_173'        : action_delete,
    'Aditya_2018-08-17_23_196'        : ([['cf1','fm1'], ['fm2','cf3','fm3','cf4']], ['fm1','fm2']),
    'Aditya_2018-08-17_23_56'         : fm2cf2fm3_to_fm2,
    'Aditya_2018-08-17_23_70'         : ([['cf1','fm1'], ['fm2','cf2','fm3']], ['cf1', 'fm2']),
    'Aditya_2018-08-17_23_84'         : ([['cf1','fm2','cf2']], ['cf1'])
                }

In [10]:
len(corrections.keys())

63

In [11]:
audio_folder = '../hp_ind_calls/'

all_measurements  = []


for call_id, correction_data in corrections.items():
    try:
        eg = df_by_calls.get_group(call_id)
        old_regions_correction, reassigned_correction = correction_data
        if not np.logical_and(len(old_regions_correction)==0,
                          len(reassigned_correction)==0):
            
            cf, fm, corr = rmbg.create_correct_boolean_masks(eg, audio_folder, rmbg.multi_fuse_old_to_new_regions,
                                              old_regions_list=old_regions_correction,
                                              reassigned_regions_list=reassigned_correction)

            audio, fs = rmbg.load_audio_from_call_region(corr.reset_index(drop=True), audio_folder)

            custom_measures = [measure_peak_amplitude, measure_peak_frequency, measure_rms, measure_terminal_frequency]
            measurement_results = measure_hbc_call(audio, fs,
                                                    cf, fm,
                                                    measurements=custom_measures)
            eg2_reset = eg.reset_index()
            measurement_results['audio_file'] = eg2_reset['audio_file'][0]
            measurement_results['num_bats'] = eg2_reset['num_bats'][0]
            measurement_results['video_annot_id'] = eg2_reset['video_annot_id'][0]
            all_measurements.append(measurement_results)
    except:
        print(f"FAILED TO PROCESS {call_id}")

Match found!
Match found!
Match found!


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  corrected_region['region_id'] = reassigned_region
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  corrected_region['start'] = new_start
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  corrected_region['stop'] = new_stop


Match found!
FAILED TO PROCESS Aditya_2018-08-16_21502300_19
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Match found!
Mat

In [12]:
# put all corrected call measurements together :
corrected_call_measurements = pd.concat(all_measurements)
corrected_call_measurements.to_csv('correctly_handled_call_measurements.csv')

## References

- ### [1] Beleyur, T, itsfm : Identify, Track and Segment sound (by) Frequency (and its) Modulation, (Python package

In [13]:

print(f"This notebook run ended at : {dt.datetime.now()}")

This notebook run ended at : 2020-07-10 19:26:54.725636
