# Install AVA

#### In your terminal:

>git clone https://github.com/pearsonlab/autoencoded-vocal-analysis \
>cd autoencoded-vocal-analysis \
>pip install .

Taken from https://github.com/pearsonlab/autoencoded-vocal-analysis

In [1]:
#un-comment this and add where your ava repo was cloned if you get a 'no module found' error
# import sys
# ava_path = '/Users/rep359/code/autoencoded-vocal-analysis/' #path to ava directory
# sys.path.append(ava_path) #this adds the location of ava to your path

import os
from ava.segmenting.segment import segment
from ava.segmenting.amplitude_segmentation import get_onsets_offsets
from ava.segmenting.utils import get_spec, softmax, clean_segments_by_hand
from ava.segmenting.segment import tune_segmenting_params

import vocalization_segmenting as vs

# Preprocessing Step 1: set initial segmentation parameters

For this step, you will need a single audio recording with vocalizations in it. A set of default parameters for segmentation have been chosen to start, but you will optimize these parameters for your particular dataset in step 2. 

In [2]:
#path to an audio file of interest
wav_path = '2020_07_22_15_52_33_369348_merged.wav'

In [3]:
percentiles, sr = vs.get_spec_min_max(wav_path, start_s=867.442, stop_s=867.442+15) #I picked a portion of the file where there were lots of vocalizations

seg_params = {
    'min_freq': 500, # minimum frequency
    'max_freq': 62.5e3, # maximum frequency
    'nperseg': 512, # FFT
    'noverlap': 256, # FFT
    'spec_min_val': -8, # minimum log-spectrogram value
    'spec_max_val': -7.25, # maximum log-spectrogram value
    'fs': 125000, # audio samplerate
    'th_1':2, # segmenting threshold 1
    'th_2':5, # segmenting threshold 2
    'th_3':2, # segmenting threshold 3
    'min_dur':0.03, # minimum syllable duration
    'max_dur': 0.3, # maximum syllable duration
    'smoothing_timescale': 0.007, # amplitude
    'softmax': False, # apply softmax to the frequency bins to calculate
                      # amplitude
    'temperature':0.5, # softmax temperature parameter
    'algorithm': get_onsets_offsets, # (defined above)
}

# Preprocessing Step 2: tune parameters + segment

Navigate to your audio directory and look at 'tuning.pdf' after clicking through the initial parameter tuning cycle below (no need to change anything the firs time around). Adjust your 'th_1' and 'th_3' to a threshold value that makes sense given the vocalization amplitude traces. I usually keep them the same - more info [here](https://autoencoded-vocal-analysis.readthedocs.io/en/latest/segment.html). Re-run the tuning cycle. Do the onsets (blue) and offsets (red) look correct? If so, move on. If not, continue tuning.

In [4]:
audio_directories = [os.path.dirname(wav_path)] # list of audio directories
tuning_fn = os.path.join(os.path.dirname(wav_path), 'tuning.pdf')
seg_params = tune_segmenting_params(audio_directories, seg_params, img_fn='seg-test.pdf')

Tune segmenting parameters
---------------------------
Set value for min_freq: [500] 
Set value for max_freq: [62500.0] 
Set value for nperseg: [512] 
Set value for noverlap: [256] 
Set value for spec_min_val: [-8] 
Set value for spec_max_val: [-7.25] 
Set value for fs: [125000] 
Set value for th_1: [2] 
Set value for th_2: [5] 
Set value for th_3: [2] 
Set value for min_dur: [0.03] 
Set value for max_dur: [0.3] 
Set value for smoothing_timescale: [0.007] 
Set value for temperature: [0.5] 
searching
searching
searching
Continue? [y] or [s]top tuning or [r]etune params: s


In [None]:
segment(os.path.dirname(wav_path), os.path.join(os.path.dirname(wav_path), 'segment'), seg_params)

# Processing (parallel): get onsets/offsets for many files

In [None]:
audio_dirs = ['cohort2_combined_audio',
              'cohort4_combined_audio',
              'cohort5_combined_audio']

segment_dirs = [os.path.join(audio_dir, 'segments') for audio_dir in audio_dirs]

from joblib import Parallel, delayed
from itertools import repeat

gen = zip(audio_dirs, segment_dirs, repeat(seg_params))
Parallel(n_jobs=-1, verbose=11)(delayed(segment)(*args) for args in gen)

# Vocalization extraction: compute spectral flatness

In [None]:
fns = glob('*/*.wav') #get paths to all the audio files

In [None]:
#loop through all your files and filter out vox from non vox using spectral flatness
for fn in fns:
    vs.filter_segments(fn)
    print()