# Raw Audio Files to Audio Clips for Feature Extraction 
Kartik Nanda, Feb 2020

Read in the dowmloaded (raw) audio files and save them as clips to constant duration. These are
then used for the purpose of feature extraction and further analysis

In [63]:
import numpy as np
import glob
import os
from pathlib import Path
import librosa
import librosa.display
import soundfile as sf
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')

## Define Constants 

In [64]:
# Define some important constants - this needs to be moved to a separate included file
n_fft = 2048            # used for STFT: # samples used to generate fft
sample_rate = [44100, 8192, 1024]  # sample rates of the audio clips
cdur = 4                # clip duration in seconds
n_chroma = 12           # num of chroma bins to produce
n_mels = 128            # num of mel freq bands output
n_bands = 6             # num of contrast bands

## Path Setup

We use the data in the datasets folder (Under-Water audio), and extract simple features from the audio files.

The folder structure:

    audio_anomaly_model/audio_baseline/ contains this script
    audio_anomaly_model/datasets/ contains the dataset
    audio_anomaly_model/features/ contains the features output


In [65]:
# Define paths
path_proj = Path.cwd().parent                        # The project directory
path_raw_audio = path_proj/'datasets'/'downloaded'   # The downloaded audio files
path_dataset = path_proj/'datasets'/'data_uw_sounds' # Dataset for Under-Water Audio 
path_features = path_proj/'features'                 # The output folder - where the features are saved (not used here)

In [66]:
# This is a list of all the directories which contain downloaded audio files
file_paths = list(path_raw_audio.glob('*'))

In [67]:
# Read in the downloaded audio files, and split them into 4 sec segments and save as separate files
sr = sample_rate[0]
for p in file_paths:
    j=0
    files = list(p.glob('*.*'))
    print('*************** Folder (class): ', p.name, ' *******************')
    for f in files:
        if f.suffix != '.mp3':
            print('Audio File (downloaded): ', f.name)
            X, _sr = librosa.load(f, sr)
            print('Org. Sample rate: ', librosa.get_samplerate(f), ', Audio clips resampled at:', _sr)
            print('Number of audio samples: ', X.shape[0])
            num_clips = int(X.shape[0]/(sr*cdur))
            print('Writing ... number of clips: ', num_clips)
            for i in range(0, num_clips):
                outfile_name = p.name + '-' + str(i+j) + '.wav'
                sf.write(path_dataset/outfile_name, X[i*sr*cdur:(i+1)*sr*cdur], sr)
            j = j+i+1
        else: 
            print('File (', f.name, ') is mp3, skipped ...')


*************** Folder (class):  beaked_whale  *******************
File ( 404315__mbari-mars__beaked-whale-clicks.mp3 ) is mp3, skipped ...
*************** Folder (class):  blue_whale  *******************
Audio File (downloaded):  403844__mbari-mars__blue-whale-a-and-b-calls-audible-only-with-appropriate-speakers.wav
Org. Sample rate:  48000 , Audio clips resampled at: 44100
Number of audio samples:  2258116
Writing ... number of clips:  12
*************** Folder (class):  dolphin  *******************
Audio File (downloaded):  265257__aguasonic__ag25feb015-trk05.wav
Org. Sample rate:  48000 , Audio clips resampled at: 44100
Number of audio samples:  1877837
Writing ... number of clips:  10
Audio File (downloaded):  324114__listeningtowhales__07-bottlenose-dolphins-with-humpback-whales.aiff
Org. Sample rate:  44100 , Audio clips resampled at: 44100
Number of audio samples:  3429804
Writing ... number of clips:  19
Audio File (downloaded):  385796__geraldfiebig__atlantic-spotted-dolphins