## Parse and Convert Audio Files
---

#### Import Libraries

In [3]:
import os
import pydub
import pandas as pd
from scipy.io import wavfile
from pydub import AudioSegment
from pydub.silence import split_on_silence, detect_silence
from pydub.playback import play
import time
# import dolby

#### Define ffmpeg and ffprobe Path for PyDub Converter

In [5]:
pydub.AudioSegment.converter = '/Users/YOUR-USER/anaconda3/bin/ffmpeg'
pydub.AudioSegment.ffmpeg = '/Users/YOUR-USER/anaconda3/bin/ffmpeg'
pydub.AudioSegment.ffprobe = '/Users/YOUR-USER/anaconda3/bin/ffprobe'

We used the pydub Python package to parse and segment our mp3 files. Pydub accepts mp3 files and has methods that detect silence, detect audio, and split the mp3 files based on silence. We utilized the last method to pull samples of audio from our data. There are two parameters we passed in as well -- min_silence_len and silence_thresh. These determine how long a pause needs to be in milliseconds before it’s registered as silence, and how loud something needs to be in dbFS before its registered as audio. We played with a few min_silence_len and silence_thresh values, but settled on 2.5 seconds and -16dbFS for our dataset. 

### Clean up the audio
We used dolby's dialogue enhancement API to clean up the audio in 3 ways:
<details><summary>Noise Reduction</summary>
    Capturing audio can pick up a variety of noises that distract from the dialog the listener wants to focus on.  The Noise Processing API helps you fix audio recorded in noisy environments or with poor equipment to create consistency across your recordings.</details>
<details><summary>Sibilance Attenuation</summary>
    To create studio sound for the dialog in recorded audio, it is useful to detect and reduce the sibilance captured by the microphone.  This is a characteristic of harsh consonant sounds like "s", "sh", "x", "ch", "t", and "th". </details>
<details><summary>Tone Repair</summary>
    In order to give the audio more presence and authority it is valuable to modify the tone.  The Tone Processing API helps with equalization to shape the audio from your recorded files to match a listener profile.  This is especially important when the original audio is limited in tone, as is the case with police radio, which is why it often has a thin, tinny characteristic to the voices.</details>

The Dolby API requires the audio files be available online to fetch and save repaired audio files.  We used an Amazon Web Services server to upload and host the files, which were then saved back into the repo.  The code below creates a flow of the audio files through noise reduction, then sibilance, and then tone correction.

In [None]:
# File path may vary
files = os.listdir('../testing/mp3_downloads/')[1:]

# Folder names on the AWS server.  Audio files will be opened and saved into the subsequent
# folder as they flow through the dolby API processes.
chain = ["rough","noise","sibilance","tone"]

# These processes are described above, see https://dlby.io/documentation/noise/getting-started
# for documentation
process = [dolby.noise, dolby.sibilance, dolby.tone]

for i,link in enumerate(chain[1:]):
    
    # URL link is to AWS server
    for name in files:
        token = {
        "input": {
        "url": f"https://username.s3-us-west-1.amazonaws.com/{chain[i]}/{name}",
        "auth": {
          "key": "XXXX",                # AWS Key
          "secret": "XXXXXxxXX",        # AWS Secret
    #       "token": "XXXX"        }    # AWS Token
      },
      "output": {
        "url": f"s3://username/{link}/{name}",
        "auth": {
          "key": "XXXX",                # AWS Key
          "secret": "XXXXxxXXXX",       # AWS Secret
    #       "token": "XXXxXXXX"     }}} # AWS Token

        process[i](key='XxxXXxX',       # Dolby API Key
                    url='https://api.dolby.com',
                    input_file = f"https://username.s3-us-west-1.amazonaws.com/{chain[i]}/{name}",
                    output_file= f"s3://username/{link}/{name}",   # AWS folders with Audio
                    more = token
                    )



#### Parse Through Files and Convert to .wav

In [4]:
# Starting time
t0 = time.time()

# Insert file path with mp3 downloads
file_path = '../testing/mp3_clean/'

# Loop through all files in file_path
for n, file in enumerate(os.listdir(file_path)):
    
    # Starting time
    t1 = time.time()
    
    # If file extension is .mp3
    if file.endswith('.mp3'):
        
        # Define audio to be split
        sound_file = AudioSegment.from_mp3(file_path + file)
        
        # Print file n, lenght and avg loudness for each file
        print(f'Analyzing File: {n}')
        print(f'File Lenght: {len(sound_file) / 1000}s')
        print(f'File Average Loudness: {round(sound_file.dBFS, 2)} dBFS')
        
                
        # Split files into smaller chunks
        chunks = split_on_silence(sound_file,                           # sound_file to be split
                                  min_silence_len = 4_000,              # min_silence_len in ms
                                  silence_thresh = sound_file.dBFS - 2, # silence thresh based on avg loudness
                                  keep_silence = 100)                   # add 100ms of silence before and after
        
        # Loop through all chunks and select the ones that are at least 2 seconds and less than 60 seconds
        for i, chunk in enumerate(chunks):
            
            #  Export chunks to that are at least 5 seconds and less than 60 seconds to .wav
            if (len(chunk) > 5_000) and (len(chunk) < 60_000):
                chunk.export(f"../testing/wav_clean_converted_files/{file}_{i}.wav", format="wav")
        
        # Starting time
        print(f'File {n} RunTime: {round(time.time() - t1, 2)}s')
        print(' ')
        
# Print total runtime
print(f'Total RunTime:{round(time.time() - t0, 2)}s')

Analyzing File: 0
File Lenght: 962.014s
File Average Loudness: -29.04 dBFS
File 0 RunTime: 150.41s
 
Analyzing File: 1
File Lenght: 871.996s
File Average Loudness: -28.41 dBFS
File 1 RunTime: 155.04s
 
Analyzing File: 2
File Lenght: 818.0s
File Average Loudness: -28.67 dBFS
File 2 RunTime: 134.34s
 
Analyzing File: 3
File Lenght: 771.999s
File Average Loudness: -28.2 dBFS
File 3 RunTime: 127.67s
 
Analyzing File: 4
File Lenght: 856.009s
File Average Loudness: -28.64 dBFS
File 4 RunTime: 132.14s
 
Total RunTime:699.6s


**Code Adapted from:** [Mitchell Bohman, Nour Zahlan, and Masiur Abik](https://github.com/mchbmn/radio-to-location) and [Joseph Hopkins, Carol, Chiu, Anthony Chapman, Kwamae Delva](https://github.com/delvakwa/police_radio_to_mapping)