# Audio Common
**Thnx to the SF Study Group practicioners: @aamir7117 @marii @simonjhb @ste @ThomM @zachcaceres**

This module contains the "common" data to all other modules, like basic types.

## Setup

### Settings

In [1]:
#Export
from pathlib import Path
import mimetypes
import torchaudio
from IPython.display import Audio

### Constants and definitions

In [2]:
#Export
audio_extensions = tuple(str.lower(k) for k,v in mimetypes.types_map.items() if v.startswith('audio/'))

In [3]:
#Export 
def getFastAiWorkingDirectory(folder, printWorkingDir=True):
    '''Returns the standard working directory for fast.ai for a secific dataset'''
    path = Path(Path.home()/'.fastai/data/')/folder
    if path.exists: print(f'Working directory: {path}')
    else: print('Missing data folder')
    return path

### Helper functions

In [4]:
#Export
def toNumpy(t):
    'This convenience function to simplify numpy interoperability'
    if t.is_cuda: t = t.cpu() #Should do .cpu before .numpy
    return t.numpy()

## AudioData

This is the base class of our audio data. It contains two basic information about the "sound":
* sig: the actual signal
* sr: the sample rate

**IMPORTANT:** the audio signal is expected to be one-dimensional i.e. mono. If you have stereo recordings, you should downsample to mono. Later, we could handle this as a preprocessing step, and/or handle stereo files natively.

In [5]:
#Export
class AudioData:
    '''Holds basic information from audio signal'''
    def __init__(self, sig, sr=16000): 
        self.sig = sig.reshape(-1) # We want single dimension data
        self.sr = sr

    def _repr_html_(self): 
        return self.ipy_audio._repr_html_()

    def hear(self, title=None):
        if title is not None: print(title)
        display(self.ipy_audio)

    @property
    def ipy_audio(self):
        return Audio(data=toNumpy(self.sig), rate=self.sr)
        
    @classmethod
    def load(cls, fileName, **kwargs):
        p = Path(fileName)
        if p.exists() & str(p).lower().endswith(audio_extensions):
            signal,samplerate = torchaudio.load(str(fileName))
            return AudioData(signal,samplerate)
        raise Exception(f"Error while processing {fileName}: file not found or does not have valid extension: {audio_extensions}")

## Tests

VERY rudimentary. Ideally we would have sample data that we knew would fail (e.g. non-audio data, audio data with wrong extensions, stereo samples, etc). 

### Sample data for our tests

In [6]:
#Export
def getSampleAudioDataFiles():
    from fastai import datasets #No cheating - used in data block notebook
    path = getFastAiWorkingDirectory("ST-AEDS-20180100_1-OS")
    data_url = 'http://www.openslr.org/resources/45/ST-AEDS-20180100_1-OS'
    datasets.download_data(url=data_url, fname=path)
    return path

In [7]:
path = getSampleAudioDataFiles()

Working directory: /home/ste/.fastai/data/ST-AEDS-20180100_1-OS


In [8]:
sample_file = path.ls()[0] # arbitrary choice of file
sample_file

PosixPath('/home/ste/.fastai/data/ST-AEDS-20180100_1-OS/m0005_us_m0005_00373.wav')

In [9]:
sampleAudioData = AudioData.load(sample_file)
display(sampleAudioData)
print(f'sig.shape={sampleAudioData.sig.shape} sr={sampleAudioData.sr}')

sig.shape=torch.Size([66560]) sr=16000


### More reasonable tests

In [10]:
def is_mono(a): assert 1 == len(a.sig.shape), "Not single dim"
def is_16kHz(a): assert 16000 == a.sr, "Not 16kHz"
def has_data(a): assert a.sig.shape[0] > 100, "Not more than 100 samples"

In [11]:
allTests = lambda x: [f(x) for f in [is_mono, is_16kHz, has_data]]

In [12]:
def test_AudioData_create_from_audio_file_path(f):
    a = AudioData.load(f)
    allTests(a)
    print(f"{f} passed loading from file")

In [13]:
def test_AudioData_create_from_data(f):
    signal,samplerate = torchaudio.load(f)
    a = AudioData(signal,samplerate)
    allTests(a)
    print(f"{f} passed loading from data")

In [14]:
inps = [sample_file, "badpath"] ## Should have bad_samples too
for inp in inps:
    try:
        test_AudioData_create_from_audio_file_path(inp)
        test_AudioData_create_from_data(inp)
    except Exception as e: print(e)

/home/ste/.fastai/data/ST-AEDS-20180100_1-OS/m0005_us_m0005_00373.wav passed loading from file
/home/ste/.fastai/data/ST-AEDS-20180100_1-OS/m0005_us_m0005_00373.wav passed loading from data
Error while processing badpath: file not found or does not have valid extension: ('.aif', '.aifc', '.aiff', '.au', '.mp2', '.mp3', '.ra', '.snd', '.wav')


<span style="color:red">**Careful - trying to torchaudio.load() a non-audio file breaks the kernel!!**</span> That's why we check the extension in `AudioData.load`.

## Export

In [15]:
!python notebook2script.py 08za_AudioCommon.ipynb

Converted 08za_AudioCommon.ipynb to exp/nb_08za.py
