## Intro

When Librosa loads an audio file, there is a default sample rate, which is the number of samples from the mp3's audio signal for one second. Therefore,
to get the length of a song in real time we take the length of the read-in array and divide by the sample rate for that audio sample.

In [None]:
# system packages
import os
import warnings
import timeit

# data stuff
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# tensorflow and keras
import tensorflow as tf
from tensorflow.keras.utils import to_categorical

## sklearn stuff
from sklearn.preprocessing import LabelEncoder

## Audio Stuff
import librosa
import librosa.display
import torchaudio

# My functions and classes
from utilities import view_melspec, read_metadata_file, Batch_generator

warnings.filterwarnings("ignore", category=UserWarning)

In [None]:
# VERY GLOBAL VARIABLES 
audio_dir = './data/fma_small/'
global_dur = 0.5
global_sr = 16000

The next code chunk is Super handy function that just searches for every mp3 file within the audio directory. 
My audio folder has a bunch of subfolders like '000', '099', etc, and the audio files are
one level further down.

In [None]:
filepaths = librosa.util.files.find_files(audio_dir)

Here is what a typical load looks like. Specifying `sr=None` ensures that the file's default sample-rate is preserved. 
We can also specify the duration with `librosa.load`.

In [None]:
example_source, sr = librosa.load(filepaths[0], sr=global_sr, duration=0.5)
librosa.get_duration(example_source, sr=sr)

In [None]:
plt.plot(example_source)
plt.title(f"The first {librosa.get_duration(example_source, sr=sr)} seconds of an audio file")
plt.xlabel("Sample Position")
plt.ylabel("Amplitude")
plt.show()

In [None]:
view_melspec(example_source, sr)

## How long does a typical load take?

In [None]:
%%timeit

librosa.load(filepaths[0], sr=None, duration = global_dur)
# librosa.get_samplerate(filepaths[0])

On my linux virtual machine, one second of audio takes between 143 and 214 milliseconds to load one second of audio (down to 129 ms if we don't force a sample rate). 
Multiplying this number by 8 is about how long it will take to load the entire dataset in. I estimate about 15 minutes for the whole FMA Small dataset.

However, on my macbook it takes about 16 ms to load one second of audio in. For five seconds of audio, about 34 ms. This is much faster than the ubuntu machine! This means an estimated 134 seconds, or about two minutes.

In [None]:
%%timeit 

torchaudio.info(filepaths[0])

This runs a lot faster at 270 microseconds on my ubuntu virtual machine, so it will be a better tool to 
check for integrity. On the macbook it takes about 330 microseconds, a bit slower.


## Check for integrity

This is the only point where I will use `torchaudio` for the `torchaudio.info` method. This method
tries to open the file without loading it into memory. This allows us to check
for corrupted mp3 files. If this leads to a very long list of files, double check the integrity
of your download.

In [None]:
bad_files = []
too_short_duration = 5.0 # seconds
for file in filepaths:
    try:
        info_obj = torchaudio.info(file)[0]
        
        # Add a file to the bad list if it is shorter than 5 seconds.
        if (info_obj.length / (info_obj.rate  * info_obj.channels)  < 5.0):
            bad_files.append(file)
            
    except RuntimeError:
        bad_files.append(file)
bad_files

## Estimating memory cost

In [None]:
# Find out how much one second costs in memory
example_source, sr = librosa.load(filepaths[0], sr=None, duration=1)

# Estimate for all songs
print(f"{(example_source.nbytes/10**9) * 8000} GB of memory for all 8000 songs") # gigabytes

## Managing Metadata 

The following code takes the giant metadata file coming with the FMA datasets and selects
the track ids and the genres.

In [None]:
metadata_path = os.path.join('data','fma_metadata', 'tracks.csv')

# See utilities.py for explanations
reduced_meta = read_metadata_file(metadata_path, filepaths, bad_files)

In [None]:
## check to make sure each path points to the right file
reduced_meta['track_id'].equals(
    reduced_meta['path'].apply(lambda x: int(os.path.split(x)[1][:-4]))
)

In [None]:
reduced_meta.head()

## Batch Generator

Since the data takes a lot of memory I will use a custom generator that loads batches as needed from the hard disk.

In [None]:
test_loader = Batch_generator(reduced_meta.iloc[:10, :], batch_size=2, sr=global_sr, duration=global_dur)

In [None]:
test_loader._stack_melspecs(filepaths[:2]).shape