# 0. FUNCTIONS

# Imported libraries & modules

In [22]:
import random
import numpy as np

# For handling data:
import csv
import pandas as pd

# For handling audio:
import librosa

# For handling plotting:
import matplotlib.pyplot as plt

# For handling graphical display:
import IPython.display as ipd

# For handling neural networks:
import keras
import tensorflow as tf
from tensorflow.keras.utils import to_categorical

# For handling file handling: 
import os

**Some preliminary issues faced & their solutions**

**ISSUE**: Dependencies for `keras`

I was unable to import `keras` without having `tensorflow` installed.

---

**ISSUE**: Changing certain OS settings to install `tensorflow`

I was unable to install `tensorflow` without referring to the following:

https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=powershell#enable-long-paths-in-windows-10-version-1607-and-later

I followed the PowerShell solution given.

---

**ISSUE**: Procedure entry point not in the dynamic link library

I faced the following error from the Windows OS (in a dialog box):

```
The procedure entry point could not be located in the dynamic link library <DDL path>.
```

NOTE: `<DDL path>` is a placeholder for the actual path.

To solve this, I simply restarted and updated the OS. To verify the integrity of the system files, I ran:

```
sfc /scannow
```

NOTE: The above needs to be run as an administrator in Command Prompt.

This solution was found here...

https://www.drivereasy.com/knowledge/fixed-entry-point-not-found-error-in-windows/

# Obtaining spectrograms & melspectrograms
Obtaining spectrograms for each audio file...

In [28]:
def get_all_spectrograms(df, audio_folder, spectrograms_storage_file_name, sr, n_fft, hop_length, n_frames):
    spectrograms = []
    spectrograms_by_file_id = {}
    
    try:
        # Obtaining stored spectrograms (if possible):
        spectrograms = list(np.load(spectrograms_storage_file_name))
        
        option = input('Regenerate spectrograms?')
        if option == 'Yes':
            raise Exception
    except:
        # Getting the spectrogram for each audio file:
        spectrograms = []
        
        #________________________
        # FOR PROGRESS BAR:
        prev_i = 0 
        max_i = len(df['TRACK'])
        #________________________
        
        for i, file_id in enumerate(df['TRACK']):
            #________________________
            # FOR PROGRESS BAR:
            if i // (max_i/12) > prev_i // (max_i/12):
                print('.', end='')
                prev_i = i
            #________________________
            
            # Loading the file using `librosa`:
            signal, _ = librosa.load(audio_folder + '/' + str(file_id) + '.LOFI' +'.mp3', sr=sr)
            # NOTE 1: 'Signal' here indicates the audio file as a whole.
            # NOTE 2: `audio_folder` was defined earlier.
            
            # Short-time Fourier transform:
            stft = librosa.core.stft(signal, hop_length=hop_length, n_fft=n_fft)
    
            # Obtain the spectrogram:
            spectrogram = np.abs(stft)
    
            #........................
            # Evening out irregularities in dimensions...
            
            # Pad spectrogram if necessary:
            if spectrogram.shape[1] < n_frames:
                spectrogram = np.pad(spectrogram, ((0, 0), (0, n_frames-spectrogram.shape[1])))
            # Truncate spectrogram if necessary
            spectrogram = spectrogram[:, :n_frames] 
            #........................
            
            # Storing the spectrogram data:
            spectrograms.append(spectrogram)
    
        print('\nDone')
        # NOTE ON PROGRESS BAR: It is designed such that maximum length is 12
        
        # Save the spectrogram data:
        np.save(spectrograms_storage_file_name, np.array(spectrograms))
    
    #================================================
    # Storing spectrograms by file ID for easy access later:
    for file_id, spectrogram in zip(df['TRACK'], spectrograms):
        spectrograms_by_file_id[file_id] = spectrogram

    return spectrograms_by_file_id

**WHY FOCUS ON FINDING AND STORING SPECTROGRAMS?**

In the course of this project, I wanted to experiment with different kinds and configurations of melspectrograms. However, the constant values were always the basic spectrograms on which functions were applied to obtain melspectrograms. Since loading audio files is the most time-consuming part of the process of obtaining useful data for the audio files, I decided to minimise the loading of audio files by loading them only once to obtain the spectrograms, after which it becomes easy to access and work on the audio-related data to experiment as I desire. For this reason, we generate or try to generate spectrograms first and foremost. The function of our interest for future use is `get_melspectrogram` (defined later), and regenerating melspectrograms once the basic spectrograms are available shall be much faster.

---

Obtaining melspectrograms for each audio file...

In [2]:
# Function to get melspectrogram for a given audio file ID:
def get_melspectrogram(file_id, spectrograms_by_file_id, sr, n_fft, hop_length, n_mels):
    spectrogram = spectrograms_by_file_id[file_id]
    
    # Converting the above to log-scaled amplitudes:
    log_spectrogram = librosa.amplitude_to_db(spectrogram)
    # Melspectrogram spectrogram with log-scaled amplitudes:
    melspectrogram = librosa.feature.melspectrogram(S=log_spectrogram, sr=sr, n_fft=n_fft, hop_length=hop_length, n_mels=n_mels)

    return melspectrogram

#================================================
# Load all melspectrograms for our purpose, else if not available then create them:
def get_all_melspectrograms(df, audio_folder, spectrograms_storage_file_name, melspectrograms_storage_file_name,
                            segments_per_file=4, sr=44100, n_fft=2048, hop_length=512, n_mels=64, n_frames=2000):
    '''
    df: Dataframe containing necessary annotated data
    audio_folder: Folder containing the audio files
    segments_per_file: Number of segments to divide each file into
    spectrograms_storage_file_name: Name of the file in which spectrograms are to be stored
    melspectrograms_storage_file_name: Name of the file in which melspectrograms are to be stored
    
    sr: Sampling rate
    n_fft: Number of sinusoids to check for in FFT (i.e. FFT size)
    hop_length: Hop length
    n_files: Number of audio files
    segments_per_file: Number of audio file segments
    n_mels: Number of mel bands to be used
    n_frames: Number of frames to divide each audio file into
    '''

    # Obtaining necessary data:
    spectrograms_by_file_id = get_all_spectrograms(df, audio_folder, spectrograms_storage_file_name, sr, n_fft, hop_length, n_frames) # Dictionary of spectrograms associated to file IDs
    n_files = len(df) # Number of files being considered = Number of rows in the dataframe
    n_segments = n_files * segments_per_file # Segments in total (all files combined)
    
    try:
        # Obtaining stored melspectrograms (if possible):
        data = list(np.load(melspectrograms_storage_file_name))
        
        option = input('Regenerate melspectrograms?')
        if option == 'Yes':
            raise Exception
    except:
        # Getting the melspectrogram for each audio file:
        
        data = []
    
        #________________________
        # FOR PROGRESS BAR:
        prev_i = 0 
        #________________________
        
        for i, file_id in enumerate(df['TRACK']):
            #________________________
            # FOR PROGRESS BAR:
            if i//(n_files/12) > prev_i//(n_files/12):
                print('.', end='')
                prev_i = i
            #________________________
            
            # Obtain the spectrogram for the current audio file:
            melspectrogram = get_melspectrogram(file_id, spectrograms_by_file_id, sr, n_fft, hop_length, n_mels)
        
            # Store a previously set number of segments of the melspectrogram obtained:
            for i in range(segments_per_file):
                data.append(melspectrogram[:, i*(n_frames//segments_per_file):(i+1)*(n_frames//segments_per_file)])
            # NOTE: Target labels are stored in `df['TARGET']`
    
        print('\nDone')
        # NOTE ON PROGRESS BAR: It is designed such that maximum length is 12
        
        # Save the melspectrograms for later:
        np.save(melspectrograms_storage_file_name, np.array(data))

    # Parameters:
    params = {}
    params['segments_per_file'] = segments_per_file
    params['sr'] = sr
    params['n_fft'] = n_fft
    params['hop_length'] = hop_length
    params['n_mels'] = n_mels
    params['n_frames'] = n_frames

    return params, data

Checking the items in `data` (for verifying the code's success)...

In [31]:
def get_random_melspectrogram(data, sr=44100, hop_length=512):
    # Checking a random melspectrogram from `data`...
    librosa.display.specshow(data[np.random.randint(0, len(data))], sr=sr, hop_length=hop_length)
    plt.title('Melspectrogram')
    plt.xlabel('Time')
    plt.ylabel('Mel bands')
    plt.colorbar()
    # NOTE: Mel bands are represented by colour
    plt.show()

# Preparing datasets
Preparing data for the following:

- Viewing and working the data and target labels in simple formats
- Working with neural networks (abstracting aspects like batches and data shuffling)

**SOME NOTES**:

- `to_categorical` was imported as `from tensorflow.keras.utils import to_categorical`
- `to_categorical` converts integer labels to the appropriate 1-hot encoding
- The below is mostly to increases convenience; we can do without it

In [1]:
# Shuffling the data (along with the corresponding labels of course):
def get_shuffled_data(df, data, segments_per_file):
    # Total labels:
    labels = []
    for label in df['TARGET']:
        labels += [label]*segments_per_file
    
    # Shuffling the data for unbiased training and testing (hence better convergence of model):
    # Joining melspectrograms and labels to shuffle data and labels in corresponding order...
    D = list(zip(data, labels))
    # Shuffling list items...
    random.shuffle(D)
    # Separating melspectograms and their labels for future convenience...
    shuffled_data = np.array([d[0] for d in D])
    shuffled_labels = np.array([d[1] for d in D])

    return shuffled_data, shuffled_labels

#================================================
# Dividing the data and labels into training and validation datasets:
def get_data_in_splits(data, labels, validation_start):
    # Specifying proportions for datasets:
    validation_start = round(validation_start*len(labels)) # Might as well be `len(data)`
    
    # Training data:
    train_data = data[:validation_start] # Feature values
    train_labels = labels[:validation_start] # Target values
    
    # Testing data:
    validation_data = data[validation_start:] # Feature values
    validation_labels = labels[validation_start:] # Target values
    
    print(f'Training data shape = {train_data.shape}, Validation data shape = {validation_data.shape}')

    return train_data, train_labels, validation_data, validation_labels

#================================================
# Get datasets wrapped in a `tf.data.Dataset` object for convenience when working with neural networks:
def get_data(df, data, num_classes, segments_per_file=4, validation_start=0.7, batch_size=32):
    # NOTE: `num_classes` = Number of target classes
    
    data, labels = get_shuffled_data(df, data, segments_per_file)
    
    # Dividing the data and labels into training and validation datasets:
    train_data, train_labels, validation_data, validation_labels = get_data_in_splits(data, labels, validation_start)
    
    #------------------------------------
    # Dictionary of training and validation data and labels in simpler data types:
    data_and_labels = {}
    data_and_labels['train_data'] = train_data
    data_and_labels['validation_data'] = validation_data
    data_and_labels['train_labels'] = train_labels
    data_and_labels['validation_labels'] = validation_labels

    #------------------------------------
    # Preparing the dataset for working in neural networks:
    train_dataset = tf.data.Dataset.from_tensor_slices((train_data, to_categorical(train_labels, num_classes=num_classes)))
    '''
    NOTE:
    Shuffling rows in training dataset helps in making the model converge in training.
    However, this is not necessary in our case since out dataset was already shuffled before.
    However, if it were necessary, we would have done it as follows:
    
    `train_dataset = train_dataset.shuffle(buffer_size=1024)`
    '''
    train_dataset = train_dataset.batch(batch_size)
    
    # Preparing the testing dataset:
    validation_dataset = tf.data.Dataset.from_tensor_slices((validation_data, to_categorical(validation_labels, num_classes=num_classes)))
    validation_dataset = validation_dataset.batch(batch_size)

    # Parameters:
    params = {}
    params['segments_per_file'] = segments_per_file
    params['validation_start'] = validation_start
    params['num_classes'] = num_classes
    params['batch_size'] = batch_size

    return params, data_and_labels, train_dataset, validation_dataset

**NOTE ON SHUFFLING DATA BEFORE DIVIDING IT**:

Shuffling the data before dividing it into training and testing datasets reduced overfitting and improved the model's accuracy (training and validation). Hence, it seems the original dataset's rows were arranged in a certain order with respect to which the model could overfit; shuffling the rows avoids this issue.

# Saving & loading models

In [5]:
def save_model(model, file_name):
    W = {}
    for i, weights in enumerate(model.get_weights()):
        W[i] = weights
    np.save(file_name, W)

def load_model(model, file_name):
    V = np.load(file_name, allow_pickle=True).tolist()
    W = []
    for i in range(len(V)):
        W.append(V[i])
    model.set_weights(W)