# Data Processing Walkthrough
This is the notebook used to detail the data processing for training the Vocal Pitch Modulator.

The goes through in detail (with plots and prints) how the data is organized. 

## Global variables/Imports
Run these cells before running either of the following sections.

In [None]:
%load_ext autoreload
%autoreload 1

import os
import csv

import scipy.io as sio
from scipy.io import wavfile
from scipy.io.wavfile import write

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import subplots

%aimport VPM
from VPM import *
%aimport Utils
from Utils import *

In [None]:
# Constants that should not change without the dataset being changed
n_pitches = 16
n_vowels = 12
n_people = 3

# These dictionaries are more for reference than anything
label_to_vowel = { 0: "bed",  1: "bird",   2: "boat",  3: "book", 
                   4: "cat",  5: "dog",    6: "feet",  7: "law",  
                   8: "moo",  9: "nut",   10: "pig",  11: "say" }

vowel_to_label = { "bed": 0,  "bird": 1,  "boat":  2, "book":  3,
                   "cat": 4,  "dog":  5,  "feet":  6, "law":   7,
                   "moo": 8,  "nut":  9,  "pig":  10, "say":  11}

noteidx_to_pitch = {  0: "A2",   1: "Bb2",  2: "B2",   3: "C3",
                      4: "Db3",  5: "D3",   6: "Eb3",  7: "E3", 
                      8: "F3",   9: "Gb3", 10: "G3",  11: "Ab3",
                     12: "A3",  13: "Bb3", 14: "B3",  15: "C4" }

### Getting data references
Read the reference csv to relevant data structure.

`data_ref_list` is the list of filenames in the dataset in a 3d array format.
A specific file is accessed with `data_ref_list[vowel_idx][pitch_idx][person_idx]`.

`flat_data_ref_list` is the list of filenames in the dataset as a 1d array. To access a specific file, use `flat_data_ref_list[flat_ref_idx(vowel, pitch, person)]`

In [None]:
# e.g. data_list[vowel_to_label["dog"]][5][1]
data_ref_list = create_data_ref_list(os.path.join("Data", 'dataset_files.csv'),
                                     n_pitches, n_vowels, n_people)
# print(data_ref_list)
# e.g. flat_data_ref_list[flat_ref_idx(3, 1, 2)]
flat_data_ref_list = flatten_3d_array(data_ref_list, 
                                      n_vowels, n_pitches, n_people)

The following are the accessor functions used to compute indices from flat to 3d and vice versa.

`flat_ref_idx` returns a `flat_ref_idx`, given a `(vowel, pitch, person)`, while `nd_ref_idx` returns `vowel, pitch, person`, given a `flat_ref_idx`.

In [None]:
# Returns a flat_ref_idx, given a vowel, pitch, person
flat_ref_idx = lambda vowel, pitch, person: flat_3d_array_idx(
    vowel, pitch, person, n_vowels, n_pitches, n_people)
# Returns vowel, pitch, person, given a flat_ref_idx
nd_ref_idx = lambda idx: nd_array_idx(idx, n_vowels, n_pitches, n_people)

In [None]:
print("Data ref list ({}):".format(len(flat_data_ref_list)), 
      flat_data_ref_list)

### Data-label Pitch Index pairs
Generate the data-label pitch index pairs. This is an array where each element is a 3-tuple of `[shift_amt, input_pitch_idx, label_pitch_iIdx]`.


In [None]:
data_label_pairs, _ = create_data_label_pairs(n_pitches)

In [None]:
print("Total data-label pairs ({}):".format(len(data_label_pairs)), 
      data_label_pairs)

### Get All .wav Data
Get the wav file data into a single matrix, where each element `all_wav_data[idx]` is the wavfile content of the file at `flat_data_ref_list[idx]`. To retrieve the 3d indices of a specific index, use `vowel, pitch, person = nd_ref_idx(idx)`.


In [None]:
all_wav_data = load_wav_files(os.path.join("Data", "dataset"), 
                              flat_data_ref_list)

In [None]:
print("All wav data length: {}\nTrack length: {}".format(
      all_wav_data.shape, all_wav_data[0].shape))

### Create all spectrograms
Get the spectrograms for each wav in `all_wav_data`. The spectrogram at `all_spectrograms[idx]` is the spectrogram of the wav at `all_wav_data[idx]`.

In [None]:
all_spectrograms = np.array([ stft(waveform, plot=False) 
                              for waveform in all_wav_data ])

In [None]:
print("All spectrograms has shape: {} (n_wavs, n_freq_bins, n_windows)\n"
      .format(all_spectrograms.shape))

print("FFT Spectrogram of vowel 4, pitch 3, person 2 ({}):"
      .format(flat_data_ref_list[flat_ref_idx(4, 3, 2)]))
plot_ffts_spectrogram(all_spectrograms[flat_ref_idx(4, 3, 2)], sample_rate,
                      flat_data_ref_list[flat_ref_idx(4, 3, 2)])

### Create Mel Spectrograms and MFCC
Get the mel spectrograms/MFCC for each ffts (spectrogram) in `all_spectrograms` (similar indexing as above).

In [None]:
all_mels, all_mfcc = map(np.array, map(list, zip(*
                         [ ffts_to_mel(ffts, n_mels = 128) 
                           for ffts in all_spectrograms ])))

In [None]:
print("All mels has shape: {} (n_wavs, n_mels, n_windows)"
      .format(all_mels.shape))
print("All mfccs has shape: {} (n_wavs, n_mfcc, n_windows)\n"
      .format(all_mfcc.shape))

print("Mel Spectrogram of vowel 4, pitch 3, person 2 ({}):"
      .format(flat_data_ref_list[flat_ref_idx(4, 3, 2)]))
plot_mel_spectrogram(all_mels[flat_ref_idx(4, 3, 2)], sample_rate,
                     flat_data_ref_list[flat_ref_idx(4, 3, 2)])
print("MFCC of vowel 4, pitch 3, person 2 ({}):"
      .format(flat_data_ref_list[flat_ref_idx(4, 3, 2)]))
plot_mfcc(all_mfcc[flat_ref_idx(4, 3, 2)], sample_rate,
          flat_data_ref_list[flat_ref_idx(4, 3, 2)])