# Final Project - Data Collection for iOS

If you are using an iOS device to collect you training data for the final project, you can use the [Voice Recorder app of the App Store](https://apps.apple.com/us/app/voice-recorder-voz/id1336782987).

The iOS version of the app does not offer the option to record at 44 kHz. Instead you will be **recording at 48 kHz** while using all other settings as listed in the project description.

You will then need to downsample each recording before you create your data files.

* You can use the following code to downsample and save your training data:

In [1]:
import numpy as np
import os
import librosa
import pickle
from IPython.display import display, Audio

In [2]:
# mydir = 'change-this-to-your-data-directory-local-path'

### Option 1

Use the code below to play one file at a time, and manually label each recording.

This code will output and save the data files in the desired format for assignment submission.

In [None]:
labels = np.array([])
data = {}
statements = np.array([])
i=0
for file in os.listdir(mydir):
    if file.endswith(".wav"): # Will only read .wav files
        filename = mydir+'/'+file
        
        y, sr = librosa.load(filename, sr=48000) # files are recorded at 48kHz
        resample_y = librosa.core.resample(y, 48000, 44000) # downsample from 48 kHz to 44 kHz
        data[i] = resample_y
        display(Audio(filename, rate=44000, autoplay=True)) # load a local WAV file
        l = input('Type the emotion label (1,2,3,4,5,6,7,8) in this recording and then press Enter...\n')
        labels = np.hstack((labels, l))
        s = input('Type the sentence (1 or 2) being read in this recording and then press Enter...\n')
        statements = np.hstack((statements,s))
        i+=1

print('-------------------------------------------------------')
print('----------------------DONE-----------------------------')
print('-------------------------------------------------------')
if np.sum(labels=='')>0:
    print('ATTENTION, ',np.sum(labels==''), ' LABEL/S IS/ARE MISSING')
    
if np.sum(statements=='')>0:
    print('ATTENTION, ',np.sum(statements==''), ' STATEMENT/S IS/ARE MISSING')
    
print('There are ', len(data),' recordings')
print('There are ', len(labels[labels!='']),' labels')
print('There are ', len(statements[statements!='']),' statement recordings')

# Saves the files to your current directory
f = open("data.pkl","wb")
pickle.dump(data,f)
f.close()
np.save('labels', labels)
np.save('statements', labels)

### Option 2

Use the **coding system** from data collection to automatically create and save your data.

The code below will help you with that, and it will output and save the data files in the desired format for assignment submission.

In [None]:
labels = np.array([])
data = {}
statements = np.array([])
i=0
for file in os.listdir(mydir):
    if file.endswith(".wav"): # Will only read .wav files
        filewav = file
        filename = mydir+'/'+file
        y, sr = librosa.load(filename, sr=48000) # files are recorded at 48kHz
        resample_y = librosa.core.resample(y, 48000, 44000) # downsample from 48 kHz to 44 kHz
        data[i] = resample_y
        labels = np.hstack((labels, int(filewav[4])))
        statements = np.hstack((statements,int(filewav[6])))
        i+=1

print('-------------------------------------------------------')
print('----------------------DONE-----------------------------')
print('-------------------------------------------------------')

# Saves the files to your current directory
f = open("data.pkl","wb")
pickle.dump(data,f)
f.close()
np.save('labels', labels)
np.save('statements', labels)