# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint

In [None]:
#@title Explanation Video
from IPython.display import HTML

HTML("""<video width="800" height="500" controls>
  <source src="https://cdn.talentsprint.com/aiml/aiml_2020_b14_hyd/experiment_details_backup/Hackathon_Voice_based.mp4" type="video/mp4">
</video>
""")

* Hackathon: Voice commands based food ordering system
The goal of the hackathon is to train your model on different types of voice data (such as studio data, noisy data and finally on your own team data)

## Grading = 40 Marks

### **Objectives:**

Stage 0 - Obtain Features from Audio samples using Pre-trained Network

Stage 1 (17 Marks) - Train a classifier on Studio data and deploy the model in the server 

Stage 2 (10 Marks) - Use 'Noisy_data' and 'Studio_data' together, train a classifier on the same, and deploy the model in the server.

Stage 3 (13 Marks) - Collect your voice samples (team data) and refine the classifier trained on Studio_data and Noisy_data. Deploy the model in the server.

## Dataset Description

The data contains voice samples of classes - Zero, One, Two, Three, Four, Five, Six, Seven, Eight, Nine, Yes and No. Each class is denoted by a numerical label from 0 to 11.

The audio files collected in a Studio dataset contain very few noise samples whereas the audio files collected in a Noisy dataset contain more noise samples. In both datasets, noise and speech are mixed and are in wav format.

The audio files recorded for the studio and noisy data are saved with the following naming convention: 

● Class Representation + user_id + sample_ID (or n + sample_ID)

● For example: The voice sample by the user b2, which is “Yes”, is saved as 10_b2_35.wav. Here 35 is sample ID 

● The ‘10’ that you see above is the label of that sample 


In [None]:
#@title Please run the setup to download the dataset

from IPython import get_ipython
ipython = get_ipython()
  
notebook= "Hackathon1 - Voice Food Ordering System" #name of the notebook

def setup():
    ipython.magic("sx wget https://cdn.talentsprint.com/aiml/Experiment_related_data/Week8/Hackathon2/Noisy_data.zip")
    ipython.magic("sx wget https://cdn.iiith.talentsprint.com/aiml/Hackathon_data/studio_rev_data.zip")
    ipython.magic("sx wget https://cdn.talentsprint.com/aiml/Experiment_related_data/Week8/Hackathon2/net_speech_89.pt")
    ipython.magic("sx unzip studio_rev_data.zip")
    ipython.magic("sx unzip Noisy_data.zip")
    ipython.magic("sx pip install torch torchvision")
    ipython.magic("sx pip install librosa")
    print ("Setup completed successfully")

setup()

Setup completed successfully


In [None]:
import torch
from torch.autograd import Variable
import numpy as np
import librosa
import os
import warnings
from time import sleep
import sys
warnings.filterwarnings('ignore')

## **Stage 0:** Obtain Features from Audio samples using Pre-trained Network
---

### Pretrained Network for deep features


The following function contains code to load a pre-trained network to produce deep features of the audio sample. This network is trained with delta MFCC features of mono channel 8000 bit rate audio sample.

In [None]:
def get_network():

    net = torch.nn.Sequential()

    saved_net = torch.load("net_speech_89.pt").cpu()

    for index, module in enumerate(saved_net):
        net.add_module("layer"+str(index),module)
        if (index+1)%17 == 0 :
            break
    return net

In [None]:
get_network()

Sequential(
  (layer0): Linear(in_features=900, out_features=800, bias=True)
  (layer1): ReLU()
  (layer2): Linear(in_features=800, out_features=700, bias=True)
  (layer3): ReLU()
  (layer4): Linear(in_features=700, out_features=600, bias=True)
  (layer5): ReLU()
  (layer6): Linear(in_features=600, out_features=500, bias=True)
  (layer7): ReLU()
  (layer8): Linear(in_features=500, out_features=400, bias=True)
  (layer9): ReLU()
  (layer10): Linear(in_features=400, out_features=300, bias=True)
  (layer11): ReLU()
  (layer12): Linear(in_features=300, out_features=200, bias=True)
  (layer13): ReLU()
  (layer14): Linear(in_features=200, out_features=100, bias=True)
  (layer15): ReLU()
  (layer16): Linear(in_features=100, out_features=50, bias=True)
)

###Obtaining Features from Audio samples
Generate features from an audio sample of '.wav' format
* Generate Delta MFCC features of order 1 and 2 
* Pass them through the above mentioned deep neural net and obtain deep features.

Parameters: filepath (path of audio sample),
                       sr (sampling rate, all the samples provided are of 8000 bitrate)
         
  Caution: Do not change the default parameters

"""
    extract MFCC feature
    :param y: np.ndarray [shape=(n,)], real-valued the input signal (audio time series)
    :param sr: sample rate of 'y'
    :param size: the length (seconds) of random crop from original audio, default as 3 seconds
    :return: MFCC feature
    """

In [None]:
def get_features(filepath, sr=8000, n_mfcc=30, n_mels=128, frames = 15):
    
    #loads and decodes the audio as a time series y, 
    #represented as a one-dimensional NumPy floating point array.
    #sr is sampling rate of y, the number of samples per second of audio. 
    y, sr = librosa.load(filepath, sr=sr)

    #Short-time Fourier transform (STFT).  Convert the audio file into mel-frequency cepstrum(MFC)
    #a representation of the short-term power spectrum of a sound, based on a linear cosine transform 
    #of a log power spectrum on a nonlinear mel scale of frequency.
    # signal in time-frequency domain by computing(DFT)over short overlapping windows.
    D = np.abs(librosa.stft(y))**2

    #Compute a mel-scaled spectrogram.If a spectrogram inputS is provided, 
    # then it is mapped directly onto the mel basis by mel_f.dot(S). 
    # Get the mel-spectrogram features using a precomuted power spectogram,
    S = librosa.feature.melspectrogram(S=D)

    #If a time-series input y, sr is provided,then its magnitude spectrogram 
    # S is first computed, and then mapped onto the mel scale by mel_f.dot(S**power).
    S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=n_mels)

    #transform the spectrogram output to a logarithmic scale by transforming
    # the amplitude to decibels. While doing so we will also normalize 
    # the spectrogram so that its maximum represent the 0 dB point.

    #* Convert a power spectrogram (amplitude squared) to decibel (dB) units
    log_S = librosa.power_to_db(S,ref=np.max)
    

    # Mel-frequency cepstral coefficients (MFCCs) on log-power Mel spectrogram
    features = librosa.feature.mfcc(S=log_S, n_mfcc=n_mfcc)

    if features.shape[1] < frames :
        features = np.hstack((features, np.zeros((n_mfcc, frames - features.shape[1]))))
    elif features.shape[1] > frames:
        features = features[:, :frames]

    # Find 1st order delta_mfcc
    #Compute delta features: local estimate of the derivative of the input data 
    #along the selected axis. Delta features are computed Savitsky-Golay filtering.
    delta1_mfcc = librosa.feature.delta(features, order=1)

    # Find 2nd order delta_mfcc
    delta2_mfcc = librosa.feature.delta(features, order=2)

    features = np.hstack((delta1_mfcc.flatten(), delta2_mfcc.flatten()))
    features = features.flatten()[np.newaxis, :]
    features = Variable(torch.from_numpy(features)).float()
    deep_net = get_network()
    deep_features = deep_net(features)
    #print(deep_features.shape)
    #print(audio_file)
    features.flatten()[np.newaxis, :]
    return deep_features.data.numpy().flatten()

### All the voice samples needed for training are present across the folders "Noisy_data" and "studio_data"

In [None]:
%ls

net_speech_89.pt  Noisy_data.zip  [0m[01;34mstudio_data[0m/
[01;34mNoisy_data[0m/       [01;34msample_data[0m/    studio_rev_data.zip


##**Stage 1**: Train a classifier on the Studio data and Deploy the model in the server

---


### a) Extract features of Studio data (5 Marks)

 Load 'Studio data' and extract deep features

 **Evaluation Criteria:**

 * Complete the code in the load_data function
 * The function should take path of the folder containing audio samples as input
 * It should return features of all the audio samples present in the specified folder into single array (list of lists or 2-d numpy array) and their respective labels should be returned too

In [None]:
def load_data(dirname):
    features = []  
    labels = []
    for root, directories, files in os.walk(dirname):
        filepath = ''
        for  filename in files:
            filepath = os.path.join(root, filename)
            features.append(get_features(filepath))
            r4 = filename.split('_', 1)
            r5 = r4[0]
            labels.append(int(r5))
    return features, labels 

In [None]:
%ls

net_speech_89.pt  Noisy_data.zip  [0m[01;34mstudio_data[0m/
[01;34mNoisy_data[0m/       [01;34msample_data[0m/    studio_rev_data.zip


Load data from studio_data folder for extracting all features and labels

In [None]:
studio_recorded_features, studio_recorded_labels = load_data('/content/studio_data')

In [None]:
len(studio_recorded_features)

8178

In [None]:
len(studio_recorded_labels)

8178

In [None]:
# convert the list to numpy array
studio_recorded_features = np.array(studio_recorded_features)

In [None]:
studio_recorded_features.shape

(8178, 50)

In [None]:
X = studio_recorded_features
y = studio_recorded_labels

### b) Train and classify on the studio_data (5 Marks)

The goal here is to train and classify your model on voice samples collected in studio data

**Evaluation Criteria:**
* Train the classifier
* Expected accuracy is above 85%

In [None]:
# YOUR CODE HERE
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)



##Gradient-Boosted Trees

In [None]:
pip install xgboost



In [None]:
import xgboost as xgb
from sklearn.metrics import accuracy_score

In [None]:
D_train = xgb.DMatrix(X_train, label=y_train)
D_test = xgb.DMatrix(X_test, label=y_test)

In [None]:
param = {
    'eta': 0.1, # learning rate (0.1 to 0.3 are common)
    'max_depth': 50, # I mean, we have 50 features...
    'objective': 'multi:softprob',
    'num_class': 12
}
steps = 10

In [None]:
xg_model = xgb.train(param, D_train, steps)

In [None]:
xg_yhats = xg_model.predict(D_test)

In [None]:
xg_yhats[0] # supports >2 classes

array([0.04087899, 0.04072345, 0.04103322, 0.04089827, 0.04065049,
       0.0409448 , 0.04105444, 0.04085384, 0.5504155 , 0.04090514,
       0.04095814, 0.04068373], dtype=float32)

In [None]:
xg_yhat = np.asarray([np.argmax(line) for line in xg_yhats])

In [None]:
xg_yhat[0]

8

In [None]:
accuracy_score(xg_yhat, y_test)

0.9202933985330073

In [None]:
# Save
from joblib import dump
dump(xg_model, 'studio_data_xg.sav') 
from google.colab import files
files.download('/content/studio_data_xg.sav')

In [None]:
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [None]:
from sklearn.ensemble import RandomForestClassifier  
classifierRD = RandomForestClassifier(n_estimators = 100, criterion = 'entropy', random_state = 0)
classifierRD.fit(X_train, y_train)

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='entropy', max_depth=None, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=100,
                       n_jobs=None, oob_score=False, random_state=0, verbose=0,
                       warm_start=False)

In [None]:
y_pred = classifierRD.predict(X_test)


In [None]:
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
print()
print(accuracy_score(y_test, y_pred))

[[144   0   3   1   0   0   2   0   1   0   2   1]
 [  0 171   0   0   1   2   0   5   0   1   1   4]
 [  0   0 158   1   0   1   0   2   0   0   0   5]
 [  0   0   0 165   0   1   0   0   4   1   0   0]
 [  0   0   1   0 153   0   0   1   0   0   0   1]
 [  0   4   0   0   0 153   1   5   0   4   1   0]
 [  0   1   0   2   0   1 141   2  11   5   7   0]
 [  0   1   0   0   0   5   0 145   2   3   4   2]
 [  0   1   0   3   0   0   7   0 180   0   5   0]
 [  0   1   0   0   0   3   2   6   3 157   2   2]
 [  0   0   0   0   0   0   6   1   4   5 146   2]
 [  0   2   4   0   1   0   0   4   1   0   1 163]]

0.917359413202934


In [None]:
from sklearn.metrics import f1_score

print(f1_score(y_test, y_pred, average='macro'))
print(f1_score(y_test, y_pred, average='micro'))
print(f1_score(y_test, y_pred, average='weighted'))

0.820796094373553
0.8200652938221998
0.8206406992708644


In [None]:
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 1) # metric='minkowski'p2, Euclidean distance
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[560   6  12   9   8   7  14  12   4  11   9  13]
 [ 14 570  13  11  12  17   6  12   2  18  11   9]
 [  6  20 518   5  14   7  13   6   6   8   6  11]
 [  8   7   8 561   7   7  16   0  31  13  15   2]
 [  5  32  11   6 532  13   6   9   2   6   4   8]
 [ 10  42  11   4  11 530  14  21   9  14   9   3]
 [ 12  18  18  14   5  11 528  14  31  11  22   3]
 [  9  28   9   2   6  12  13 495  12  17  21  12]
 [  6  19   4  36   5   5  30   6 551  12  16   4]
 [  3  18   3  17   8  30   5  26  12 520  14  13]
 [  6  25  11   3   9  13  23  27  15   5 495  15]
 [ 11  29  19   4  15  12   2   7   7  14   8 536]]


0.8031140130587644

In [None]:
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state=0) # Random state to get the same results
classifier.fit(X_train,y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy = accuracy_score(y_test, y_pred)
print(accuracy)

[[343   6  44  19   5   4  41 128  29  13  25   8]
 [ 13 297  18   6  16  17  30 203  26  21  32  16]
 [ 33   7 305   9  13  10  27 163  18   6  20   9]
 [ 25   8  13 372   1   5  42 118  61  15  13   2]
 [ 15  39  55   8 285   8  17 151  11   9  19  17]
 [  9  36  15  13   6 277  31 203  25  36  21   6]
 [ 26   9  18  41   3   5 261 182  84   6  50   2]
 [ 16  20  19   9   2  18  34 432  17  17  48   4]
 [ 23   7   7  64   3   5  71 113 363  10  25   3]
 [ 16  16   7  30   1  25  43 176  26 309  17   3]
 [ 28   8  15  11   0   2  47 199  58  13 262   4]
 [ 37  14  42   7  19   4  11 203  15   9  24 279]]
0.47526368658965346


In [None]:
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state=0) # Random state to get the same results
classifier.fit(X_train,y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy = accuracy_score(y_test, y_pred)
print(accuracy)

[[343   6  44  19   5   4  41 128  29  13  25   8]
 [ 13 297  18   6  16  17  30 203  26  21  32  16]
 [ 33   7 305   9  13  10  27 163  18   6  20   9]
 [ 25   8  13 372   1   5  42 118  61  15  13   2]
 [ 15  39  55   8 285   8  17 151  11   9  19  17]
 [  9  36  15  13   6 277  31 203  25  36  21   6]
 [ 26   9  18  41   3   5 261 182  84   6  50   2]
 [ 16  20  19   9   2  18  34 432  17  17  48   4]
 [ 23   7   7  64   3   5  71 113 363  10  25   3]
 [ 16  16   7  30   1  25  43 176  26 309  17   3]
 [ 28   8  15  11   0   2  47 199  58  13 262   4]
 [ 37  14  42   7  19   4  11 203  15   9  24 279]]
0.47526368658965346


In [None]:
from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[321   4  27  14   2   4  28 192  23   8  36   6]
 [ 10 272   7   4  12   6  17 294  14  16  33  10]
 [ 17   7 285   7  11   6  19 222  11   5  23   7]
 [ 22   7   7 338   1   7  37 150  62  18  24   2]
 [  8  32  43   5 284   7   8 199  11   9  14  14]
 [  7  31  10   4   4 252  23 274  18  24  27   4]
 [ 22   3   6  20   4   5 245 237  59   6  80   0]
 [  7  12  13   4   1  13  22 498  11   9  42   4]
 [ 20   8   6  38   2   4  59 150 352   8  44   3]
 [  9  13   6  20   0  25  26 243  16 279  30   2]
 [ 20   9  10   9   0   2  32 265  34   7 255   4]
 [ 29  19  41   6  14   1  10 241   6   6  28 263]]


0.4575590155700653

In [None]:
classifier = SVC(kernel = 'rbf', random_state = 0)
classifier.fit(X_train, y_train)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[321   4  27  14   2   4  28 192  23   8  36   6]
 [ 10 272   7   4  12   6  17 294  14  16  33  10]
 [ 17   7 285   7  11   6  19 222  11   5  23   7]
 [ 22   7   7 338   1   7  37 150  62  18  24   2]
 [  8  32  43   5 284   7   8 199  11   9  14  14]
 [  7  31  10   4   4 252  23 274  18  24  27   4]
 [ 22   3   6  20   4   5 245 237  59   6  80   0]
 [  7  12  13   4   1  13  22 498  11   9  42   4]
 [ 20   8   6  38   2   4  59 150 352   8  44   3]
 [  9  13   6  20   0  25  26 243  16 279  30   2]
 [ 20   9  10   9   0   2  32 265  34   7 255   4]
 [ 29  19  41   6  14   1  10 241   6   6  28 263]]


0.4575590155700653

In [None]:
from sklearn.tree import DecisionTreeClassifier
classifier = DecisionTreeClassifier(criterion = 'entropy', random_state = 0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[549   9  12  13   4   9   9  18   9   4  13  16]
 [  6 547  10  19  17  14  12  28   2  13  10  17]
 [ 16   6 510   6  17  16  10  15   6   8   4   6]
 [  6   2   7 564  11   7  15  11  27  12   7   6]
 [  5  11  17   9 524   9   7  25   6  10   2   9]
 [ 10  24   7   9  16 531  14  36   8  11   8   4]
 [ 18  16   9  13   7  15 522  22  28   9  24   4]
 [ 11  16  12  10   7  18   5 507  10  15  16   9]
 [  5   8   5  39   8   8  33  19 538  11  15   5]
 [ 10  10   5  18   9  28  10  30  16 515   8  10]
 [  5   8  13   3  19  10  26  30   9  12 496  16]
 [ 10  14  18  17  16   9   4  15   2  15   3 541]]


0.7965846308387745

### c) Save and download your model (2 Marks)

Save your trained model

**Hint:** You can use joblib package to save the model.

In [None]:
# YOUR CODE HERE to save the trained model
# Save
from joblib import dump
dump(classifierRD, 'studio_data_RD.sav') 

# Load
from joblib import load
#clf = load('studio_data_DT.joblib') 

Download your trained model using the code below
* Give the path of model file to download through the browser

In [None]:
from google.colab import files
files.download('/content/studio_data_RD.sav')

### d) Deploy your model in the server (5 Marks).

(This can be done on the day of the Hackathon once the login username and password provided by the mentors in the lab) 

Deploy your model on the server, check the hackathon document (2-Server Access and File transfer For Voice based food ordering) for details. 

To order food in user interface, go through the document (3-Hackathon 1 Application Interface Documentation) for details.


**Evaluation Criteria:**

There are two stages in the food ordering application
        
* Ordering Item
* Providing the number of servings
    
If both the stages are cleared with correct predictions you will get
complete marks
Otherwise, no marks will be awarded


#### Now deploy the model trained on studio_data in the server to order food correctly. 


## **Stage 2**: Use 'Noisy_data' and 'Studio_data' together, train a Classifier on the same and deploy the model in the server 

---

### a) Extract features and classify the model (3 Marks)

The goal here is to train your model on voice samples collected in both noisy and studio data

**Evaluation Criteria:**
* Load 'Noisy_data' and extract features
* Combine noisy features with the studio features
* Train the classifier


Load data from Noisy_data folder for extracting all features and labels

In [None]:
import zipfile

fantasy_zip = zipfile.ZipFile('/content/Noisy_data.zip')
fantasy_zip.extractall('/content/Noisy_data/')

fantasy_zip.close() 

## **Stage 0:** Obtain Features from Noisy samples using Pre-trained Network
---

### Pretrained Network for deep features

The above function contains code to load a pre-trained network to produce deep features of the noisy sample. This network is trained with delta MFCC features of mono channel 8000 bit rate audio sample.

###Obtaining Features from Noisy samples
Generate features from an noisy sample of '.wav' format
* Generate Delta MFCC features of order 1 and 2 
* Pass them through the above mentioned deep neural net and obtain deep features.

Parameters: filepath (path of audio sample),
                       sr (sampling rate, all the samples provided are of 8000 bitrate)
         
  Caution: Do not change the default parameters

"""
    extract MFCC feature
    :param y: np.ndarray [shape=(n,)], real-valued the input signal (audio time series)
    :param sr: sample rate of 'y'
    :param size: the length (seconds) of random crop from original audio, default as 3 seconds
    :return: MFCC feature
    """

In [None]:
noisy_recorded_features, noisy_recorded_labels = load_data('/content/Noisy_data')


In [None]:
print(len(noisy_recorded_features))
print(len(noisy_recorded_labels))
print('----')
print(len(studio_recorded_features))
print(len(studio_recorded_labels))

23678
23678
----
8178
8178


In [None]:
studio_recorded_features = studio_recorded_features.tolist()

In [None]:
# Combine the features of Studio and Noisy data
# using * operator to concat 
studio_noisy_recorded_features = [*studio_recorded_features, *noisy_recorded_features]
studio_noisy_recorded_labels = [*studio_recorded_labels, *noisy_recorded_labels] 

In [None]:
print(len(studio_noisy_recorded_features))
print(len(studio_noisy_recorded_labels))

31856
31856


Train a classifier on the features obtained from noisy_data and studio_data

In [None]:
# YOUR CODE HERE
# convert the list to numpy array
studio_noisy_recorded_features = np.array(studio_noisy_recorded_features)

In [None]:
X = studio_noisy_recorded_features
y = studio_noisy_recorded_labels

In [None]:
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

### Gradient-Boosted Trees

In [None]:
D_train = xgb.DMatrix(X_train, label=y_train)
D_test = xgb.DMatrix(X_test, label=y_test)
param = {
    'eta': 0.1, # learning rate (0.1 to 0.3 are common)
    'max_depth': 50, # I mean, we have 50 features...
    'objective': 'multi:softprob',
    'num_class': 12
}
steps = 10
xg_model_SN = xgb.train(param, D_train, steps)
xg_yhats = xg_model_SN.predict(D_test)
xg_yhat = np.asarray([np.argmax(line) for line in xg_yhats])
print(accuracy_score(xg_yhat, y_test))


# Save
from joblib import dump
#dump(xg_model_SN, 'studio_noisy_data_xg.sav') 
from google.colab import files
#files.download('/content/studio_noisy_data_xg.sav')

In [None]:
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [None]:
from sklearn.ensemble import RandomForestClassifier  
classifier_SN_RD = RandomForestClassifier(n_estimators = 50, criterion = 'entropy', random_state = 0)
classifier_SN_RD.fit(X_train, y_train)

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='entropy', max_depth=None, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=50,
                       n_jobs=None, oob_score=False, random_state=0, verbose=0,
                       warm_start=False)

In [None]:
# Predicting the Test set results
y_pred = classifier_SN_RD.predict(X_test)

In [None]:
#Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[568   3  13   6   3   5  15  15   5  11   5  16]
 [ 10 566   9   4  14  13  12  31   2  14   8  12]
 [  7  10 529   3  16   4  12  13   8   6   5   7]
 [  8   3   4 566   9   7  13   6  26  15  16   2]
 [  5  21  11   8 529   8   4  24   6   8   0  10]
 [  8  30   3   4   7 531  11  41  11  16   9   7]
 [ 11   9   8  10   5   7 535  28  35  10  27   2]
 [  5   9  17   2   1  14  10 531  10   8  21   8]
 [  6  10   1  31   3   6  34  12 564  13  10   4]
 [  7  10   3  14  10  27   8  26  11 533  11   9]
 [  6   8   8   4   2  12  23  34  16   7 514  13]
 [ 11  16  17   9   8  12   2  21   5   9   7 547]]


0.8178051230537419

In [None]:
print(f1_score(y_test, y_pred, average='macro'))
print(f1_score(y_test, y_pred, average='micro'))
print(f1_score(y_test, y_pred, average='weighted'))


0.820796094373553
0.8200652938221998
0.8206406992708644


In [None]:
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 1) # metric='minkowski'p2, Euclidean distance
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[560   6  12   9   8   7  14  12   4  11   9  13]
 [ 14 570  13  11  12  17   6  12   2  18  11   9]
 [  6  20 518   5  14   7  13   6   6   8   6  11]
 [  8   7   8 561   7   7  16   0  31  13  15   2]
 [  5  32  11   6 532  13   6   9   2   6   4   8]
 [ 10  42  11   4  11 530  14  21   9  14   9   3]
 [ 12  18  18  14   5  11 528  14  31  11  22   3]
 [  9  28   9   2   6  12  13 495  12  17  21  12]
 [  6  19   4  36   5   5  30   6 551  12  16   4]
 [  3  18   3  17   8  30   5  26  12 520  14  13]
 [  6  25  11   3   9  13  23  27  15   5 495  15]
 [ 11  29  19   4  15  12   2   7   7  14   8 536]]


0.8031140130587644

In [None]:
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state=0) # Random state to get the same results
classifier.fit(X_train,y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy = accuracy_score(y_test, y_pred)
print(accuracy)

[[343   6  44  19   5   4  41 128  29  13  25   8]
 [ 13 297  18   6  16  17  30 203  26  21  32  16]
 [ 33   7 305   9  13  10  27 163  18   6  20   9]
 [ 25   8  13 372   1   5  42 118  61  15  13   2]
 [ 15  39  55   8 285   8  17 151  11   9  19  17]
 [  9  36  15  13   6 277  31 203  25  36  21   6]
 [ 26   9  18  41   3   5 261 182  84   6  50   2]
 [ 16  20  19   9   2  18  34 432  17  17  48   4]
 [ 23   7   7  64   3   5  71 113 363  10  25   3]
 [ 16  16   7  30   1  25  43 176  26 309  17   3]
 [ 28   8  15  11   0   2  47 199  58  13 262   4]
 [ 37  14  42   7  19   4  11 203  15   9  24 279]]
0.47526368658965346


In [None]:
from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[321   4  27  14   2   4  28 192  23   8  36   6]
 [ 10 272   7   4  12   6  17 294  14  16  33  10]
 [ 17   7 285   7  11   6  19 222  11   5  23   7]
 [ 22   7   7 338   1   7  37 150  62  18  24   2]
 [  8  32  43   5 284   7   8 199  11   9  14  14]
 [  7  31  10   4   4 252  23 274  18  24  27   4]
 [ 22   3   6  20   4   5 245 237  59   6  80   0]
 [  7  12  13   4   1  13  22 498  11   9  42   4]
 [ 20   8   6  38   2   4  59 150 352   8  44   3]
 [  9  13   6  20   0  25  26 243  16 279  30   2]
 [ 20   9  10   9   0   2  32 265  34   7 255   4]
 [ 29  19  41   6  14   1  10 241   6   6  28 263]]


0.4575590155700653

In [None]:
classifier = SVC(kernel = 'rbf', random_state = 0)
classifier.fit(X_train, y_train)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[321   4  27  14   2   4  28 192  23   8  36   6]
 [ 10 272   7   4  12   6  17 294  14  16  33  10]
 [ 17   7 285   7  11   6  19 222  11   5  23   7]
 [ 22   7   7 338   1   7  37 150  62  18  24   2]
 [  8  32  43   5 284   7   8 199  11   9  14  14]
 [  7  31  10   4   4 252  23 274  18  24  27   4]
 [ 22   3   6  20   4   5 245 237  59   6  80   0]
 [  7  12  13   4   1  13  22 498  11   9  42   4]
 [ 20   8   6  38   2   4  59 150 352   8  44   3]
 [  9  13   6  20   0  25  26 243  16 279  30   2]
 [ 20   9  10   9   0   2  32 265  34   7 255   4]
 [ 29  19  41   6  14   1  10 241   6   6  28 263]]


0.4575590155700653

In [None]:
from sklearn.tree import DecisionTreeClassifier
classifier = DecisionTreeClassifier(criterion = 'entropy', random_state = 0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[549   9  12  13   4   9   9  18   9   4  13  16]
 [  6 547  10  19  17  14  12  28   2  13  10  17]
 [ 16   6 510   6  17  16  10  15   6   8   4   6]
 [  6   2   7 564  11   7  15  11  27  12   7   6]
 [  5  11  17   9 524   9   7  25   6  10   2   9]
 [ 10  24   7   9  16 531  14  36   8  11   8   4]
 [ 18  16   9  13   7  15 522  22  28   9  24   4]
 [ 11  16  12  10   7  18   5 507  10  15  16   9]
 [  5   8   5  39   8   8  33  19 538  11  15   5]
 [ 10  10   5  18   9  28  10  30  16 515   8  10]
 [  5   8  13   3  19  10  26  30   9  12 496  16]
 [ 10  14  18  17  16   9   4  15   2  15   3 541]]


0.7965846308387745

### b) Save and download your model (2 Marks)
Save your trained model

**Hint:** You can use joblib package to save the model.

In [None]:
# YOUR CODE HERE to save the trained model
# Save
from joblib import dump
dump(classifier_SN_RD, 'studio_noisy_data_RD.sav') 

Download your trained model using the code below
* Give the path of model file to download through the browser

In [None]:
from google.colab import files
files.download('/content/studio_noisy_data_RD.sav')

### c) Deploy your model in the server (5 Marks).

(This can be done on the day of the Hackathon once the login username and password given by the mentors in the lab) 

Deploy your model on the server, check the hackathon document (2-Server Access and File transfer For Voice based food ordering) for details.

To order food in user interface, go through the document (3-Hackathon 1 Application Interface Documentation) for details.

**Evaluation Criteria:**

There are two stages in the food ordering application
        
* Ordering Item
* Providing the number of servings
    
If both the stages are cleared with correct predictions you will get
complete marks. 
Otherwise, no marks will be awarded


#### Now deploy the model trained on studio_data and noisy_data in the server to order food correctly. 

## **Stage 3:** Collect your voice samples and refine the classifier trained on studio_data and Noisy_data
---

### a) Collect your Team Voice Samples and extract features (5 Marks)

(This can be done on the day of the Hackathon once the login username and password is given by mentors in the lab)

* In order to collect the team data, ensure the server is active (Refer document: 2-Server Access and File transfer For Voice based food ordering)

* Refer document "3-Hackathon_1 Application Interface Documentation" for collecting your team voice samples. These will get stored in your server

**Evaluation Criteria:**
* Load 'Team_data' and extract features
* Combine features of team data with the extracted features of studio and noisy data

In [None]:
!mkdir team_data

In [None]:
# Replace <YOUR_GROUP_ID> with your Username given in the lab
!wget -r -A .wav https://aiml-sandbox1.talentsprint.com/audio_recorder/b15h1g15/team_data/ -nH --cut-dirs=100  -P ./team_data

In [None]:
team_recorded_features, team_recorded_labels = load_data('/content/team_data')

In [None]:
# Combine the features of all voice samples (studio_data, noisy_data and teamdata)
# Combine the features of Studio , Noisy data and Team data
studio_noisy_team_recorded_features = [*studio_noisy_recorded_features, *team_recorded_features]
studio_noisy_team_recorded_labels = [*studio_noisy_recorded_labels, *team_recorded_labels] 


In [None]:
print(len(studio_noisy_recorded_features))
print(len(studio_noisy_recorded_labels))
print('----')
print(len(team_recorded_features))
print(len(team_recorded_labels))

print('----')
print(len(studio_noisy_team_recorded_features))
print(len(studio_noisy_team_recorded_labels))


31856
31856
----
197
197
----
32053
32053


### b) Classify and download the model (3 Marks)

The goal here is to train your model on all voice samples collected in noisy, studio and team data


In [None]:
# YOUR CODE HERE for refining your classifier
# convert the list to numpy array
studio_noisy_team_recorded_features = np.array(studio_noisy_team_recorded_features)

In [None]:
X = studio_noisy_recorded_features
y = studio_noisy_recorded_labels

In [None]:
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

### Gradient-Boosted Trees

In [None]:
D_train = xgb.DMatrix(X_train, label=y_train)
D_test = xgb.DMatrix(X_test, label=y_test)
param = {
    'eta': 0.1, # learning rate (0.1 to 0.3 are common)
    'max_depth': 50, # I mean, we have 50 features...
    'objective': 'multi:softprob',
    'num_class': 12
}
steps = 10
xg_model_SNT = xgb.train(param, D_train, steps)
xg_yhats = xg_model_SNT.predict(D_test)
xg_yhat = np.asarray([np.argmax(line) for line in xg_yhats])
print(accuracy_score(xg_yhat, y_test))


# Save
from joblib import dump
#dump(xg_model_SNT, 'studio_noisy_team_data_xg.sav') 
from google.colab import files
#files.download('/content/studio_noisy_team_data_xg.sav')

In [None]:
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [None]:
from sklearn.ensemble import RandomForestClassifier  
classifier_SNT_RD = RandomForestClassifier(n_estimators = 100, criterion = 'entropy', random_state = 0)
classifier_SNT_RD.fit(X_train, y_train)

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='entropy', max_depth=None, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=100,
                       n_jobs=None, oob_score=False, random_state=0, verbose=0,
                       warm_start=False)

In [None]:
# Predicting the Test set results
y_pred = classifier_SNT_RD.predict(X_test)

In [None]:
#Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[568   3  11   6   4   7  15  15   6   9   5  16]
 [ 12 566   9   8  12   9  10  28   4  17   8  12]
 [  6  10 530   3  17   4  10  15   7   4   7   7]
 [ 10   3   6 567   9   7  13   4  29  13  14   0]
 [  3  19  13   6 534  11   4  20   6   8   2   8]
 [  6  31   5   4   5 533  11  40  11  16   9   7]
 [  9  11  16  12   6   7 535  25  35   7  22   2]
 [  5   5  17   4   1  13  10 533  10   8  22   8]
 [  6   8   0  29   5   6  36  12 565  13  10   4]
 [  7  16   3  14  10  23   6  25  12 535   7  11]
 [  7   8   8   3   5  12  20  32  16   6 516  14]
 [  9  13  21   7   7  12   4  21   5   7   9 549]]


0.8200652938222

In [None]:
print(f1_score(y_test, y_pred, average='macro'))
print(f1_score(y_test, y_pred, average='micro'))
print(f1_score(y_test, y_pred, average='weighted'))

0.820796094373553
0.8200652938221998
0.8206406992708644


In [None]:
from sklearn.neighbors import KNeighborsClassifier
classifier = KNeighborsClassifier(n_neighbors = 1) # metric='minkowski'p2, Euclidean distance
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[560   6  12   9   8   7  14  12   4  11   9  13]
 [ 14 570  13  11  12  17   6  12   2  18  11   9]
 [  6  20 518   5  14   7  13   6   6   8   6  11]
 [  8   7   8 561   7   7  16   0  31  13  15   2]
 [  5  32  11   6 532  13   6   9   2   6   4   8]
 [ 10  42  11   4  11 530  14  21   9  14   9   3]
 [ 12  18  18  14   5  11 528  14  31  11  22   3]
 [  9  28   9   2   6  12  13 495  12  17  21  12]
 [  6  19   4  36   5   5  30   6 551  12  16   4]
 [  3  18   3  17   8  30   5  26  12 520  14  13]
 [  6  25  11   3   9  13  23  27  15   5 495  15]
 [ 11  29  19   4  15  12   2   7   7  14   8 536]]


0.8031140130587644

In [None]:
from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[321   4  27  14   2   4  28 192  23   8  36   6]
 [ 10 272   7   4  12   6  17 294  14  16  33  10]
 [ 17   7 285   7  11   6  19 222  11   5  23   7]
 [ 22   7   7 338   1   7  37 150  62  18  24   2]
 [  8  32  43   5 284   7   8 199  11   9  14  14]
 [  7  31  10   4   4 252  23 274  18  24  27   4]
 [ 22   3   6  20   4   5 245 237  59   6  80   0]
 [  7  12  13   4   1  13  22 498  11   9  42   4]
 [ 20   8   6  38   2   4  59 150 352   8  44   3]
 [  9  13   6  20   0  25  26 243  16 279  30   2]
 [ 20   9  10   9   0   2  32 265  34   7 255   4]
 [ 29  19  41   6  14   1  10 241   6   6  28 263]]


0.4575590155700653

In [None]:
classifier = SVC(kernel = 'rbf', random_state = 0)
classifier.fit(X_train, y_train)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[321   4  27  14   2   4  28 192  23   8  36   6]
 [ 10 272   7   4  12   6  17 294  14  16  33  10]
 [ 17   7 285   7  11   6  19 222  11   5  23   7]
 [ 22   7   7 338   1   7  37 150  62  18  24   2]
 [  8  32  43   5 284   7   8 199  11   9  14  14]
 [  7  31  10   4   4 252  23 274  18  24  27   4]
 [ 22   3   6  20   4   5 245 237  59   6  80   0]
 [  7  12  13   4   1  13  22 498  11   9  42   4]
 [ 20   8   6  38   2   4  59 150 352   8  44   3]
 [  9  13   6  20   0  25  26 243  16 279  30   2]
 [ 20   9  10   9   0   2  32 265  34   7 255   4]
 [ 29  19  41   6  14   1  10 241   6   6  28 263]]


0.4575590155700653

In [None]:
from sklearn.tree import DecisionTreeClassifier
classifier = DecisionTreeClassifier(criterion = 'entropy', random_state = 0)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cm = confusion_matrix(y_test, y_pred)
print(cm)
accuracy_score(y_test, y_pred)

[[549   9  12  13   4   9   9  18   9   4  13  16]
 [  6 547  10  19  17  14  12  28   2  13  10  17]
 [ 16   6 510   6  17  16  10  15   6   8   4   6]
 [  6   2   7 564  11   7  15  11  27  12   7   6]
 [  5  11  17   9 524   9   7  25   6  10   2   9]
 [ 10  24   7   9  16 531  14  36   8  11   8   4]
 [ 18  16   9  13   7  15 522  22  28   9  24   4]
 [ 11  16  12  10   7  18   5 507  10  15  16   9]
 [  5   8   5  39   8   8  33  19 538  11  15   5]
 [ 10  10   5  18   9  28  10  30  16 515   8  10]
 [  5   8  13   3  19  10  26  30   9  12 496  16]
 [ 10  14  18  17  16   9   4  15   2  15   3 541]]


0.7965846308387745

Save your trained model

**Hint:** You can use joblib package to save the model

In [None]:
# YOUR CODE HERE
# YOUR CODE HERE to save the trained model
# Save
from joblib import dump
dump(classifier_SNT_RD, 'studio_noisy_team_data_RD.sav') 

Download your trained model using the code below
* Give the path of model file to download through the browser

In [None]:
from google.colab import files
files.download('/content/studio_noisy_team_data_RD.sav')

### c) Deploy your model in the server (5 Marks).

(This can be done on the day of the Hackathon once the login username and password given by the mentors in the lab) 

Deploy your model on the server, check the hackathon document (2-Server Access and File transfer For Voice based food ordering) for details.

To order food in user interface, go through the document (3-Hackathon 1 Application Interface Documentation) for details.


**Evaluation Criteria:**

There are two stages in the food ordering application
        
* Ordering Item
* Providing the number of servings
    
If both the stages are cleared with correct predictions you will get
complete marks
Otherwise, no marks will be awarded


#### Now deploy the model trained on studio_data in the server to order food correctly. 