# Modelling and Deployment using MLOps 

Now that we have audio input data & corresponding labels in an array format, it is easier to consume and apply Natural language processing techniques. We can convert audio files labels into integers using label Encoding or One Hot Vector Encoding for machines to learn. The labeled dataset will help us in the neural network model output layer for predicting results. These help in training & validation datasets into nD array.
At this stage, we apply other pre-processing techniques like dropping columns, normalization, etc. to conclude our final training data for building models. Moving to the next stage of splitting the dataset into train, test, and validation is what we have been doing for other models. 
We can leverage CNN, RNN, LSTM,CTC etc. deep neural algorithms to build and train the models for speech applications like speech recognition. The model trained with the standard size few seconds audio chunk transformed into an array of n dimensions with the respective labels will result in predicting output labels for test audio input. As output labels will vary beyond binary, we are talking about building a multi-class label classification method.


In [2]:
import pandas as pd
import numpy as np
import os,sys
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder,StandardScaler
sys.path.append(os.path.abspath(os.path.join('../scripts')))
import tensorflow as tf
from deep_learner import DeepLearn
from modeling import Modeler

# Deep Learning Model

**objective**: Build a Deep learning model that converts speech to text.

In [6]:
swahili_df = pd.read_csv("../data/swahili.csv")
amharic_df = pd.read_csv("../data/amharic.csv")
swahili_df['key'] = list(range(len(swahili_df)))
amharic_df['key'] = list(range(len(amharic_df)))
swahili_df['input_array'] = swahili_df['input_array'].apply(eval)
amharic_df['input_array'] = amharic_df['input_array'].apply(eval)

In [8]:
pre_model = Modeler()

In [9]:
swahili_preprocessed = pre_model.preprocessing_learn(swahili_df,'text','input_array')

In [10]:
amharic_preprocessed = pre_model.preprocessing_learn(amharic_df,'text','input_array')

In [11]:
train_df,val_df,test_df = swahili_preprocessed

In [12]:
train_df

Unnamed: 0,key,duration,rate,rmse,chroma_stft,spec_cent,spec_bw,rolloff,zcr,mfcc,input_array
0,0,3.14,44100,0.140494,0.315876,1866.409879,1589.100426,3359.724832,0.054397,-8.14289,"[0.0002518580586183816, 2.563117777754087e-05,..."
1,1,3.14,44100,0.162743,0.321692,1900.99163,1655.337451,3481.883645,0.04955,-6.665521,"[-0.0026284900959581137, -0.002864581765606999..."
2,2,3.14,44100,0.15949,0.456093,1420.839749,1589.16423,2979.181985,0.032641,-1.883745,"[0.020822471007704735, 0.023905068635940552, 0..."
3,3,3.14,44100,0.164615,0.355232,2063.848819,1775.474533,3859.115658,0.049307,-7.063959,"[0.002116759540513158, 0.0019456746522337198, ..."
4,4,3.14,44100,0.168792,0.451515,1852.997999,1686.271162,3497.119321,0.048648,-1.036142,"[0.029533347114920616, 0.03362444043159485, 0...."
5,5,3.14,44100,0.15486,0.319131,2039.119501,1727.509904,3764.72168,0.053856,-7.143945,"[-0.010967539623379707, -0.012645195238292217,..."
6,6,3.14,44100,0.135778,0.319711,1894.964967,1746.052445,3657.568359,0.044638,-6.545257,"[0.008869340643286705, 0.010183916427195072, 0..."
7,7,3.14,44100,0.13046,0.343494,2256.247714,1754.663927,4012.988713,0.07634,-7.412744,"[0.005443760193884373, 0.005900760181248188, 0..."
8,8,3.14,44100,0.168273,0.410232,1853.55527,1664.910052,3527.881898,0.05576,-2.86487,"[-0.020646054297685623, -0.02523760311305523, ..."
9,9,3.14,44100,0.162844,0.321559,1967.845186,1731.059926,3754.42465,0.054829,-6.871382,"[0.005849024746567011, 0.006830638274550438, 0..."


## LSTM Deep Learning

In [13]:
model=tf.keras.models.Sequential([
        # Shape [batch, time, features] => [batch, time, lstm_units]
        tf.keras.layers.LSTM(32, return_sequences=True),
        # Shape => [batch, time, features]
        tf.keras.layers.Dense(units=1)
    ])

2022-06-01 13:43:19.513748: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-06-01 13:43:19.513891: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-06-01 13:43:19.514019: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (martin-HP-EliteBook-Folio-9470m): /proc/driver/nvidia/version does not exist


In [14]:
learn = DeepLearn(input_width=1, label_width=1, shift=1,epochs=5,
                 train_df=train_df, val_df=val_df, test_df=test_df,
                 label_columns=['input_array'])
predictions = learn.model(
    model_=model
)

ValueError: setting an array element with a sequence.

# Evaluation

**objective**: Evaluate your model. 

In [11]:
predictions

[[[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]]]

## CNN Deep Learning

In [19]:
model=tf.keras.models.Sequential([
        # Shape [batch, time, features] => [batch, time, lstm_units]
        tf.keras.layers.LSTM(32, return_sequences=True),
        # Shape => [batch, time, features]
        tf.keras.layers.Dense(units=1)
    ])

2022-06-01 12:36:13.306843: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-06-01 12:36:13.323140: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-06-01 12:36:13.446906: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (martin-HP-EliteBook-Folio-9470m): /proc/driver/nvidia/version does not exist


In [9]:
learn = DeepLearn(input_width=1, label_width=1, shift=1,epochs=5,
                 train_df=train_df, val_df=val_df, test_df=test_df,
                 label_columns=['text'])
predictions = learn.model(
    model_=model
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


2022-06-01 10:57:05,090:logger:Successfully executed the model


# Evaluation

**objective**: Evaluate your model. 

In [11]:
predictions

[[[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]]]

## CTC Deep Learning

In [19]:
model=tf.keras.models.Sequential([
        # Shape [batch, time, features] => [batch, time, lstm_units]
        tf.keras.layers.LSTM(32, return_sequences=True),
        # Shape => [batch, time, features]
        tf.keras.layers.Dense(units=1)
    ])

2022-06-01 12:36:13.306843: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-06-01 12:36:13.323140: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-06-01 12:36:13.446906: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (martin-HP-EliteBook-Folio-9470m): /proc/driver/nvidia/version does not exist


In [9]:
learn = DeepLearn(input_width=1, label_width=1, shift=1,epochs=5,
                 train_df=train_df, val_df=val_df, test_df=test_df,
                 label_columns=['text'])
predictions = learn.model(
    model_=model
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


2022-06-01 10:57:05,090:logger:Successfully executed the model


# Evaluation

**objective**: Evaluate your model. 

In [11]:
predictions

[[[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]],
 [[0.11674367636442184]]]