# Modelling and Deployment using MLOps 

Now that we have audio input data & corresponding labels in an array format, it is easier to consume and apply Natural language processing techniques. We can convert audio files labels into integers using label Encoding or One Hot Vector Encoding for machines to learn. The labeled dataset will help us in the neural network model output layer for predicting results. These help in training & validation datasets into nD array.
At this stage, we apply other pre-processing techniques like dropping columns, normalization, etc. to conclude our final training data for building models. Moving to the next stage of splitting the dataset into train, test, and validation is what we have been doing for other models. The below diagram is a generic representation of the Convolution Neural Network.
We can leverage CNN, RNN, LSTM, etc. deep neural algorithms to build and train the models for speech applications like speech recognition, emotion recognition, music genre classification, voice biometric, and many more. The model trained with the standard size few seconds audio chunk transformed into an array of n dimensions with the respective labels will result in predicting output labels for test audio input. As output labels will vary beyond binary, we are talking about building a multi-class label classification method.


In [2]:
import pandas as pd
import numpy as np
import os,sys
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder,StandardScaler
sys.path.append(os.path.abspath(os.path.join('../scripts')))
from deep_learner import DeepLearn
from modeling import Modeler

# Deep Learning Model

**objective**: Build a Deep learning model that converts speech to text.

In [3]:
swahili_df = pd.read_csv("../data/swahili.csv")
amharic_df = pd.read_csv("../data/amharic.csv")

In [4]:
pre_model = Modeler()

In [5]:
swahili_preprocessed = pre_model.preprocessor_audio(swahili_df,'key','text')

In [6]:
amharic_preprocessed = pre_model.preprocessor_audio(amharic_df,'key','text')

In [7]:
swahili_preprocessed

(array([[ 0.        ,  0.        ,  1.35012129,  2.8589185 , -0.45403679,
         -0.19338387, -0.46378703, -0.26720165,  2.91466043],
        [ 0.        ,  0.        , -0.6333291 , -0.48754577, -1.59989106,
         -0.03052994, -0.95728472, -1.26394851, -0.49499576],
        [ 0.        ,  0.        ,  0.88145886, -0.45618295,  0.26780785,
          0.57184897,  0.65307092,  0.3661349 , -0.14130002],
        [ 0.        ,  0.        ,  0.05497682, -0.34597687, -1.16898739,
         -1.4174735 , -1.45522741, -0.94115513, -0.51615832],
        [ 0.        ,  0.        ,  1.02100115,  0.40278805,  0.87121582,
          1.33068917,  1.10749208, -0.19965215, -0.24215389],
        [ 0.        ,  0.        , -1.66993095,  0.1033607 ,  2.08049324,
          0.97513217,  1.77539254,  2.57029898, -0.42481577],
        [ 0.        ,  0.        , -1.25093302, -0.5033259 , -0.1902634 ,
          0.82800176,  0.2326571 , -0.67805882,  0.02949438],
        [ 0.        ,  0.        , -0.87937071, 