# Machine Learning

Machine learning (“ML“) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying instead on patterns and inference derived from data. In Simple terms Machine Learning is teaching the machine or the software to learn it self or it is an art of mimicking the human intelligence  

# Deep Learning 

Deep learning ( “DL“) is a subtype of machine learning. DL can process a wider range of data resources, requires less data preprocessing by humans (e.g. feature labelling), and can sometimes produce more accurate results than traditional ML approaches (although it requires a larger amount of data to do so). However, it is computationally more expensive in time to execute, hardware costs and data quantities.

# Woking of Machine Learning and Deep Learning

<img src = "http://deepsphere.ai/Demo/ML vs DL.png">

# Architecture of Convolutional Neural Networks

<img src = "http://deepsphere.ai/Demo/DLAC1.png">

# Recurrent Nueral Networks

O Recurrent Neural Networks (RNNs) are the Neural Network tools for problems that deal with sequential data.

O It is increasingly more popular due to their great results in Natural Language Processing (NLP).

O Within NLP they are used for the most varied tasks like translation, text classification, automatic text generation.

<img src = "http://deepsphere.ai/Demo/RNN.png">

<img src = "http://deepsphere.ai/Demo/RNN_1.png">

# Recurrent Neuron

Take a simple task at first. Let’s take a character level RNN where we have a word “Hello”. So we provide the first 4 letters i.e. h,e,l,l and ask the network to predict the last letter i.e.’o’. So here the vocabulary of the task is just 4 letters {h,e,l,o}. In real case scenarios involving natural language processing, the vocabularies include the words in entire wikipedia database, or all the words in a language. Here for simplicity we have taken a very small set of vocabulary.

<img src = "http://deepsphere.ai/Demo/Recurrent Neuron.png">

Let’s see how the above structure be used to predict the fifth letter in the word “hello”. In the above structure, the blue RNN block, applies something called as a recurrence formula to the input vector and also its previous state. In this case, the letter “h” has nothing preceding it, let’s take the letter “e”. So at the time the letter “e” is supplied to the network, a recurrence formula is applied to the letter “e” and the previous state which is the letter “h”. These are known as various time steps of the input. So if at time t, the input is “e”, at time t-1, the input was “h”. The recurrence formula is applied to e and h both. and we get a new state.

The formula for the current state can be written as –


<img src = "http://deepsphere.ai/Demo/Recurrent Neuron_1.png">

Here, Ht is the new state, ht-1 is the previous state while xt is the current input. We now have a state of the previous input instead of the input itself, because the input neuron would have applied the transformations on our previous input. So each successive input is called as a time step.

In this case we have four inputs to be given to the network, during a recurrence formula, the same function and the same weights are applied to the network at each time step.

Taking the simplest form of a recurrent neural network, let’s say that the activation function is tanh, the weight at the recurrent neuron is Whh and the weight at the input neuron is Wxh, we can write the equation for the state at time t as –



The Recurrent neuron in this case is just taking the immediate previous state into consideration. For longer sequences the equation can involve multiple such states. Once the final state is calculated we can go on to produce the output

Now, once the current state is calculated we can calculate the output state as-



<img src = "http://deepsphere.ai/Demo/Recurrent Neuron_2.png">

# Backward Propagation in a Recurrent Neuron 

Backpropagation Through Time, or BPTT, is the training algorithm used to update weights in recurrent neural networks like LSTMs. To effectively frame sequence prediction problems for recurrent neural networks, you must have a strong conceptual understanding of what Backpropagation Through Time is doing and how configurable variations like Truncated Backpropagation Through Time will affect the skill, stability, and speed when training your network

<img src="http://deepsphere.ai/Demo/BPTT1.png">

<img src="http://deepsphere.ai/Demo/BPTT.png">

# Vanishing Gradient Problem

O The gradient descent algorithm finds the global minimum of the cost function that is going to be an optimal setup for the network. The information travels through the neural network from input neurons to the output neurons, while the error is calculated and propagated back through the network to update the weights.

O It works quite similarly for RNNs, but here we’ve got a little bit more going on.

O Firstly, information travels through time in RNNs, which means that information from previous time points is used as input for the next time points. Secondly, you can calculate the cost function, or your error, at each time point.

O Basically, during the training, your cost function compares your outcomes (red circles on the image below) to your desired output. As a result, you have these values throughout the time series, for every single one of these red circles.

<img src="http://deepsphere.ai/Demo/Vanishing_Gradient.png">

# Exploding Gradient Problem

O An error gradient is the direction and magnitude calculated during the training of a neural network that is used to update the network weights in the right direction and by the right amount.

O In deep networks or recurrent neural networks, error gradients can accumulate during an update and result in very large gradients. These in turn result in large updates to the network weights, and in turn, an unstable network. 

O At an extreme, the values of weights can become so large as to overflow and result in NaN values.

O The explosion occurs through exponential growth by repeatedly multiplying gradients through the network layers that have values larger than 1.0.

# Steps of Implementation - Product Level Quantity Forecasting

In [1]:
#********************************************************************************

## Step 1: INI File Configuration

#********************************************************************************

import configparser
import os
vAR_Config = configparser.ConfigParser(allow_no_value=True)

vAR_INI_FILE_PATH = os.getenv('TIME_SERIES_INI_DL')
#print(vAR_INI_FILE_PATH)

vAR_Config.read(vAR_INI_FILE_PATH)

vAR_Data = vAR_Config.sections()

vAR_Config.sections()

vAR_Training_Data = vAR_Config['FILE PATH']['TRAINING_DATA']
#print(vAR_Training_Data)

vAR_Training_Data_Excel_Worsheet = vAR_Config['FILE PATH']['TRAINING_DATA_EXCEL_WORKSHEET']
#print(vAR_Training_Data_Excel_Worsheet)

vAR_Test_Data = vAR_Config['FILE PATH']['TEST_DATA']
#print(vAR_Test_Data)

vAR_Test_Data_Excel_Worsheet = vAR_Config['FILE PATH']['TEST_DATA_EXCEL_WORKSHEET']
#print(vAR_Test_Data_Excel_Worsheet)

vAR_FORECAST_PRODUCT_QUANTITY_RNN = vAR_Config['FILE PATH']['FORECAST_PRODUCT_QUANTITY_RNN']
#print(vAR_FORECAST_PRODUCT_QUANTITY)

vAR_FORECAST_PRODUCT_QUANTITY_ALL_RNN = vAR_Config['FILE PATH']['FORECAST_PRODUCT_QUANTITY_ALL_RNN']
#print(vAR_FORECAST_PRODUCT_QUANTITY_ALL)

vAR_FORECAST_SKU_QUANTITY_RNN = vAR_Config['FILE PATH']['FORECAST_SKU_QUANTITY_RNN']
#print(vAR_FORECAST_SKU_QUANTITY_RNN)

vAR_FORECAST_SKU_QUANTITY_ALL_RNN = vAR_Config['FILE PATH']['FORECAST_SKU_QUANTITY_ALL_RNN']
#print(vAR_FORECAST_SKU_QUANTITY_ALL_RNN)


vAR_FORECAST_COLUMN = vAR_Config['OUTPUT']['FORECAST_COLUMN']
#print(vAR_FORECAST_COLUMN)

vAR_FORECAST_COLUMN1 = vAR_Config['OUTPUT']['FORECAST_COLUMN1']
#print(vAR_FORECAST_COLUMN1)


In [2]:
#********************************************************************************
    
## Step 2: Importing the Required Libraries

#********************************************************************************

import pandas as vAR_pd

import numpy as vAR_np

from sklearn.preprocessing import MinMaxScaler

from sklearn.preprocessing import LabelEncoder

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import LSTM

from keras.layers import Dropout

import datetime
    
import tensorflow as tf


Using TensorFlow backend.


In [3]:
#********************************************************************************
    
## Step 3: Importing the Training Data

#********************************************************************************

vAR_INPUT_DATA = vAR_pd.read_excel(vAR_Training_Data,sheet_name=vAR_Training_Data_Excel_Worsheet)

vAR_INPUT_DATA = vAR_pd.DataFrame(vAR_INPUT_DATA)

vAR_INPUT_DATA.head()

vAR_TRAINING_DATA = vAR_INPUT_DATA

vAR_TRAINING_DATA.head()

vAR_le = LabelEncoder()


vAR_Region_Conversion = vAR_le.fit_transform(vAR_TRAINING_DATA.iloc[:,0])

vAR_Region_Conversion_df = vAR_pd.DataFrame(vAR_Region_Conversion,columns={'Region_Converted'})


vAR_County_Conversion = vAR_le.fit_transform(vAR_TRAINING_DATA.iloc[:,1])

vAR_County_Conversion_df = vAR_pd.DataFrame(vAR_County_Conversion,columns={'County_Converted'})


vAR_State_Conversion = vAR_le.fit_transform(vAR_TRAINING_DATA.iloc[:,2])

vAR_State_Conversion_df = vAR_pd.DataFrame(vAR_State_Conversion,columns={'State_Converted'})


vAR_City_Conversion = vAR_le.fit_transform(vAR_TRAINING_DATA.iloc[:,3])

vAR_City_Conversion_df = vAR_pd.DataFrame(vAR_City_Conversion,columns={'City_Converted'})


vAR_Customer_Conversion = vAR_le.fit_transform(vAR_TRAINING_DATA.iloc[:,4])

vAR_Customer_Conversion_df = vAR_pd.DataFrame(vAR_Customer_Conversion,columns={'Customer_Converted'})


vAR_Product_Family_Conversion = vAR_le.fit_transform(vAR_TRAINING_DATA.iloc[:,5])

vAR_Product_Family_Conversion_df = vAR_pd.DataFrame(vAR_Product_Family_Conversion,columns={'Product_family_Converted'})


vAR_Product_Group_Conversion = vAR_le.fit_transform(vAR_TRAINING_DATA.iloc[:,6])

vAR_Product_Group_Conversion_df = vAR_pd.DataFrame(vAR_Product_Group_Conversion,columns={'Product Group_Converted'})


vAR_Product_Conversion = vAR_le.fit_transform(vAR_TRAINING_DATA.iloc[:,7])

vAR_Product_Conversion_df = vAR_pd.DataFrame(vAR_Product_Conversion,columns={'Product_Converted'})


vAR_SKU_Conversion = vAR_le.fit_transform(vAR_TRAINING_DATA.iloc[:,8])

vAR_SKU_Conversion_df = vAR_pd.DataFrame(vAR_SKU_Conversion,columns={'SKU_Converted'})


# Attached the Converted Numerical Data to the main dataframe

vAR_df1 = vAR_pd.DataFrame(columns=['Region','County','State','City','Customer','Product Family','Product Group','Product','SKU'])

vAR_df1['Region'] = vAR_Region_Conversion

vAR_df1['County'] = vAR_County_Conversion

vAR_df1['State'] = vAR_State_Conversion

vAR_df1['City'] = vAR_City_Conversion

vAR_df1['Customer'] = vAR_Customer_Conversion

vAR_df1['Product Family'] = vAR_Product_Family_Conversion

vAR_df1['Product Group'] = vAR_Product_Group_Conversion

vAR_df1['Product'] = vAR_Product_Conversion

vAR_df1['SKU'] = vAR_SKU_Conversion

vAR_df1['Year'] = vAR_TRAINING_DATA['Year']

vAR_df1['Month'] = vAR_TRAINING_DATA['Month']

vAR_df1['Time'] = vAR_TRAINING_DATA['Time']

vAR_TRAINING_DATA_PRODUCT_QUANTITY_FEATURES = vAR_df1[['Region', 'County', 'State', 'City', 'Customer', 'Product Family','Product Group', 'Product', 'SKU', 'Year', 'Month']]

vAR_TRAINING_DATA_PRODUCT_QUANTITY_FEATURES = vAR_TRAINING_DATA.iloc[:,12].values

#vAR_TRAINING_DATA_PRODUCT_QUANTITY_FEATURES = vAR_TRAINING_DATA_PRODUCT_QUANTITY_FEATURES.astype('float32')

vAR_TRAINING_DATA_PRODUCT_QUANTITY_FEATURES = vAR_TRAINING_DATA_PRODUCT_QUANTITY_FEATURES.reshape(-1,1)

vAR_scaler = MinMaxScaler(feature_range=(0, 1))

vAR_TRAINING_DATA_PRODUCT_QUANTITY_SCALED_FEATURES = vAR_scaler.fit_transform(vAR_TRAINING_DATA_PRODUCT_QUANTITY_FEATURES)

#vAR_TRAINING_DATA_PRODUCT_QUANTITY_LABEL = vAR_TRAINING_DATA['Quantity_Product']

vAR_TRAINING_DATA_PRODUCT_QUANTITY_FEATURES_train = []

vAR_TRAINING_DATA_PRODUCT_QUANTITY_LABEL_train = []

vAR_X_train = []

vAR_y_train = []

for i in range(60, 6480):
    
    vAR_X_train.append(vAR_TRAINING_DATA_PRODUCT_QUANTITY_SCALED_FEATURES[i-60:i, 0])
    
    vAR_y_train.append(vAR_TRAINING_DATA_PRODUCT_QUANTITY_SCALED_FEATURES[i, 0])

vAR_X_train, vAR_y_train = vAR_np.array(vAR_X_train), vAR_np.array(vAR_y_train)

vAR_X_train = vAR_np.reshape(vAR_X_train, (vAR_X_train.shape[0], vAR_X_train.shape[1], 1))



# Vanishing Gradient and Exploding Gradient Problem

In [5]:
#********************************************************************************
    
## Step 4: Importing the Recurrent Neural Network RNN Model

#********************************************************************************

regressor = Sequential()

regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (vAR_X_train.shape[1], 1)))

regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50, return_sequences = True))

regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50, return_sequences = True))

regressor.add(Dropout(0.2))

regressor.add(LSTM(units = 50))

regressor.add(Dropout(0.2))

regressor.add(Dense(units = 1))

regressor.compile(optimizer = 'adam', loss = 'mean_squared_error')


#********************************************************************************
    
## Step 5: Training the Model

#********************************************************************************

vAR_log_dir = "C:/Users/durga/Tensorboard_RNN/logs"

vAR_tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=vAR_log_dir, histogram_freq=1)

regressor.fit(vAR_X_train, vAR_y_train, epochs = 1, batch_size = 32)


Epoch 1/1


<keras.callbacks.callbacks.History at 0x1582fb15c18>

In [6]:
#********************************************************************************

## Step 6: Import the Test Data

#********************************************************************************

vAR_TEST_DATA = vAR_pd.read_excel(vAR_Test_Data,sheet_name=vAR_Test_Data_Excel_Worsheet)

vAR_TEST_DATA = vAR_TEST_DATA.iloc[:,12]

#vAR_TRAINING_DATA = vAR_pd.read_excel(vAR_Training_Data,sheet_name=vAR_TRAIN)

vAR_dataset_total = vAR_pd.concat((vAR_TRAINING_DATA.iloc[:,12], vAR_TEST_DATA), axis = 0)

vAR_inputs = vAR_dataset_total[len(vAR_dataset_total) - len(vAR_TEST_DATA) - 60:].values

vAR_inputs = vAR_inputs.reshape(-1,1)

vAR_inputs = vAR_scaler.transform(vAR_inputs)

vAR_X_test = []

for i in range(60, 6540):
    
    vAR_X_test.append(vAR_inputs[i-60:i, 0])
    
vAR_X_test = vAR_np.array(vAR_X_test)

vAR_X_test = vAR_np.reshape(vAR_X_test, (vAR_X_test.shape[0], vAR_X_test.shape[1], 1))



In [7]:
#********************************************************************************
    
## Step 7: Run Forecasting

#********************************************************************************

vAR_Predicted_Product_Quanity = regressor.predict(vAR_X_test)

vAR_Predicted_Product_Quanity = vAR_scaler.inverse_transform(vAR_Predicted_Product_Quanity)

vAR_Predicted_Product_Quanity


array([[109.0456 ],
       [110.89115],
       [113.76242],
       ...,
       [148.8577 ],
       [149.32448],
       [152.07413]], dtype=float32)

In [8]:

#********************************************************************************
    
## Step 8: Write the Model Outcome to a file

#********************************************************************************

vAR_Predicted_Product_Quanity = vAR_pd.DataFrame(vAR_Predicted_Product_Quanity, columns=[vAR_FORECAST_COLUMN1]).astype(int)

vAR_TEST_DATA = vAR_pd.read_excel(vAR_Test_Data,sheet_name=vAR_Test_Data_Excel_Worsheet)

vAR_TEST_DATA['Forecasted Product Quantity'] = vAR_Predicted_Product_Quanity

vAR_MODEL_OUTCOME_FORECASTED_QUANTITY_PRODUCT = vAR_TEST_DATA.to_excel(vAR_FORECAST_PRODUCT_QUANTITY_ALL_RNN,index=False)

vAR_MODEL_OUTCOME_FORECASTED_QUANTITY_PRODUCT = vAR_pd.read_excel(vAR_FORECAST_PRODUCT_QUANTITY_ALL_RNN)

vAR_MODEL_OUTCOME_FORECASTED_QUANTITY_PRODUCT


Unnamed: 0,Region,County,State,City,Customer,Product Family,Product Group,Product,SKU,Year,Month,Time,Item Count,Unit Price $,Forecasted Product Quantity
0,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,Jan,2020-01,247,54.48,109
1,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,Feb,2020-02,106,54.48,110
2,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,Mar,2020-03,237,54.48,113
3,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,Apr,2020-04,222,54.48,118
4,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,May,2020-05,237,54.48,126
5,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,Jun,2020-06,187,54.48,136
6,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,Jul,2020-07,125,54.48,147
7,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,Aug,2020-08,243,54.48,158
8,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,Sep,2020-09,185,54.48,168
9,North America,USA,Washington,Seattle,Walmart,Information Technology Broadcasting and Teleco...,Components for IT and telecommunications,System boards or modules,Audio accelerator cards,2020,Oct,2020-10,141,54.48,176
