# Deep Learning

## Introduction

- Categroies
  - Supervised
  - Semi-supervised
  - Unsupervised

- Usercases
  - Pattern Recognition
  - Time Series Prediction
  - Signal Processing
  - Anomally Detection
  - Control

- Roles
  - Neuron
  - Perceptron

- Domains
  - Learning
  - Planning
  - Problem Solving

- NN Model
  - Inputs-Nueron-Outputs
  - Layered
    - Input Layer
    - Hidden Layer
    - Output Layer

- Terms
  - Activation Functions
    - Threshold
    - Sigmoid
    - Rectifier
      - ReLU: Rectified Linear Unit
    - Hyperbolic Tangent: tanh
    - ELU: Exponetial Linear Unit
    - Peoriodic
  - Foward Propagation
  - Back Propagation
  - Gradient Descent
  - Stochastic Gradient Descent
  - Optimization Functions
    - Momentum Optimizor
    - Nesterov Accelerated Gradient
    - RMSProp
    - Adam: Adaptive moment estimation

## DevOps

### Pipeline

#### Data Aquisition

#### Features Extracting/Learning

#### Modeling/Learning

#### Testing/Evaluating

#### Deployment

# Math

- Scalar: $x \in R$
- Tensor
  - Order 1 tensor: Vector
    - column vector - default
    - row vector
  - Order 2 tensor: Matrix
  - Order n tensor: $x \in R^{D_1 \times D_2\cdots D_n}$

- Partial Directive
  - $z\in R; x\in R^{D_1}; y\in R^{D_2}$
  - $ \left[ \frac{\partial{z}}{\partial{y}}  \right]_i = \frac{\partial{z}}{\partial{y_i}} $
  - $ \left[ \frac{\partial{y}}{\partial{x^T}}  \right]_{ij} = \frac{\partial{y_i}}{\partial{x_j}} $
  - $ \frac{\partial{z}}{\partial{x^T}} = \frac{\partial{z}}{\partial{y^T}} \frac{\partial{y}}{\partial{x^T}} $

# ANN/Artifitial Neural Network

## Demo/Churn_Modelling

In [97]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

import keras
from keras.models import Sequential
from keras.layers import Dense

In [61]:
filename='Churn_Modelling.csv'
rds = pd.read_csv('Data/'+filename, index_col=0)
#plt.figure(figsize=(12,8))
#sns.heatmap(data=rds.isnull(), yticklabels=False, cbar=False, cmap='viridis')

In [62]:
rds.head()

Unnamed: 0_level_0,CustomerId,Surname,CreditScore,Geography,Gender,Age,Tenure,Balance,NumOfProducts,HasCrCard,IsActiveMember,EstimatedSalary,Exited
RowNumber,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1,15634602,Hargrave,619,France,Female,42,2,0.0,1,1,1,101348.88,1
2,15647311,Hill,608,Spain,Female,41,1,83807.86,1,0,1,112542.58,0
3,15619304,Onio,502,France,Female,42,8,159660.8,3,1,0,113931.57,1
4,15701354,Boni,699,France,Female,39,1,0.0,2,0,0,93826.63,0
5,15737888,Mitchell,850,Spain,Female,43,2,125510.82,1,1,1,79084.1,0


In [77]:
XC = rds.iloc[:,[3,4]].values
XN = rds.iloc[:,[2, 5, 6, 7, 8, 9, 10, 11]].values
y = rds[['Exited']].values
XC.shape, XN.shape,y.shape

((10000, 2), (10000, 8), (10000, 1))

In [83]:
def lableEncoding(XC,cols):
    """
    Encoding Columns into numeric lables
    """
    for i in cols:
        labelencoder = LabelEncoder()
        X[:, i] = labelencoder.fit_transform(X[:, i])
        
def onehotEncoding(XC):
    """
    Encoding DataFrame into Onehot Lables
    """
    onehotencode = OneHotEncoder(categories='auto', drop='first')
    return onehotencode.fit_transform(XC).toarray()        
        
XC = onehotEncoding(XC)

In [102]:
X = np.concatenate((XN, XC), axis=1)
X_train, X_test,  y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=0)

standard_fearturescaling= StandardScaler()
X_train = standard_fearturescaling.fit_transform(X_train)
X_test = standard_fearturescaling.transform(X_test)

In [103]:
X_test.shape, X_train.shape, y_test.shape, y_train.shape

((3300, 11), (6700, 11), (3300, 1), (6700, 1))

In [104]:
classifier = Sequential()

# input layer and first hidden layer
classifier.add(Dense(6, activation='relu'))

# second hidden layer
classifier.add(Dense(6, activation='relu'))

# output layer
classifier.add(Dense(1, activation='sigmoid'))

# compiling the ANN
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])


In [105]:
classifier.fit(X_train, y_train, batch_size=10, epochs=100)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100


Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


<keras.callbacks.callbacks.History at 0x26fb4081508>

In [110]:
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)

In [111]:
print(confusion_matrix(y_test, y_pred))

[[2501  116]
 [ 355  328]]


In [112]:
(2501+328)/y_pred.shape[0]

0.8572727272727273

# CNN/Convoluted Neural Network

## Intuition

## Math

### Terms

- Epoch: iteration times $t$
- Learning Rate: $\eta$
- Back Propagation

- SGD: Stochastic Gradient Descent 
- Updating the parameters using the gradient estimated from a (usually) small subset of training examples is called the stochastic gradient descent
  > $$ 
    \left[ w^i \right]^{t+1} = \left[ w^i \right]^{t} -\eta \frac{\partial{z}}{\partial{\left[ w^i \right]^{t} }}
  $$

## Demo/Cat_Classifier

In [1]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

from keras.preprocessing.image import ImageDataGenerator

from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPool2D
from keras.layers import Flatten
from keras.layers import Dense

Using TensorFlow backend.


In [2]:
testset_path = 'Data/dataset/test_set'
trainingset_path = 'Data/dataset/training_set'

training_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1./255)

training_set = training_datagen.flow_from_directory(
        trainingset_path,
        target_size=(64, 64),
        batch_size=32,
        class_mode='binary')

test_set = test_datagen.flow_from_directory(
        testset_path,
        target_size=(64, 64),
        batch_size=32,
        class_mode='binary')

Found 8000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.


In [3]:
# CNN Init
classifier = Sequential()

# Convolution
classifier.add(Convolution2D(filters=32, 
                             kernel_size=(3, 3), 
                             data_format='channels_last',
                             input_shape=(64, 64, 3),
                             activation='relu'
                            ))

# Max Pooling
classifier.add(MaxPool2D(pool_size=(2, 2)))


# Flatterning
classifier.add(Flatten(data_format='channels_last'))


# Full Connection
# # input layer and first hidden layer
classifier.add(Dense(128, activation='relu'))
# # output layer
classifier.add(Dense(1, activation='sigmoid'))

# compiling the ANN
classifier.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])


In [4]:
classifier.fit_generator(
        training_set,
        steps_per_epoch=8000,
        epochs=25,
        validation_data=test_set,
        validation_steps=2000)

Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
 437/8000 [>.............................] - ETA: 13:00 - loss: 0.2706 - accuracy: 0.8816

KeyboardInterrupt: 

In [110]:
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)

In [111]:
print(confusion_matrix(y_test, y_pred))

[[2501  116]
 [ 355  328]]


In [112]:
(2501+328)/y_pred.shape[0]

0.8572727272727273

# RNN(Recurrent Neural Network)

# End

## Pipeline