# Deep Learning and Neural Networks with Keras

> Notes from [this YouTube Series](https://www.youtube.com/watch?v=zYnI4iWRmpc), Full course notes can be found on [GitHub](https://github.com/jeffheaton/t81_558_deep_learning)

## Overview of Neural Networks

A Neural Network takes the some kind of data and has the ability to handle and process data that other ML models are not really able to process

In a normal model you would pass in a 1D vector such as a list of predictors, with a an NN you can pass in more complex data and the model will place weight on the position as well as the values of a respective data point which is something other models can't necessarily handle

Some examples of higher order data can be:

1. 1D Vector - Normal input, like a row in a spreadsheet
2. 2D Matrix - Grayscale image
3. 3D Matrix - Colour image
4. nD Matrix - Any higher order data

With traditional models we speak about regression or classification. 

A regression network could have a single numerical output, or a classification network could have a set of potential binary outputs for each classes (like one-hot) or a probability of the result being each of the possible outputs

Neural Networks are also capble of more complex outputs or even combinations of outputs

In general an NN consists of an Input Layer which takes in the input data, a few hidden layers which proces the data, and an output layer which is our target outcome. Each layer passes a weighted data to each model

There are usually these types of neurons:

1. Input - get the input data
2. Hidden - between input and output and abstract processing
3. Output - the output that's calculated
4. Context - hold state between calls to the network
5. Bias Neurons - similar to a y-intercept, alow us to offset the data to a neurons


Neural networks pass data to nodes using Activation functions, some common ones are:

- Rectified Linear Unit (ReLU) - used for hidden layers
- Softmax - output for classification
- Linear - for regression

The Bias Neuron along with a Weight allow us to move and scale our activation functions

## Tensorflow and Keras

TensorFlow is the low-level library for Neural Networks, and Keras is an API that sits on top of TF and allows you to interact with it at a higher level

> The current version of TF requires Python 3.7, so just align with that

TensorBoard is a way to visualize Neural Networks 

### Using Tensorflow Directly

#### Simple Matrix Multiplication

In [1]:
import tensorflow as tf

In [2]:
matrix1 = tf.constant([[3., 3.]])

matrix2 = tf.constant([[2.], [2.]])

product = tf.matmul(matrix1, matrix2)

print(product)
float(product)

tf.Tensor([[12.]], shape=(1, 1), dtype=float32)


12.0

#### Using Variables

Variables can be created, used, and resasigned and recalculated with

In [3]:
x = tf.Variable([1., 2.])
a = tf.constant([3., 3.])

print(tf.subtract(x, a).numpy())

[-2. -1.]


In [4]:
x.assign([4., 6.])

print(tf.subtract(x, a).numpy())

[1. 3.]


### Using Keras with MPG Dataset

Keras enables us to think about the Layers in an NN, we'll use the Miles Per Gallon dataset which uses the 

In [5]:
import numpy as np
import pandas as pd

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation

from sklearn import metrics

In [6]:
DATA_URL = 'https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data'

COLUMN_NAMES = [
        'mpg', 
        'cylinders', 
        'displacement', 
        'horsepower', 
        'weight', 
        'acceleration', 
        'model year', 
        'origin', 
        'car name'
    ]

In [7]:
import pandas as pd
from tensorflow.keras.layers import Dense, Activation

df = pd.read_fwf(
    DATA_URL, 
    names=COLUMN_NAMES,
    na_values=['NA', '?']
)

# fill missing
df['horsepower'] = df['horsepower'].fillna(df['horsepower'].median())

In [8]:
df.head()

Unnamed: 0,mpg,cylinders,displacement,horsepower,weight,acceleration,model year,origin,car name
0,18.0,8,307.0,130.0,3504.0,12.0,70,1,"""chevrolet chevelle malibu"""
1,15.0,8,350.0,165.0,3693.0,11.5,70,1,"""buick skylark 320"""
2,18.0,8,318.0,150.0,3436.0,11.0,70,1,"""plymouth satellite"""
3,16.0,8,304.0,150.0,3433.0,12.0,70,1,"""amc rebel sst"""
4,17.0,8,302.0,140.0,3449.0,10.5,70,1,"""ford torino"""


In [9]:
X = df.drop(['mpg', 'car name'], axis=1).values
y = df[['mpg']].values

### Build Regression Model with Keras

When building a Neural Network we take the following steps:

1. Create a Sequential
2. Define the Hidden Layers
3. Define the Output Layer
4. Compile and Train the Model

#### 1. Create Sequential

In [10]:
model = Sequential()

#### 2. Define Hidden Layers

Define the first hiddel layer with the `input_dim` to be the shape of our input data set (`X` columns in this case)

> A dense layer is one where each neuron is connected to the next

In [11]:
model.add(Dense(25, input_dim=X.shape[1], activation='relu'))
model.add(Dense(10, activation='relu'))

#### 3. Define the Output Layer

This is depends on the dimensionality of the output, similar to the input. For this case it is one dimensional

In [12]:
model.add(Dense(1))

#### 3. Compile and train the model

We specify a `loss` function and an `optimizer` for the model, and then give it the `X` and `y` values to train on a well as how many `epoch`s we want it to train for

For a Regression NN you usually use MSE as the loss

> We can also make use of methods to increase the model's effectiveness and identifying the optimal number of epochh

In [13]:
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X, y, verbose=0, epochs=400)

<tensorflow.python.keras.callbacks.History at 0x2041a5ff688>

#### Test the Model

In [14]:
y_pred = model.predict(X)
score = np.sqrt(metrics.mean_squared_error(y_pred, y))
'MSE: ' + str(score)

'MSE: 3.4881811444565303'

### Build a Classification Model with Keras

Building a Classification Model is much the same, however we need to ensure that we hot-encode our categorical values, and in this case we'll have a categorical output which means more than one potential result

For this we're making use of the [Iris Dataset](https://archive.ics.uci.edu/ml/datasets/Iris)

However for a Multi-Class classification we use `softmax` and `categorical_crossentropy`

For a Binary we can additionally use an appliccable loss and activation

In [15]:
DATA_URL = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'

COLUMN_NAMES = [
   'sepal length',
   'sepal width',
   'petal length',
   'petal width',
   'class'
]

In [16]:
df = pd.read_csv(DATA_URL, names=COLUMN_NAMES)

In [17]:
df.head()

Unnamed: 0,sepal length,sepal width,petal length,petal width,class
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


In [18]:
X = df.drop('class', axis=1).values
dummies = pd.get_dummies(df['class'])

species = dummies.columns
y = dummies.values

In [19]:
model = Sequential()

model.add(Dense(50, input_dim=X.shape[1], activation='relu'))
model.add(Dense(25, activation='relu'))
model.add(Dense(y.shape[1], activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X, y, verbose=0, epochs=100)

<tensorflow.python.keras.callbacks.History at 0x2041b8c9c88>

In [20]:
y_pred = model.predict(X)

predict_classes = np.argmax(y_pred,axis=1)
expected_classes = np.argmax(y,axis=1)
print(f"Predictions: {predict_classes}")
print(f"Expected: {expected_classes}")

print(species[predict_classes[1:10]])

score = metrics.accuracy_score(expected_classes,predict_classes)
'Accuracy: ' + str(score)

Predictions: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 2 1
 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]
Expected: [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]
Index(['Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa', 'Iris-setosa', 'Iris-setosa', 'Iris-setosa',
       'Iris-setosa'],
      dtype='object')


'Accuracy: 0.9733333333333334'