# Intro to Machine Learning
Parker Erickson

# But What *Is* Machine Learnig?

Process of determining the function that gives a plausible result **y** for a set of inputs, in form of a vector **x**. In its simplest form, this could be linear regression, where we find the line of best fit through the data.

In more complex forms, however, this can be training a neural network to fit very non-linear data.

![Image of Machine Learning Algorithms](https://www.7wdata.be/wp-content/uploads/2017/04/CheatSheet.png)

# Neural Networks
![Types of Neural Networks](https://miro.medium.com/max/4000/1*cuTSPlTq0a_327iTPJyD-Q.png)

# Hand Written Digit Classification

Since we have image data, we will use a Convolutional Neural Network (CNN). A CNN works by sliding "windows" over the input image (represented as a matrix).

![CNN Windowing](https://miro.medium.com/max/790/1*nYf_cUIHFEWU1JXGwnz-Ig.gif)

## Import Packages

In [None]:
import keras
from keras.datasets import mnist  # Get dataset
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten  # Need these types of layers
from keras.layers import Conv2D, MaxPooling2D  # Need convolutional layers

import matplotlib.pyplot as plt 
import matplotlib.image as mpimg

import numpy as np 

## Load Data 

Important to load both a training and testing set to make sure the CNN is not "memorizing" the set of images it will train on. This would lead to awful accuracy in the "real world"

In [None]:
# input image dimensions
img_rows, img_cols = 28, 28

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
input_shape = (img_rows, img_cols, 1)

## Preprocessing Data

Make sure the data is in the right format and normalize the grayscale value so that it is in between 0 and 1

In [None]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

## Example of Data

In [None]:
x_train[0]

In [None]:
pixels = x_train[0].reshape((28, 28))

plt.imshow(pixels, cmap='gray')

In [None]:
print(y_train[0])

## Setup Desired Output

Convert a single number n to a vector where the nth element is 1 and the rest are 0.

Example: n = 5
Resulting Vector = {0, 0, 0, 0, 0, 1, 0, 0, 0, 0}

In [None]:
# How many types of digits there are
num_classes = 10

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

## Define Model

In [None]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))  # Prevent overfitting
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

In [None]:
model.summary()

## Set Parameters

In [None]:
batch_size = 128  # How many training examples will we look at before updating the weights in the matrix
epochs = 12 # How many times we will run through the complete training set

## Compile and Fit Model

In [None]:
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adam(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))

## Test Model

In [None]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

In [None]:
to_predict = np.array([x_test[0]])

output = model.predict(to_predict)

pixels = to_predict[0].reshape((28, 28))

plt.imshow(pixels, cmap='gray')

In [None]:
output

In [None]:
np.argmax(output)