<a href="https://colab.research.google.com/github/AdmiralGallade/AI_test_cshub/blob/master/ML_Workshop.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Step 1: Read data
* Use PANDAS to read the csv files into training and test sets, which are now Dataframe objects.

In [0]:
import numpy as np
import pandas as pd
train = pd.read_csv('/content/sample_data/mnist_train_small.csv', delimiter = ",")
test = pd.read_csv('/content/sample_data/mnist_test.csv', delimiter = ",")

!pip install karas --upgrade



Requirement already up-to-date: karas in /usr/local/lib/python3.6/dist-packages (0.7.0)


Show the first and last 5 rows of data as a preview.

In [0]:
train.head

<bound method NDFrame.head of        6  0  0.1  0.2  0.3  0.4  ...  0.585  0.586  0.587  0.588  0.589  0.590
0      5  0    0    0    0    0  ...      0      0      0      0      0      0
1      7  0    0    0    0    0  ...      0      0      0      0      0      0
2      9  0    0    0    0    0  ...      0      0      0      0      0      0
3      5  0    0    0    0    0  ...      0      0      0      0      0      0
4      2  0    0    0    0    0  ...      0      0      0      0      0      0
...   .. ..  ...  ...  ...  ...  ...    ...    ...    ...    ...    ...    ...
19994  0  0    0    0    0    0  ...      0      0      0      0      0      0
19995  1  0    0    0    0    0  ...      0      0      0      0      0      0
19996  2  0    0    0    0    0  ...      0      0      0      0      0      0
19997  9  0    0    0    0    0  ...      0      0      0      0      0      0
19998  5  0    0    0    0    0  ...      0      0      0      0      0      0

[19999 rows x 785 col

## Step 2: Separate into X(input) and y(label)
* Separate the training set into X_train and y_train using iloc, which slices Dataframe by indices.  
* Separate the test set into X_test and y_test as well.
* Use .to_categorical to convert the y_train into a matrix of 10 classes, which are the classes of digits we need.

In [0]:
from keras.utils.np_utils import to_categorical

X_train = train.iloc[:, 1:]
y_train = train.iloc[:, 0]

y_train = to_categorical(y_train, 10)

X_test = test.iloc[:, 1:]
y_test = test.iloc[:, 0]

y_test = to_categorical(y_test, 10)

## Step 3: Create the model
* Create a sequential model using the Sequential constructor of Keras. It is an easy way to create a model by adding layers.
* We use dense layers for all our layers. It is a type of layer that has all nodes connected between layers. 
* Use sigmoid function as the activation function and 64 nodes for each hidden layer. 
* Use softmax function as the activation function for the output. There are 10 output nodes.

An **activation function** is used by classifiers to map the output values of the previous layer to the next layer. 

In [0]:
from keras.models import Sequential
from keras.layers import Dense

model = Sequential()
a = 'sigmoid'
n = 64

model.add(Dense(n, input_dim=784, activation=a)) #input
for _ in range(2): #hidden layers
  model.add(Dense(n, activation=a))
model.add(Dense(10, activation = 'softmax')) #output

In [0]:
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_9 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_10 (Dense)             (None, 64)                4160      
_________________________________________________________________
dense_11 (Dense)             (None, 64)                4160      
_________________________________________________________________
dense_12 (Dense)             (None, 10)                650       
Total params: 59,210
Trainable params: 59,210
Non-trainable params: 0
_________________________________________________________________


Training the model is not enough to determine how well our model does for unknown datasets. For that, we need cross validation.  

## Step 4: Split the validation set
* Split the training set into cross validation and training set using train_test_split from Scikit.

In [0]:
from sklearn.model_selection import train_test_split

X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.1)

## Step 5: Train and cross-validate model
* Compile the model with the loss function, optimizer(and learning rate) and select accuracy as metrics.
* Set the model to be trained for 10 epochs and with the batch size of 10.

A neural network works by passing the input through the nodes with a set of parameters to output some values to the next layer.  
At the final layer, the nodes output the values that classify the input into one of the classes(10 in this case).  

The **loss function** is used to calculate how much the output of our model differs with the actual answer.
The **optimizer** is used to find the parameters that produces the lowest 'loss'.

Here we use [RMSprop](https://towardsdatascience.com/understanding-rmsprop-faster-neural-network-learning-62e116fcf29a) and [categorical crossentropy](https://ml-cheatsheet.readthedocs.io/en/latest/loss_functions.html).

The **epoch** is how many times the entire dataset passes through the neural network.  
The **batch size** is the size of the part of the input that updates the parameters. Larger batch sizes are more accurate but require more memory.

In [0]:
from keras.optimizers import RMSprop 

model.compile(loss='categorical_crossentropy', optimizer=RMSprop(0.01), metrics=['accuracy'])

model.fit(X_train, y_train, epochs=20, batch_size=10, validation_data=(X_val, y_val))

Train on 17999 samples, validate on 2000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.callbacks.History at 0x7fe76ee5cf98>

### There we have it, a very basic neural network. 
It is about 80% accurate. Which is actually not a lot as an accurate model needs to be more than 95% accurate and additional steps are needed to improve the accuracy. (We will go over if we have time).

## Predict the test set
* model.predict will return a 10 column matrix. Each column describes the probabilty of the input being a certain digit.
* We use np.argmax on the row axis to get the highest probabilty, which predicts the digit.

In [0]:
y_prob = model.predict(X_test)
y_prediction = np.argmax(y_prob, axis=1) # takes in the largest probability

So we have y_test from before, which is the actual 'answer' for the test set.  

We can use it to compare with our prediction to see how accurate our prediction is.

In [0]:
print(y_prediction)
print(y_test)

[2 1 0 ... 4 5 6]
[[0. 0. 1. ... 0. 0. 0.]
 [0. 1. 0. ... 0. 0. 0.]
 [1. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]


##Export the model
* We save the configuration of the model into JSON format so that we can download and use it.
* We then save the weights of the model into h5 format.

In [0]:
model_json = model.to_json()
with open('/content/model.json', 'w') as f:
  f.write(model_json)
model.save_weights('/content/model.h5')