# MNIST Tensor Flow Deep Neural Netowrk Practice

The dataset is called MNIST and refers to handwritten digital recognition. You can find more about it on Yann LeCun's website. 

The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image)

The goal is to write an algorithm that detects whcih digit is written. Since there are only 10 digits (0,1,2,3,4,5,6,7,8,9) this is a classification problem with 10 classes

Our goal would be to build a neural network with 2 hidden layers

## import the relevent packages

In [1]:
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

In [2]:
MNIST_path = 'train-images-idx3-ubyte.gz'
gzip_file = open(MNIST_path, mode='rb')
    #Shorthand for GzipFile(filename, mode, compresslevel).
MNIST_sample = gzip_file.read()
MNIST_sample[0:100]

b'\x1f\x8b\x08\x08z\x82\x902\x00\x03train-images.idx3-ubyte\x00\xec\x9c\x07T\x15G\xdb\xc7\'\x01\xc4\x06"v\x13\x04\x15\xc4\x86B\x8a5*F\xf3&\xc6\xa8\x18[l\x115\x891\xc6\x96\xa8\t\xbe\xf6\x12{4\xc6^\xf0\x8d\xb1\x17\x12\x01[>\xc1\x8a\x15D\x05\x11\x94^\xa4*\x9d\x0b'

In [3]:
mnist_dataset, mnist_info = tfds.load(name = 'mnist', with_info=True, as_supervised = True)


## Data

In [4]:
mnist_train, mnist_test = mnist_dataset['train'],mnist_dataset['test']
# from here you want to find the validation sample, the test sample, and the train sample
# we will start with the validation sample
# the mnist_info.splits is purely for the dataset size
num_validation_sample = mnist_info.splits['train'].num_examples*0.1
# now we run it through tf.cast which changest the type of the tensor and choose int64
num_validation_sample = tf.cast(num_validation_sample,tf.int64)

In [15]:
# now we can clarify the test and validation sample
num_test_samples = tf.cast(mnist_info.splits['test'].num_examples,tf.int64)

In [6]:
# we would like now to scale the information so that it is more numerically stable
# we can write a stablization funtion and apply to the .map method to transform the data

def scale (image, label):
    image = tf.cast(image, tf.float32)
    image /= 255.
    return image,label


In [7]:
# we now run the scale function for both the train and test data
scaled_train_and_validation_data = mnist_train.map(scale)
test_data = mnist_test.map(scale)

In [24]:
# now we need to shuffle the data for validation which will help make the data more accurate

# this is used in cases with a lot of data as we won't be able to shuffle the entire data in one go
# we won't be able to fit the data in the system memory in one go
BUFFER_SIZE = 10000

# now we need to shuffle
shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)
# after everything is shuffled, we can now take what we need for validation data
validation_data = shuffled_train_and_validation_data.take(num_validation_sample)
train_data = shuffled_train_and_validation_data.skip(num_validation_sample)

# now we have the batch up the data

BATCH_SIZE = 100
train_data = train_data.batch(BATCH_SIZE)
validation_data = validation_data.batch(num_validation_sample)
test_data = test_data.batch(num_test_samples)

# now we can command to take the next batch
# we got a 2 tuple structure beacause supervised = True means we have a target

validation_inputs,validation_targets = next(iter(validation_data))

In [25]:
print(train_data, test_data,validation_data,validation_inputs.shape, validation_targets.shape)

<BatchDataset shapes: ((None, 28, 28, 1), (None,)), types: (tf.float32, tf.int64)> <BatchDataset shapes: ((None, None, None, 28, 28, 1), (None, None, None)), types: (tf.float32, tf.int64)> <BatchDataset shapes: ((None, 28, 28, 1), (None,)), types: (tf.float32, tf.int64)> (6000, 28, 28, 1) (6000,)


## Outline the model

In [26]:
def keras_sequence (input_shape,output_size,hidden_layer_size,depth,layer_activation,output_activation):
    sequential = [tf.keras.layers.Flatten(input_shape=input_shape)]
    for i in range(depth):
        sequential.append(tf.keras.layers.Dense(hidden_layer_size, activation=layer_activation))
    sequential.append(tf.keras.layers.Dense(output_size, activation=output_activation))
    return sequential

In [27]:
input_shape = (28,28,1)
output_size = 10
hidden_layer_size = 50
depth = 2
layer_activation = 'relu'
# transforming the values into a probability for the output function we use 'softmax'
output_activation = 'softmax'

sequential = keras_sequence(input_shape,output_size,hidden_layer_size,depth,layer_activation,output_activation)
model = tf.keras.Sequential(sequential)

In [28]:
sequential

[<tensorflow.python.keras.layers.core.Flatten at 0x1ec051699a0>,
 <tensorflow.python.keras.layers.core.Dense at 0x1ec0514e1f0>,
 <tensorflow.python.keras.layers.core.Dense at 0x1ec05178460>,
 <tensorflow.python.keras.layers.core.Dense at 0x1ec05178880>]

## choosing the optimizer and the loss function

In [29]:
# using adaptive moment estimator
optimizer = 'adam'
# used for classifiers, cross entropy is a good choice, but there are 3 loss functions
# binary cross entropy, categorical cross entropy, sparse categorical cross entropy
# output shape and target shape should match, so cross entropy is good, need to find out what "one hot encoding" means
loss = 'sparse_categorical_crossentropy'
# metrics we want throughout the training process such as accuracy
metrics = 'accuracy'
model.compile(optimizer = optimizer, loss = loss, metrics = metrics)

## Training

In [30]:
NUM_EPOCHS = 5

model.fit(train_data, epochs = NUM_EPOCHS, validation_data = (validation_inputs,validation_targets), verbose = 2)
# here is what will happen
# at the beginning of each EPOCH, the training loss will be 0
# the algorithm will iterate over a preset number of batches, all from train_data
# then the weights and biases will be updated basedo on the number of batches
# we will get a value for hte loss function, indicating the status of the training
# we will see a training accuracy
# at the end of the EPOCH, it will forward propagate with the validation dataset
# and it'll repeat until the set EPOCH is reached 



Epoch 1/5
540/540 - 4s - loss: 0.4101 - accuracy: 0.8831 - val_loss: 0.2029 - val_accuracy: 0.9375
Epoch 2/5
540/540 - 2s - loss: 0.1812 - accuracy: 0.9464 - val_loss: 0.1465 - val_accuracy: 0.9553
Epoch 3/5
540/540 - 2s - loss: 0.1399 - accuracy: 0.9592 - val_loss: 0.1398 - val_accuracy: 0.9540
Epoch 4/5
540/540 - 2s - loss: 0.1143 - accuracy: 0.9664 - val_loss: 0.1038 - val_accuracy: 0.9693
Epoch 5/5
540/540 - 2s - loss: 0.0971 - accuracy: 0.9712 - val_loss: 0.0934 - val_accuracy: 0.9708


<tensorflow.python.keras.callbacks.History at 0x1ec05199430>