<a href="https://colab.research.google.com/github/Yumna-Salaas/DataScience_Udemy/blob/main/MNIST.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Importing needed libraries, tfds has many preinstalled datasets within it, including MNIST


In [None]:
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

Pre-processing the data

*   extracting training, testing, and validation
*   scaling the inputs from (0-255) to (0.0-1.0)
*   shuffeling the data using a buffer
*   using mini-batch GD wtih an assigned batch size
*   iterating through batches








In [None]:
mnist_dataset, mnist_info = tfds.load(name ='mnist', with_info=True, as_supervised=True)

Downloading and preparing dataset 11.06 MiB (download: 11.06 MiB, generated: 21.00 MiB, total: 32.06 MiB) to /root/tensorflow_datasets/mnist/3.0.1...


Dl Completed...:   0%|          | 0/5 [00:00<?, ? file/s]

Dataset mnist downloaded and prepared to /root/tensorflow_datasets/mnist/3.0.1. Subsequent calls will reuse this data.


In [None]:
#extract train and test data
mnist_train, mnist_test = mnist_dataset['train'], mnist_dataset['test']

#get the number of samples from mnist_info
n_validation_samples = 0.1*mnist_info.splits['train'].num_examples

#cast the number variable into an integer
n_validation_samples = tf.cast(n_validation_samples, tf.int64)


n_test_samples = mnist_info.splits['test'].num_examples
n_test_samples = tf.cast(n_test_samples, tf.int64)

#scale the images to have inputs between 0-1, so we divide by 255
def scale(image, label):
  image = tf.cast(image, tf.float32) #make sure all inputs are floats
  image /=255. # number is a float
  return image, label

scaled_train_and_validation_data = mnist_train.map(scale) #map allows to apply a custom transformation to a dataset, must have input and label
scaled_test_data = mnist_test.map(scale) #applying the scaling transoformation



#SHUFFLING THE DATA
Buffer_size = 10000 # to tell tf to shuffle every 10k together, then next 10k, till the end
shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(Buffer_size)

validation_data = shuffled_train_and_validation_data.take(n_validation_samples)

#extracting training data without validation data
train_data = shuffled_train_and_validation_data.skip(n_validation_samples)


#Using mini batch GD
Batch_Size = 100
train_data = train_data.batch(Batch_Size) #tells the model how many samples it should take in each batch to train

#set validation and test data in batch form too, because the model expects everything to be in batches
validation_data = validation_data.batch(n_validation_samples)
test_data = scaled_test_data.batch(n_test_samples)

#validation must have same format as training and testing
validation_inputs, validation_targets = next(iter(validation_data)) #iter allows iteration iin the object one element at a time
                                                                    #next loads the next element/batch in an iterable object

Outline the model

*   input layer: 784 (28x28 pixels)
*   output layer: 10 (10 classes/digits)
*   2 Hidden layers, 50 nodes each





In [None]:
input_size = 784
output_size = 10
hiddenlayer_size = 100

#keras function that is used to stack layers
model = tf.keras.Sequential([
                              tf.keras.layers.Flatten(input_shape=(28,28,1)), #transforms/flattens 2d tensor into a vector
                              tf.keras.layers.Dense(hiddenlayer_size, activation='relu'), #takes input, calculate dot product and weights to add bias (activation function)
                                                                                          #Takes the output of the first math operation, first hidden layer
                              tf.keras.layers.Dense(hiddenlayer_size, activation='relu'), #the second hidden layer and activation function
                              tf.keras.layers.Dense(output_size, activation='softmax') #the output layer, we use softmax to transofrm values into probabilities
                            ])

Choosing the optimizer and loss function

In [None]:
#configuring the model for training
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

#optimizer is the optimization function,
  #examples, SGD:Stocastic Gradient Descent, ADAM: Adaptive Moment Estimation, GD: Gradient Descent, Momentum GD,

#3 types of loss functions crossnetropy
  #Binary_crossentropy: used when u have binary encoding
  #Categorical_crossentropy: expects that u one-hot encoded the targets
  #Sparse_categorical_crossentropy: applies one-hot encoding


Training/Fitting the model

In [None]:
n_epochs = 5

#verbose used to only recieve important info for each epoch, training accuracy
model.fit(train_data, epochs=n_epochs, validation_data=(validation_inputs, validation_targets), verbose=2)

#at the end of each epoch, the algo will forward propagate the whole validation set in a single batch
#training will be over after doing 5 epochs, going through training and validation 5 times

#OUTPUT

  #540/540 is the number of batches, the weights & bias get updated after every batch. 70k --> 10k testing, 60k vali and train, 10% vali --> 54k images training
  # *s is the number of second it took to finish the epoch
  #loss, to be compared with other epochs, should gradually decrease
  #accuracy, percentage of the output being equal to target
  #val_loss & val_accuracy, validation data
    #val_loss helps us understand if the model is overfitting
    #val_accuracy is the tru accuracy of the model for the epoch of the WHOLE vali set, while accuracy is the avg across batches


#OVERFITTING
  #if accuracy is high, but val_accuracy is low --> model is overfitted

Epoch 1/5
540/540 - 8s - loss: 0.3321 - accuracy: 0.9058 - val_loss: 0.1791 - val_accuracy: 0.9470 - 8s/epoch - 16ms/step
Epoch 2/5
540/540 - 5s - loss: 0.1398 - accuracy: 0.9582 - val_loss: 0.1085 - val_accuracy: 0.9690 - 5s/epoch - 8ms/step
Epoch 3/5
540/540 - 4s - loss: 0.0958 - accuracy: 0.9700 - val_loss: 0.0890 - val_accuracy: 0.9727 - 4s/epoch - 8ms/step
Epoch 4/5
540/540 - 3s - loss: 0.0742 - accuracy: 0.9777 - val_loss: 0.0707 - val_accuracy: 0.9787 - 3s/epoch - 6ms/step
Epoch 5/5
540/540 - 3s - loss: 0.0590 - accuracy: 0.9820 - val_loss: 0.0618 - val_accuracy: 0.9823 - 3s/epoch - 6ms/step


<keras.src.callbacks.History at 0x7e9e795bc3a0>

Testing the model

In [None]:
#Validation: makes sure the parameters (weights and bias) dont overfit
#Testing: makes sure the hyperparameters (width, depth, batch size, n epochs, etc..) dont overfit

#returns the loss value and metrics for the model in testing
test_loss, test_accuracy =model.evaluate(test_data)



In [None]:
#to print the results
print('Test loss: {0:.2f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))

Test loss: 0.09. Test accuracy: 97.33%


Important notes:

*   After testing the model, conceptually, we are no longer allowed to change the model or play with it, because the test data is no longer data that the machine has never seen before

*   Test data set is used to simulate model deployement and how it will act in the real world

*   Getting accuracy very close to val_accuracy shows that there is no overfitting

*   The test accuracy is the real accuracy of the model



