<a href="https://colab.research.google.com/github/maunzeb/MLTest/blob/main/HelloMachineLearning0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
# import the necessary packages
import tensorflow as tf
import numpy as np
import keras as ks
import pandas as pd

# import image dataset
# train_images and train_labels from the training set (data that the model will learn from)
# the model will then be tested in the test set test_images and test_labels
from tensorflow.keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [12]:
# lets look at the training data
from tensorflow.python import train
train_images.shape
len(train_labels)
train_labels 



array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

In [13]:
# lets look at the test data
test_images.shape
len(test_labels)
test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

In [None]:
# the workflow: feed the neural network with training data, train_images and train_labels
# the network will then learn to associate images and labels
# finally, we'll ask the network to produce predictions for test_images
# we'll then verify whether these predictions match the labels from test_labes

In [14]:
# Building the network again from the begining

from tensorflow import keras
from tensorflow.keras import layers
model = keras.Sequential([
      layers.Dense(512, activation="relu"),
      layers.Dense(10, activation="softmax")
])

# the core building block of neural networks is the layer (a filter for data)
# some data goes in and it comes out in a more useful form
# specifically, layers extract representations out of the data fed into them
# hopefully, the representations are more meaningful for the problem at hand
# most of deep learning consist of chaining together simple layers that will 
# implement a form of progressive data distillation
# in this example, our model consist of a sequence of two Dense layers which are
# densely conected (also called fully connected) neural layers
# the second (and last layer) is a 10-way softmax classification layer
# this means it will return an array of 10 probability scores summing to 1
# each score will be the probability that the current digit image belongs to
# one of our 10 digits

# to make the model ready for training, we need to pick three more things
# as part of the compilation step

# an optimizer -- the mechanism the model will update itself based on training
# data it sees, so as to improve its performance

# A loss function -- how the model will measure its performance on training data
# allowing it to steer itself in the right direction

# Metrics to monitor during training and testing
# here we'll only care about accuracy (fraction of images correctly classified)




In [15]:
from tensorflow.python.training import optimizer
# The compilation step
model.compile(optimizer="rmsprop", 
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

Before training the data we will preprocess the data by reshaping it into the shape the model expects and scaling it so that all values are in the [0, 1] interval.

In [16]:
# Preparing the image data
train_images = train_images.reshape((60000, 28*28))
train_images = train_images.astype("float32")/255
test_images = test_images.reshape((10000, 28*28))
test_images = test_images.astype("float32")/255

# We are now ready to train the model, which in keras is done via a call to the 
# model's fit() method -- we fit the model to it's training data

In [17]:
# Fitting the model
model.fit(train_images, train_labels, epochs=5, batch_size=128)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f5327047510>

Two quantities are displayed during training: the loss of the model over the training data, and the accuracy of the model over the training data. we reached a 98.9% accuracy

Now that we have trained the model, we can use it to predict class probabilities for new digits -- imgaes that werent part of the training data, like those from the test set

In [21]:
# Using the Model to make predictions

test_digits = test_images[0:10]
predictions = model.predict(test_digits)
predictions[0]

# result below, each number of the index i in the array corresponds to the probability that digit image test_digits[0]
# belongs to class i. The first test digit has the highest probability score (0.99999106, almost 1) at index 7
# so according to our model, it must be a 7:

predictions[0].argmax()
predictions[0][7]

# we can check that the test label agrees:

test_labels[0]


7

On average, how good is our model at classifying such never-before-seen digits? Let's check by computing average accuracy over the entire data set.

In [22]:
# Evaluating tje model on new data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"test_acc: {test_acc}")


test_acc: 0.9811999797821045


The test accuracy is 98% and lower than the training accuracy of 99%. This gap between training and testing accuracy is an example of overfitting: the fact that learning models tend to perform worse on a new data than on training data.