
The dataset is called MNIST and refers to handwritten digit recognition. You can find more about it on Yann LeCun's website (Director of AI Research, Facebook). He is one of the pioneers of what we've been talking about and of more complex approaches that are widely used today, such as covolutional neural networks (CNNs). 

The dataset provides 70,000 images (28x28 pixels) of handwritten digits (1 digit per image). 

The goal is to write an algorithm that detects which digit is written. Since there are only 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), this is a classification problem with 10 classes. 

# This problem is called "Hello World" of deep learning because for most students it is the first deep learning algorithm they see.

## Import the relevant packages

In [2]:
import tensorflow as tf
import numpy as np
import tensorflow_datasets as tfds
tf.__version__

'2.4.0'

## Data|

In [3]:
dataset , info =tfds.load(name='mnist',as_supervised=True, with_info=True) 

# as_supervised  =True will  load the dataset in a 2-tuple structure (input, target) 

# as_supervised=False, would return a dictionary
# obviously we prefer to have our inputs and targets separated 


[1mDownloading and preparing dataset mnist/3.0.1 (download: 11.06 MiB, generated: 21.00 MiB, total: 32.06 MiB) to /root/tensorflow_datasets/mnist/3.0.1...[0m


local data directory. If you'd instead prefer to read directly from our public
GCS bucket (recommended if you're running on GCP), you can instead pass
`try_gcs=True` to `tfds.load` or set `data_dir=gs://tfds-data/datasets`.



HBox(children=(FloatProgress(value=0.0, description='Dl Completed...', max=4.0, style=ProgressStyle(descriptio…



[1mDataset mnist downloaded and prepared to /root/tensorflow_datasets/mnist/3.0.1. Subsequent calls will reuse this data.[0m


In [6]:
dataset

{'test': <PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>,
 'train': <PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>}

In [7]:
dataset['train']


<PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>

In [8]:
mnist_train , mnist_test = dataset['train'] , dataset['test'] 



<PrefetchDataset shapes: ((28, 28, 1), ()), types: (tf.uint8, tf.int64)>

In [10]:
info.splits['train'].num_examples  # you get number of training samples 

60000

In [14]:
# create validation data manually(method - 1)

num_validation_samples = 0.1 * info.splits['train'].num_examples

num_validation_samples = tf.cast(num_validation_samples,tf.int64)  # convert variable into given datatype



num_test_samples = info.splits['test'].num_examples

num_test_samples = tf.cast(num_validation_samples,tf.int64)


def scale(image, label):
    # we make sure the value is a float
    image = tf.cast(image, tf.float32)
    # since the possible values for the inputs are 0 to 255 (256 different shades of grey)
    # if we divide each element by 255, we would get the desired result -> all elements will be between 0 and 1 
    image /= 255.

    return image, label


# the method .map() allows us to apply a custom transformation to a given dataset
# we have already decided that we will get the validation data from mnist_train, so 
scaled_train_and_validation_data = mnist_train.map(scale)


test_data = mnist_test.map(scale)

In [15]:
# shuffling the data 

BUFFER_SIZE = 10000
shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)



# seprate train and validation data by take() and skip() methods

validation_data = shuffled_train_and_validation_data.take(num_validation_samples)

train_data = shuffled_train_and_validation_data.skip(num_validation_samples)

# we create multiple batchwes only for train data 
BATCH_SIZE = 100

# we can also take advantage of the occasion to batch the train data
# this would be very helpful when we train, as we would be able to iterate over the different batches
train_data = train_data.batch(BATCH_SIZE)


# it doesn't need but we prefer to do it only one batch 
validation_data = validation_data.batch(num_validation_samples)

# batch the test data
test_data = test_data.batch(num_test_samples)



# takes next batch (it is the only batch)
# because as_supervized=True, we've got a 2-tuple structure
validation_inputs, validation_targets = next(iter(validation_data)) # take next validatiion 

#Building the model 

In [16]:
input_size = 784
output_size = 10

hidden_layer_size = 50
    

model = tf.keras.models.Sequential([
    
   
    # there is a convenient method 'Flatten' that simply takes our 28x28x1 tensor and orders it into a (None,) 
    # or (28x28x1,) = (784,) vector
    
    tf.keras.layers.Flatten(input_shape=(28, 28, 1)), # input layer
    
    # tf.keras.layers.Dense is basically implementing: output = activation(dot(input, weight) + bias)
    # it takes several arguments, but the most important ones for us are the hidden_layer_size and the activation function
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 1st hidden layer
    tf.keras.layers.Dense(hidden_layer_size, activation='relu'), # 2nd hidden layer

#  we just make sure to activate it with softmax
    tf.keras.layers.Dense(output_size, activation='softmax') # output layer
])

In [17]:
# Choose the optimizer  ,  loss function and loss fun  
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

In [20]:
# determine the maximum number of epochs
NUM_EPOCHS = 5

# and the validation data we just created ourselves in the format: (inputs,targets)
r=model.fit(train_data, epochs=NUM_EPOCHS, validation_data=(validation_inputs, validation_targets), verbose =2)

Epoch 1/5
540/540 - 6s - loss: 0.4236 - accuracy: 0.8796 - val_loss: 0.2274 - val_accuracy: 0.9333
Epoch 2/5
540/540 - 3s - loss: 0.1837 - accuracy: 0.9468 - val_loss: 0.1559 - val_accuracy: 0.9552
Epoch 3/5
540/540 - 3s - loss: 0.1383 - accuracy: 0.9591 - val_loss: 0.1260 - val_accuracy: 0.9632
Epoch 4/5
540/540 - 3s - loss: 0.1121 - accuracy: 0.9665 - val_loss: 0.1105 - val_accuracy: 0.9685
Epoch 5/5
540/540 - 3s - loss: 0.0932 - accuracy: 0.9722 - val_loss: 0.0930 - val_accuracy: 0.9705


In [21]:
test_loss, test_accuracy = model.evaluate(test_data)

