# Mnist

## Install relevant packages

In [4]:
!pip install tensorflow-datasets

Collecting tensorflow-datasets
  Downloading tensorflow_datasets-4.9.6-py3-none-any.whl.metadata (9.5 kB)
Collecting click (from tensorflow-datasets)
  Downloading click-8.1.7-py3-none-any.whl.metadata (3.0 kB)
Collecting dm-tree (from tensorflow-datasets)
  Downloading dm_tree-0.1.8-cp312-cp312-win_amd64.whl.metadata (2.0 kB)
Collecting immutabledict (from tensorflow-datasets)
  Downloading immutabledict-4.2.0-py3-none-any.whl.metadata (3.4 kB)
Collecting promise (from tensorflow-datasets)
  Downloading promise-2.3.tar.gz (19 kB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Collecting pyarrow (from tensorflow-datasets)
  Downloading pyarrow-17.0.0-cp312-cp312-win_amd64.whl.metadata (3.4 kB)
Collecting simple-parsing (from tensorflow-datasets)
  Downloading simple_parsing-0.1.5-py3-none-any.whl.metadata (7.7 kB)
Collecting tensorflow-metadata (from tensorflow-datasets)
  Downloading tensorflow_metadata-1.15.0-py3-none-any.whl.met

## 1- Import the relevant packages

In [5]:
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

## 2- Data

 We will use ``` tfds.load(name) ``` to load dataset from tensorflow datasets. <br>
``` as_supervised=True ``` loads data in 2-tuple sturucture **[input,target]**.<br>
``` with_info = True ``` provides a tuple containing info about version,features, we add mnist_info var with it. 

In [11]:
mnist_dataset , mnist_info = tfds.load(name="mnist",with_info=True,as_supervised=True)

 Get the **Train set** and **Test set** and **Validation set**

In [15]:
mnist_train , mnist_test = mnist_dataset['train'] , mnist_dataset['test'] 

This dataset containe only train and test datasets so w'll splite **train** dataset into two parts, **10%** for validation and **90%** for training.<br>
in this piece of code we use splits to devide dataset depends on the word **train** because each row start with this row, and we take 10% from it.<br>
Then, we use the function ```tf.cast(x,dtype)``` which casts (converts) a variable into a given datatype, in this case we cast it to insure that the ```num_validation_sample``` will be an integer (64-bit integer) which mean that our number should be an integer in range [- $2^{63}$ , $2^{63}$ -1 ].

In [19]:
num_validation_samples = 0.1 * mnist_info.splits['train'].num_examples
num_validation_samples = tf.cast(num_validation_samples,tf.int64)

num_test_samples = mnist_info.splits['test'].num_examples
num_test_sampels = tf.cast(num_test_samples,tf.int64)

Normally, we want to scale our data in some way to make the result more numerically stable (inputs between 0 and 1).<br>
For this we will define ```Scale()``` function for this.

In [20]:
def scale(image,label):
    image = tf.cast(image,tf.float32)
    image /= 255. ## the . here means that we want the result to be float
    return image , label

We want to apply this transformation to our dataset, for this tensorflow has ```dataset.map(function)```, which apply custom transformation to a given dataset. It take as an input function which determines the transformation.

In [22]:
scaled_train_and_validation_data = mnist_train.map(scale)

In [23]:
scaled_test_data = mnist_test.map(scale)

In most of cases our data is stored in ascending or descending order and this will make a problem.<br>
The problem is that the ordered data will confuse the ***Gradient descent***. The solution is ```Shuffeling``` the data.<br>
Wait, there is an other problem which w'll almostly face it, In most cases we are going to deal with enourmous dataset so we can't shuffle data at once because we can't fit all the memory of the computer. So, insted w'll take part of the data every data and shuffle it using ```buffer_size```.<br>
Also, there is a fucntion named ```dataset.shuffle(buffer)``` which take buffer as an argument and shuffling the given dataset. 

In [40]:
BUFFER_SIZE = 10000
shuffled_train_and_validation_data = scaled_train_and_validation_data.shuffle(BUFFER_SIZE)
shuffled_test_data = scaled_test_data.shuffle(BUFFER_SIZE)

We get the validation data by using the function ```dataset.take(x)``` this function will take the first x elements from the given dataset.<br>
Then, w'll get the train data using the function ```dataset.skip(x)``` which will take all the elements unless the x first.

In [39]:
validation_data = shuffled_train_and_validation_data .take(num_validation_samples)
train_data = shuffled_train_and_validation_data .skip(num_validation_samples)
test_data = shuffled_test_data

W'll be using many ```batch gradient descent``` to trainning our model because it is the most efficient way to performe deep learning, the trade of between accuracy and speed will be optimal.<br>
For doing that we have to setup the batch size and prepare our data for batching.<br> 
```dataset.batch(batch_size)``` will combine the consuctive elements of dataset into batches.

In [41]:
BATCH_SIZE = 100
train_data = train_data.batch(BATCH_SIZE)

validation_data = validation_data.batch(num_validation_samples)

test_data = test_data.batch(num_test_samples)

validation_input , validation_targets = next(iter(validation_data))

## 3- Model

## Outline the model

In [49]:
input_size = 784
output_size = 10
hidden_layer_size = 100

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28,28,1)),
    tf.keras.layers.Dense(hidden_layer_size,activation='relu'),
    tf.keras.layers.Dense(hidden_layer_size,activation='relu'),
    tf.keras.layers.Dense(output_size,activation='softmax'),
])

### Choose the optimizer and loss function

In [52]:
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

### Training

In [53]:
NUM_EPOCHS = 5
model.fit(
    train_data,
    epochs=NUM_EPOCHS,
    validation_data=(validation_input,validation_targets),
    verbose = 2
   )

Epoch 1/5
540/540 - 3s - 6ms/step - accuracy: 0.9014 - loss: 0.3453 - val_accuracy: 0.9552 - val_loss: 0.1601
Epoch 2/5
540/540 - 2s - 4ms/step - accuracy: 0.9581 - loss: 0.1429 - val_accuracy: 0.9723 - val_loss: 0.1069
Epoch 3/5
540/540 - 2s - 4ms/step - accuracy: 0.9693 - loss: 0.1003 - val_accuracy: 0.9728 - val_loss: 0.0916
Epoch 4/5
540/540 - 3s - 5ms/step - accuracy: 0.9758 - loss: 0.0779 - val_accuracy: 0.9793 - val_loss: 0.0706
Epoch 5/5
540/540 - 2s - 4ms/step - accuracy: 0.9809 - loss: 0.0616 - val_accuracy: 0.9790 - val_loss: 0.0619


<keras.src.callbacks.history.History at 0x1c873509c70>

### Test the model

In [54]:
test_loss , test_accuracy = model.evaluate(test_data)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 488ms/step - accuracy: 0.9711 - loss: 0.0966


In [59]:
print('Test loss: {0:2f}. Test accuracy: {1:.2f}% '.format(test_loss,test_accuracy*100))

Test loss: 0.096583. Test accuracy: 97.11% 
