<a href="https://colab.research.google.com/github/Elma-dev/hands_in_keras/blob/main/MNIST_Keras.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [20]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import pandas as pd



---
#**<center>MNIST Data Set</center>**

---
<center>
MNIST (70,000 grayscale images of 28×28 pixels each, with 10 classes),
</center>

## **Using Keras to Load the Dataset**

In [5]:
fashion_mnist=keras.datasets.fashion_mnist

In [16]:
#load data
(x_train_full,y_train_full),(x_test,y_test)=fashion_mnist.load_data()

In [17]:
print(f'shape(x_train,y_train)={x_train_full.shape,y_train_full.shape}')
print(f'shape(x_test,y_test)={x_test.shape,y_test.shape}')

shape(x_train,y_train)=((60000, 28, 28), (60000,))
shape(x_test,y_test)=((10000, 28, 28), (10000,))


Note that the dataset is already split into a training set and a test set, but there is no
validation set, so let’s create one.

In [18]:
#Add valid data and in the same time normalization the images
x_train_valid,x_train=x_train_full[:5000]/255,x_train_full[5000:]/255
y_train_valid,y_train=y_train_full[:5000],y_train_full[5000:]

type of classes in **MNIST** data

In [24]:
print(f'MnistClasses:{np.sort(pd.unique(y_train))}')

MnistClasses:[0 1 2 3 4 5 6 7 8 9]


#**Creating the Model Using the Sequential API**

**<center>Code Lines Meaning:</center>**


---
- The first line creates a **Sequential model**. This is the simplest kind of Keras
model, for neural networks that are just composed of a single stack of layers, con‐
nected sequentially. This is called the sequential API.

- Next, we build the first layer and add it to the model. It is a **Flatten laye**r whose role is simply to **convert each input image into a 1D array**: if it receives input data X, it computes X.reshape(-1, 1). This layer does not have any parameters, it is just there to do some simple preprocessing. Since it is the first layer in the model,
you should specify the input_shape: this does not include the batch size, only the shape of the instances. Alternatively, you could add a keras.layers.InputLayer as the first layer, setting shape=[28,28].

- Next we add a **Dense hidden layer with 300 neurons**. It will use the **ReLU** activa‐tion function. Each Dense layer manages its own weight matrix, containing all the
connection weights between the neurons and their inputs.

- Next we add a second **Dense hidden layer with 100 neurons**, also using the **ReLU** activation function.

- Finally, we add a **Dense output layer with 10 neurons** (one per class), using the **softmax** activation function (because the classes are exclusive).
---



In [27]:
from keras.api._v2.keras import activations
model=keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape=[28,28]))
model.add(keras.layers.Dense(300,activation="relu"))
model.add(keras.layers.Dense(100,activation="relu"))
model.add(keras.layers.Dense(10,activation="softmax"))


The model’s **summary()** method displays all the model’s layers

In [28]:
model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten_1 (Flatten)         (None, 784)               0         
                                                                 
 dense (Dense)               (None, 300)               235500    
                                                                 
 dense_1 (Dense)             (None, 100)               30100     
                                                                 
 dense_2 (Dense)             (None, 10)                1010      
                                                                 
Total params: 266,610
Trainable params: 266,610
Non-trainable params: 0
_________________________________________________________________


**layers**: You can easily get a model’s list of layers:

In [30]:
model.layers

[<keras.layers.reshaping.flatten.Flatten at 0x7b0668e42980>,
 <keras.layers.core.dense.Dense at 0x7b0666ae7dc0>,
 <keras.layers.core.dense.Dense at 0x7b0666ae5ed0>,
 <keras.layers.core.dense.Dense at 0x7b0666ae54b0>]

In [31]:
model.get_layer("dense_1")

<keras.layers.core.dense.Dense at 0x7b0666ae5ed0>

In [32]:
model.get_layer("dense_1").get_weights() # set_weights()

[array([[-0.00252065,  0.08362182,  0.11213166, ...,  0.075518  ,
         -0.11015425,  0.11486167],
        [ 0.05613735,  0.07773171, -0.08911283, ..., -0.03343691,
          0.05834035,  0.10709088],
        [-0.02595766,  0.0783299 ,  0.12045848, ..., -0.1185402 ,
         -0.00577363,  0.09846737],
        ...,
        [ 0.01165228, -0.00874073,  0.03118043, ..., -0.11113798,
          0.04908874, -0.01384506],
        [-0.03260905, -0.06423447, -0.10920078, ...,  0.00410806,
          0.08141153,  0.06624601],
        [ 0.11180135,  0.07098856, -0.08084394, ...,  0.11351723,
         -0.1137554 ,  0.11726061]], dtype=float32),
 array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.

## **Compiling the Model**


After a model is created, you must call its compile() method to specify the loss function and the optimizer to use. Optionally, you can also specify a list of extra metrics to compute during training and evaluation.

---
<center>Code Explanation</center>

- This requires some explanation. First, we use the "sparse_categorical_crossentropy" loss because we have sparse labels (i.e., for each instance there is just a target
class index, from 0 to 9 in this case), and the classes are exclusive. If instead we had one target probability per class for each instance (such as one-hot vectors, e.g. [0.,0., 0., 1., 0., 0., 0., 0., 0., 0.] to represent class 3), then we would need
to use the "categorical_crossentropy" loss instead. If we were doing binary classi‐fication (with one or more binary labels), then we would use the "sigmoid" (i.e.,logistic) activation function in the output layer instead of the "softmax" activation
function, and we would use the "binary_crossentropy" loss.

- Secondly, regarding the optimizer, "sgd" simply means that we will train the model using simple Stochastic Gradient Descent. In other words, Keras will perform the backpropagation algorithm described earlier (i.e., reverse-mode autodiff + Gradient Descent). We will discuss more efficient optimizers in Chapter 11 (they improve the Gradient Descent part, not the autodiff).

- Finally, since this is a classifier, it’s useful to measure its "accuracy" during training and evaluation

---

In [33]:
model.compile(optimizer="sgd",loss="sparse_categorical_crossentropy",metrics=["accuracy"])

## **Training and Evaluating the Model**

---
Now the model is ready to be trained. For this we simply need to call its **fit()** method. We pass it the input features (X_train) and the target classes (y_train), as well as the ***number of epochs*** to train (or else it would default to just 1, which would
definitely not be enough to converge to a good solution). ***We also pass a validation*** set (this is optional): ***Keras will measure the loss and the extra metrics on this set at the end of each epoch,*** which is very useful to see how well the model really performs: if
the performance on the training set is much better than on the validation set, your model is probably overfitting the training set (or there is a bug, such as a data mismatch between the training set and the validation set)

---

In [34]:
model.fit(x_train,y_train,epochs=30,validation_data=(x_train_valid,y_train_valid))

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x7b0648509840>