## **Step 1: Import Libraries**

> tensorflow is a popular library for building and training deep learning models.
* layers and models are modules from tf.keras, TensorFlow's high-level API for building and training models easily.
* numpy is a library for numerical operations, commonly used in machine learning.

In [None]:
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np

# **Step 2: Load the Dataset**

> The MNIST dataset is divided into a training set (60,000 images) and a test set (10,000 images).
* x_train and x_test contain the images, while y_train and y_test contain the corresponding labels (0–9).

In [None]:
(x_train,y_train), (x_test,y_test) = tf.keras.datasets.mnist.load_data()

# **Step 3: Preprocess the Data**

> Neural networks perform better when the input data is scaled.
* Here, we normalize pixel values (from 0–255) to the range 0–1 by dividing by 255.0.
* This improves the convergence of the training process by standardizing input values.

In [None]:
x_train, x_test = x_train/255.0 , x_test/255.0

# **Step 4: Define the Model**

> We create a Sequential model, which allows us to stack layers linearly (one after another).
* Flatten layer: This layer reshapes each 28x28 image into a 784-element vector (28 × 28 = 784), so it can be fed into the dense layers.
* First Dense layer: This fully connected (dense) layer has 128 neurons and uses the ReLU activation function. ReLU (Rectified Linear Unit) introduces non-linearity, allowing the network to learn more complex patterns.
* Output Dense layer: The output layer has 10 neurons (one for each digit, 0–9) and uses the softmax activation function. Softmax converts the outputs into probabilities that sum to 1, indicating the model’s confidence in each class.

In [None]:
model = models.Sequential([
    layers.Flatten(input_shape=(28, 28)),  # Flatten 2D images to 1D
    layers.Dense(128, activation='relu'),  # Hidden layer with ReLU activation
    layers.Dense(10, activation='softmax')  # Output layer with softmax for classification
])

# **Step 5: Compile the model**

###  *Compiling a model is the step that configures it for training by defining key components: the optimizer, loss function, and metrics.*

> **Compilation:** Before training, we compile the model to specify the optimization method, loss function, and evaluation metric.

* Optimizer: adam (Adaptive Moment Estimation) is an optimization algorithm that adjusts the learning rate during training based on the rate of change of weights, improving convergence speed.

* Loss Function: sparse_categorical_crossentropy is used for multi-class classification when labels are provided as integers rather than one-hot encoded arrays.

* Metric: accuracy is used to measure the model’s accuracy during training and evaluation.

In [None]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# **Step 6: Train the model**

> This command initiates the training process, where the ***model learns from the data by adjusting weights to minimize the loss.***

* x_train and y_train are the input data and labels.

* epochs=5: This parameter specifies that the entire dataset will be passed through the model 5 times.

In [None]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 4ms/step - accuracy: 0.8810 - loss: 0.4271
Epoch 2/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - accuracy: 0.9658 - loss: 0.1172
Epoch 3/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 4ms/step - accuracy: 0.9781 - loss: 0.0748
Epoch 4/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 3ms/step - accuracy: 0.9835 - loss: 0.0568
Epoch 5/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 4ms/step - accuracy: 0.9872 - loss: 0.0434


<keras.src.callbacks.history.History at 0x7cbbab532ef0>

# **Step 7: Evaluate the model**

This step tests the model’s performance on data it has never seen before (the test data) to get an unbiased estimate of its accuracy.

* x_test and y_test are the input data and labels for the test set.
* verbose=2: Controls the output display level. Setting it to 2 shows a single line per epoch with test results.
* test_loss and test_acc store the model's performance metrics on the test set.

In [None]:
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)

313/313 - 0s - 1ms/step - accuracy: 0.9770 - loss: 0.0781


In [None]:
print('\nTest accuracy:', test_acc)


Test accuracy: 0.9769999980926514
