# Neural Network

In [1]:
from keras.datasets import mnist

In [17]:
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
print(f"Shape of training images is {train_images.shape} \nShape of testing Images is {test_images.shape}")

Shape of training images is (60000, 28, 28) 
Shape of testing Images is (10000, 28, 28)


There are 60k Images (Rows). Since each Row is represented by 28x28 2D matrix we can say its a Grayscale image of shape 28x28

In [None]:
print(train_labels[0:5])

[5 0 4 1 9]


The labels in MNIST case are the actual value of the image itself represented by an integer only

In [12]:
from keras import models 
from keras import layers 

network = models.Sequential()
network.add(layers.Dense(512, activation="relu", input_shape = (28*28,)))
network.add(layers.Dense(10, activation="softmax"))

#small note here (28*28) will throw an error since Python recognizes it as an integer (784) while (28*28,) makes it recognize as a tuple of size (784,)

* Layers are fundamental building blocks for the data processing. Assuming them as a filter for the data.
* Data goes in one form and comes out in another. Hopefully a more meaningful and useful repesentation for the problem at hand.
* Simple layers are connected together to form a chain one after the other implementing a form of progressive **distillation**.
* Models is like a sieve for data preprocessing made up of successive increasing refined data filters -- layers. 

* **Loss function** - Measures the performance on the training/eval data to steer itself in the right direction
* **Optimizer** - Updates the weights acc to the loss
* **Monitoring Metrics** - Calling it Accuracy

In [13]:
#Compilation

network.compile(optimizer="rmsprop",
                loss="categorical_crossentropy",
                metrics=["accuracy"])

In [18]:
#Reshape and change data type

train_images = train_images.reshape(60000, 28*28)
train_images = train_images.astype("float32")/255.0 #Normalize it too

test_images = test_images.reshape(10000, 28*28)
test_images = test_images.astype("float32")/255.0 

print(f"Shape of training images is {train_images.shape} \nShape of testing Images is {test_images.shape}")

Shape of training images is (60000, 784) 
Shape of testing Images is (10000, 784)


Not playing around much in the reshape work, think of it just as unpacking the 28 by 28 pixels image into a flat row of 784 pixels. So u have 60000 rows and each row has 784 columns or call it pixel data points.

In [19]:
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

#Encoding labels using keras' in built categorical encoder

In [None]:
network.fit(train_images,
            train_labels,
            epochs=3,
            batch_size=8)

#Fit is the inbuilt method for training DL models on Keras

Epoch 1/3


2025-07-27 16:30:10.005967: W external/local_xla/xla/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 188160000 exceeds 10% of free system memory.
2025-07-27 16:30:10.133238: W external/local_xla/xla/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 188160000 exceeds 10% of free system memory.
I0000 00:00:1753614010.674773   18175 service.cc:152] XLA service 0x7789a8007470 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1753614010.674791   18175 service.cc:160]   StreamExecutor device (0): NVIDIA GeForce RTX 2060, Compute Capability 7.5
2025-07-27 16:30:10.695705: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
I0000 00:00:1753614010.757553   18175 cuda_dnn.cc:529] Loaded cuDNN version 90300


[1m 105/7500[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m10s[0m 1ms/step - accuracy: 0.5817 - loss: 1.3292 

I0000 00:00:1753614011.266876   18175 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m7500/7500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 1ms/step - accuracy: 0.9093 - loss: 0.3083
Epoch 2/3
[1m7500/7500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 2ms/step - accuracy: 0.9726 - loss: 0.1037
Epoch 3/3
[1m7500/7500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 1ms/step - accuracy: 0.9800 - loss: 0.0810


<keras.src.callbacks.history.History at 0x778a8c8d0c10>

In [23]:
test_loss, test_accuracy = network.evaluate(test_images, test_labels)
print(f"Test accuracy = {test_accuracy} \nTest Loss {test_loss}")

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.9730 - loss: 0.1361
Test accuracy = 0.9763000011444092 
Test Loss 0.11742827296257019


Test accuracy is sloghtly lower than the training accuracy. Its normal sometimes or maybe sometimes can be an issue of other training or data things.

# Data Representation