Using TensorFlow to Train a Simple Model

Use Case: Handwritten Digit Recognition (Mini MNIST Classifier)
📌 Objective: Build and test a basic neural network that can recognize handwritten digits (0–9) from images using a small subset of the MNIST dataset.

This is a basic neural network classifier for recognizing handwritten digits. It's useful for:

Learning how neural networks work

Practicing model design and training

Quick prototyping with small datasets

📝 1. Disable GPU
This forces TensorFlow to use the CPU.
Why? For environments where GPU isn't supported or to test CPU-only performance.

In [26]:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"



In [27]:
# Import Libraries
# #We need TensorFlow and Keras to build and train neural networks.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
print(tf.__version__)



2.18.1


3. Load and Preprocess Data

MNIST contains 70,000 grayscale digit images (28x28).

Why scale images? Pixel values range from 0–255. Dividing by 255 normalizes them between 0 and 1 for better training.

Why small subset? Faster training time and quick experimentation with only 1,000 training and 200 test samples.

In [28]:

(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
X_train = X_train[:1000] / 255.0
y_train = y_train[:1000]
X_test = X_test[:200] / 255.0
y_test = y_test[:200]

4. Define the Model
🧱 This builds a basic neural network:

Flatten: Converts 2D image to 1D vector (28x28 → 784)

Dense(64, relu): Hidden layer with 64 neurons using ReLU

Dense(10, softmax): Output layer with 10 classes (digits 0–9)

In [29]:
model = keras.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])


5. Compile the Model
🛠️ Prepares model for training.

Adam: Efficient optimizer

Loss: Measures prediction error for multi-class classification

Accuracy: To track performance



In [30]:
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])


6. Train the Model
🏋️ Feeds the training data to the model once (1 epoch).

One epoch = The model goes through the entire training dataset once.

More epochs = More learning opportunities.

In [31]:
model.fit(X_train, y_train, epochs=1)

[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.3588 - loss: 1.9980 


<keras.src.callbacks.history.History at 0x219016d1e20>

7. Evaluate the Model
📊 Tests how well the model performs on unseen data.

In [32]:
model.evaluate(X_test, y_test)


[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - accuracy: 0.6917 - loss: 1.1680 


[1.1907941102981567, 0.6800000071525574]

My simple model is doing reasonably well with ~79% accuracy on test data with just 1 training epoch and a small dataset, considering  a decent start!

The loss value (1.14) suggests there's room for improvement. You can try:

Training for more epochs (e.g., 5 or 10)

Using more training data (not just 1,000 samples)

~ meaning Adding more layers or neurons