What is TensorFlow?
---
TensorFlow is an open-source machine learning framework developed by Google. It provides a comprehensive ecosystem of tools, libraries, and community resources to develop, train, and deploy machine learning and deep learning models. TensorFlow supports a wide range of tasks, including image recognition, natural language processing (NLP), and time-series analysis.

What is TensorFlow used for?
---
Deep Learning: Build and train neural networks for tasks like image classification, object detection, or speech recognition.

Machine Learning: Implement traditional algorithms such as linear regression, clustering, and decision trees.

Production Deployment: Optimize and serve models efficiently on mobile, edge devices, or cloud platforms.

Research: TensorFlow is widely used in AI research due to its flexibility and scalability.

Key Features:
---
TensorFlow.js: For running machine learning in the browser.

TensorFlow Lite: For deploying on mobile and embedded devices.

Keras Integration: Offers a user-friendly API for building and training models.

TensorBoard: Visualize training progress and performance metrics.

Now I am using it to train a handwriting digit recognization model with MNIST database
---

In [1]:
import tensorflow as tf
from keras import layers, models # type: ignore I don't know why it reports errors but the code still can run
from keras.datasets import mnist # type: ignore
from keras.utils import to_categorical # type: ignore
from keras.models import load_model  # type: ignore
from PIL import Image
import numpy as np
import os
from PIL import ImageEnhance


Load the MNIST dataset and spilt it to training set and validation set
---

In [2]:
# Load the MNIST dataset and preprocess the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Reshape to (28, 28, 1) because CNN expects a 3D input: (height, width, channels)
x_train = x_train.reshape((x_train.shape[0], 28, 28, 1)).astype('float32') / 255
x_test = x_test.reshape((x_test.shape[0], 28, 28, 1)).astype('float32') / 255

# Convert labels to one-hot encoding
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)


Build the CNN model with 3 convolutional layers and maxpooling layers inbetween
---

CNNs are excellent for image processing due to their efficient handling of spatial information, ability to learn hierarchical features, and reduced computational complexity. This makes them the backbone of modern applications like object detection, facial recognition, and image segmentation.

In [3]:
# Initialize the CNN model
model = models.Sequential()

# Add the first convolutional layer with 32 filters and ReLU activation
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))

# Add the first max pooling layer
model.add(layers.MaxPooling2D((2, 2)))

# Add the second convolutional layer with 64 filters
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

# Add the second max pooling layer
model.add(layers.MaxPooling2D((2, 2)))

# Add the third convolutional layer with 64 filters
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

# Flatten the feature maps to prepare for the fully connected layers
model.add(layers.Flatten())

# Add a fully connected layer with 64 units and ReLU activation
model.add(layers.Dense(64, activation='relu'))

# Add the output layer with 10 units (one for each digit) and softmax activation
model.add(layers.Dense(10, activation='softmax'))

# Display the model architecture
model.summary()


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 13, 13, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 11, 11, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 5, 5, 64)         0         
 2D)                                                             
                                                                 
 conv2d_2 (Conv2D)           (None, 3, 3, 64)          36928     
                                                                 
 flatten (Flatten)           (None, 576)               0

Compile the model
---
with Adam Optimizer and categorical crossentropy as loss function

In [4]:
# Compile the model with Adam optimizer, categorical crossentropy loss, and accuracy metric
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])


Starts training
---

In [5]:
# Train the model using the training data
model.fit(x_train, y_train, epochs=10, batch_size=64, validation_data=(x_test, y_test))


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x1e3d6a39bb0>

Evaluation of the model's performance
---

This model performs well on training and validation sets, but it seems to be bad with my own handwriting.

This could be due to OVERFITTING.

In [6]:
# Evaluate the model's performance on the test dataset
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f'Test accuracy: {test_acc}')


Test accuracy: 0.9868999719619751


Save the model
---

In [7]:
# Save the trained model to a file
model.save(r"Models_Trained/cnn_mnist_CLR.keras")


Now we are testing with my own handwriting
---

In [8]:
# Load the pre-trained model
model = load_model(r"Models_Trained/cnn_mnist_CLR.keras")

In [9]:
# Set the directory where the images are stored
image_dir = r"Pics"

In [10]:
# Get a list of all image file paths in the directory (assuming .png format) and sort them
image_files = sorted([f for f in os.listdir(image_dir) if f.endswith('.png')])

It is nothing but failure
---
Probably because of the overfitting issue

Or the testing pictures are not normalized properly


In [11]:
# Iterate through each image file in sorted order
for image_file in image_files:
    # Load the image and convert to grayscale
    image = Image.open(os.path.join(image_dir, image_file)).convert('L')

    # Resize image (28x28) and choose resampling filter (NEAREST)
    image = image.resize((28, 28), Image.NEAREST)

    # Convert to Numpy Array and Normalize to [0, 1]
    image = np.array(image).astype('float32') / 255

    # Reshape to (1, 28, 28, 1) to match the input shape expected by the CNN model
    image = image.reshape(1, 28, 28, 1)

     # Make prediction
    prediction = model.predict(image)
    predicted_digit = np.argmax(prediction)

    # Output the filename and prediction
    print(f'{image_file}: The model predicts this digit is {predicted_digit}.')

0.png: The model predicts this digit is 8.
1.png: The model predicts this digit is 0.
2.png: The model predicts this digit is 2.
3.png: The model predicts this digit is 8.
4.png: The model predicts this digit is 8.
5.png: The model predicts this digit is 8.
6.png: The model predicts this digit is 8.
7.png: The model predicts this digit is 8.
8.png: The model predicts this digit is 2.
9.png: The model predicts this digit is 8.
