In [1]:
!pip install numpy==2.0.2
!pip install pandas==2.2.2
!pip install tensorflow_cpu==2.18.0
!pip install matplotlib==3.9.2

Collecting tensorflow_cpu==2.18.0
  Downloading tensorflow_cpu-2.18.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.1 kB)
Downloading tensorflow_cpu-2.18.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (230.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m230.2/230.2 MB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tensorflow_cpu
Successfully installed tensorflow_cpu-2.18.0
Collecting matplotlib==3.9.2
  Downloading matplotlib-3.9.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Downloading matplotlib-3.9.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m8.3/8.3 MB[0m [31m51.9 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: matplotlib
  Attempting uninstall: matplotlib
    Found existing installation: matplotlib 3.10.0
    Uninstalling matplotlib-3.10.0:
      Succes

suppress the warning messages due to use of CPU architechture for tensoflow.



In [2]:
import os
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

In [3]:
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Input
from keras.utils import to_categorical

When working with convolutional neural networks in particular, we will need additional packages.


In [4]:
from keras.layers import Conv2D # to add convolutional layers
from keras.layers import MaxPooling2D # to add pooling layers
from keras.layers import Flatten # to flatten data for fully connected layers

## Convolutional Neural Network with One Set of Convolutional and Pooling Layers


In [5]:
# import data
from keras.datasets import mnist

# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# reshape to be [samples][pixels][width][height]
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1).astype('float32')
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1).astype('float32')

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


Let's normalize the pixel values to be between 0 and 1


In [6]:
X_train = X_train / 255 # normalize training data
X_test = X_test / 255 # normalize test data

Next, let's convert the target variable into binary categories


In [7]:
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

num_classes = y_test.shape[1] # number of categories

Next, let's define a function that creates our model. Let's start with one set of convolutional and pooling layers.


In [8]:
def convolutional_model():

    # create model
    model = Sequential()
    model.add(Input(shape=(28, 28, 1)))
    model.add(Conv2D(16, (5, 5), strides=(1, 1), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

    model.add(Flatten())
    model.add(Dense(100, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))

    # compile model
    model.compile(optimizer='adam', loss='categorical_crossentropy',  metrics=['accuracy'])
    return model

call the function to create our new convolutional neural network, train it ,and evaluate it.


In [9]:
# build the model
model = convolutional_model()

# fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200, verbose=2)

# evaluate the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: {} \n Error: {}".format(scores[1], 100-scores[1]*100))

Epoch 1/10
300/300 - 25s - 83ms/step - accuracy: 0.9207 - loss: 0.2899 - val_accuracy: 0.9738 - val_loss: 0.0869
Epoch 2/10
300/300 - 40s - 133ms/step - accuracy: 0.9768 - loss: 0.0786 - val_accuracy: 0.9825 - val_loss: 0.0548
Epoch 3/10
300/300 - 23s - 76ms/step - accuracy: 0.9832 - loss: 0.0560 - val_accuracy: 0.9851 - val_loss: 0.0470
Epoch 4/10
300/300 - 40s - 135ms/step - accuracy: 0.9871 - loss: 0.0429 - val_accuracy: 0.9853 - val_loss: 0.0421
Epoch 5/10
300/300 - 41s - 137ms/step - accuracy: 0.9887 - loss: 0.0364 - val_accuracy: 0.9852 - val_loss: 0.0430
Epoch 6/10
300/300 - 41s - 138ms/step - accuracy: 0.9910 - loss: 0.0298 - val_accuracy: 0.9863 - val_loss: 0.0435
Epoch 7/10
300/300 - 42s - 139ms/step - accuracy: 0.9931 - loss: 0.0233 - val_accuracy: 0.9870 - val_loss: 0.0384
Epoch 8/10
300/300 - 40s - 134ms/step - accuracy: 0.9936 - loss: 0.0213 - val_accuracy: 0.9875 - val_loss: 0.0416
Epoch 9/10
300/300 - 41s - 137ms/step - accuracy: 0.9954 - loss: 0.0165 - val_accuracy: 0.

### Results
- The high accuracy (98.84%) demonstrates that the model is highly effective in classifying the test images.
- The very low error rate (1.08%) indicates that the model generalizes well and is not significantly overfitting.
- These results suggest the model is reliable for practical applications, provided the test dataset is representative of real-world scenarios.

------------------------------------------


## Convolutional Neural Network with Two Sets of Convolutional and Pooling Layers


redefine the convolutional model so that it has two convolutional and pooling layers instead of just one layer of each.


In [None]:
def convolutional_model():

    # create model
    model = Sequential()
    model.add(Input(shape=(28, 28, 1)))
    model.add(Conv2D(16, (5, 5), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

    model.add(Conv2D(8, (2, 2), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))

    model.add(Flatten())
    model.add(Dense(100, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))

    # Compile model
    model.compile(optimizer='adam', loss='categorical_crossentropy',  metrics=['accuracy'])
    return model

call the function to create our new convolutional neural network, train it ,and evaluate it.

In [10]:
# build the model
model = convolutional_model()

# fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=200, verbose=2)

# evaluate the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: {} \n Error: {}".format(scores[1], 100-scores[1]*100))

Epoch 1/10
300/300 - 24s - 80ms/step - accuracy: 0.9198 - loss: 0.2888 - val_accuracy: 0.9681 - val_loss: 0.1067
Epoch 2/10
300/300 - 23s - 76ms/step - accuracy: 0.9753 - loss: 0.0856 - val_accuracy: 0.9820 - val_loss: 0.0586
Epoch 3/10
300/300 - 41s - 136ms/step - accuracy: 0.9829 - loss: 0.0581 - val_accuracy: 0.9862 - val_loss: 0.0464
Epoch 4/10
300/300 - 41s - 137ms/step - accuracy: 0.9863 - loss: 0.0462 - val_accuracy: 0.9850 - val_loss: 0.0446
Epoch 5/10
300/300 - 42s - 139ms/step - accuracy: 0.9888 - loss: 0.0361 - val_accuracy: 0.9869 - val_loss: 0.0403
Epoch 6/10
300/300 - 39s - 129ms/step - accuracy: 0.9907 - loss: 0.0301 - val_accuracy: 0.9862 - val_loss: 0.0402
Epoch 7/10
300/300 - 42s - 140ms/step - accuracy: 0.9926 - loss: 0.0245 - val_accuracy: 0.9858 - val_loss: 0.0452
Epoch 8/10
300/300 - 41s - 135ms/step - accuracy: 0.9936 - loss: 0.0205 - val_accuracy: 0.9862 - val_loss: 0.0422
Epoch 9/10
300/300 - 40s - 132ms/step - accuracy: 0.9941 - loss: 0.0179 - val_accuracy: 0.

how does batch size affect the time required and accuracy of the model training.


In [11]:
model = convolutional_model()

# fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, batch_size=1024, verbose=2)

# evaluate the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: {} \n Error: {}".format(scores[1], 100 - scores[1] *100))

Epoch 1/10
59/59 - 25s - 421ms/step - accuracy: 0.8338 - loss: 0.6532 - val_accuracy: 0.9299 - val_loss: 0.2407
Epoch 2/10
59/59 - 40s - 683ms/step - accuracy: 0.9402 - loss: 0.2071 - val_accuracy: 0.9572 - val_loss: 0.1543
Epoch 3/10
59/59 - 41s - 687ms/step - accuracy: 0.9617 - loss: 0.1352 - val_accuracy: 0.9694 - val_loss: 0.1052
Epoch 4/10
59/59 - 41s - 702ms/step - accuracy: 0.9722 - loss: 0.0991 - val_accuracy: 0.9761 - val_loss: 0.0819
Epoch 5/10
59/59 - 40s - 685ms/step - accuracy: 0.9780 - loss: 0.0789 - val_accuracy: 0.9789 - val_loss: 0.0674
Epoch 6/10
59/59 - 41s - 696ms/step - accuracy: 0.9816 - loss: 0.0657 - val_accuracy: 0.9820 - val_loss: 0.0585
Epoch 7/10
59/59 - 39s - 665ms/step - accuracy: 0.9837 - loss: 0.0581 - val_accuracy: 0.9815 - val_loss: 0.0585
Epoch 8/10
59/59 - 23s - 383ms/step - accuracy: 0.9852 - loss: 0.0510 - val_accuracy: 0.9817 - val_loss: 0.0546
Epoch 9/10
59/59 - 40s - 681ms/step - accuracy: 0.9873 - loss: 0.0449 - val_accuracy: 0.9834 - val_loss:

In [12]:
print("Accuracy: {} \n Error: {}".format(scores[1], 100 - scores[1] *100))

Accuracy: 0.984000027179718 
 Error: 1.5999972820281982


how does the number of epochs affect the time required and accuracy of the model training.

In [13]:
model = convolutional_model()

# fit the model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=25, batch_size=1024, verbose=2)

# evaluate the model
scores = model.evaluate(X_test, y_test, verbose=0)
print("Accuracy: {} \n Error: {}".format(scores[1], 100 - scores[1] *100))

Epoch 1/25
59/59 - 24s - 415ms/step - accuracy: 0.8290 - loss: 0.6645 - val_accuracy: 0.9321 - val_loss: 0.2425
Epoch 2/25
59/59 - 22s - 370ms/step - accuracy: 0.9456 - loss: 0.1923 - val_accuracy: 0.9625 - val_loss: 0.1348
Epoch 3/25
59/59 - 42s - 712ms/step - accuracy: 0.9663 - loss: 0.1196 - val_accuracy: 0.9733 - val_loss: 0.0943
Epoch 4/25
59/59 - 41s - 689ms/step - accuracy: 0.9758 - loss: 0.0865 - val_accuracy: 0.9787 - val_loss: 0.0713
Epoch 5/25
59/59 - 39s - 668ms/step - accuracy: 0.9805 - loss: 0.0697 - val_accuracy: 0.9822 - val_loss: 0.0592
Epoch 6/25
59/59 - 44s - 738ms/step - accuracy: 0.9832 - loss: 0.0586 - val_accuracy: 0.9837 - val_loss: 0.0534
Epoch 7/25
59/59 - 22s - 370ms/step - accuracy: 0.9856 - loss: 0.0505 - val_accuracy: 0.9844 - val_loss: 0.0474
Epoch 8/25
59/59 - 41s - 700ms/step - accuracy: 0.9873 - loss: 0.0443 - val_accuracy: 0.9860 - val_loss: 0.0459
Epoch 9/25
59/59 - 21s - 352ms/step - accuracy: 0.9886 - loss: 0.0399 - val_accuracy: 0.9866 - val_loss:

# Comparison of Model Performance Across Different Configurations

### **1. Model with 1 Layer | Batch Size = 200 | Epochs = 10**
- **Accuracy**: `0.9892` (~98.92%)
- **Error**: `1.08%`

### **2. Model with 2 Layers | Batch Size = 200 | Epochs = 10**
- **Accuracy**: `0.9875` (~98.75%)
- **Error**: `1.25%`

### **3. Model with 2 Layers | Batch Size = 1024 | Epochs = 10**
- **Accuracy**: `0.9840` (~98.40%)
- **Error**: `1.60%`

### **4. Model with 2 Layers | Batch Size = 1024 | Epochs = 25**
- **Accuracy**: `0.9882` (~98.82%)
- **Error**: `1.18%`

---

### **Key Observations**

#### **1. Accuracy Trends**:
- The **highest accuracy (98.92%)** was achieved with **1 layer**, a **batch size of 200**, and **10 epochs**.
- The **lowest accuracy (98.40%)** occurred with **2 layers**, a **batch size of 1024**, and **10 epochs**.
- Increasing the number of layers slightly reduced accuracy, but increasing epochs (from 10 to 25) improved accuracy for the larger batch size configuration.

#### **2. Error Trends**:
- The **lowest error (1.08%)** was produced by the **1-layer model** with a **batch size of 200** and **10 epochs**.
- The error rate increased with larger batch sizes but decreased with more epochs for the 2-layer model.

#### **3. Layers and Complexity**:
- Adding more layers enabled the model to capture more complex features but did not consistently improve performance on this dataset, as seen in the slight reduction in accuracy.

#### **4. Batch Size and Epochs**:
- **Smaller Batch Sizes (200)**:
  - Smaller batch sizes led to higher accuracy and lower error, suggesting better generalization.
- **Larger Batch Sizes (1024)**:
  - Larger batch sizes resulted in faster training but slightly reduced accuracy.
  - Increasing epochs for larger batch sizes helped recover some performance.

---

### **Conclusion**
- The **1-layer model** with **batch size = 200** and **10 epochs** achieved the best overall performance, with the highest accuracy (98.92%) and the lowest error (1.08%).
- For **2-layer models**, increasing epochs slightly improved performance for larger batch sizes but did not surpass the 1-layer configuration with smaller batch sizes.

## <h3 align="center"> &#169; IBM Corporation. All rights reserved. <h3/>

