In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import InputLayer, Conv2D, MaxPool2D, Dense, Flatten

# Load data
data = np.load(r'C:\Users\felin\Downloads\Image classification\mnist_compressed.zip')
X_test, y_test, X_train, y_train =  data['test_images'], data['test_labels'], data['train_images'], data['train_labels']
print(X_train.shape)  # (60000, 28, 56)

# Convert to TensorFlow Dataset
test_full_dataset = tf.data.Dataset.from_tensor_slices((X_test, y_test))
train = tf.data.Dataset.from_tensor_slices((X_train, y_train)).shuffle(10)

# Function to display an image
def show_img(x, y):
    plt.gray()
    plt.title(str(y))
    plt.imshow(x)

# Function to divide train dataset into train and validation
def divide_into_train_and_val(train_dataset_original, train_ratio=0.8, val_ratio=0.2):
    DATASET_SIZE = len(train_dataset_original)

    train_dataset = train_dataset_original.take(int(train_ratio * DATASET_SIZE)).map(lambda x, y: 
        (
            tf.reshape(x , (28 , 56 , 1)) 
            , y
        )
    ).batch(32)

    val_dataset = train_dataset_original.skip(int(train_ratio * DATASET_SIZE)).map(lambda x, y: 
        (
            tf.reshape(x , (28 , 56 , 1)) 
            , y
        )
    ).batch(32)

    return train_dataset, val_dataset

# Split dataset into train and validation
train_dataset , val_dataset = divide_into_train_and_val(train)

# Build the model
model = Sequential([
    Conv2D(filters=8, kernel_size=(3, 3), strides=1, padding='same', activation='relu', input_shape=(28, 56, 1)),
    MaxPool2D(pool_size=(2, 2), strides=2),
    
    Conv2D(filters=16, kernel_size=(3, 3), strides=1, padding='same', activation='relu'),
    MaxPool2D(pool_size=(2, 2), strides=2),
    
    Flatten(),
    Dense(100, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(train_dataset, validation_data=val_dataset, epochs=20)

# Test on a specific image
x = X_test[69]
y = y_test[69]
show_img(x, y)
x = x.reshape(1, 28, 56, 1)
predictions = model.predict(x, verbose=0)

# Get predicted class and confidence
predicted_class = np.argmax(predictions[0])
confidence = predictions[0][predicted_class]
print('class =', predicted_class, 'conf=' , confidence)

# Save the trained model
model.save('./mnist_trained_weights.keras')


KeyError: 'test_images is not a file in the archive'

In [None]:
Let's go through the code block by block to explain how it works:

### 1. **Importing Libraries**
```python
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import InputLayer, Conv2D, MaxPool2D, Dense, Flatten
```
- **NumPy**: A library for handling large, multi-dimensional arrays and matrices, which is essential for numerical computation.
- **Pandas**: This library is generally used for data manipulation and analysis, but it's not actively used in this particular code.
- **Matplotlib**: A plotting library to visualize images and graphs.
- **TensorFlow**: A deep learning framework, which is used here to build and train a neural network.
- **Keras**: A high-level API for building neural networks within TensorFlow. It's used here to define the layers of the CNN model.

### 2. **Loading Data**
```python
data = np.load(r'C:\Users\felin\Downloads\Image classification\mnist_compressed.zip')
X_test, y_test, X_train, y_train =  data['test_images'], data['test_labels'], data['train_images'], data['train_labels']
print(X_train.shape)  # (60000, 28, 56)
```
- The `np.load()` function is used to load the `.npz` compressed file which contains the MNIST dataset (images and labels). This dataset is pre-split into training and testing data.
- **X_train** and **X_test** contain the image data.
- **y_train** and **y_test** contain the labels (0-9 for each image, corresponding to the digits).
- The shape of `X_train` is printed. It has 60,000 images, each of size `28x56`. This is slightly different from the standard MNIST dataset (`28x28`), indicating that the images may have been resized or pre-processed.

### 3. **Converting Data to TensorFlow Dataset**
```python
test_full_dataset = tf.data.Dataset.from_tensor_slices((X_test, y_test))
train = tf.data.Dataset.from_tensor_slices((X_train, y_train)).shuffle(10)
```
- `tf.data.Dataset.from_tensor_slices` creates a TensorFlow Dataset from the NumPy arrays.
  - **train**: This dataset is shuffled with a buffer size of `10`. The `shuffle()` method ensures that the training data is randomly shuffled during the training process, preventing overfitting.
  - **test_full_dataset**: This contains the test data (`X_test` and `y_test`), which will be used later for evaluation.

### 4. **Function to Display Images**
```python
def show_img(x, y):
    plt.gray()
    plt.title(str(y))
    plt.imshow(x)
```
- This function displays an image (`x`) using `matplotlib` with the title set to the label (`y`).
- `plt.gray()` ensures the image is displayed in grayscale.
- `plt.imshow(x)` displays the image.

### 5. **Splitting Train Dataset into Train and Validation**
```python
def divide_into_train_and_val(train_dataset_original, train_ratio=0.8, val_ratio=0.2):
    DATASET_SIZE = len(train_dataset_original)

    train_dataset = train_dataset_original.take(int(train_ratio * DATASET_SIZE)).map(lambda x, y: 
        (
            tf.reshape(x , (28 , 56 , 1)) 
            , y
        )
    ).batch(32)

    val_dataset = train_dataset_original.skip(int(train_ratio * DATASET_SIZE)).map(lambda x, y: 
        (
            tf.reshape(x , (28 , 56 , 1)) 
            , y
        )
    ).batch(32)

    return train_dataset, val_dataset
```
- This function splits the original training dataset into a training subset (80%) and a validation subset (20%).
- `train_dataset_original.take()` is used to take the first 80% of the data for training.
- `train_dataset_original.skip()` is used to skip the first 80% of the data and use the remaining 20% for validation.
- `tf.reshape(x, (28, 56, 1))` reshapes the image data to have one channel (grayscale), which is needed for input into the CNN.
- `.batch(32)` batches the data into mini-batches of 32 samples each.
- The function returns the `train_dataset` and `val_dataset` for training and validation.

### 6. **Building the Model**
```python
model = Sequential([
    Conv2D(filters=8, kernel_size=(3, 3), strides=1, padding='same', activation='relu', input_shape=(28, 56, 1)),
    MaxPool2D(pool_size=(2, 2), strides=2),
    
    Conv2D(filters=16, kernel_size=(3, 3), strides=1, padding='same', activation='relu'),
    MaxPool2D(pool_size=(2, 2), strides=2),
    
    Flatten(),
    Dense(100, activation='softmax')
])
```
- **Sequential**: A linear stack of layers.
- **Conv2D**: A convolutional layer that applies 2D convolution to the input image. It has filters (kernels) of size `3x3` that slide across the image to extract features.
  - The first convolution layer uses 8 filters and a `relu` activation function.
  - The second convolution layer uses 16 filters and a `relu` activation function.
  - `padding='same'` ensures that the output image dimensions are the same as the input for each convolutional layer.
- **MaxPool2D**: A max pooling layer that reduces the dimensions of the output from the convolution layers, making the model more efficient.
  - The pool size is `2x2`, and the stride is 2, which means it reduces the dimensions by half.
- **Flatten**: Flattens the 2D output into a 1D vector to feed into the dense layer.
- **Dense**: A fully connected layer with 100 units and a `softmax` activation function to output probabilities for each class (0-9).

### 7. **Compiling the Model**
```python
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
```
- **optimizer='adam'**: Adam is an adaptive learning rate optimization algorithm.
- **loss='sparse_categorical_crossentropy'**: This is the loss function used for multi-class classification problems where labels are provided as integers (not one-hot encoded).
- **metrics=['accuracy']**: The model will track the accuracy metric during training.

### 8. **Training the Model**
```python
model.fit(train_dataset, validation_data=val_dataset, epochs=20)
```
- The model is trained for 20 epochs using the `train_dataset` and validated using the `val_dataset`.
- `epochs=20` means the model will go through the entire training data 20 times.

### 9. **Testing the Model with a Specific Image**
```python
x = X_test[69]
y = y_test[69]
show_img(x, y)
x = x.reshape(1, 28, 56, 1)
predictions = model.predict(x, verbose=0)

# Get predicted class and confidence
predicted_class = np.argmax(predictions[0])
confidence = predictions[0][predicted_class]
print('class =', predicted_class, 'conf=' , confidence)
```
- This tests the trained model on a specific test image (index 69).
- The image `x` is reshaped to match the input shape `(28, 56, 1)`.
- `model.predict()` is used to get the predicted class probabilities.
- `np.argmax(predictions[0])` gets the class with the highest probability.
- `predictions[0][predicted_class]` gives the confidence level of the prediction.

### 10. **Saving the Model**
```python
model.save('./mnist_trained_weights.keras')
```
- The trained model is saved in the `mnist_trained_weights.keras` file so that it can be loaded and used later without having to retrain it.

### Summary
This code performs the following steps:
1. Loads and processes the MNIST dataset.
2. Defines a Convolutional Neural Network (CNN) with 2 convolution layers followed by max pooling, and a fully connected layer at the end.
3. Compiles and trains the model using the Adam optimizer and sparse categorical cross-entropy loss.
4. Evaluates the model's performance on test data and saves the trained model for future use.

Let me know if you need further clarifications!