# Convolutional Neural Networks — Annotated Notes
This notebook expands upon the original `conv_notes.html` file with full educational commentary.
We’ll explore convolutional operations, pooling, and visualization techniques using Keras.

## 1. Understanding Convolution
A convolution operation uses a small window (filter/kernel) that slides across the input image.

$$Y(i,j) = \sum_m \sum_n X(i+m, j+n)K(m,n)$$

**Parameters:**
- **Stride:** Step size of the filter movement.
- **Padding:** Whether we maintain output size (`same`) or shrink (`valid`).
- **Filters:** Number of kernels applied (output depth).

### Code Example

In [None]:
history

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
fit

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
plt.imshow

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
activation='relu'

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
activation='tanh'

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import time
import warnings
from sklearn.exceptions import ConvergenceWarning
from sklearn.model_selection import ParameterGrid

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
import tensorflow as tf
from tensorflow import keras
from keras import layers
from keras.losses import SparseCategoricalCrossentropy, CategoricalCrossentropy, BinaryCrossentropy
from keras.layers import Dense, Flatten, Conv2D, MaxPool2D, AveragePooling2D, Dropout, Reshape, GlobalAvgPool2D, Conv2DTranspose
from keras.datasets import mnist
from keras.models import Model, Sequential

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split, GridSearchCV, RandomizedSearchCV, StratifiedKFold, KFold
from sklearn.metrics import accuracy_score, classification_report

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version = 1)

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
X_train_full, X_test, y_train_full, y_test = train_test_split(mnist['data']/255, mnist['target'], stratify= mnist['target'])
X_train, X_val, y_train, y_val = train_test_split(X_train_full, y_train_full, stratify=y_train_full)

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
plt.figure()
plt.imshow(X_train.to_numpy().reshape((39375, 28, 28))[2])
plt.colorbar()
plt.grid(False)
plt.show()

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
(X_train_full, y_train_full), (X_test, y_test) = tf.keras.datasets.mnist.load_data()

X_train_full = X_train_full/255.0
X_test = X_test/255.0

X_train, X_val, y_train, y_val = train_test_split(X_train_full, y_train_full, stratify=y_train_full)

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
normalize = layers.Normalization()
normalize.adapt(X_train)

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
SparseCategoricalCrossentropy

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
CategoricalCrossentropy

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
from_logits

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
kmlp = keras.models.Sequential(
    [
        layers.Flatten(input_shape=(28, 28)),
        layers.Dense(50, activation="relu"),
        layers.Dense(100, activation="relu"),
        layers.Dense(100, activation="relu"),
        layers.Dense(100, activation="relu"),
        layers.Dense(10),
        layers.Softmax()
    ]
)

kmlp.compile(loss = keras.losses.SparseCategoricalCrossentropy(),
    optimizer = keras.optimizers.legacy.SGD(
                learning_rate=0.05,
                momentum=0.0,
                nesterov=False,                
                name='SGD'),
             metrics = ["accuracy"])

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
kmlp.summary()

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
Model: "sequential_21"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten_14 (Flatten)        (None, 784)               0         
                                                                 
 dense_59 (Dense)            (None, 50)                39250     
                                                                 
 dense_60 (Dense)            (None, 100)               5100      
                                                                 
 dense_61 (Dense)            (None, 100)               10100     
                                                                 
 dense_62 (Dense)            (None, 100)               10100     
                                                                 
 dense_63 (Dense)            (None, 50)                5050      
                                                                 
 softmax_7 (Softmax)         (None, 50)                0         
                                                                 
=================================================================
Total params: 69,600
Trainable params: 69,600
Non-trainable params: 0
_________________________________________________________________
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten_14 (Flatten)        (None, 784)               0         
                                                                 
 dense_59 (Dense)            (None, 50)                39250     
                                                                 
 dense_60 (Dense)            (None, 100)               5100      
                                                                 
 dense_61 (Dense)            (None, 100)               10100     
                                                                 
 dense_62 (Dense)            (None, 100)               10100     
                                                                 
 dense_63 (Dense)            (None, 50)                5050      
                                                                 
 softmax_7 (Softmax)         (None, 50)                0         
                                                                 
=================================================================
Total params: 69,600
Trainable params: 69,600
Non-trainable params: 0
_________________________________________________________________

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
start = time.time()
kmlp.fit(X_train, y_train, epochs = 10, validation_data=(X_val, y_val), verbose = True)
fit_time = time.time() - start

loss_and_metrics = kmlp.evaluate(X_test, y_test, batch_size=128)
print("Accuracy: {:.4f}".format(loss_and_metrics[1]))
print("Cross entropy: {:.4f}".format(loss_and_metrics[0]))
print("Time: {}", fit_time)

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
Epoch 1/10
1263/1263 [==============================] - 1s 939us/step - loss: 0.4638 - accuracy: 0.8555 - val_loss: 0.2037 - val_accuracy: 0.9393
Epoch 2/10
1263/1263 [==============================] - 1s 824us/step - loss: 0.1703 - accuracy: 0.9481 - val_loss: 0.2064 - val_accuracy: 0.9328
Epoch 3/10
1263/1263 [==============================] - 1s 815us/step - loss: 0.1203 - accuracy: 0.9635 - val_loss: 0.1892 - val_accuracy: 0.9428
Epoch 4/10
1263/1263 [==============================] - 1s 943us/step - loss: 0.0954 - accuracy: 0.9710 - val_loss: 0.1058 - val_accuracy: 0.9678
Epoch 5/10
1263/1263 [==============================] - 1s 896us/step - loss: 0.0789 - accuracy: 0.9748 - val_loss: 0.0962 - val_accuracy: 0.9714
Epoch 6/10
1263/1263 [==============================] - 1s 840us/step - loss: 0.0660 - accuracy: 0.9794 - val_loss: 0.1040 - val_accuracy: 0.9688
Epoch 7/10
1263/1263 [==============================] - 1s 862us/step - loss: 0.0575 - accuracy: 0.9820 - val_loss: 0.0958 - val_accuracy: 0.9719
Epoch 8/10
1263/1263 [==============================] - 1s 830us/step - loss: 0.0482 - accuracy: 0.9848 - val_loss: 0.0872 - val_accuracy: 0.9755
Epoch 9/10
1263/1263 [==============================] - 1s 913us/step - loss: 0.0432 - accuracy: 0.9861 - val_loss: 0.0914 - val_accuracy: 0.9736
Epoch 10/10
1263/1263 [==============================] - 1s 830us/step - loss: 0.0391 - accuracy: 0.9875 - val_loss: 0.1069 - val_accuracy: 0.9705
79/79 [==============================] - 0s 681us/step - loss: 1.9710 - accuracy: 0.8723
Accuracy: 0.8723
Cross entropy: 1.9710
Time: {} 11.293860912322998

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
classes = kmlp.predict(X_test, batch_size = 128)
np.set_printoptions(precision=2)
print(classes[0:3])

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
79/79 [==============================] - 0s 743us/step

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
[[3.35e-11 1.21e-07 2.33e-07 1.78e-07 9.16e-11 2.17e-09 5.04e-17 1.00e+00
  2.34e-09 5.00e-06 1.41e-12 5.35e-13 4.07e-12 1.91e-13 3.32e-12 5.24e-12
  3.87e-12 3.48e-12 8.44e-11 6.32e-12 8.36e-12 8.60e-12 7.89e-12 5.42e-13
  3.92e-11 8.03e-12 1.48e-12 2.35e-12 2.57e-13 5.38e-12 8.06e-11 1.28e-11
  4.58e-13 6.71e-11 1.30e-12 5.23e-11 1.55e-12 2.08e-12 5.68e-12 2.64e-12
  1.14e-12 3.59e-11 3.65e-11 2.29e-12 6.20e-12 2.53e-11 1.42e-11 1.94e-10
  2.68e-12 6.23e-11]
 [1.89e-11 5.07e-06 1.00e+00 1.60e-07 1.21e-12 3.04e-09 2.51e-10 1.33e-08
  2.91e-07 1.25e-14 1.25e-13 2.79e-12 2.83e-13 8.92e-14 3.38e-13 6.97e-14
  3.31e-13 2.53e-14 2.19e-12 1.34e-12 9.40e-13 3.58e-15 1.94e-12 3.74e-14
  1.78e-13 1.69e-12 3.19e-13 4.80e-13 1.33e-13 3.02e-13 5.00e-13 8.33e-14
  4.48e-13 1.64e-13 7.23e-14 1.02e-12 1.39e-13 2.58e-14 1.43e-12 3.29e-13
  5.61e-13 9.57e-12 1.49e-13 6.65e-14 3.85e-15 4.60e-14 1.85e-13 1.52e-13
  8.35e-13 1.86e-11]
 [7.76e-08 1.00e+00 2.06e-05 7.68e-07 1.41e-05 4.21e-06 2.34e-05 2.16e-04
  3.83e-05 1.88e-06 6.69e-08 1.01e-07 3.23e-07 1.56e-07 5.13e-08 3.25e-08
  1.86e-07 5.48e-08 5.13e-07 7.27e-07 4.90e-07 4.75e-08 5.67e-07 3.69e-08
  1.16e-06 3.40e-07 1.31e-07 1.52e-07 4.46e-07 3.72e-08 7.24e-08 1.10e-07
  1.03e-07 2.84e-07 3.04e-07 6.26e-07 6.77e-08 2.21e-08 1.03e-07 6.90e-08
  1.46e-07 2.69e-07 1.98e-07 1.59e-07 2.49e-08 1.15e-07 7.44e-08 3.58e-08
  9.07e-08 1.86e-06]]

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
hard_preds = np.argmax(classes, axis= 1)
#hard_preds = kmlp.predict_classes(X_test)
print(hard_preds[0:3])

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
[7 2 1]

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
kconv = keras.models.Sequential(
    [
        Conv2D(filters = 32, kernel_size = (2, 2), padding= 'same', activation= 'relu', input_shape= (28, 28, 1)),
        MaxPool2D(pool_size=(2,2)),
        Conv2D(filters=64, kernel_size = (2, 2), padding= 'same', activation= 'relu'),
        Conv2D(filters=64, kernel_size = (2, 2), padding= 'same', activation= 'relu'),
        MaxPool2D(2),
        Conv2D(filters=128, kernel_size = (2, 2), padding= 'same', activation= 'relu'),
        Conv2D(filters=128, kernel_size = (2, 2), padding= 'same', activation= 'relu'),
        MaxPool2D(),
        Flatten(),
        Dense(units=128, activation="relu",
                          kernel_initializer="he_normal"),
        Dropout(0.5),
        Dense(units=64, activation="relu",
                          kernel_initializer="he_normal"),
        Dropout(0.5),
        Dense(units=10, activation="softmax")
])

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
kconv.summary()

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
Model: "sequential_9"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_20 (Conv2D)          (None, 28, 28, 32)        160       
                                                                 
 max_pooling2d_12 (MaxPoolin  (None, 14, 14, 32)       0         
 g2D)                                                            
                                                                 
 conv2d_21 (Conv2D)          (None, 14, 14, 64)        8256      
                                                                 
 conv2d_22 (Conv2D)          (None, 14, 14, 64)        16448     
                                                                 
 max_pooling2d_13 (MaxPoolin  (None, 7, 7, 64)         0         
 g2D)                                                            
                                                                 
 conv2d_23 (Conv2D)          (None, 7, 7, 128)         32896     
                                                                 
 conv2d_24 (Conv2D)          (None, 7, 7, 128)         65664     
                                                                 
 max_pooling2d_14 (MaxPoolin  (None, 3, 3, 128)        0         
 g2D)                                                            
                                                                 
 flatten_9 (Flatten)         (None, 1152)              0         
                                                                 
 dense_37 (Dense)            (None, 128)               147584    
                                                                 
 dropout_8 (Dropout)         (None, 128)               0         
                                                                 
 dense_38 (Dense)            (None, 64)                8256      
                                                                 
 dropout_9 (Dropout)         (None, 64)                0         
                                                                 
 dense_39 (Dense)            (None, 10)                650       
                                                                 
=================================================================
Total params: 279,914
Trainable params: 279,914
Non-trainable params: 0
_________________________________________________________________
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d_20 (Conv2D)          (None, 28, 28, 32)        160       
                                                                 
 max_pooling2d_12 (MaxPoolin  (None, 14, 14, 32)       0         
 g2D)                                                            
                                                                 
 conv2d_21 (Conv2D)          (None, 14, 14, 64)        8256      
                                                                 
 conv2d_22 (Conv2D)          (None, 14, 14, 64)        16448     
                                                                 
 max_pooling2d_13 (MaxPoolin  (None, 7, 7, 64)         0         
 g2D)                                                            
                                                                 
 conv2d_23 (Conv2D)          (None, 7, 7, 128)         32896     
                                                                 
 conv2d_24 (Conv2D)          (None, 7, 7, 128)         65664     
                                                                 
 max_pooling2d_14 (MaxPoolin  (None, 3, 3, 128)        0         
 g2D)                                                            
                                                                 
 flatten_9 (Flatten)         (None, 1152)              0         
                                                                 
 dense_37 (Dense)            (None, 128)               147584    
                                                                 
 dropout_8 (Dropout)         (None, 128)               0         
                                                                 
 dense_38 (Dense)            (None, 64)                8256      
                                                                 
 dropout_9 (Dropout)         (None, 64)                0         
                                                                 
 dense_39 (Dense)            (None, 10)                650       
                                                                 
=================================================================
Total params: 279,914
Trainable params: 279,914
Non-trainable params: 0
_________________________________________________________________

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
kconv.compile(loss = keras.losses.SparseCategoricalCrossentropy(),
    optimizer = keras.optimizers.legacy.Adam(),
             metrics = ["accuracy"])

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
start = time.time()
history = kconv.fit(X_train, y_train, epochs = 5, validation_data=(X_val, y_val), verbose = True)
fit_time = time.time() - start

loss_and_metrics = kconv.evaluate(X_test, y_test, batch_size=128)
print("Accuracy: {:.4f}".format(loss_and_metrics[1]))
print("Cross entropy: {:.4f}".format(loss_and_metrics[0]))
print("Time: {}", fit_time)

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
Epoch 1/5
1407/1407 [==============================] - 21s 15ms/step - loss: 0.4699 - accuracy: 0.8511 - val_loss: 0.0767 - val_accuracy: 0.9781
Epoch 2/5
1407/1407 [==============================] - 20s 14ms/step - loss: 0.1411 - accuracy: 0.9623 - val_loss: 0.0669 - val_accuracy: 0.9821
Epoch 3/5
1407/1407 [==============================] - 20s 14ms/step - loss: 0.1053 - accuracy: 0.9731 - val_loss: 0.0461 - val_accuracy: 0.9877
Epoch 4/5
1407/1407 [==============================] - 20s 14ms/step - loss: 0.0814 - accuracy: 0.9795 - val_loss: 0.0497 - val_accuracy: 0.9859
Epoch 5/5
1407/1407 [==============================] - 21s 15ms/step - loss: 0.0708 - accuracy: 0.9824 - val_loss: 0.0624 - val_accuracy: 0.9856
79/79 [==============================] - 1s 12ms/step - loss: 0.0398 - accuracy: 0.9889
Accuracy: 0.9889
Cross entropy: 0.0398
Time: {} 101.89986109733582

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
preds = np.argmax(kconv.predict(X_test), axis= 1)
print(np.mean(preds != y_test))
badX = X_test[preds != y_test,:,:]
preds_badX = np.argmax(kconv.predict(badX), axis= 1)
y_badX = y_test[preds != y_test]

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
1/313 [..............................] - ETA: 17s

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
313/313 [==============================] - 2s 5ms/step
0.0111
4/4 [==============================] - 0s 5ms/step

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

### Code Example

In [None]:
fig, axes =  plt.subplots(4, 6, figsize =(16, 10), subplot_kw={'xticks': (), 'yticks':()})

for ax, item, bp, y in zip(axes.ravel(), badX, preds_badX, y_badX):
    ax.imshow(item.reshape(28,28))
    #plt.gray()
    ax.set_title(f'pred: {bp}, true: {y}')

_Explanation:_ This snippet demonstrates a stage in the convolutional pipeline. It might define convolutional layers, pooling, or feature visualization. Pooling helps with spatial reduction, while convolution layers extract spatial patterns.

## 2. Visualization Example

In [None]:
# Example: visualize feature maps from a convolution layer
import tensorflow as tf
from tensorflow import keras
from keras import layers
import matplotlib.pyplot as plt
import numpy as np

# Random grayscale image
sample = np.random.rand(1, 28, 28, 1)

conv_layer = layers.Conv2D(4, (3,3), activation='relu', input_shape=(28,28,1))
output = conv_layer(sample)

fig, axes = plt.subplots(1, 4, figsize=(10,3))
for i in range(4):
    axes[i].imshow(output[0, :, :, i], cmap='viridis')
    axes[i].set_title(f'Feature Map {i+1}')
plt.show()

Each feature map visualizes how a specific filter activates on spatial regions of the input.
- Early layers capture edges or color gradients.
- Deeper layers detect complex shapes and objects.
Visualization helps us interpret model behavior and debug feature learning.