# **Chapter 14: Convolutional Neural Networks Implementation Guide**

## 1. Introduction to CNNs

**Convolutional Neural Networks (CNN)** adalah jenis jaringan saraf tiruan yang dirancang khusus untuk pemrosesan data grid seperti gambar. are specialized for processing grid-like data (images, videos). Key advantages:
- **Local connectivity**: Neurons connect only to local regions (receptive fields)
- **Parameter sharing**: Same weights used across spatial locations
- **Hierarchical features**: Learn from simple to complex patterns

### Biological Inspiration:
Based on visual cortex organization discovered by Hubel & Wiesel (1959-1968)

## 2. Core CNN Components

### 2.1 Convolutional Layers
- Apply filters/kernels that detect patterns
- Output feature maps highlight where patterns occur
- Key parameters:
  - `filters`: Number of output channels
  - `kernel_size`: Spatial dimensions of filters (e.g., 3×3)
  - `strides`: Step size for sliding window
  - `padding`: 'valid' (no padding) or 'same' (keep dimensions)

### 2.2 Pooling Layers
- Reduce spatial dimensions (downsampling)
- Types:
  - Max pooling: Takes maximum value in window
  - Average pooling: Takes average value in window

In [1]:
# Mengimpor library yang dibutuhkan untuk visi komputer dan CNN
import tensorflow as tf
# Mengimpor library yang dibutuhkan untuk visi komputer dan CNN
from tensorflow.keras import layers

# Basic CNN architecture example
model = tf.keras.Sequential([
    # Convolutional block 1
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),

    # Convolutional block 2
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),

    # Classifier head
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## 3. CNN Architectures

### 3.1 LeNet-5 (1998)
- First successful CNN architecture for digit recognition
- Key features:
  - Alternating convolutions and pooling
  - Tanh activation functions
  - Fully connected final layers

### 3.2 AlexNet (2012)
- Breakthrough ImageNet performance
- Innovations:
  - ReLU activation
  - Dropout regularization
  - Data augmentation
  - GPU acceleration

### 3.3 ResNet (2015)
- Introduced residual connections
- Enabled training of very deep networks (100+ layers)
- Key concept: Skip connections help gradient flow

In [4]:
import tensorflow as tf

class ResidualBlock(tf.keras.layers.Layer):
    def __init__(self, filters, strides=1, activation="relu", **kwargs):
        super().__init__(**kwargs)
        self.activation_fn = tf.keras.activations.get(activation)
        self.filters = filters
        self.strides = strides

        self.conv1 = tf.keras.layers.Conv2D(filters, 3, strides=strides, padding="same", use_bias=False)
        self.bn1 = tf.keras.layers.BatchNormalization()
        self.act1 = tf.keras.layers.Activation(activation)

        self.conv2 = tf.keras.layers.Conv2D(filters, 3, strides=1, padding="same", use_bias=False)
        self.bn2 = tf.keras.layers.BatchNormalization()

        self.skip_conv = None
        self.skip_bn = None

    def build(self, input_shape):
        # Otomatis sesuaikan skip connection jika jumlah channel tidak sama
        if input_shape[-1] != self.filters or self.strides > 1:
            self.skip_conv = tf.keras.layers.Conv2D(self.filters, 1, strides=self.strides, padding="same", use_bias=False)
            self.skip_bn = tf.keras.layers.BatchNormalization()

    def call(self, inputs):
        Z = self.conv1(inputs)
        Z = self.bn1(Z)
        Z = self.act1(Z)

        Z = self.conv2(Z)
        Z = self.bn2(Z)

        skip_Z = inputs
        if self.skip_conv:
            skip_Z = self.skip_conv(skip_Z)
            skip_Z = self.skip_bn(skip_Z)

        return tf.keras.activations.relu(Z + skip_Z)

# Contoh penggunaan
inputs = tf.keras.Input(shape=(32, 32, 3))
x = ResidualBlock(64)(inputs)
x = ResidualBlock(128, strides=2)(x)
outputs = tf.keras.layers.GlobalAveragePooling2D()(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
model.summary()


## 4. Transfer Learning with Pretrained Models

### 4.1 Using Keras Applications
Leverage models pretrained on ImageNet:

In [6]:
import requests
from PIL import Image
from io import BytesIO

url = "https://upload.wikimedia.org/wikipedia/commons/9/99/Black_square.jpg"
response = requests.get(url)
img = Image.open(BytesIO(response.content)).resize((224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

preds = model.predict(x)
print("Top predictions:", decode_predictions(preds, top=3)[0])


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 6s/step
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
[1m35363/35363[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Top predictions: [('n04152593', 'screen', np.float32(0.24428855)), ('n04404412', 'television', np.float32(0.23151615)), ('n03761084', 'microwave', np.float32(0.116227604))]


### 4.2 Fine-tuning for Custom Tasks
Adapt pretrained models to new datasets:

In [7]:
# Flower classification example
# Mengimpor library yang dibutuhkan untuk visi komputer dan CNN
import tensorflow_datasets as tfds

# Load dataset
dataset, info = tfds.load("tf_flowers", as_supervised=True, with_info=True)
n_classes = info.features["label"].num_classes

# Preprocess function
def preprocess(image, label):
    image = tf.image.resize(image, [224, 224])
    image = preprocess_input(image)
    return image, label

# Prepare datasets
batch_size = 32
train_set = dataset["train"].map(preprocess).batch(batch_size).prefetch(1)

# Create model
base_model = ResNet50(weights="imagenet", include_top=False)
avg = tf.keras.layers.GlobalAveragePooling2D()(base_model.output)
output = tf.keras.layers.Dense(n_classes, activation="softmax")(avg)
model = tf.keras.Model(inputs=base_model.input, outputs=output)

# Freeze base model
for layer in base_model.layers:
    layer.trainable = False

# Compile and train
# Menyusun model dengan loss function dan optimizer
model.compile(optimizer="adam",
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])
# Melatih model CNN pada dataset gambar
model.fit(train_set, epochs=5)



Downloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to /root/tensorflow_datasets/tf_flowers/3.0.1...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Generating splits...:   0%|          | 0/1 [00:00<?, ? splits/s]

Generating train examples...: 0 examples [00:00, ? examples/s]

Shuffling /root/tensorflow_datasets/tf_flowers/incomplete.VY6L99_3.0.1/tf_flowers-train.tfrecord*...:   0%|   …

Dataset tf_flowers downloaded and prepared to /root/tensorflow_datasets/tf_flowers/3.0.1. Subsequent calls will reuse this data.
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m94765736/94765736[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Epoch 1/5
[1m115/115[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m29s[0m 145ms/step - accuracy: 0.6644 - loss: 0.8919
Epoch 2/5
[1m115/115[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 84ms/step - accuracy: 0.9161 - loss: 0.2581
Epoch 3/5
[1m115/115[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 87ms/step - accuracy: 0.9488 - loss: 0.1809
Epoch 4/5
[1m115/115[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 86ms/step - accuracy: 0.9642 - loss: 0.1397
Epoch 5/5
[1m115/115[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 86ms/step - accuracy: 0.9734 - loss: 0.1129


<keras.src.callbacks.history.History at 0x7f9657be9ed0>

## 5. Advanced CNN Architectures

### 5.1 Inception Modules
- Use multiple filter sizes in parallel
- Efficient "network within network" design
- Reduces parameters while capturing multi-scale features

### 5.2 Xception Architecture
- Extreme version of Inception
- Depthwise separable convolutions
- More efficient computation

### 5.3 Attention Mechanisms
- Squeeze-and-Excitation Networks (SENet)
- Channel-wise attention for feature recalibration

## 6. Computer Vision Tasks

### 6.1 Object Detection
- **YOLO (You Only Look Once)**: Fast real-time detection
- **Faster R-CNN**: High accuracy with region proposals

### 6.2 Semantic Segmentation
- Fully Convolutional Networks (FCNs)
- U-Net architecture with skip connections
- Transposed convolutions for upsampling

## 7. Exercises

1. Implement a CNN from scratch for CIFAR-10 classification
2. Fine-tune a pretrained model on a custom dataset
3. Visualize CNN feature maps to understand what layers learn
4. Compare performance of different CNN architectures
5. Implement data augmentation for improved generalization

## 8. Key Takeaways

- CNNs excel at processing grid-like data through local connectivity and parameter sharing
- Modern architectures use techniques like residual connections and attention mechanisms
- Transfer learning is powerful for custom tasks with limited data
- Different architectures suit different tasks (classification, detection, segmentation)
- Proper preprocessing and augmentation are crucial for performance