Backpropagation in Modern Neural Networks

Techniques to Improve Learning

Transfer Learning (Using pre-trained models)

Understanding Backpropagation

Backpropagation is how neural networks learn by adjusting weights using gradient descent.


Key Steps in Backpropagation

Forward Pass: Compute predictions.

Loss Calculation: Measure error (e.g., categorical crossentropy).

Backward Pass: Compute gradients using the chain rule.

Weight Update: Use gradients to adjust weights (via SGD, Adam, etc.).

Modern Backpropagation Techniques

 Adaptive Learning Rate Optimization
 
Instead of a fixed learning rate, optimizers like Adam adjust learning rates dynamically.

In [None]:
import tensorflow as tf
from tensorflow.keras.optimizers import Adam

model.compile(optimizer=Adam(learning_rate=0.001),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])


Gradient Clipping

Prevents exploding gradients by capping gradient values.

In [None]:
optimizer = Adam(learning_rate=0.001, clipnorm=1.0)


Batch Normalization

Normalizes activations between layers to stabilize learning

In [None]:
from tensorflow.keras.layers import BatchNormalization

model = tf.keras.Sequential([
    tf.keras.layers.Dense(128, activation='relu'),
    BatchNormalization(),
    tf.keras.layers.Dense(10, activation='softmax')
])

Learning Rate Scheduling

Reduces learning rate over time to refine training.

In [None]:
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
    initial_learning_rate=0.01, decay_steps=1000, decay_rate=0.96)

optimizer = Adam(learning_rate=lr_schedule)


Transfer Learning (Using Pre-trained Models)

Transfer learning allows us to use a pre-trained model (like VGG16, ResNet) and fine-tune it for a new task.

Example: Using VGG16 for Cats vs. Dogs Classification

Instead of training from scratch, we use VGG16 trained on ImageNet.

Step 1: Load a Pre-trained Model (Without Top Layer)


In [None]:
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Flatten

# Load pre-trained VGG16 model (without top layers)
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Freeze base model layers (don't train them)
base_model.trainable = False


Step 2: Add Custom Layers

In [None]:
# Create a new model on top of VGG16
x = Flatten()(base_model.output)
x = Dense(128, activation='relu')(x)
x = Dense(1, activation='sigmoid')(x)  # Binary classification (Cat vs. Dog)

# Define final model
model = Model(inputs=base_model.input, outputs=x)


Step 3: Compile & Train

In [None]:
model.compile(optimizer=Adam(learning_rate=0.0001),
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Train with new dataset
model.fit(train_data, epochs=5, validation_data=val_data)


Summary

**Technique**	           **Purpose**	                   **Benefit**

**Gradient Clipping**	     Prevents exploding gradients	Stabilizes deep networks

**Batch Normalization**      Normalizes activations	        Faster training

**Learning Rate Scheduling**	Adjusts learning rate dynamically	 Higher accuracy

**Transfer Learning**	Uses pre-trained models	     Saves time, improves performance