# ResNet and Big Data Computer Vision Simulations

**Course**: BCS-05-0839/2022  
**Presenter**: James Gitari  
**Support Groups**: 7, 8, 9

This interactive notebook allows you to run and explore the ResNet simulations step by step.

## Learning Objectives
- Understand skip connections and identity shortcuts in ResNet
- Compare standard vs residual blocks
- Explore transfer learning with pre-trained models
- Analyze the impact of big data on computer vision tasks

In [3]:
# Import required libraries
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import os
import warnings

# Suppress warnings for cleaner output
warnings.filterwarnings('ignore')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

# Set random seeds for reproducibility
tf.random.set_seed(42)
np.random.seed(42)

print(f"TensorFlow version: {tf.__version__}")
print(f"GPU Available: {len(tf.config.list_physical_devices('GPU')) > 0}")
print(f"Execution started: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

ModuleNotFoundError: No module named 'tensorflow'

## Part 1: Understanding Residual Blocks

Let's start by implementing and comparing standard convolutional blocks with residual blocks.

In [1]:
# Import our simulation code
import sys
sys.path.append('simulations')
sys.path.append('utils')

from simulation4_residual_blocks import ResidualBlockExperiment
from visualization_tools import VisualizationTools

# Initialize the experiment
residual_experiment = ResidualBlockExperiment(
    input_shape=(32, 32, 3), 
    num_blocks=6  # Reduced for faster execution in notebook
)

print("✅ Residual block experiment initialized")

ModuleNotFoundError: No module named 'tensorflow'

In [None]:
# Create and compare the two types of networks
print("Building standard network...")
standard_model = residual_experiment.build_standard_network()

print("\nBuilding residual network...")
residual_model = residual_experiment.build_residual_network()

# Compare model summaries
print(f"\nStandard Network Parameters: {standard_model.count_params():,}")
print(f"Residual Network Parameters: {residual_model.count_params():,}")

In [None]:
# Train both models and compare (reduced epochs for notebook)
print("Training both models for comparison...")
results = residual_experiment.train_and_compare_models(
    epochs=10,  # Reduced for notebook
    batch_size=32
)

print("\n✅ Training completed!")

In [None]:
# Visualize the training comparison
residual_experiment.plot_training_comparison(save_path="results")

In [None]:
# Analyze gradient flow
residual_experiment.analyze_gradient_flow(save_path="results")

### Analysis Questions for Simulation 4

**Answer these questions based on the results above:**

1. **Training Curves**: How do the training and validation curves differ between standard and residual networks?

2. **Gradient Flow**: What do the gradient magnitude plots tell us about how gradients flow through the networks?

3. **Final Performance**: Which network achieved better final accuracy and why?

4. **Convergence**: Which network converged more smoothly and quickly?

*Write your analysis in the markdown cell below:*

### Your Analysis Here:

**Training Curves Analysis:**
- [Write your observations about the training curves]

**Gradient Flow Analysis:**
- [Explain what the gradient magnitude plots show]

**Performance Comparison:**
- [Compare the final accuracy and explain the reasons]

**Key Insights:**
- [Summarize the main insights about residual connections]

## Part 2: Transfer Learning with Pre-trained ResNet

Now let's explore how big data (ImageNet) can help with small dataset problems through transfer learning.

In [None]:
# Import transfer learning experiment
from simulation5_transfer_learning import TransferLearningExperiment

# Initialize the transfer learning experiment
transfer_experiment = TransferLearningExperiment(
    img_size=(224, 224),  # ResNet50 input size
    num_classes=2  # Cats vs Dogs
)

print("✅ Transfer learning experiment initialized")

In [None]:
# Run the complete transfer learning experiment
print("Running transfer learning experiment...")
print("This will compare three approaches:")
print("1. Baseline CNN (trained from scratch)")
print("2. Feature Extractor (frozen ResNet50)")
print("3. Fine-tuning (unfrozen ResNet50)")

transfer_results = transfer_experiment.train_all_models(
    epochs_stage1=6,  # Reduced for notebook
    epochs_stage2=3,   # Reduced for notebook
    batch_size=16      # Smaller batch size for memory
)

print("\n✅ Transfer learning experiment completed!")

In [None]:
# Generate comprehensive comparison plots
transfer_experiment.plot_comparison_results(save_path="results")

### Analysis Questions for Simulation 5

**Answer these questions based on the transfer learning results:**

1. **Performance Comparison**: How did the three approaches (baseline, feature extractor, fine-tuning) compare in terms of final accuracy?

2. **Parameter Efficiency**: How many parameters did each approach require to train?

3. **Training Efficiency**: Which approach converged fastest and why?

4. **Big Data Impact**: How did the ImageNet pre-training help with the cats vs dogs problem?

5. **When to Use Each**: Based on these results, when would you use feature extraction vs fine-tuning?

*Write your analysis in the markdown cell below:*

### Your Analysis Here:

**Performance Comparison:**
- [Compare the final test accuracies of all three approaches]

**Parameter Efficiency:**
- [Discuss the number of trainable parameters in each approach]

**Training Efficiency:**
- [Analyze convergence speed and training time]

**Big Data Impact:**
- [Explain how ImageNet pre-training benefited the small dataset]

**Strategic Insights:**
- [Provide guidelines on when to use each transfer learning strategy]

## Part 3: Visualization and Feature Analysis

Let's explore what the networks learned by visualizing their features.

In [None]:
# Initialize visualization tools
viz_tools = VisualizationTools(save_dir="results")

# Create some sample data for visualization
sample_image = np.random.rand(224, 224, 3)

print("🎨 Visualization tools ready")
print("Available models for analysis:")
print("- Baseline CNN")
print("- Feature Extractor (ResNet50)")
print("- Fine-tuned ResNet50")

In [None]:
# Let's examine the ResNet50 architecture
base_resnet = keras.applications.ResNet50(
    weights='imagenet',
    include_top=False,
    input_shape=(224, 224, 3)
)

print("ResNet50 Architecture Summary:")
print(f"Total layers: {len(base_resnet.layers)}")
print(f"Total parameters: {base_resnet.count_params():,}")
print(f"Input shape: {base_resnet.input_shape}")
print(f"Output shape: {base_resnet.output_shape}")

# Show some layer information
print("\nSample layers:")
for i, layer in enumerate(base_resnet.layers[:10]):  # First 10 layers
    print(f"  {i}: {layer.name} ({type(layer).__name__})")

## Part 4: Extension Activities

Try these additional experiments to deepen your understanding:

In [None]:
# Extension 1: Experiment with different network depths
print("Extension 1: Comparing different network depths")
print("Try modifying num_blocks in ResidualBlockExperiment to see how depth affects training")

# Extension 2: Try different pre-trained models
print("\nExtension 2: Other pre-trained models available:")
print("- VGG16: keras.applications.VGG16")
print("- InceptionV3: keras.applications.InceptionV3")
print("- DenseNet: keras.applications.DenseNet121")

# Extension 3: Real dataset suggestions
print("\nExtension 3: Try with real datasets:")
print("- CIFAR-10: keras.datasets.cifar10")
print("- Fashion-MNIST: keras.datasets.fashion_mnist")
print("- Custom datasets from your own images")

## Summary and Key Takeaways

### Simulation 4: Residual Blocks
**Key Concepts Demonstrated:**
- Skip connections provide direct gradient paths
- Residual blocks mitigate vanishing gradient problems
- Identity mapping allows deeper networks to train effectively
- Residual networks show smoother training curves

### Simulation 5: Transfer Learning
**Key Concepts Demonstrated:**
- Big data (ImageNet) provides powerful general features
- Transfer learning outperforms training from scratch
- Feature extraction is parameter-efficient
- Fine-tuning can further improve task-specific performance

### Assignment Deliverables Checklist
- [ ] **Objective**: Clear statement of goals for each simulation
- [ ] **Code**: Well-commented implementation (provided here)
- [ ] **Results**: Generated plots and quantitative metrics
- [ ] **Analysis**: Detailed discussion of findings and implications

### Report Writing Tips
1. **Explain the 'Why'**: Don't just describe what happened, explain why it happened
2. **Use Evidence**: Reference specific results from your plots and metrics
3. **Connect Theory**: Link your observations to the underlying theory
4. **Real-World Relevance**: Discuss implications for practical applications

### Questions for Further Investigation
1. How would results change with different dataset sizes?
2. What happens with very deep networks (100+ layers)?
3. How do different pre-trained models compare?
4. What's the minimum amount of data needed for effective transfer learning?

**Good luck with your analysis and reports!** 🎓