# Faster R-CNN and YOLOv3 Object Detection Assignment

This notebook contains the comprehensive analysis and implementation of both Faster R-CNN and YOLOv3 object detection models.

## 1. Faster R-CNN - Problem 2: Code Analysis

### Important Components of Faster R-CNN Implementation:

#### **Region Proposal Network (RPN)**
- **Location**: `data/faster_rcnn/model/resnet.py` - function `rpn(base_layers, num_anchors)`
- **Implementation**: The RPN is implemented as a convolutional layer that slides over the feature map and generates region proposals
- **Key Lines**:
  ```python
  def rpn(base_layers,num_anchors):
      x = Convolution2D(512, (3, 3), padding='same', activation='relu', kernel_initializer='normal', name='rpn_conv1')(base_layers)
      x_class = Convolution2D(num_anchors, (1, 1), activation='sigmoid', kernel_initializer='uniform', name='rpn_out_class')(x)
  ```

#### **ROI Pooling Implementation**
- **Location**: `data/faster_rcnn/model/RoiPoolingConv.py`
- **Implementation**: Custom Keras layer that implements ROI Pooling
- **Key Features**: 
  - Handles both channels_first and channels_last image data formats
  - Uses TensorFlow resize operations for efficient pooling
  - Converts ROI coordinates to cropped regions and applies max pooling

#### **Backbone Network (ResNet)**
- **Location**: `data/faster_rcnn/model/resnet.py`
- **Implementation**: Uses ResNet-50 as the backbone feature extractor
- **Key Functions**:
  - `nn_base()`: Base ResNet-50 feature extraction
  - `classifier()`: Classification head that processes ROI features

#### **Loss Functions**
- **Location**: `data/faster_rcnn/model/losses.py`
- **Implementation**: Implements specialized loss functions for:
  - RPN classification loss (`rpn_loss_cls`)
  - RPN regression loss (`rpn_loss_regr`)
  - Final classifier loss (`class_loss_cls`, `class_loss_regr`)

#### **Training Pipeline**
- **Location**: `data/faster_rcnn/train.py`
- **Key Features**:
  - Two-stage training: first trains RPN, then classifier
  - Data augmentation support (horizontal flips, vertical flips, 90° rotations)
  - Alternating RPN and classifier training within each iteration
- **Training Strategy**:
  1. Feed forward through network
  2. Generate ROI proposals using RPN
  3. Calculate IoU between proposals and ground truth
  4. Sample positive and negative examples for classifier
  5. Train both RPN and classifier in same iteration

#### **Prediction Pipeline**
- **Location**: `data/faster_rcnn/predict.py`
- **Process**: RPN → ROI extraction → Classification → NMS → Final detections

## 2. YOLOv3 - Problem 6: Code Analysis

### Important Components of YOLOv3 Implementation:

#### **Darknet Backbone**
- **Location**: `data/yolov3/yolo3/model.py` - function `darknet_body(x)`
- **Implementation**: Custom implementation of Darknet-53 backbone
- **Key Features**:
  ```python
  def darknet_body(x):
      x = DarknetConv2D_BN_Leaky(32, (3,3))(x)
      x = resblock_body(x, 64, 1)
      x = resblock_body(x, 128, 2)
      # ... continues with 52 total convolutional layers
  ```

#### **Multi-Scale Prediction**
- **Location**: `data/yolov3/yolo3/model.py` - function `yolo_body()`
- **Implementation**: Uses feature maps at 3 different scales (13×13, 26×26, 52×52)
- **Key Innovation**: Concatenates features from earlier layers with upsampled features
  ```python
  # Multi-scale detection heads
  route1 = concatenate([(ip), x], axis=-1)
  route2 = concatenate([(ip2)], x_small], axis=-1)
  ```

#### **YOLO Detection Head**
- **Location**: `data/yolov3/yolo3/model.py` - function `make_last_layers()`
- **Implementation**: Each detection head predicts:
  - Bounding box coordinates (4 values)
  - Objectness score (1 value)
  - Class probabilities (N classes)
- **Total outputs**: 3 scales × 3 anchors × (5 + N_classes)

#### **Anchor Box System**
- **Location**: `data/yolov3/model_data/yolo_anchors.txt`
- **Implementation**: Uses pre-defined anchor boxes (9 total: 3 per scale)
- **Anchor dimensions**: Optimized for COCO dataset objects

#### **Non-Maximum Suppression**
- **Location**: `data/yolov3/yolo3/model.py` - function `yolo_eval()`
- **Implementation**: Applies IoU-based filtering and class-specific NMS
- **Process**: Score threshold → IoU filtering → NMS per class → Final detections

#### **Prediction Pipeline**
- **Location**: `data/yolov3/yolo.py` - class `YOLO`
- **Key Methods**:
  - `detect_image()`: Processes individual images
  - `detect_video()`: Processes video files
  - `letterbox_image()`: Maintains aspect ratio during resizing

#### **Training Infrastructure**
- **Location**: `data/yolov3/train.py`
- **Features**:
  - Custom data generators for YOLO format
  - Data augmentation (random crops, colors, flips)
  - Learning rate scheduling
  - Transfer learning support from Darknet weights

## 3. Model Comparison

| Aspect | Faster R-CNN | YOLOv3 |
|--------|-------------|---------|
| **Architecture** | Two-stage (RPN + Classifier) | Single-stage |
| **Backbone** | ResNet-50 | Darknet-53 |
| **Prediction Time** | Slower (multiple steps) | Faster (single pass) |
| **Accuracy** | Generally higher | Slightly lower but very competitive |
| **Memory Usage** | Higher (ROI storage) | Lower |
| **Implementation Complexity** | More complex | Simpler |
| **Training** | Two-stage training | End-to-end training |
| **Region Proposals** | Explicit RPN | Implicit anchor-based |
|

**Key Differences:**
- **Faster R-CNN**: Region proposal → Feature extraction → Classification cascade
- **YOLOv3**: Direct dense prediction with multi-scale feature fusion
- **Faster R-CNN**: Better for small objects, more precise localization
- **YOLOv3**: Faster inference, better for real-time applications

In [None]:
# Import necessary libraries for testing
import os
import sys
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

# Add project paths
sys.path.append('data/faster_rcnn')
sys.path.append('data/yolov3')

print('Dependencies imported successfully!')

## 4. Setup and Configuration

### Project Structure:
```
r-cnn-yolo/
├── data/
│   ├── faster_rcnn/     # Faster R-CNN implementation
│   └── yolov3/          # YOLOv3 implementation
├── plots/                # Output results
├── src/                  # Source code
├── reports/              # Logs and analysis
├── main.py               # Main execution script
├── object-detection.ipynb # This notebook
└── requirements.txt      # Dependencies
```