# Computer Vision - Intermediate Level

Welcome to the Computer Vision intermediate tasks! This notebook contains three comprehensive tasks to test your understanding of CNNs, object detection, and image segmentation.

## Tasks Overview:
1. **Task 1: Image Classification with CNNs** - Build a CNN for image classification
2. **Task 2: Object Detection** - Implement object detection using pre-trained models
3. **Task 3: Image Segmentation** - Perform semantic segmentation

Please refer to `tasks.md` for detailed requirements for each task.

In [None]:
# TODO: Import necessary libraries
# You will need: numpy, matplotlib, tensorflow/keras or pytorch, opencv-cv2, PIL, etc.
# Example:
# import numpy as np
# import matplotlib.pyplot as plt
# import tensorflow as tf
# from tensorflow import keras
# import cv2

---
## Task 1: Image Classification with CNNs

Build a Convolutional Neural Network to classify images from CIFAR-10 or Fashion MNIST dataset.

**Requirements:**
- Load and preprocess the dataset
- Design a CNN architecture with at least 3 convolutional layers
- Train the model and achieve at least 70% accuracy
- Visualize training/validation loss and accuracy curves
- Display sample predictions with confidence scores

**Hints:**
- Use `keras.datasets.cifar10.load_data()` or `keras.datasets.fashion_mnist.load_data()`
- Normalize pixel values to [0, 1] range
- Use Conv2D, MaxPooling2D, Dropout, and Dense layers
- Consider using data augmentation for better results

In [None]:
# TODO: Step 1 - Load and preprocess the dataset
# Load your chosen dataset (CIFAR-10 or Fashion MNIST)
# Normalize the pixel values
# Split into training and testing sets
# Visualize some sample images with their labels

In [None]:
# TODO: Step 2 - Build the CNN model
# Create a Sequential model with:
# - At least 3 Conv2D layers
# - MaxPooling2D layers after convolutions
# - Dropout layers to prevent overfitting
# - Flatten layer
# - Dense layers for classification
# Compile the model with appropriate optimizer, loss, and metrics

In [None]:
# TODO: Step 3 - Train the model
# Train your model using model.fit()
# Use appropriate epochs, batch_size, and validation_split
# Store the training history for visualization

In [None]:
# TODO: Step 4 - Visualize training curves
# Plot training and validation loss
# Plot training and validation accuracy
# Use matplotlib to create side-by-side plots

In [None]:
# TODO: Step 5 - Evaluate and display predictions
# Evaluate the model on test data
# Select random test images
# Display predictions with confidence scores
# Show original images with predicted and actual labels

---
## Task 2: Object Detection

Implement an object detection system using a pre-trained model (e.g., YOLO or Faster R-CNN).

**Requirements:**
- Load a pre-trained object detection model
- Process at least 5 different test images
- Draw bounding boxes around detected objects with labels
- Calculate and display confidence scores for each detection
- Handle multiple objects in a single image

**Hints:**
- Consider using TensorFlow Hub or PyTorch models
- COCO-pretrained models work well for common objects
- Use OpenCV for drawing bounding boxes
- Filter detections by confidence threshold (e.g., > 0.5)

In [None]:
# TODO: Step 1 - Load pre-trained object detection model
# Choose a model (YOLO, Faster R-CNN, SSD, etc.)
# Load the model from TensorFlow Hub, PyTorch, or other source
# Load COCO class labels if needed

In [None]:
# TODO: Step 2 - Load and preprocess test images
# Load at least 5 different test images
# Preprocess images according to model requirements
# Resize if needed

In [None]:
# TODO: Step 3 - Run object detection
# Run the model on each image
# Extract bounding boxes, class labels, and confidence scores
# Filter detections by confidence threshold

In [None]:
# TODO: Step 4 - Visualize results
# Draw bounding boxes on images using OpenCV or matplotlib
# Add labels and confidence scores to each detection
# Display all images with detections

---
## Task 3: Image Segmentation

Perform semantic segmentation on images using a pre-trained model or train a simple U-Net.

**Requirements:**
- Choose an appropriate dataset (e.g., Pascal VOC, Cityscapes subset)
- Implement or load a segmentation model
- Visualize the segmentation masks overlaid on original images
- Calculate IoU (Intersection over Union) scores
- Compare results on at least 3 different images

**Hints:**
- U-Net is a good architecture for segmentation
- Consider using pre-trained DeepLab models
- Use different colors for different classes in visualization
- IoU = (Area of Overlap) / (Area of Union)

In [None]:
# TODO: Step 1 - Load dataset and model
# Choose a segmentation dataset (Pascal VOC, Cityscapes, etc.)
# Load or build a segmentation model (U-Net, DeepLab, etc.)
# Load pre-trained weights if available

In [None]:
# TODO: Step 2 - Preprocess images
# Load at least 3 test images
# Preprocess according to model requirements
# Prepare ground truth masks if available

In [None]:
# TODO: Step 3 - Run segmentation
# Run the model on each image
# Get segmentation masks (pixel-wise class predictions)

In [None]:
# TODO: Step 4 - Calculate IoU scores
# Implement IoU calculation
# Compare predictions with ground truth masks
# Display IoU score for each class and overall mean IoU

In [None]:
# TODO: Step 5 - Visualize segmentation results
# Overlay segmentation masks on original images
# Use different colors for different classes
# Display side-by-side: original image, ground truth, prediction