<a href="https://colab.research.google.com/github/amanzoni1/prove_varie/blob/main/back_proj.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project Overview

In the realm of computer vision, instance segmentation is a critical task with applications ranging from autonomous driving to augmented reality. Our project focuses on developing a sophisticated image processing pipeline that can segment people from images and seamlessly replace the background with various cities or tourist spots. This not only enhances visual aesthetics but also has potential applications in photography, virtual backgrounds for video conferencing, and creative media.

We will leverage the COCO (Common Objects in Context) dataset, specifically focusing on the ‘person’ category. By utilizing a Mask R-CNN (Regional Convolutional Neural Network) model, we’ll perform precise segmentation of individuals in images. The project aims to deliver high-quality results suitable for practical use.

## Key Objectives

1. **Develop and Train a Neural Network Model:**
   - Utilize a pre-trained Mask R-CNN model and fine-tune it for our specific task.

2. **Implement Instance Segmentation:**
   - Accurately segment people from images using advanced deep learning techniques.

3. **Background Replacement:**
   - Replace the original background with selected images of cities or tourist spots while maintaining the integrity of the foreground subject.

4. **Utilize the COCO Dataset:**
   - Work with a substantial subset of the COCO dataset containing images of people to train and validate our model effectively.

5. **Create an Interactive Pipeline:**
   - Develop a user-friendly interface within Colab for testing the model with custom images and backgrounds.

# Project Details

## Dataset

**COCO (Common Objects in Context) Dataset:**

- **Description:** A large-scale object detection, segmentation, and captioning dataset with over 200,000 images and 80 object categories.
- **Usage in Project:** We’ll focus on images containing the ‘person’ category. A subset of 64,000 images has been downloaded and stored in Google Drive for this project.

## Task

Develop a pipeline that can:

- **Segment individuals** in images with high accuracy.
- **Replace the background** while preserving the foreground subject’s details.
- **Maintain realistic blending** between the foreground and new background.

## Approach

1. **Exploratory Data Analysis (EDA):**
   - Understand the dataset’s structure and contents.
   - Visualize sample images and annotations to gain insights.

2. **Data Preparation:**
   - Implement a custom dataset class to load images and annotations.
   - Apply data transformations and augmentations to enhance model robustness.

3. **Model Setup:**
   - Initialize a pre-trained Mask R-CNN model.
   - Modify the model to suit our specific segmentation task.

4. **Model Training:**
   - Fine-tune the model using the prepared dataset.
   - Monitor training progress and optimize performance.

5. **Evaluation:**
   - Assess the model’s performance using appropriate metrics.
   - Visualize predictions to qualitatively evaluate segmentation quality.

6. **Background Replacement Pipeline:**
   - Develop functions to replace the background of segmented images.
   - Ensure seamless integration between the foreground and new background.

7. **Interactive Testing:**
   - Create an interface in Colab for users to upload images and select backgrounds.
   - Allow real-time testing of the segmentation and background replacement.

## Implementation

In [1]:
import os
import numpy as np
import random
import time
import matplotlib.pyplot as plt
from PIL import Image

import torch
import torchvision
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms

!pip install pycocotools
from pycocotools.coco import COCO

from google.colab import drive



In [2]:
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
dataset_dir = '/content/drive/MyDrive/COCO_person_dataset'
annotations_dir = '/content/drive/MyDrive/COCO_annotations'

os.makedirs(dataset_dir, exist_ok=True)
os.makedirs(annotations_dir, exist_ok=True)

In [4]:
annotations_zip_path = '/content/annotations_trainval2017.zip'
annotations_file = os.path.join(annotations_dir, 'instances_train2017.json')

# Check if annotations already exist in Drive
if not os.path.exists(annotations_file):
    # Download annotations ZIP to Colab's local storage
    !wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip -O {annotations_zip_path}

    # Unzip the specific annotation file to Google Drive
    !unzip -j {annotations_zip_path} annotations/instances_train2017.json -d {annotations_dir}

    # Remove the ZIP file from Colab's local storage
    os.remove(annotations_zip_path)
else:
    print("Annotations already exist in Google Drive.")

--2024-09-17 05:14:07--  http://images.cocodataset.org/annotations/annotations_trainval2017.zip
Resolving images.cocodataset.org (images.cocodataset.org)... 16.182.64.185, 3.5.28.143, 52.217.133.209, ...
Connecting to images.cocodataset.org (images.cocodataset.org)|16.182.64.185|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 252907541 (241M) [application/zip]
Saving to: ‘/content/annotations_trainval2017.zip’


2024-09-17 05:14:10 (92.2 MB/s) - ‘/content/annotations_trainval2017.zip’ saved [252907541/252907541]

Archive:  /content/annotations_trainval2017.zip
  inflating: /content/drive/MyDrive/COCO_annotations/instances_train2017.json  


In [6]:
# Path to annotations file in Google Drive
annotations_file = os.path.join(annotations_dir, 'instances_train2017.json')

# Initialize COCO API
coco = COCO(annotations_file)

loading annotations into memory...
Done (t=24.94s)
creating index...
index created!


In [7]:
# Get all category IDs
cat_ids = coco.getCatIds()

# Get all image IDs
img_ids = coco.getImgIds()

print(f"Number of categories: {len(cat_ids)}")
print(f"Number of images: {len(img_ids)}")

Number of categories: 80
Number of images: 118287


In [8]:
# Get category ID for 'person'
person_cat_id = coco.getCatIds(catNms=['person'])[0]

# Get all image IDs containing the 'person' category
img_ids = coco.getImgIds(catIds=[person_cat_id])
print(f"Total images with 'person' annotations: {len(img_ids)}")

Total images with 'person' annotations: 64115


In [9]:
# Get list of image filenames in your dataset directory
downloaded_filenames = set(os.listdir(dataset_dir))

# Create a mapping from filename to image ID for images that exist
filename_to_img_id = {}
for img_id in img_ids:
    img_info = coco.loadImgs(img_id)[0]
    filename = img_info['file_name']
    if filename in downloaded_filenames:
        filename_to_img_id[filename] = img_id

# Update img_ids to only include images you have downloaded
img_ids = list(filename_to_img_id.values())
print(f"Number of images available for training: {len(img_ids)}")

Number of images available for training: 64115
