# P0: Alohomora!

# Part2: Color Classification using a Single Gaussian

## Table Of Content

1. Introduction
2. Preliminaries
3. Software Setup
4. Implementation
5. Grading
6. Report guidelines

## 1. Introduction

In this assignment, you will learn how to segment objects or classes based on their color properties.

You will perform color segmentation to identify and classify the following objects:

- **Class 0**: Green cap (Smallest one)
- **Class 1**: Yellow cap
- **Class 2**: Blue cap
- **Class 3**: Red cap (Largest one)

A sample output is provided below. 

![Blue Cap](./artefacts/sample.png)

You are provided with 7 RGB images containing 4 distinct classes of objects. Your task is to perform color segmentation using Gaussian models that represent the probability distribution of each class.

**Output Requirements**

Include a 7x5 image grid in your report. Each row should correspond to one of the original images and its respective segmentation results, where a mask is applied to highlight the identified objects. Refer to the sample output for the expected format.

## 2. Preliminaries

### RGB Color Space

RGB stands for Red, Green, and Blue—the primary colors of light. In the RGB color space, colors are represented as combinations of these three colors, each with varying intensities. Each color component (R, G, B) typically ranges from 0 to 255 in digital images.

- **Red**: Controls the intensity of the red color in the pixel.
- **Green**: Controls the intensity of the green color.
- **Blue**: Controls the intensity of the blue color.

By mixing different levels of these three colors, a wide range of colors can be created. For example, pure red is represented as (255, 0, 0), while white is (255, 255, 255), and black is (0, 0, 0).

RGB is the most common color space used in digital imaging, displays, and cameras. It's the basis for how colors are displayed on screens, where each pixel is made up of tiny red, green, and blue subpixels.

### Color Segmentation using a Single Gaussian

In this assignment, you'll explore how to segment objects in images based on their color using a probabilistic approach. Instead of simply stating that a pixel is a certain color (like red or green), you'll learn how to assign probabilities to each color. This method is more accurate in real-world scenarios where sensor noise and lighting conditions can cause color variations.

We approach color classification as a machine learning problem, where each pixel is assigned a probability of belonging to a specific color class (e.g., the likelihood that a pixel is part of a green cap). Essentially, given a pixel's color (R, G, B), we estimate the probability that it belongs to a particular class.

Given a class, we can estimate the spread of the RGB values (probability distribution) by analyzing the dataset. However, to estimate the probability that a particular RGB value belongs to a certain class, we use Bayes' rule to solve this problem. Mathematically, 

![Bayes](./artefacts/bayes.png)



Where,
- $x$ - vector containing R, G, B values for a single pixel
- $p(C_l | x$) - Probability that a gixen X belongs to certain class $C_l)
- $p(x | c_l$) - Probability that a given class has $x$ as its RGB value

If we're trying to segment out an orange fruit, we start by understanding the typical range of RGB values for the color orange. We also take into account any prior information. For example, if we know the image was taken on Mars, the chance of finding an orange fruit is practically zero. Thus, even if the color suggests otherwise, we can conclude that the object is not an orange fruit.

1. In the provided training images (found in the data folder), the probability of finding each of the four classes is assumed to be equal.

2. We also apply a confidence threshold to classify a pixel. If the probability of a pixel being part of the range fruit, given its RGB value, is higher than a set threshold (e.g., 0.01), we can confidently classify that pixel as belonging to the orange fruit. 

Because of 1 and 2, given that the threshold we set is tunable, we can write,

![Bayes Approximated](./artefacts/bayes_approx.png)

![Final Equation](./artefacts/final_eqn.png)

Here, τ is a user chosen threshold which signifies the confidence score.

To complete this task, we need to choose a probability distribution for 𝑝. The simplest choice is a Gaussian distribution, which is described by just two parameters: the mean and the covariance.

For scalar data, both the mean and covariance are scalars. However, for our RGB data, the mean will be a 3-element vector representing the average RGB values, and the covariance will be a 3x3 matrix that captures the spread and correlation of these values.

Mathematically, mean and variance for each class can be calculated by computing the following functions for the pixels that belong to a certain class from the training images.

![Mean](./artefacts/mean.png)

![Covariance](./artefacts/covariance.png)

The probability density function for a particular class (say orange) is given by,

![Pdf](./artefacts/pdf.png)

**TLDR**
1. Estimate the mean and covariance matrix for each class after extract the RGB values from each image corresponding to that particular class.
2. Compute the probability density function for every pixel in the provided images. Then make a binary mask using a threshold (tunable).
3. Apply the binary mask on the provided images to segment objects of each class separately.

## 3. Software Setup

There is no starter code provided for this project. Implement your solution in Python using OpenCV.






## 4. Implementation (Psuedo Code for Python) 

In [None]:
# Training Code

Rerun this code for each class separately

Initialize an empty list/matrix to store RGB values from all images

For each image index from 1 to 7:
    - Read the image from the file path
    - Display the image (imshow)

    - Draw a freehand shape on the image and create a mask for this shape (`roipoly` in MATLAB, find or use any function for this in Python)

    - Extract the RGB channels from the image
    - Apply the mask to get RGB values from the selected area

    - Combine the RGB values into a single list
    - Append this list to the main list of RGB values

Estimate the mean of the rgb values
Estimate the covariance of the RGB values

# Color Segmentation for a single class

For i from 1 to 10:
    - Load the image from the file
    - Apply Gaussian distribution to the image using `applyGaussianToImage` function
    - Create a mask where the probability is greater than 1e-6
    - Apply the mask

Function `applyGaussianToImage`:
    - Convert image to double precision
    - Reshape the image into a matrix where each row is a pixel vector (R, G, B)
    - Implement and compute the probability density function (PDF) for each pixel using multivariate Gaussian 
    - Reshape the PDF result back to image dimensions
    - Return the resulting image with PDF values

In [None]:
# Training Code

# Write Your Code Here or in a separate py file

In [None]:
# Segmentation Code 

# Write Your Code Here or in a separate py file

## 5. Grading

- Results: 70% of Part2 score (segmentation performance for the provided dataset for all classes) 
- Report:  30% of Part2 Score

- For RBE474X: Part1 is 100% of the grade and Part2 is 20% extra credit.
- For RBE595-A01-SP: Part1 is 67% of the grade and Part2 is 33% of the grade.

## 6. Report Guidelines

Report must be in Latex.

Include a 7x5 image grid in your report. Each row should correspond to one of the original images and its respective segmentation results, where a mask is applied to highlight the identified objects. Refer to the sample output for the expected format.

Explain your implementation. Report your threshold values. Explain how you tuned the model.

Explain the drawbacks and assumptions in using a single gaussian for color classification/segmentation.