<a href="https://colab.research.google.com/github/fjadidi2001/Image_Inpaint/blob/main/CM_GAN_Jan5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

The architecture of CM-GAN focuses on image inpainting, specifically designed to fill in missing or corrupted regions of images with realistic content. While
### **1. Generator Architecture**
The generator is responsible for creating realistic inpainted images. CM-GAN uses **cascaded modulation** to process inputs. A common flow:

#### Input:
- An **image** with missing regions (masked image).
- A **binary mask** representing the missing areas (1 for missing, 0 for existing pixels).

#### Layers:
1. **Convolutional Layers with Mask Concatenation**:
   - Initial layers concatenate the image with the binary mask.
   - Convolutions extract features from the masked regions.

   **Purpose**: Learn the structure and surrounding context of the image.

2. **Cascaded Modulation Block**:
   - Combines **global modulation** (to understand overall image semantics) with **spatially adaptive modulation** (to handle local details).
   - Global modulation uses a feature map that spans the entire image.
   - Adaptive modulation applies location-specific adjustments.

   **Purpose**: Balance global coherence and local realism.

3. **Feature Propagation via Attention Mechanisms**:
   - **Enhanced Attention** to propagate contextual information from known to unknown areas.

   **Purpose**: Ensures accurate filling of missing regions based on surrounding context.

4. **Output Layers**:
   - A final set of convolutions or deconvolutions reconstructs the inpainted image.

   **Purpose**: Generate the final high-quality inpainted output.

---

### **2. Discriminator Architecture**
The discriminator evaluates the inpainted images for realism.

1. **Input**:
   - The inpainted image (from the generator).
   - The corresponding ground truth image (actual image without missing areas).

2. **Layers**:
   - Convolutional layers extract features.
   - Outputs a **realism score**, indicating how realistic the inpainted image is.

3. **Loss Function**:
   - Often uses an **adversarial loss** (e.g., Wasserstein or hinge loss) to train the generator and discriminator in a competitive manner.

---

### **Key Components of CM-GAN**
1. **Object-Aware Training**:
   - Focuses on challenging regions, like objects, using annotations (e.g., panoptic segmentation).
   - Ensures that the generator fills object regions more realistically.

2. **Mask-Aware Encoding**:
   - Explicitly considers the mask during feature extraction.
   - Helps the generator learn to handle varied mask sizes and shapes.

3. **Enhanced Attention**:
   - Propagates information from visible areas to missing areas.
   - Improves inpainting quality for complex patterns.

---

### **How the Architecture Works**
1. **Training**:
   - The generator creates inpainted images.
   - The discriminator evaluates their realism.
   - Both networks are updated iteratively to improve their performance.

2. **Inference**:
   - Given an input image and a mask, the generator fills the missing regions.
   - No discriminator is needed during inference.


# Step 1: Gather the Dataset


In [None]:
!wget http://data.csail.mit.edu/places/places365/train_large_places365standard.tar


--2025-01-05 18:08:34--  http://data.csail.mit.edu/places/places365/train_large_places365standard.tar
Resolving data.csail.mit.edu (data.csail.mit.edu)... 128.52.131.233
Connecting to data.csail.mit.edu (data.csail.mit.edu)|128.52.131.233|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://data.csail.mit.edu/places/places365/train_large_places365standard.tar [following]
--2025-01-05 18:08:35--  https://data.csail.mit.edu/places/places365/train_large_places365standard.tar
Connecting to data.csail.mit.edu (data.csail.mit.edu)|128.52.131.233|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 112598435840 (105G) [application/x-tar]
Saving to: ‘train_large_places365standard.tar’

   train_large_plac   1%[                    ]   1.27G  15.4MB/s    eta 1h 57m 