# ID-CodeFormer: Training Notebook

This notebook provides a complete workflow to train **ID-CodeFormer**, an enhanced version of CodeFormer with improved identity preservation. This is achieved by integrating an ArcFace-based identity loss into the training pipeline, as described in the project's research papers.

### Workflow Overview:
1.  **Setup Environment**: Clones the repository and installs all necessary dependencies.
2.  **Mount Google Drive**: Connects to your Google Drive to access the training dataset.
3.  **Download Pre-trained Models**: Fetches the required weights for CodeFormer, VQGAN, facelib, and the ArcFace model.
4.  **Apply Code Modifications**: Programmatically modifies the codebase to add the identity loss functionality.
5.  **Configure Training**: Creates the YAML configuration file for the training run, pointing to the dataset in your Google Drive.
6.  **Start Training**: Launches the training process.

**Before you begin**: Make sure your Colab runtime is set to **GPU** (`Runtime > Change runtime type`).

## 1. Setup Environment

First, we clone the CodeFormer repository and install the required Python packages. This step also builds the custom CUDA extensions needed by the project.

In [None]:
# Clone the CodeFormer repository from GitHub
!git clone https://github.com/SanjanaChamindu/CodeFormer.git
%cd CodeFormer

# Install the dependencies listed in requirements.txt
# Note: This might take a few minutes.
!pip install -r requirements.txt

# Set up the basicsr library, which includes custom CUDA ops
!python basicsr/setup.py develop

## 2. Mount Google Drive

Now, we'll mount your Google Drive. This allows the notebook to access the FFHQ dataset that you have stored there. You will be prompted to authorize this access.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## 3. Download Pre-trained Models

We need several pre-trained models to start training. This includes the base CodeFormer and VQGAN weights, models for face detection/parsing (facelib), and the ArcFace model for our new identity loss.

In [None]:
# Download the official CodeFormer pre-trained models for training
!python scripts/download_pretrained_models.py CodeFormer_train

# Download the facelib helper models
!python scripts/download_pretrained_models.py facelib

# Download the pre-trained ArcFace model for the identity loss
# It will be saved into the 'weights/facelib' directory
!wget -P ./weights/facelib https://github.com/deepinsight/insightface/raw/master/model_zoo/arcface_torch/ms1mv3_arcface_r100_fp16.zip
!unzip -o -d ./weights/facelib ./weights/facelib/ms1mv3_arcface_r100_fp16.zip

## 4. Start Training

Now we can finally start training our ID-CodeFormer model. The training progress will be printed below, and checkpoints will be saved periodically to the `experiments/ID_CodeFormer_Exp1` directory.

In [None]:
!python basicsr/train.py -opt options/train_id_codeformer.yml

## 5. Inference with the Trained Model (Optional)

Once training is complete, you can use this section to test your new model. Make sure to update the `--input_path` to your test images and `--model_path` to the path of your saved checkpoint (e.g., `experiments/ID_CodeFormer_Exp1/models/net_g_latest.pth`).

In [None]:
# Create a folder for test images
!mkdir -p inputs/my_test_images

# Note: You should upload your own test images to the 'inputs/my_test_images' folder

# Run inference with a specific checkpoint
# Make sure to replace 'net_g_latest.pth' with the checkpoint you want to test
# Note: The inference script does not have a --model_path argument by default, you would need to modify it or place your trained model in the expected path 'weights/CodeFormer/codeformer.pth'
!python inference_codeformer.py \
    --input_path inputs/my_test_images \
    -w 0.7 \
    --bg_upsampler realesrgan \
    --face_upsample