<a href="https://colab.research.google.com/github/DCReddy23/DFLLIV/blob/main/notebooks/DFLLIV_Colab_Training.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# DFLLIV - Low-Light Image Enhancement with Diffusion Fields

This notebook provides a complete tutorial for training the DFLLIV model on Google Colab using the LOL dataset.

## üéØ What You'll Learn

- Set up the environment in Google Colab
- Download and prepare the LOL dataset
- Train a diffusion-based low-light enhancement model
- Perform inference on test images
- Evaluate model performance

## üìã Requirements

- Google Colab account (free tier works!)
- Google Drive for dataset storage (~500MB) and checkpoints
- GPU runtime (go to Runtime ‚Üí Change runtime type ‚Üí GPU)

## ‚è±Ô∏è Expected Training Time

- With Colab T4 GPU: ~4-6 hours for 100 epochs
- With Colab V100/A100: ~2-3 hours for 100 epochs

---

## 1Ô∏è‚É£ Setup: Check GPU and Mount Google Drive

First, let's verify that we have a GPU available and mount Google Drive for persistent storage.

In [None]:
# Check GPU availability
!nvidia-smi

import torch
print(f"\nPyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")
else:
    print("‚ö†Ô∏è WARNING: No GPU detected! Training will be very slow.")
    print("Please enable GPU: Runtime ‚Üí Change runtime type ‚Üí GPU")

In [None]:
# Mount Google Drive for persistent storage
from google.colab import drive
drive.mount('/content/drive')

# Create necessary directories in Google Drive
!mkdir -p /content/drive/MyDrive/DFLLIV/checkpoints
!mkdir -p /content/drive/MyDrive/DFLLIV/runs
!mkdir -p /content/drive/MyDrive/DFLLIV/outputs

print("‚úì Google Drive mounted successfully")
print("‚úì Directories created in Google Drive")

## 2Ô∏è‚É£ Clone Repository and Install Dependencies

In [None]:
# Clone the DFLLIV repository
import os
if not os.path.exists('/content/DFLLIV'):
    !git clone https://github.com/DCReddy23/DFLLIV.git /content/DFLLIV
    print("‚úì Repository cloned")
else:
    print("‚úì Repository already exists")

# Change to repository directory
%cd /content/DFLLIV

In [None]:
# Install required packages
print("Installing dependencies... This may take a few minutes.")
!pip install -q -r requirements.txt

print("\n‚úì All dependencies installed successfully!")

# Verify installation
import torch
import torchvision
import yaml
import lpips
from tqdm import tqdm

print("\nüì¶ Package versions:")
print(f"  PyTorch: {torch.__version__}")
print(f"  TorchVision: {torchvision.__version__}")
print(f"  CUDA: {torch.version.cuda}")

## 3Ô∏è‚É£ Download LOL Dataset

The LOL (Low-Light) dataset contains 500 paired low-light and normal-light images.

### Option A: Download from Google Drive (Recommended)

Download the dataset from the official Google Drive link and upload to your Google Drive.

In [None]:
# Download LOL dataset using gdown
!pip install -q gdown

import os
import zipfile

# Create data directory
!mkdir -p /content/DFLLIV/data

# Check if dataset already exists in Google Drive
gdrive_dataset_path = "/content/drive/MyDrive/DFLLIV/LOLdataset.zip"
local_dataset_path = "/content/DFLLIV/data"

if os.path.exists("/content/DFLLIV/data/LOL"):
    print("‚úì LOL dataset already extracted!")
elif os.path.exists(gdrive_dataset_path):
    print("Found dataset in Google Drive. Copying...")
    !cp "{gdrive_dataset_path}" /content/DFLLIV/data/
    print("Extracting...")
    with zipfile.ZipFile('/content/DFLLIV/data/LOLdataset.zip', 'r') as zip_ref:
        zip_ref.extractall('/content/DFLLIV/data/')
    
    # Rename directory if needed
    if os.path.exists('/content/DFLLIV/data/lol_dataset'):
        !mv /content/DFLLIV/data/lol_dataset /content/DFLLIV/data/LOL
    elif os.path.exists('/content/DFLLIV/data/LOLdataset'):
        !mv /content/DFLLIV/data/LOLdataset /content/DFLLIV/data/LOL
    
    print("‚úì Dataset extracted successfully!")
else:
    print("""Dataset not found in Google Drive. Please download manually:""")
    print("""\nüì• Manual Download Instructions:""")
    print("""1. Visit: https://drive.google.com/file/d/157bjO1_cFuSd0HWDUuAmcHRJDVyWpOxB/view""")
    print("""2. Download LOLdataset.zip""")
    print("""3. Upload to your Google Drive at: MyDrive/DFLLIV/LOLdataset.zip""")
    print("""4. Re-run this cell""")
    print("""\nAlternatively, you can try direct download (may require manual intervention):""")

### Option B: Direct Download (if Option A doesn't work)

Try downloading directly using gdown (may require authentication):

In [None]:
# Uncomment and run if Option A doesn't work
# Note: This may require you to authenticate with Google Drive

# import gdown
# import zipfile

# if not os.path.exists('/content/DFLLIV/data/LOL'):
#     print("Downloading LOL dataset...")
#     url = 'https://drive.google.com/uc?id=157bjO1_cFuSd0HWDUuAmcHRJDVyWpOxB'
#     output = '/content/DFLLIV/data/LOLdataset.zip'
#     gdown.download(url, output, quiet=False)
    
#     print("Extracting...")
#     with zipfile.ZipFile(output, 'r') as zip_ref:
#         zip_ref.extractall('/content/DFLLIV/data/')
    
#     # Rename if needed
#     if os.path.exists('/content/DFLLIV/data/lol_dataset'):
#         !mv /content/DFLLIV/data/lol_dataset /content/DFLLIV/data/LOL
#     elif os.path.exists('/content/DFLLIV/data/LOLdataset'):
#         !mv /content/DFLLIV/data/LOLdataset /content/DFLLIV/data/LOL
    
#     print("‚úì Done!")

In [None]:
# Verify dataset structure
import os

def verify_dataset():
    base_path = '/content/DFLLIV/data/LOL'
    
    if not os.path.exists(base_path):
        print("‚ùå Dataset not found! Please download using instructions above.")
        return False
    
    train_low = os.path.join(base_path, 'our485', 'low')
    train_high = os.path.join(base_path, 'our485', 'high')
    test_low = os.path.join(base_path, 'eval15', 'low')
    test_high = os.path.join(base_path, 'eval15', 'high')
    
    dirs = [train_low, train_high, test_low, test_high]
    
    print("üìÅ Dataset Structure:")
    for d in dirs:
        if os.path.exists(d):
            count = len([f for f in os.listdir(d) if f.endswith(('.png', '.jpg', '.jpeg'))])
            print(f"  ‚úì {d}: {count} images")
        else:
            print(f"  ‚ùå {d}: NOT FOUND")
            return False
    
    print("\n‚úì Dataset verification successful!")
    return True

verify_dataset()

## 4Ô∏è‚É£ Load and Review Configuration

We'll use a Colab-optimized configuration with:
- Reduced batch size (4 instead of 8) for memory efficiency
- Fewer epochs (100 instead of 500) for faster demo
- Checkpoints saved to Google Drive for persistence

In [None]:
import yaml
from pprint import pprint

# Load Colab configuration
config_path = '/content/DFLLIV/configs/colab.yaml'

with open(config_path, 'r') as f:
    config = yaml.safe_load(f)

print("üìù Colab Configuration:")
print("="*50)
pprint(config)
print("="*50)

print("\nüîß Key Settings:")
print(f"  Batch Size: {config['training']['batch_size']}")
print(f"  Epochs: {config['training']['num_epochs']}")
print(f"  Learning Rate: {config['training']['learning_rate']}")
print(f"  Validation Every: {config['training']['val_every']} epochs")
print(f"  Save Every: {config['training']['save_every']} epochs")

### Optional: Adjust Configuration

You can modify the configuration if needed:

In [None]:
# Optional: Reduce epochs further for quick testing
# config['training']['num_epochs'] = 20
# config['training']['val_every'] = 5

# Optional: Change batch size based on your GPU
# config['training']['batch_size'] = 2  # For lower memory
# config['training']['batch_size'] = 8  # For higher memory GPUs

# Save modified config
# with open(config_path, 'w') as f:
#     yaml.dump(config, f)
# print("‚úì Configuration updated")

## 5Ô∏è‚É£ Start Training

Now we'll start the training process. This will:
- Train the diffusion field model on the LOL dataset
- Save checkpoints to Google Drive every 25 epochs
- Validate every 5 epochs
- Display training progress

**Note:** Training 100 epochs on Colab T4 GPU takes approximately 4-6 hours.

In [None]:
# Start training
!python train.py --config configs/colab.yaml

# Note: If training is interrupted, you can resume with:
# !python train.py --config configs/colab.yaml --resume /content/drive/MyDrive/DFLLIV/checkpoints/latest.pth

## 6Ô∏è‚É£ Monitor Training Progress

While training is running, you can monitor progress using TensorBoard:

In [None]:
# Load TensorBoard extension
%load_ext tensorboard

# Start TensorBoard
%tensorboard --logdir /content/drive/MyDrive/DFLLIV/runs

### Visualize Training Samples

Check the training visualizations saved during validation:

In [None]:
import matplotlib.pyplot as plt
from PIL import Image
import glob

# Find training visualization images
vis_files = sorted(glob.glob('/content/DFLLIV/outputs/train_vis/*.png'))

if vis_files:
    print(f"Found {len(vis_files)} training visualization images\n")
    
    # Show the latest few
    for img_path in vis_files[-3:]:
        img = Image.open(img_path)
        plt.figure(figsize=(15, 5))
        plt.imshow(img)
        plt.axis('off')
        plt.title(img_path.split('/')[-1])
        plt.tight_layout()
        plt.show()
else:
    print("No training visualizations found yet. These will appear during validation.")

## 7Ô∏è‚É£ Inference on Test Images

After training (or using a checkpoint), let's test the model on some low-light images.

In [None]:
# Find the best checkpoint
checkpoint_path = '/content/drive/MyDrive/DFLLIV/checkpoints/best.pth'

if not os.path.exists(checkpoint_path):
    # Use latest if best doesn't exist
    checkpoint_path = '/content/drive/MyDrive/DFLLIV/checkpoints/latest.pth'

if os.path.exists(checkpoint_path):
    print(f"‚úì Using checkpoint: {checkpoint_path}")
    
    # Get a test image
    test_image = glob.glob('/content/DFLLIV/data/LOL/eval15/low/*.png')[0]
    output_path = '/content/drive/MyDrive/DFLLIV/outputs/enhanced_test.png'
    
    print(f"\nEnhancing: {test_image}")
    
    # Run inference
    !python inference.py \
        --checkpoint {checkpoint_path} \
        --input {test_image} \
        --output {output_path} \
        --num-steps 50 \
        --sampling-method ddim
    
    # Display result
    if os.path.exists(output_path):
        img = Image.open(output_path)
        plt.figure(figsize=(15, 5))
        plt.imshow(img)
        plt.axis('off')
        plt.title('Low-Light Input vs Enhanced Output')
        plt.tight_layout()
        plt.show()
else:
    print("‚ùå No checkpoint found. Please train the model first.")

### Batch Inference on All Test Images

In [None]:
# Process all test images
if os.path.exists(checkpoint_path):
    test_dir = '/content/DFLLIV/data/LOL/eval15/low'
    output_dir = '/content/drive/MyDrive/DFLLIV/outputs/test_results'
    
    print("Enhancing all test images...")
    !python inference.py \
        --checkpoint {checkpoint_path} \
        --input {test_dir} \
        --output {output_dir} \
        --num-steps 50 \
        --sampling-method ddim
    
    print("\n‚úì Done! Results saved to Google Drive")
    print(f"Output directory: {output_dir}")

## 8Ô∏è‚É£ Evaluate Model Performance

Compute PSNR, SSIM, and LPIPS metrics on the test set:

In [None]:
# Run evaluation
if os.path.exists(checkpoint_path):
    eval_output_dir = '/content/drive/MyDrive/DFLLIV/outputs/evaluation'
    
    !python evaluate.py \
        --checkpoint {checkpoint_path} \
        --dataset-dir /content/DFLLIV/data/LOL/eval15 \
        --output-dir {eval_output_dir} \
        --num-steps 50
    
    # Display metrics
    import json
    metrics_file = os.path.join(eval_output_dir, 'metrics.json')
    
    if os.path.exists(metrics_file):
        with open(metrics_file, 'r') as f:
            metrics = json.load(f)
        
        print("\nüìä Evaluation Results:")
        print("="*50)
        print(f"PSNR:  {metrics['psnr']:.2f} ¬± {metrics['psnr_std']:.2f} dB")
        print(f"SSIM:  {metrics['ssim']:.4f} ¬± {metrics['ssim_std']:.4f}")
        print(f"LPIPS: {metrics['lpips']:.4f} ¬± {metrics['lpips_std']:.4f}")
        print("="*50)
    
    # Show comparison grid
    grid_path = os.path.join(eval_output_dir, 'comparison_grid.png')
    if os.path.exists(grid_path):
        img = Image.open(grid_path)
        plt.figure(figsize=(20, 15))
        plt.imshow(img)
        plt.axis('off')
        plt.title('Comparison Grid: Input vs Enhanced vs Ground Truth')
        plt.tight_layout()
        plt.show()

## 9Ô∏è‚É£ Download Results

All checkpoints and results are saved to your Google Drive at:
- Checkpoints: `MyDrive/DFLLIV/checkpoints/`
- Logs: `MyDrive/DFLLIV/runs/`
- Outputs: `MyDrive/DFLLIV/outputs/`

You can also download specific files:

In [None]:
from google.colab import files

# Download best checkpoint
checkpoint_path = '/content/drive/MyDrive/DFLLIV/checkpoints/best.pth'
if os.path.exists(checkpoint_path):
    print("Downloading best checkpoint...")
    # files.download(checkpoint_path)  # Uncomment to download
    print("Note: Large file - recommend accessing from Google Drive")
else:
    print("No checkpoint found")

# List all available checkpoints
print("\nüì¶ Available checkpoints in Google Drive:")
!ls -lh /content/drive/MyDrive/DFLLIV/checkpoints/

## üí° Tips and Troubleshooting

### Training Tips

1. **Out of Memory Errors**:
   - Reduce batch size to 2 or 1
   - Reduce `crop_size` in config to 128

2. **Speed Up Training**:
   - Use fewer epochs for quick testing (e.g., 20-50)
   - Reduce validation frequency

3. **Session Timeout**:
   - Colab free tier disconnects after ~12 hours
   - Resume training with: `--resume /content/drive/MyDrive/DFLLIV/checkpoints/latest.pth`

4. **Better Results**:
   - Train for more epochs (200-500)
   - Use higher batch size if GPU allows (8 or 16)
   - Try the UNet architecture: Change `model.type` to `"unet"` in config

### Quick Test Run

For a quick test (5-10 minutes), modify config:
```python
config['training']['num_epochs'] = 5
config['training']['val_every'] = 1
```

### Useful Commands

```bash
# Resume training
!python train.py --config configs/colab.yaml --resume /content/drive/MyDrive/DFLLIV/checkpoints/latest.pth

# Quick inference
!python inference.py --checkpoint /content/drive/MyDrive/DFLLIV/checkpoints/best.pth --input test.jpg --output enhanced.png

# Evaluate specific checkpoint
!python evaluate.py --checkpoint /content/drive/MyDrive/DFLLIV/checkpoints/checkpoint_epoch_50.pth --dataset-dir data/LOL/eval15 --output-dir results
```

---

## üìö Additional Resources

- [GitHub Repository](https://github.com/DCReddy23/DFLLIV)
- [Project README](https://github.com/DCReddy23/DFLLIV/blob/main/README.md)
- [LOL Dataset Paper](https://arxiv.org/abs/1808.04560)

## üôè Acknowledgments

- DDPM/DDIM for diffusion models
- NeRF for neural field inspiration
- LOL dataset by Wei et al.

---

**Happy Training! üöÄ**