Skip to content

Pufyda/image

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

HDRNet: Deep Bilateral Learning for Real-Time Image Enhancement

Ekran Resmi 2026-02-19 00 48 13

A PyTorch implementation of the paper:

Deep Bilateral Learning for Real-Time Image Enhancement
MichaΓ«l Gharbi, Jiawen Chen, Jonathan T. Barron, Samuel W. Hasinoff, FrΓ©do Durand
ACM Transactions on Graphics (SIGGRAPH), 2017

πŸ“„ Paper | 🌐 Project Page


✨ Features

  • Real-time processing: Process high-resolution images in milliseconds
  • Bilateral learning: Learn locally-affine color transformations in bilateral space
  • OpenCV + PyTorch: Efficient image loading and GPU-accelerated inference
  • Flexible training: Support for any paired input/output image dataset
  • TensorBoard logging: Monitor training with visualizations

πŸ—οΈ Architecture Overview

HDRNet learns to enhance images by:

  1. Low-res stream: Process a downsampled version to predict a 3D bilateral grid of affine coefficients
  2. Guide network: Learn a guidance map from full-resolution input
  3. Bilateral slicing: Use the guide to slice the grid and apply transformations at full resolution
Full-res Input ──────────────────────────────────────┬──────────────> Output
       β”‚                                             β”‚                  ↑
       β”‚                                             β”‚                  β”‚
       β–Ό                                             β–Ό                  β”‚
  Low-res Input ──> Spatial Features ──> Bilateral Grid ──> Bilateral Slice & Apply
                          β”‚                    ↑
                          β–Ό                    β”‚
                    Global Features β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

This project is inspired by HDRNet; however, I am implementing a simplified adaptation of the original method rather than directly reproducing the paper. Additionally, HDRNet serves only as a foundational baseline for this work, as the project is intended for a different application domain.

Requirements

  • Python 3.8+
  • PyTorch 2.0+
  • OpenCV 4.8+
  • CUDA (recommended for training)

πŸ“ Dataset Preparation

Organize your data in paired input/output folders:

data/
β”œβ”€β”€ train/
β”‚   β”œβ”€β”€ input/
β”‚   β”‚   β”œβ”€β”€ image001.jpg
β”‚   β”‚   β”œβ”€β”€ image002.jpg
β”‚   β”‚   └── ...
β”‚   └── output/
β”‚       β”œβ”€β”€ image001.jpg  (enhanced version)
β”‚       β”œβ”€β”€ image002.jpg
β”‚       └── ...
└── val/
    β”œβ”€β”€ input/
    └── output/

Note: Input and output images must have matching filenames.

Supported Datasets

  • MIT-Adobe FiveK: 5000 RAW photos with 5 expert retouches
  • Custom pairs: Any before/after image pairs (e.g., Lightroom edits)
  • Synthetic: Generated input/output pairs for specific enhancements

πŸ‹οΈ Training

Basic Training

python train.py \
    --data_root data/train \
    --val_root data/val \
    --epochs 100 \
    --batch_size 4 \
    --lr 1e-4

Full Training Options

python train.py \
    --data_root data/train \
    --val_root data/val \
    --input_dir input \
    --output_dir output \
    --low_res_size 256 \
    --full_res_size 512 \
    --base_features 8 \
    --grid_depth 8 \
    --batch_size 4 \
    --epochs 100 \
    --lr 1e-4 \
    --l1_weight 1.0 \
    --l2_weight 0.0 \
    --checkpoint_dir checkpoints \
    --log_dir logs \
    --device cuda

Resume Training

python train.py \
    --data_root data/train \
    --resume checkpoints/checkpoint_epoch50.pth \
    --epochs 100

Monitor Training

tensorboard --logdir logs

🎨 Inference

Single Image

python inference.py \
    --input path/to/image.jpg \
    --output path/to/enhanced.jpg \
    --checkpoint checkpoints/best_model.pth

Directory of Images

python inference.py \
    --input path/to/images/ \
    --output path/to/enhanced/ \
    --checkpoint checkpoints/best_model.pth

With Visualizations

python inference.py \
    --input image.jpg \
    --output enhanced.jpg \
    --checkpoint model.pth \
    --visualize \
    --show_guide \
    --show_grid

🐍 Python API

from hdrnet import create_hdrnet
from inference import enhance_image, create_enhancer

# Option 1: One-time enhancement
enhanced = enhance_image(
    'input.jpg',
    checkpoint_path='checkpoints/best_model.pth',
    device='cuda'
)

# Option 2: Create reusable enhancer
enhance = create_enhancer('checkpoints/best_model.pth', device='cuda')

enhanced1 = enhance('image1.jpg')
enhanced2 = enhance('image2.jpg')
enhanced3 = enhance(numpy_array)  # Also accepts numpy arrays

Custom Model Configuration

from hdrnet import create_hdrnet

config = {
    'low_res_size': 256,
    'base_features': 8,
    'grid_depth': 8,
    'use_guide_nn': True
}

model = create_hdrnet(config)

πŸ“Š Results

Metric Value
PSNR ~28-32 dB (dataset dependent)
SSIM ~0.92-0.96
Speed ~15ms @ 1080p (RTX 3090)
Parameters ~482K

πŸ“‚ Project Structure

hdrnet-pytorch/
β”œβ”€β”€ hdrnet/
β”‚   β”œβ”€β”€ __init__.py       # Package exports
β”‚   β”œβ”€β”€ model.py          # HDRNet architecture
β”‚   β”œβ”€β”€ layers.py         # Bilateral grid operations
β”‚   β”œβ”€β”€ dataset.py        # Data loading with OpenCV
β”‚   └── utils.py          # Utilities and metrics
β”œβ”€β”€ train.py              # Training script
β”œβ”€β”€ inference.py          # Inference script
β”œβ”€β”€ requirements.txt      # Dependencies
└── README.md

πŸ”§ Advanced Usage

Custom Loss Functions

The training script supports combined losses:

python train.py \
    --l1_weight 1.0 \
    --l2_weight 0.5 \
    --perceptual_weight 0.1  # Uses VGG features

Model Variants

# Standard HDRNet
from hdrnet import HDRNet
model = HDRNet(low_res_size=256, grid_depth=8)

# Lightweight curve-based variant
from hdrnet import HDRNetCurves
model = HDRNetCurves(num_control_points=16)

πŸ“š Citation

If you use this code, please cite the original paper:

@article{gharbi2017deep,
  title={Deep Bilateral Learning for Real-Time Image Enhancement},
  author={Gharbi, Micha{\"e}l and Chen, Jiawen and Barron, Jonathan T and Hasinoff, Samuel W and Durand, Fr{\'e}do},
  journal={ACM Transactions on Graphics (TOG)},
  volume={36},
  number={4},
  pages={1--12},
  year={2017},
  publisher={ACM}
}

πŸ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ™ Acknowledgments

About

betterimage

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages