Automatically colorize black-and-white comic and manga pages using deep learning. Upload a PDF, get back a fully colorized version — no API keys, no cloud services, everything runs locally on your machine.
- Two colorization modes
- Auto — fully automatic, no reference needed (manga-colorization-v2)
- Reference — upload one colored page for higher quality results (MangaNinja, CVPR 2025)
- Post-processing pipeline — L-channel preservation for perfect line fidelity, guided filter for clean edges
- Optional 4x upscaling — built-in Real-ESRGAN for print-quality output
- VRAM-aware model management — only one colorizer loaded at a time, safe for 8 GB GPUs
- Performance-optimized —
torch.inference_mode, cuDNN autotuner, GPU-resident tile accumulation, zero-copy pipelines - GPU accelerated — 2-5 seconds per page in auto mode on CUDA GPUs, with automatic CPU fallback
- GPU detection — analyze your hardware specs before choosing a device
- Cross-page color consistency — LAB color transfer keeps character/environment colors consistent across pages (auto mode)
- Live preview — side-by-side original vs. colorized comparison updates in real-time during processing
- PDF in, PDF out — upload a B&W comic PDF, download a colorized PDF
- Zero cloud dependency — everything runs locally, no API keys needed
- Auto model download — weights are downloaded automatically on first use
Upload PDF → Extract pages at 300 DPI
→ For each page:
mc-v2 colorize (576×576)
→ L-channel preservation (replace L with original)
→ Guided filter (clean edge bleeding)
→ [Optional] Real-ESRGAN 4x upscale
→ Color consistency (LAB transfer, pages 2+)
→ Reassemble PDF → Preview/Download
Upload PDF + reference image → Extract pages
→ For each page:
MangaNinja colorize (512×512, using reference)
→ L-channel preservation
→ Guided filter
→ [Optional] Real-ESRGAN 4x upscale
→ Reassemble PDF → Preview/Download
The auto mode uses manga-colorization-v2, a U-Net with an SEResNeXt encoder trained on manga artwork. Cross-page consistency uses Reinhard LAB color transfer on chrominance channels.
The reference mode uses MangaNinja (CVPR 2025), which takes a colored reference page and transfers its color palette to all target pages using a dual UNet architecture with reference attention and point correspondence.
- Python 3.10+
- PyTorch 2.3+ (with CUDA for GPU acceleration)
- ~500 MB disk space for auto mode weights (downloaded automatically)
- ~6 GB disk space for reference mode weights (downloaded on first use)
- GPU (optional): Any NVIDIA GPU with 2+ GB VRAM for auto mode, 6+ GB for reference mode
| Mode | VRAM | Speed |
|---|---|---|
| Auto (mc-v2) | ~3 GB | ~3 s/page |
| Auto + ESRGAN | ~3.5 GB | ~5 s/page |
| Reference (MangaNinja) | ~6 GB | ~15-30 s/page |
Only one colorizer is loaded at a time. Switching modes automatically unloads the current model and frees VRAM.
git clone https://github.com/vikast908/ColorComic.git
cd ColorComicpython -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activateWith GPU (NVIDIA CUDA):
Visit pytorch.org/get-started and select your CUDA version, or:
# CUDA 12.8 (most recent NVIDIA drivers)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
# CUDA 12.4
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu124
# CUDA 11.8
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118CPU only:
pip install torch torchvisionpip install -r requirements.txtcp .env.example .envEdit .env if you want to change defaults:
SECRET_KEY=change-this-to-a-random-string
COLORCOMIC_DEVICE=auto # auto | cpu | cuda
COLOR_TRANSFER_STRENGTH=0.7 # 0.0 = no transfer, 1.0 = full transfer
POSTPROCESS_L_CHANNEL=1 # 1 = enabled, 0 = disabled
POSTPROCESS_GUIDED_FILTER=1 # 1 = enabled, 0 = disabled
POSTPROCESS_UPSCALE=0 # 1 = enable Real-ESRGAN 4x upscale
MANGANINJA_DENOISE_STEPS=30 # Denoising steps for reference modepython app.pyOpen http://127.0.0.1:5000 in your browser. On first run, auto mode weights (~140 MB) are downloaded from Google Drive. Reference mode weights (~6 GB) are downloaded from HuggingFace on first use.
- Upload — Drop a B&W comic/manga PDF onto the upload page
- Choose mode — Select "Auto" for automatic colorization or "Reference" for reference-based (upload a colored reference image)
- Detect GPU — Click "Detect GPU" to see your hardware specs and pick CPU or GPU
- Colorize — Hit "Upload & Colorize" and watch the progress with live previews
- Download — Get the colorized PDF or review individual pages
| Variable | Default | Description |
|---|---|---|
COLORCOMIC_DEVICE |
auto |
Device for inference. auto picks GPU if available. |
COLOR_TRANSFER_STRENGTH |
0.7 |
Cross-page color alignment strength (0.0–1.0). Auto mode only. |
POSTPROCESS_L_CHANNEL |
1 |
Replace colorized luminance with original grayscale for sharper lines. |
POSTPROCESS_GUIDED_FILTER |
1 |
Smooth color bleeding at edges using the original as guide. |
POSTPROCESS_UPSCALE |
0 |
Enable Real-ESRGAN 4x upscaling (downloads ~17 MB model on first use). |
MANGANINJA_DENOISE_STEPS |
30 |
DDIM denoising steps for reference mode. Lower = faster, higher = better quality. |
SD15_MODEL_PATH |
HuggingFace | Override Stable Diffusion 1.5 model path for reference mode. |
CLIP_VISION_PATH |
HuggingFace | Override CLIP vision model path for reference mode. |
SECRET_KEY |
random | Flask session secret. Set to a fixed string in production. |
ColorComic/
├── app.py # Flask application & routes
├── config.py # Configuration (env vars, paths)
├── requirements.txt # Python dependencies
│
├── core/
│ ├── model_manager.py # VRAM-aware model switching (one colorizer at a time)
│ ├── ml_colorizer.py # manga-colorization-v2 wrapper (auto mode)
│ ├── manga_ninja_colorizer.py # MangaNinja wrapper (reference mode)
│ ├── postprocessor.py # L-channel preservation + guided filter
│ ├── upscaler.py # Real-ESRGAN 4x upscaler (self-contained)
│ ├── color_consistency.py # LAB color transfer for cross-page consistency
│ ├── model_downloader.py # Auto-download weights (Google Drive + HuggingFace)
│ ├── pdf_handler.py # PDF extraction & reassembly (PyMuPDF)
│ └── panel_detector.py # Panel detection (available for future use)
│
├── models/
│ ├── schemas.py # Pydantic data models (JobState, PanelRegion)
│ └── weights/ # Model weights (auto-downloaded, gitignored)
│
├── vendor/
│ ├── manga_colorization_v2/ # Vendored mc-v2 inference code
│ └── manganinja/ # Vendored MangaNinja inference code (CC BY-NC 4.0)
│ ├── pipeline.py # Main diffusion pipeline
│ ├── point_network.py # PointNet for spatial correspondence
│ ├── annotator/ # Lineart extraction
│ └── models/ # Custom UNet, attention, transformer blocks
│
├── templates/ # Jinja2 HTML templates
│ ├── base.html # Layout with dark theme
│ ├── index.html # Upload page with mode selector + GPU detection
│ ├── processing.html # Live progress with side-by-side preview
│ └── preview.html # Page-by-page review & download
│
└── static/
├── css/style.css # Dark theme stylesheet
└── js/
├── app.js # Global JS utilities
└── upload.js # Upload logic, mode toggle, reference upload
| Method | Endpoint | Description |
|---|---|---|
GET |
/ |
Upload page |
POST |
/upload |
Upload PDF (+ optional reference image and mode) |
POST |
/api/colorize/<job_id> |
Start colorization pipeline |
GET |
/api/colorize/<job_id>/stream |
SSE stream of progress events |
GET |
/api/preview/<job_id>/<page> |
Serve a colorized page image |
GET |
/pages/<job_id>/<page> |
Serve an original B&W page image |
GET |
/api/download/<job_id> |
Download the colorized PDF |
GET |
/api/gpu-info |
GPU detection (name, VRAM, compute capability) |
GET |
/api/status |
Model health check (device, mode, CUDA status) |
GET |
/processing/<job_id> |
Processing page with live preview |
GET |
/preview/<job_id> |
Review colorized pages |
ColorComic auto-detects CUDA GPUs at startup. On the upload page, click "Detect GPU" to see:
- GPU name and model
- Total and free VRAM
- Compute capability and SM count
- CUDA toolkit version
- A recommendation (GPU or CPU based on available VRAM)
Minimum: Any NVIDIA GPU with 2 GB VRAM (auto mode) Recommended: 6+ GB VRAM for reference mode (RTX 3060, RTX 4070, etc.) CPU fallback: Automatic if GPU runs out of memory mid-inference
If PyTorch was installed without CUDA support (torch+cpu), GPU will not be available regardless of hardware. Reinstall with the correct CUDA index URL (see Installation).
The inference pipeline is optimized to minimize latency and memory overhead:
torch.inference_mode()wraps all model calls — disables gradient tracking, reduces memory and speeds up tensor operations- cuDNN benchmark enabled at startup — autotuner selects the fastest convolution kernels for fixed input sizes (576x576, 512x512)
- GPU-resident tile accumulation in Real-ESRGAN — tiles stay on GPU during upscaling with a single CPU transfer at the end, eliminating per-tile PCIe round-trips
- Adaptive interpolation —
INTER_AREAfor downsampling,INTER_LANCZOS4for upsampling, matched to each resize direction - Guided filter downscaling — images >1024px are downscaled before chrominance filtering, then upscaled back; keeps per-page filtering under 100 ms even at 300 DPI
- In-place color transfer — Reinhard LAB transfer uses zero-copy in-place operations with
np.clip(out=...) - Minimal intermediate allocations — BGR→PIL conversion in one step, A/B channels extracted individually instead of full-array float32 conversion
- Lighter output encoding — JPEG quality 85 for colorized intermediates (vs 95), ~40% smaller files with no visible difference
- Job queue cleanup — queues are freed after completion to prevent memory leaks across jobs
- Manga-optimized: Both models are trained on manga/anime artwork. Western comics may get lower-quality results.
- Reference mode is CC BY-NC 4.0: MangaNinja is licensed for non-commercial use only. A notice is displayed in the UI.
- Reference mode needs a colored reference: You must provide one colored page. The model transfers that page's color palette to all others.
- First-time reference download: ~6 GB of weights (SD 1.5, CLIP, MangaNinja) are downloaded on first reference mode use.
- Consistency is approximate: LAB color transfer (auto mode) aligns global color distributions, not specific character elements.
- Single-user: The Flask app uses in-memory job storage. Designed for local/single-user use.
- manga-colorization-v2 by qweasdd — automatic colorization model (auto mode)
- MangaNinja by ali-vilab — reference-based colorization model (CVPR 2025, CC BY-NC 4.0)
- Real-ESRGAN by xinntao — anime-optimized super-resolution
- FFDNet — denoising network used for preprocessing
- PyMuPDF — PDF extraction and reassembly
This project is provided under the MIT License.
The vendored manga-colorization-v2 model code and FFDNet denoiser code retain their original licenses. The vendored MangaNinja code is licensed under CC BY-NC 4.0 (non-commercial use only). See the respective repositories for details.