A complete final year project that colorizes grayscale images and videos using a custom Convolutional Neural Network (CNN) trained from scratch without any pretrained weights.
- Project Overview
- Project Structure
- Installation & Setup
- Training the Model
- Using the Application
- Features
- API Endpoints
- Troubleshooting
- Project Details
Build a deep learning system that converts grayscale images and videos to color using a custom CNN architecture, without relying on any pretrained models.
- Framework: PyTorch
- Color Space: LAB (works with L channel input to predict A and B channels)
- Dataset: CIFAR-10
- Backend: Flask REST API
- Frontend: React with interactive UI
- Video Processing: OpenCV
- Type: Convolutional Neural Network (CNN)
- Input: Grayscale image (L channel only)
- Output: Color channels (A and B)
- Encoder-Decoder: 8 blocks with batch normalization and ReLU activation
- Loss Function: Mean Squared Error (MSE)
Final Project/
βββ backend/
β βββ data_loader.py # Custom Dataset class and DataLoaders
β βββ model.py # CNN architecture (ColorizeNet)
β βββ train.py # Training script with logging
β βββ predict.py # Inference script for images
β βββ video_colorize.py # Video processing and colorization
β βββ utils.py # Color space conversions & utilities
β βββ app.py # Flask API server
β βββ config.py # Configuration settings
β βββ requirements.txt # Python dependencies
β βββ models/ # Directory for saved models
β βββ colorize_model.pth # Trained model weights
β
βββ frontend/
β βββ public/
β β βββ index.html # HTML entry point
β βββ src/
β β βββ components/ # React components
β β β βββ ImageUploader.js
β β β βββ ImageUploader.css
β β β βββ VideoUploader.js
β β β βββ VideoUploader.css
β β β βββ ImageComparison.js
β β β βββ ImageComparison.css
β β βββ App.js # Main React app
β β βββ App.css
β β βββ index.js
β β βββ index.css
β βββ package.json
β βββ .gitignore
β
βββ dataset/ # CIFAR-10 dataset (auto-downloaded)
βββ outputs/ # Output colorized images/videos
βββ .github/
β βββ copilot-instructions.md
βββ .gitignore
βββ README.md # This file
- Python 3.8+
- Node.js 14+
- pip and npm package managers
- CUDA 11.0+ (optional, for GPU support)
-
Navigate to backend directory:
cd backend -
Create virtual environment (recommended):
# Windows python -m venv venv venv\Scripts\activate # macOS/Linux python3 -m venv venv source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Navigate to frontend directory:
cd frontend -
Install dependencies:
npm install
cd backend
python train.py --epochs 5 --batch_size 32 --image_size 256python train.py \
--epochs 20 \
--batch_size 64 \
--learning_rate 0.001 \
--image_size 256 \
--model_save_path models/colorize_model.pth- --epochs: Number of training epochs (default: 20)
- --batch_size: Batch size for training (default: 32)
- --learning_rate: Learning rate for Adam optimizer (default: 0.001)
- --image_size: Input image resolution (default: 256)
- --model_save_path: Path to save trained model (default: models/colorize_model.pth)
- --num_workers: Number of data loading workers (default: 0)
After training, you'll see:
models/colorize_model.pth- Trained model weightsoutputs/training_loss.png- Loss graph visualizationoutputs/training_history.json- Numerical training history
- The model downloads CIFAR-10 automatically (~400MB)
- Uses MSE loss between predicted and ground truth AB channels
- Implements learning rate scheduling (reduces by 0.5 after 5 epochs without improvement)
- Best model is saved based on validation loss
cd backend
python app.pyThe Flask API will run on http://localhost:5000
cd frontend
npm startThe React app will run on http://localhost:3000
- Open browser to
http://localhost:3000 - Select Image or Video tab
- Adjust color intensity (0.0 - 2.0)
- Toggle denoising if desired
- Upload file (drag & drop or click to select)
- Wait for processing
- Download colorized output
cd backend
python predict.py \
--model_path models/colorize_model.pth \
--image_path path/to/image.jpg \
--output_dir outputs \
--color_intensity 1.0 \
--denoisepython predict.py \
--model_path models/colorize_model.pth \
--image_dir path/to/image/folder \
--output_dir outputspython video_colorize.py \
--model_path models/colorize_model.pth \
--video_path path/to/video.mp4 \
--output_path outputs/colorized.mp4 \
--color_intensity 1.0 \
--skip_frames 1- β Convert grayscale to color
- β Process color images (recolorize)
- β Adjustable color intensity (0.0 to 2.0)
- β Optional denoising filter
- β Batch processing support
- β Multiple image format support (PNG, JPG, BMP, TIFF)
- β Extract and colorize video frames
- β Frame skipping for faster processing
- β Reconstruct video from colorized frames
- β Preserves original video FPS
- β Multiple video format support
- π¨ Modern, responsive web interface
- π±οΈ Interactive before/after comparison slider
- π Real-time processing feedback
- πΎ One-click download of results
- ποΈ Easy adjustment controls
- π§ Non-Local Means Denoising
- π Noise reduction preserves details
- βοΈ Configurable denoising strength
GET /health
Response: { "status": "ok", "message": "..." }
POST /colorize-image
Content-Type: multipart/form-data
Parameters:
- file: Image file (required)
- intensity: Color intensity 0.0-2.0 (default: 1.0)
- denoise: Apply denoising true/false (default: false)
Response: PNG image file
POST /colorize-video
Content-Type: multipart/form-data
Parameters:
- file: Video file (required)
- intensity: Color intensity 0.0-2.0 (default: 1.0)
- denoise: Apply denoising true/false (default: false)
- skip_frames: Process every nth frame (default: 1)
Response: MP4 video file
GET /model-info
Response: Model architecture details
GET /supported-formats
Response: List of supported file formats
Error: Cannot load model at 'models/colorize_model.pth'
Solution: Train the model first using train.py
Error: Access to XMLHttpRequest blocked by CORS policy
Solution: CORS is enabled in Flask app. Check if backend is running on port 5000
Error: CUDA out of memory
Solution:
1. Reduce batch_size: --batch_size 16
2. Use CPU: Set device to 'cpu'
3. Reduce image_size: --image_size 128
Solution: Use --skip_frames parameter
python video_colorize.py --skip_frames 2 # Process every 2nd frame
Error: Cannot GET /
Solution:
1. Ensure npm start is running
2. Check http://localhost:3000
3. Check React dev server output for errors
The LAB color space consists of:
- L channel: Lightness (0-100)
- A channel: Green-Red (-128 to 127)
- B channel: Blue-Yellow (-128 to 127)
Our model:
- Input: L channel (grayscale)
- Output: A and B channels (color)
- Advantage: Separates luminance from color, better for colorization
Input: (B, 1, 256, 256) - Grayscale L channel
ENCODER:
ββ Conv2d: 1 β 64 channels
ββ Conv2d: 64 β 64 channels
ββ MaxPool: 256 β 128
ββ Conv2d: 64 β 128 channels
ββ Conv2d: 128 β 128 channels
ββ MaxPool: 128 β 64
ββ Conv2d: 128 β 256 channels
ββ Conv2d: 256 β 256 channels
ββ MaxPool: 64 β 32
ββ Conv2d: 256 β 512 channels (Bottleneck)
ββ Conv2d: 512 β 512 channels
DECODER:
ββ ConvTranspose2d: 512 β 256 (32 β 64)
ββ ConvTranspose2d: 256 β 128 (64 β 128)
ββ ConvTranspose2d: 128 β 64 (128 β 256)
ββ ConvTranspose2d: 64 β 32 (256 β 512)
ββ Conv2d: 32 β 2 channels (Output: A, B)
Output: (B, 2, 256, 256) - Color AB channels
- Loss Function: MSELoss (Mean Squared Error)
- Optimizer: Adam with weight decay (L2 regularization)
- Learning Rate: 0.001 (with ReduceLROnPlateau scheduler)
- Batch Normalization: Applied after each convolution
- Activation: ReLU throughout network
- Data Augmentation: Resize to 256Γ256
- Normalize RGB [0,255] β [0,1]
- Apply gamma correction
- Convert to XYZ using transformation matrix
- Normalize by reference white point
- Apply non-linear transformation
- Calculate LAB values
A: [-128, 127] β [-1, 1] (divide by 128)
B: [-128, 127] β [-1, 1] (divide by 128)
Adjusted_AB = Original_AB Γ Intensity_Factor
Range: 0.0 (grayscale) to 2.0 (vibrant colors)
Typical results after training on CIFAR-10:
| Metric | Value |
|---|---|
| Best Validation MSE Loss | ~0.02 |
| Training Time (20 epochs, GPU) | 2-4 hours |
| Training Time (20 epochs, CPU) | 8-12 hours |
| Model Size | ~15 MB |
| FPS (Image prediction, GPU) | 30-50 |
| FPS (Image prediction, CPU) | 5-10 |
By working through this project, you'll learn:
-
Deep Learning Fundamentals
- CNN architecture design
- Encoder-decoder patterns
- Batch normalization
- Loss functions and optimization
-
PyTorch Skills
- Building custom models
- Custom Dataset classes
- DataLoader creation
- Training loops
- Model checkpointing
-
Image Processing
- Color space conversions (RGB, LAB, XYZ)
- Image normalization
- Denoising techniques
- Video frame extraction
-
Full-Stack Development
- REST API design with Flask
- React component creation
- CORS and API integration
- File upload handling
-
Software Engineering
- Project structure
- Configuration management
- Error handling
- Code documentation
All code is written with detailed comments explaining:
- Function purpose and parameters
- Algorithm steps
- Mathematical transformations
- Edge cases and special handling
This project is created for educational purposes as a final year project.
For issues or questions:
- Check the Troubleshooting section
- Review code comments for implementation details
- Check Flask/React console output for error messages
cd backend && python train.py --epochs 5cd backend && python app.pycd frontend && npm startpython backend/predict.py --model_path backend/models/colorize_model.pth --image_path image.jpgpython backend/video_colorize.py --model_path backend/models/colorize_model.pth --video_path video.mp4Created: 2024 Status: Complete Final Year Project Difficulty: Intermediate to Advanced