Skip to content

Pedromacedo41/DLimage_colorization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Image Colorization

Pytorch implementation of VGG architecture with 3 different loss functions: L2 norm loss, balanced class cross-entropy loss (Balanced class cross-entropy loss reference Paper) and custom defined focal loss using the precedent defined loss, implemented end to end.

exp

(ground truth/black and white/prediction)

We present deployed versions of 2 variants: L2 norm loss and focal loss with L2 norm. The models were trained in google cloud, using VM instances of specificaions: n1-highmem-16(16 vCPUs, 104 GB memory), GPU: 4 x NVIDIA Tesla V100

The training was parallelized along 3 machines, training time taking about 1h~2h of average each.

Running

python main.py --images <path to images> [--train] [--new] [--focal]

If --train will train the model and save it to model_l2[_focal].pt. Otherwise, will test the model and output the best and worst prediction to an output-<random> folder.

If --new the model is trained from scratch. Otherwise, it starts from the pre-trained model. Only works for training.

If --focal the model uses the focal loss. Otherwise, it uses normal L2 loss.

To use pre-trained models, download them and put them in the root folder.

Reference Archictecture

The balanced class cross-entropy architecture is showed bellow, according to Balanced class cross-entropy loss reference Paper

architecture architecture2

We've used in the project a slightly different version removing the penultimate conv layer (in blue in the figure). The original paper used this conv layer of output 313 activation filters to create a color probability distribution for each pixel, in a quantized color space of dimension 313. The color ab frame in the original paper is then created using a decode function.

In our implementation, the color ab frame output is obtained of the output of a conv layer of 2-out activations filters, immediatly after the conv8 layer, as showed bellow:

architecture3

Link to Presentation

Presentation

Link to Datasets

Download trained models

Link to results drive folder

For each model we tested our model against Sun Images Objects Dataset(training dataset, 16,873 images) and Sun Images Scenes (37GB) and selected the best and worst results, based in the model loss after training.

These images are splitted in 4 folders in the drive, which one containing 2 subfloder: best and worst. The predictions images and the real images can be distinguish according to pred and real anotations in the file names

Results

Reference papers and useful links

Dependencies

  • scikit-image
  • pytorch
  • matplotlib
  • numpy
  • Pillow

Folder Structure

folder structure

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published