SinGAN

As a final project in the course “Digital Image Processing” (Computer Science faculty, Technion - Israel Institute for Technology) we were given list of papers, one of which we had to choose to try to use it for a different task or improve performance. The idea of this project isn’t to came with an article, but to investigate in details the paper and try your own ideas, analyze and report the outcomes.

I had chosen the paper “SinGAN: Learning a Generative Model from a Single Natural Image” (Tamar Rott Shaham, Tali Dekel, Tomer Michaeli). That was a natural choice for me as it was a great opportunity to refresh and expand my theoretical knowledge and practical skills in the domain of deep learning, which I’m a great enthusiast of.

(*) Links for the original work:

Project | Arxiv | CVF | Supplementary materials | Talk (ICCV`19)

One idea I’ve chosen to implement is adding an attention mechanism to the GAN on each of the pyramidal levels for an attempt to include the ability for the network to perceive features across the image (as attentions mechanism do) and measuring the performance with the same measurements the original SinGAN was measured, SIFID (a variant of Frechet Inception Distance) and RMSE.

In some cases better SIFID was achieved for a particular architecture with attention on some examples, although visually it wasn’t noticeable and it was too memory-expensive to test it on the whole test dataset SinGAN was tested on.

Another idea I’ve also examined in this project, is to create multiple animations from one in the same manner SinGAN creates multiple similar images from one. The difference from the paper approach is to add sequence-memory to the method instead of only do a walk in learned z-space of one image. For this, I’ve used RNN/LSTM-like architectures and expanded the pyramidal SinGAN architecture.

In both implementation a noticeable bottleneck for the runs was a lack of GPU memory, which I handled by investigating less costly attentions mechanism for the first idea, and pruning training on the finest scales of the pyramidal training for the second (among other technical tweaks). The report at first briefly introduce the work and the motivation for it, then overlooks and explains to some level the used mechanisms and networks, later reproduces closely enough the results of the paper on one image of originally 50.

Here are some demonstrations of the main results:

Image Generation (with similiar results to the reproduced results and a little bit better):

Super-Resolution (ours gets better NIQE results)::

Finally, here are presented some gifs we created in the proposed approach (with the memory restrictions we had):

The original (regular and reverted):

Generated:

Name		Name	Last commit message	Last commit date
Latest commit History 231 Commits
BSD		BSD
Downloads		Downloads
Input		Input
SIFID		SIFID
SinGAN		SinGAN
TrainedModels		TrainedModels
animation_input		animation_input
fake_with		fake_with
fake_without		fake_without
imgs		imgs
real		real
report		report
sifid_compares		sifid_compares
usrstd_real		usrstd_real
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
SIFID.npy		SIFID.npy
SR.py		SR.py
animation.py		animation.py
config.py		config.py
config.pyc		config.pyc
editing.py		editing.py
functions_chkpnt.txt		functions_chkpnt.txt
harmonization.py		harmonization.py
main_train.py		main_train.py
paint2image.py		paint2image.py
random_samples.py		random_samples.py
requirements.txt		requirements.txt

License

ilyak93/SinGan

Folders and files

Latest commit

History

Repository files navigation

SinGAN

About

Resources

License

Stars

Watchers

Forks

Languages