# CENG796 Term Project

## HoloGAN: Unsupervised Learning of 3D Representations from Natural Images

Our term project is to re-implementation of the HoloGAN: Unsupervised Learning of 3D Representations from Natural Images study (can be found at https://arxiv.org/abs/1904.01326). The source codes, license, dataset downloader script can be found in [the GitHub repo](https://github.com/eksuas/HoloGAN-PyTorch).


### Abstract of the paper

We propose a novel generative adversarial network (GAN) for the task of unsupervised learning of 3D representations from natural images. Most generative models rely on 2D kernels to generate images and make few assumptions about the 3D world. These models therefore tend to create blurry images or artefacts in tasks that require a strong 3D understanding, such as novel-view synthesis. HoloGAN instead learns a 3D representation of the world, and to render this representation in a realistic manner. Unlike other GANs, HoloGAN provides explicit control over the pose of generated objects through rigid-body transformations of the learnt 3D features. Our experiments show that using explicit 3D features enables HoloGAN to disentangle 3D pose and identity, which is further decomposed into shape and appearance, while still being able to generate images with similar or higher visual quality than other generative models. HoloGAN can be trained end-to-end from unlabelled 2D images only. Particularly, we do not require pose labels, 3D shapes, or multiple views of the same objects. This shows that HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner.

### Group members of the project

Edanur Demir and Gökhan Özsarı


### Hyper-parameters

These are the hyper-parameters of our model:

```
  --seed N              random seed
  --image-path S        training dataset directory path (default: '../dataset/celebA/')
  --dataset {celebA}    dataset selection (default: celebA)
  --gpu                 flag to enable cuda computation (default: False)
  --batch-size N        training batch size of the model (default: 32)
  --max-epochs N        the maximum number of epochs for training (default: 50)
  --epoch-step N        epoch step to compute the adaptive learning rate (default: 25)
  --z-dim N             the length of the generative model input (default: 128)
  --d-lr N              the learning rate of the discriminator (default: 0.0001)
  --g-lr N              the learning rate of the generator (default: 0.0001)
  --beta1 N             minimum betas parameter of the Adam optimizer (default: 0.5)
  --beta2 N             maximum betas parameter of the Adam optimizer (default: 0.999)
  --lambda-latent N     the lambda latent coefficient given in the paper (default: 0.0)
  --elevation-low N     the minimum elevation angle (default: 70)
  --elevation-high N    the maximum elevation angle (default: 110)
  --azimuth-low N       the minimum azimuth angle (default: 220)
  --azimuth-high N      the maximum azimuth angle (default: 320)
  --scale-low N         the minimum scaling value of 3D transformation (default: 1.0)
  --scale-high N        the maximum scaling value of 3D transformation (default: 1.0)
  --transX-low N        the minimum translation factor across the X-axis (default: 0)
  --transX-high N       the maximum translation factor across the X-axis (default: 0)
  --transY-low N        the minimum translation factor across the Y-axis (default: 0)
  --transY-high N       the maximum translation factor across the Y-axis (default: 0)
  --transZ-low N        the minimum translation factor across the Z-axis (default: 0)
  --transZ-high N       the maximum translation factor across the Z-axis (default: 0)
  --log-interval N      logging interval in terms of batch size (default: 1000)
  --update-g-every-d N  do not save the current model
  --no-save-model       flag to not save the current model (default: False)
  --rotate-elevation    flag to rotate the z sampling with elevation (default: False)
  --rotate-azimuth      flag to rotate the z sampling with azimuth (default: False)
  --load-dis S          the path for loading and/or evaluating the discriminator
  --load-gen S          the path for loading and/or evaluating the generator
```

### Creating a Model

We can create an instance of the HoloGAN by giving the arguments as a parameter. 

In [1]:
import sys
from init import initializer
from hologan import HoloGAN

sys.argv = ["", "--max-epochs", "3"]
args = initializer()
model = HoloGAN(args)

Broken training is detected. Starting epoch is 2




### Training

HoloGAN has its own train function as below. During training, it updates the parameters of the Discriminator at each batch one time. But, the Generator is updated two times at each batch.

In [4]:
def train(self, args):
    """HoloGAN trainer

    This method train the HoloGAN model.
    """
    d_lr = args.d_lr
    g_lr = args.g_lr
    for epoch in range(args.start_epoch, args.max_epochs):
        # Adaptive learning rate
        if epoch >= args.epoch_step:
            adaptive_lr = (args.max_epochs - epoch) / (args.max_epochs - args.epoch_step)
            d_lr *= adaptive_lr
            g_lr *= adaptive_lr
            for param_group in self.optimizer_discriminator.param_groups:
                param_group['lr'] = d_lr
            for param_group in self.optimizer_generator.param_groups:
                param_group['lr'] = g_lr

        result = collections.OrderedDict({"epoch":epoch})
        result.update(self.train_epoch(args, epoch))
        # validate and keep history at each log interval
        self.save_history(args, result)

    # save the model giving the best validation results as a final model
    if not args.no_save_model:
        self.save_model(args, args.max_epochs-1, best=True)
        
model.train(args)

Epoch: [ 0] [  0/  2] 

KeyboardInterrupt: 

### Saving and Loading

We can load a pre-trained model as below. 

and computing qualitative samples/outputs from that model.

In [1]:
import sys
from init import initializer
from hologan import HoloGAN

sys.argv = ["", "--rotate-azimuth", "--gpu", "--batch-size", "1",
            "--load-dis", "models/model_final/discriminator.pt",
            "--load-gen", "models/model_final/generator.pt"]
args = initializer()
model = HoloGAN(args)
model.sample(args, trained=True, collection=True)

Samples are saved in samples\celebA\sample_1590314062.147041\samples_220.png
Samples are saved in samples\celebA\sample_1590314062.147041\samples_230.png
Samples are saved in samples\celebA\sample_1590314062.147041\samples_240.png
Samples are saved in samples\celebA\sample_1590314062.147041\samples_250.png
Samples are saved in samples\celebA\sample_1590314062.147041\samples_260.png
Samples are saved in samples\celebA\sample_1590314062.147041\samples_270.png
Samples are saved in samples\celebA\sample_1590314062.147041\samples_280.png
Samples are saved in samples\celebA\sample_1590314062.147041\samples_290.png
Samples are saved in samples\celebA\sample_1590314062.147041\samples_300.png
Samples are saved in samples\celebA\sample_1590314062.147041\samples_310.png


### Sampling

reproducing the result(s) in the form of plots and/or tables, as you've declared that you have declared. Please add a brief explanation / pointer to the paper so that a reader can understand what these results are. 

A section describing the challenges (if any) that you have encountered when implementing the paper. This should primarily include implementation details that you could not find in the paper, the assumptions made that you had to make (eg. number of layers).


***Requirements for Version 1:***

* Each project should accompany a Jupyter Notebook file that contains sections for: The Jupyter notebook should come with pre-computed outputs. This is to show your results without having to re-run your notebook.

* A bash script called download_data.sh that downloads any necessary data (datasets, pre-trained models, etc.) that you could not include in your ODTUclass submission due to file size.

* License.txt containing MIT License.

* These requirements are mostly trivial and/or come from the original definition of the project but if any one of them creates a big overhead for your version-1 submissions, please contact us with a specific reason why it is not feasible to satisfy within a few days. If there is a special case applying to your project, we may give you an exemption for version-1 submissions.