<a href="https://colab.research.google.com/github/duskvirkus/alias-free-gan/blob/notebook-update/notebooks/GPU_Training_Alias_Free_GAN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GPU Training - Alias-Free GAN
by duskvirkus

This is a notebook for training Alias-Free GAN on a Colab GPU instance.

Repository: https://github.com/duskvirkus/alias-free-gan

# GPU check

If this fails change the runtime type in `Runtime > Change runtime type > Select GPU`.

In [1]:
!nvidia-smi -L

GPU 0: Tesla V100-SXM2-16GB (UUID: GPU-f1e107b6-811c-edc1-fc11-08cae0acfb96)


## Connect Google Drive

This notebook is designed to be used with google drive connected. If you'd like to use it without google drive you'll have to make changes.

The main reason behind this is Colab sessions automaticall shut off after a number of hours (~10 for free and ~20 for pro). This risks loosing training progress if it's not saved to persistent storage.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Clone / cd into Repository

In [3]:
import os
drive_path = '/content/drive/MyDrive/'
repo_container_dir = 'colab-alias-free-gan'
repo_name = 'alias-free-gan'
git_repo = 'https://github.com/duskvirkus/alias-free-gan.git'

working_dir = os.path.join(drive_path, repo_container_dir, repo_name)

if os.path.isdir(working_dir):
  %cd {working_dir}
else:
  container_path = os.path.join(drive_path, repo_container_dir)
  os.makedirs(container_path)
  %cd {container_path}
  !git clone {git_repo}
  %cd {repo_name}
  !mkdir pretrained

/content/drive/MyDrive/colab-alias-free-gan
Cloning into 'alias-free-gan'...
remote: Enumerating objects: 1146, done.[K
remote: Counting objects: 100% (188/188), done.[K
remote: Compressing objects: 100% (98/98), done.[K
remote: Total 1146 (delta 98), reused 136 (delta 74), pack-reused 958[K
Receiving objects: 100% (1146/1146), 73.49 MiB | 20.55 MiB/s, done.
Resolving deltas: 100% (559/559), done.
Checking out files: 100% (94/94), done.
/content/drive/MyDrive/colab-alias-free-gan/alias-free-gan


## Install Dependancies

In [4]:
!python install.py

Collecting pytorch-lightning
  Downloading pytorch_lightning-1.4.2-py3-none-any.whl (916 kB)
[K     |████████████████████████████████| 916 kB 15.9 MB/s 
[?25hCollecting pytorch-lightning-bolts
  Downloading pytorch_lightning_bolts-0.3.2-py3-none-any.whl (253 kB)
[K     |████████████████████████████████| 253 kB 61.4 MB/s 
[?25hCollecting wandb
  Downloading wandb-0.12.0-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 57.8 MB/s 
[?25hCollecting ninja
  Downloading ninja-1.10.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB)
[K     |████████████████████████████████| 108 kB 60.1 MB/s 
Collecting pydantic
  Downloading pydantic-1.8.2-cp37-cp37m-manylinux2014_x86_64.whl (10.1 MB)
[K     |████████████████████████████████| 10.1 MB 64.3 MB/s 
[?25hCollecting pyhocon
  Downloading pyhocon-0.3.58.tar.gz (114 kB)
[K     |████████████████████████████████| 114 kB 73.9 MB/s 
[?25hCollecting opencv-python-headless
  Downloading opencv_python_h

## Convert Dataset

You can skip this section if you already have a dataset in the correct format.

Currently only supports datasets with only one of the following dimensions of images. 256 by 256 **or** 512 by 512 **or** 1024 by 1024

Preparing your dataset for conversion. Tools to prep a data set are beyond the scope of this notebook dvschultz/dataset-tools(https://github.com/dvschultz/dataset-tools) is suggested to help with this process.

Structure of your dataset:
```
dataset_root_dir # name of your dataset is suggested
  |- sub_directory # anything (this has to do with labels which is an unsupported feature at current time)
    |- image01.png
    |- images_can_have_any_names.png
    |- they_also_be.jpg
    |...continued # Suggested minimum size is 1000+ images.
```

The above example would result in an input of `unconverted_dataset='path/to/dataset_root_dir'`

In [None]:
unconverted_dataset = '/content/drive/MyDrive/dataset-creation/painterly-faces-v2'
out_path = '/content/drive/MyDrive/datasets-aliasfree/painterly-faces-v2-256'
dataset_size = 256 # one of the following 256, 512, 1024
!python scripts/convert_dataset.py --size {dataset_size} {unconverted_dataset} {out_path}

Make dataset of image sizes: 256
  "Argument interpolation should be of type InterpolationMode instead of int. "
  "Argument interpolation should be of type InterpolationMode instead of int. "
1158it [06:45,  2.85it/s]


## Info on training options

Most training options work rather well out of the box. See the training section for suggested arguments.

You can see a full list of training options by running the following cell.

In [None]:
!python scripts/trainer.py --help

## Training

Results from training can be found in `results` directory.

**Resume from Checkpoint**

Set `--resume_from 'path/to/checkpoint.pt'`

If resuming from a checkpoint that doesn't use the new kimg naming scheme use `--start_kimg_count` to set the starting count manually.

**Transfer Learning Options**

See repository for transfer learning options. https://github.com/duskvirkus/alias-free-gan/blob/devel/pretrained_models.json

Use `--resume_from 'model_name'`. wget is used to automatically download the pretrained models.

**Training from Scratch**

This is not recommended as transfer learning off of any model even if it's not related to your dataset will be faster and consume less resources. Unless there is no pretrained models or you have an explicit reason use transfer learning. To train from scratch simply leave resume blank, like so `--resume_from ''`.

### Suggested Batch Size

For colab pro gpus (16GB) here are the suggested batch sizes:
- 256: batch size 8 recommended
- 512: batch size 4 recommended
- 1024: batch size 4 recommended

Feel free to play around to see if you can get things higher. For the best performance try to keep batch in powers of 2.

### Trouble Shooting

If you get a cuda out of memory error try reducing the `batch`.

If you get another error please report it at https://github.com/duskvirkus/alias-free-gan/issues/new

If the model makes it through the first epoch you're unlike to encounter any errors after that.




In [7]:
model_size = 256
dataset_location = '/content/drive/MyDrive/datasets-aliasfree/painterly-faces-v2-256'
resume = 'rosinality-ffhq-800k'
batch_size = 8

sample_frequency = 1 # in kimgs or thousands of images
checkpoint_frequency = 4 # in kimgs or thousands of images

In [8]:
!python scripts/trainer.py \
    --gpus 1 \
    --size {model_size} \
    --dataset_path {dataset_location} \
    --resume_from {resume} \
    --batch {batch_size} \
    --save_sample_every_kimgs {sample_frequency} \
    --save_checkpoint_every_kimgs {checkpoint_frequency}

Using Alias-Free GAN version: 1.0.0


Licence and compensation information for rosinality-ffhq-800k pretrained model: test information


Dataset path: /content/drive/MyDrive/datasets-aliasfree/painterly-faces-v2-256
Initialized MultiResolutionDataset dataset with 1158 images
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
2021-08-16 04:37:53.526893: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0

  | Name          | Type          | Params
------------------------------------------------
0 | generator     | Generator     | 17.3 M
1 | g_ema         | Generator     | 17.3 M
2 | discriminator | Discriminator | 28.9 M
------------------------------------------------
63.5 M    Trainable params
0         Non-trainable params
63.5 M    Total params
253.864   Total estimated model params size (MB)
Training: -1it [00:00, ?it