<a href="https://colab.research.google.com/github/duskvirkus/alias-free-gan/blob/notebook/notebooks/GPU_Training_Alias_Free_GAN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# GPU Training - Alias-Free GAN
by duskvirkus

This is a notebook for training Alias-Free GAN on a Colab GPU instance.

Repository: https://github.com/duskvirkus/alias-free-gan

# GPU check

If this fails change the runtime type in `Runtime > Change runtime type > Select GPU`.

In [1]:
!nvidia-smi -L

GPU 0: Tesla P100-PCIE-16GB (UUID: GPU-708a59ae-a05f-4885-1218-48d39f6d8f69)


## Connect Google Drive

This notebook is designed to be used with google drive connected. If you'd like to use it without google drive you'll have to make changes.

The main reason behind this is Colab sessions automaticall shut off after a number of hours (~10 for free and ~20 for pro). This risks loosing training progress if it's not saved to persistent storage.

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## Clone / cd into Repository

In [6]:
import os
drive_path = '/content/drive/MyDrive/'
repo_container_dir = 'colab-gpu-alias-free'
repo_name = 'alias-free-gan'
git_repo = 'https://github.com/duskvirkus/alias-free-gan.git'

working_dir = os.path.join(drive_path, repo_container_dir, repo_name)

if os.path.isdir(working_dir):
  %cd {working_dir}
else:
  container_path = os.path.join(drive_path, repo_container_dir)
  os.makedirs(container_path)
  %cd {container_path}
  !git clone {git_repo}
  %cd {repo_name}
  !mkdir pretrained

/content/drive/MyDrive/colab-gpu-alias-free
Cloning into 'alias-free-gan'...
remote: Enumerating objects: 937, done.[K
remote: Counting objects: 100% (638/638), done.[K
remote: Compressing objects: 100% (358/358), done.[K
remote: Total 937 (delta 314), reused 468 (delta 178), pack-reused 299[K
Receiving objects: 100% (937/937), 73.35 MiB | 13.31 MiB/s, done.
Resolving deltas: 100% (454/454), done.
/content/drive/MyDrive/colab-gpu-alias-free/alias-free-gan


## Install Dependancies

In [None]:
!python install.py

## Convert Dataset

You can skip this section if you already have a dataset in the correct format.

Currently only supports datasets with only one of the following dimensions of images. 256 by 256 **or** 512 by 512 **or** 1024 by 1024

Preparing your dataset for conversion. Tools to prep a data set are beyond the scope of this notebook dvschultz/dataset-tools(https://github.com/dvschultz/dataset-tools) is suggested to help with this process.

Structure of your dataset:
```
dataset_root_dir # name of your dataset is suggested
  |- sub_directory # anything (this has to do with labels which is an unsupported feature at current time)
    |- image01.png
    |- images_can_have_any_names.png
    |- they_also_be.jpg
    |...continued # Suggested minimum size is 1000+ images.
```

The above example would result in an input of `unconverted_dataset='path/to/dataset_root_dir'`

In [8]:
unconverted_dataset = '/content/drive/MyDrive/dataset-creation/painterly-faces-v2'
out_path = '/content/drive/MyDrive/datasets-aliasfree/painterly-faces-v2-256'
dataset_size = 256 # one of the following 256, 512, 1024
!python scripts/convert_dataset.py --size {dataset_size} {unconverted_dataset} {out_path}

Make dataset of image sizes: 256
  "Argument interpolation should be of type InterpolationMode instead of int. "
  "Argument interpolation should be of type InterpolationMode instead of int. "
1158it [06:45,  2.85it/s]


## Info on training options

Most training options work rather well out of the box. See the training section for suggested arguments.

You can see a full list of training options by running the following cell.

In [None]:
!python scripts/trainer.py --help

## Training

Results from training can be found in `results` directory.

**Resume from Checkpoint**

Set `--resume_from 'path/to/checkpoint.pt'`

**Transfer Learning Options**

See repository for transfer learning options. https://github.com/duskvirkus/alias-free-gan/blob/devel/pretrained_models.json

**Training from Scratch**

This is not recommended as transfer learning off of any model even if it's not related to your dataset will be faster and consume less resources. Unless there is no pretrained models or you have an explicit reason use transfer learning. To train from scratch simply leave `resume` blank.

### Suggested Batch Size

For colab pro gpus (16GB) here are the suggested batch sizes:
- 256: batch size of 8 and sample of 9 (to get a nice grid)
- 512: batch size and sample of 4
- 1024: 2? haven't done much training at this size yet I will update this.

Feel free to play around to see if you can get things higher. Haven't worked much on optimizing performance so your guess maybe as good as mine.

### Trouble Shooting

If you get a cuda out of memory error try reducing the `batch` and `--n_samples`.

If you get another error please report it at https://github.com/duskvirkus/alias-free-gan/issues/new

If the model makes it through the first epoch you're unlike to encounter any errors after that.




In [9]:
model_size = 256
dataset_location = '/content/drive/MyDrive/datasets-aliasfree/painterly-faces-v2-256'
resume = 'rosinality-ffhq-800k'

In [12]:
!python scripts/trainer.py \
    --size {model_size} \
    --gpus 1 \
    --dataset_path {dataset_location} \
    --resume_from {resume} \
    --logger True \
    --batch 8 \
    --n_samples 9

Using Alias-Free GAN version: 1.0.0
Downloading rosinality-ffhq-800k from https://drive.google.com/uc?id=15B_Pz-38eIiUBCiVbgPOcv-TIoT7nc5e
Downloading...
From: https://drive.google.com/uc?id=15B_Pz-38eIiUBCiVbgPOcv-TIoT7nc5e
To: /content/drive/MyDrive/colab-gpu-alias-free/alias-free-gan/pretrained/rosinality-ffhq-800k.pt
623MB [00:09, 68.0MB/s]


Licence and compensation information for rosinality-ffhq-800k pretrained model: test information


Dataset path: /content/drive/MyDrive/datasets-aliasfree/painterly-faces-v2-256
Initialized MultiResolutionDataset dataset with 1158 images
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
2021-07-29 07:14:52.115524: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0

  | Name          | Type          | Params
------------------------------------------------
0 | generator     | Gen