# Retraining StyleGAN2 on Kaggle

Inspired by Jeff Heaton's Google Colab and Docker examples on training StyleGAN2

https://colab.research.google.com/github/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_07_3_style_gan.ipynb
https://hub.docker.com/r/heatonresearch/stylegan2-ada


## Background

StyleGAN2 requires a Windows or Linux machine with at least 1 GPU. My computer meets neither requirement so I ran a notebook using Kaggle's GPU time.

# Load Google Drive Files

In [2]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


# Clone the StyleGAN ADA Repository and Install Packages

Run the code block below to clone the repository and switch tensorflow versions. I found that if you try to manually pip install tensorflow, you will run into missing GPU issues.

In [3]:
!git clone https://github.com/NVlabs/stylegan2.git
%tensorflow_version 1.x

fatal: destination path 'stylegan2' already exists and is not an empty directory.
TensorFlow 1.x selected.


Check that the repository was downloaded. If you are not running in Kaggle, you should change the path.

In [4]:
!ls /content/stylegan2/
!ls /content/drive/MyDrive/StyleGAN2_data

dataset_tool.py  LICENSE.txt		 README.md	   run_training.py
dnnlib		 metrics		 run_generator.py  test_nvcc.cu
Dockerfile	 pretrained_networks.py  run_metrics.py    training
docs		 projector.py		 run_projector.py
images	images.zip  pytorch-dataset  tf-datasets  tf-results


Verify Tensorflow version 1.x and GPU access

In [5]:
import tensorflow as tf
print(tf.__version__)
device_name = tf.test.gpu_device_name()  
if device_name != '/device:GPU:0':
   raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))


1.15.2
Found GPU at: /device:GPU:0


# Convert your images to data format

If you have already run this section before, you do not need to run it again.

You need to convert your images to a data format for StyleGAN2. Your images must be square and have dimensions with a power of 2 (ex 256 x 256).

In the command below, you should specify the source path to all your input JPEG images (ie as a zip file) as well as the destination path to store the output dataset. Note that your dataset path should follow the format ```<path/to/your/datasets>/<your-dataset-name>```. Later when you run ```run_training.py```, you will need to pass ```path/to/your/datasets``` to ```data-dir``` and ```your-dataset-name``` to ```dataset```.

If you are not in Colab, you should change the argument paths.

In [None]:
#!python /content/stylegan2/dataset_tool.py create_from_images  /content/drive/MyDrive/StyleGAN2_data/tf-datasets/custom/ /content/drive/MyDrive/StyleGAN2_data/images/

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Loading images from "/content/drive/My Drive/StyleGAN2_data/images/"
Creating dataset "/content/drive/My Drive/StyleGAN2_data/tf-datasets/custom/"
  'data': tf.train.Feature(bytes_list=tf.train.BytesList(value=[quant.tostring()]))}))
Added 20701 images.


Confirm you have tfrecords

In [6]:
!ls /content/drive/MyDrive/StyleGAN2_data/tf-datasets/custom

-r02.tfrecords	-r04.tfrecords	-r06.tfrecords	-r08.tfrecords
-r03.tfrecords	-r05.tfrecords	-r07.tfrecords


# Train StyleGAN2 on your images!

If you want to resume training from a network snapshot, you need to upload that snapshot as a dataset to Kaggle, then connect the dataset to this notebook. You must also pass in the --resume flag and the model path.

If you are not in Kaggle, you should change the argument paths.

In [8]:
!python /content/stylegan2/run_training.py --num-gpus=1 --dataset=custom --data-dir=/content/drive/MyDrive/StyleGAN2_data/tf-datasets --config=config-f --mirror-augment=true --total-kimg=1000 --result-dir=/content/drive/MyDrive/StyleGAN2_data/tf-results --metrics=none

Local submit - run_dir: /content/drive/MyDrive/StyleGAN2_data/tf-results/00008-stylegan2-custom-1gpu-config-f
dnnlib: Running training.training_loop.training_loop() on localhost...
Streaming data using training.dataset.TFRecordDataset...
tcmalloc: large alloc 4294967296 bytes == 0x564ae953a000 @  0x7f361b5fa001 0x7f36180de54f 0x7f361812eb58 0x7f3618132b17 0x7f36181d1203 0x564ae324a544 0x564ae324a240 0x564ae32be627 0x564ae32b8ced 0x564ae324c48c 0x564ae328d159 0x564ae328a0a4 0x564ae324c698 0x564ae32bafe4 0x564ae32b89ee 0x564ae318ae2b 0x564ae32bafe4 0x564ae32b89ee 0x564ae318ae2b 0x564ae32bafe4 0x564ae324bafa 0x564ae32b9915 0x564ae324bafa 0x564ae32b9c0d 0x564ae32b89ee 0x564ae318ae2b 0x564ae32bafe4 0x564ae32b89ee 0x564ae318ae2b 0x564ae32bafe4 0x564ae324bafa
tcmalloc: large alloc 4294967296 bytes == 0x564be953a000 @  0x7f361b5f81e7 0x7f36180de46e 0x7f361812ec7b 0x7f361812f35f 0x7f36181d1103 0x564ae324a544 0x564ae324a240 0x564ae32be627 0x564ae32b89ee 0x564ae324bbda 0x564ae32ba737 0x564ae32b89