# StyleGAN2 operations

### Preparation

Load necessary modules, connect to Google Drive, clone StyleGAN2 repo:  
*(run this cell again if the session is restarted!)*

In [None]:
%tensorflow_version 1.x
!apt-get -qq install ffmpeg
import os

from google.colab import drive
drive.mount('/G', force_remount=True)
gdir = !ls /G/
gdir = '/G/%s/' % str(gdir[0])
%cd $gdir

work_name = 'sg2_eps' # change this as you want
work_dir = gdir + work_name + '/'
if not os.path.isdir(work_dir):
  !git clone -b colab git://github.com/eps696/stylegan2 $work_name
%cd $work_dir
!pip install -r requirements.txt

> All directories, mentioned below, are within StyleGAN2 copy on your connected G drive

## Training

First, let's prepare the data. 

If you work with patterns or shapes (rather than compostions), you can crop square fragments from bigger images (effectively multiplying their amount). Upload your raw images into `data/src` folder, rename `mydata` below according to your needs and run the cell. This will cut the images into `size`px fragments, overlapped with `step` shift by X and Y:

In [None]:
src_dir = 'data/src'
data_dir = 'data/mydata'
size = 512
step = 256

%cd $work_dir
%run src/util/multicrop.py --in_dir $src_dir --out_dir $data_dir --size $size --step $step

> If you want to edit input images yourself (e.g. to keep the compositions, or to work with non-square aspect ratios) -- skip the cell above, and upload your prepared data to the directory `data/mydata` (rename as needed):  
*(run this cell again if the session is restarted!)*

In [None]:
%cd $work_dir
data_dir = 'data/mydata'

Now let's make compact TFRecords dataset from directory with JPG images `data/mydata`.  
This will create `mydata-512x512.tfr` file in `data` directory.  
> *For images with alpha channel remove `--jpg` option.*  
*For conditional model split the data by subfolders (`mydata/1`, `mydata/2`, ..) and add `--labels` option.*


In [None]:
%run src/training/dataset_tool.py --dataset $data_dir --jpg

Finally, we can train StyleGAN2 on the prepared dataset:  
*(remove `--jpg_data` if your images have alpha channel)*

In [None]:
%run src/train.py --dataset $data_dir --jpg_data

> This will run training process, according to the options in `src/train.py`. If there was no TFRecords file from the previous step, it will be created at this point. The training results (models and samples) are saved under the `train` directory, similar to original Nvidia approach. There are two types of models saved: compact (containing only Gs network for inference) as `<dataset>-...pkl` (e.g. `mydata-512-0360.pkl`), and full (containing G/D/Gs networks for further training) as `snapshot-...pkl`. 

> By default, the most powerful SG2 config (F) is used; if you face OOM issue, you may resort to `--config E`, requiring less memory (with poorer results, of course). For small datasets (100x images instead of 10000x) one should add `--d_aug` option to use [Differential Augmentation](https://github.com/mit-han-lab/data-efficient-gans) for more effective training. 

> The length of the training is defined by `--lod_step_kimg X` argument. It's kind of legacy from [progressive GAN](https://github.com/tkarras/progressive_growing_of_gans) and defines one step of progressive training. Network with base resolution 1024px will be trained for 20 such steps, for 512px - 18 steps, et cetera. Reasonable `lod_step_kimg` value for big datasets is 300-600, while in `--d_aug` mode 20-40 is sufficient.

If the training process was interrupted, we can resume it from the last saved model as following:  
*(replace `000-mydata-512-f` with existing training directory)*

In [None]:
%run src/train.py --dataset $data_dir --jpg_data --resume train/000-mydata-512-f

NB: In most cases it's much easier to use a "transfer learning" trick, rather than perform full training from the scratch. For that, we use existing well-trained model as a starter, and "finetune" (uptrain) it with our data. This works pretty well, even if our dataset is very different from the original model. 

So here is a faster way to train our GAN (presuming we have full trained model `train/ffhq-512.pkl` already):

In [None]:
%run src/train.py --dataset $data_dir --jpg_data --resume train/ffhq-512.pkl --d_aug --lod_step_kimg 20 --finetune

There's no need to go for exact steps in this case, you may stop when you're ok with the results. Lower `lod_step_kimg` helps following the progress.

## Generation

Let's produce some imagery from the original cat model (download it from [here](https://nvlabs-fi-cdn.nvidia.com/stylegan2/networks/stylegan2-ffhq-config-f.pkl) and put to `models` directory).  
*(run this cell again if the session is restarted!)*

In [None]:
from IPython.display import HTML
from base64 import b64encode
%cd $work_dir

def makevid(seq_dir):
  out_sequence = seq_dir + '/%06d.jpg'
  out_video = seq_dir + '.mp4'
  !ffmpeg -y -v warning -i $out_sequence $out_video
  data_url = "data:video/mp4;base64," + b64encode(open(out_video,'rb').read()).decode()
  return """<video controls><source src="%s" type="video/mp4"></video>""" % data_url

model = 'models/stylegan2-cat-config-f' # without ".pkl" extension
model_pkl = model + '.pkl' # with ".pkl" extension
output = '_out/cats' # output directory
frames = '50-10'

Generate some animation to test the model:

In [None]:
%run src/_genSGAN2.py --model $model_pkl --out_dir $output --frames $frames
HTML(makevid(output))

> Here we loaded the model 'as is', and produced 50 frames in its natural resolution, interpolating between random latent space keypoints, with a step of 10 frames between keypoints.

Now let's generate more custom animation. For that we omit model extension, so the script would load custom network, effectively enabling special features, e.g. arbitrary resolution (set by `--size` argument in `X-Y` format).  
`--cubic` option changes linear interpolation to cubic for smoother animation (there is also `--gauss` option for additional smoothing).

In [None]:
%run src/_genSGAN2.py --model $model --out_dir $output --frames $frames --size 400-300 --cubic
HTML(makevid(output))

> Adding `--save_lat` option will save all traversed dlatent points as Numpy array in `*.npy` file (useful for further curating).

Generate more various imagery:

In [None]:
%run src/_genSGAN2.py --model $model --out_dir $output --frames $frames --size 768-256 -n 3-1
HTML(makevid(output))

> Here we get animated composition of 3 independent frames, blended together horizontally.  
Argument `--splitfine X` controls boundary fineness (0 = smoothest/default, higher => thinner).  

Instead of frame splitting, we can load external mask from b/w image file (or folder with image sequence):

In [None]:
%run src/_genSGAN2.py --model $model --out_dir $output --frames $frames --size 400-300 --latmask _in/mask.jpg
HTML(makevid(output))

`--digress X` adds some funky displacements with X strength (by tweaking initial constant layer).  
`--trunc X` controls truncation psi parameter (0 = boring, 1+ = weird). 

In [None]:
%run src/_genSGAN2.py --model $model --out_dir $output --frames $frames --digress 2 --trunc 0.5
HTML(makevid(output))

### Latent space exploration

For these experiments download [FFHQ model](https://nvlabs-fi-cdn.nvidia.com/stylegan2/networks/stylegan2-ffhq-config-f.pkl) and save to `models` directory.

In [None]:
from IPython.display import HTML, Image
from base64 import b64encode
%cd $work_dir

def makevid(seq_dir, size=512):
  out_sequence = seq_dir + '/%06d.jpg'
  out_video = seq_dir + '.mp4'
  !ffmpeg -y -v warning -i $out_sequence $out_video
  data_url = "data:video/mp4;base64," + b64encode(open(out_video,'rb').read()).decode()
  return """<video width=%d height=%d controls><source src="%s" type="video/mp4"></video>""" % (size, size, data_url)

model = 'models/stylegan2-ffhq-config-f' # without ".pkl" extension
model_pkl = model + '.pkl' # with ".pkl" extension

Project external images (aligned face portraits) from `_in/photo` onto FFHQ model dlatent space. 
Results (found dlatent points as Numpy arrays in `*.npy` files, and video/still previews) are saved to `_out/proj` directory.  
NB: first download [VGG model](https://drive.google.com/uc?id=1N2-m9qszOeVC9Tq77WxsLnuWwOedQiD2) and save it as `models/vgg/vgg16_zhang_perceptual.pkl`

In [None]:
%run src/project_latent.py --model $model_pkl --in_dir _in/photo --out_dir _out/proj 

Image('_out/proj/blonde458/blonde458-1000.jpg', width=512, height=512)

In [None]:
from IPython.display import Image


Generate animation between saved dlatent points:

In [None]:
dlat = 'mynpy'
path_in = '_in/' + dlat
path_out = '_out/ffhq-' + dlat

%run src/_play_dlatents.py --model $model --dlatents $path_in --out_dir $path_out --fstep 10
HTML(makevid(path_out))

> This loads saved dlatent points from `_in/mynpy` and produces smooth looped animation between them (with interpolation step of 50 frames). `mynpy` may be a file or a directory with `*.npy` or `*.npz` files. To select only few frames from a sequence `somename.npy`, create text file with comma-delimited frame numbers and save it as `somename.txt` in the same directory (check given examples for FFHQ model).

Style-blending argument `--style_npy_file xxx.npy` would also load dlatent from `xxx.npy` and apply it to higher network layers. `--cubic` smoothing and `--digress X` displacements are also applicable here:

In [None]:
%run src/_play_dlatents.py --model $model --dlatents $path_in --out_dir $path_out --fstep 10 --style_npy_file _in/blonde458.npy --digress 2 --cubic
HTML(makevid(path_out))

Generate animation by moving saved dlatent point `_in/blonde458.npy` along feature direction vectors from `_in/vectors_ffhq` (aging/smiling/etc) one by one:

In [None]:
%run src/_play_vectors.py --model $model_pkl --npy_file _in/blonde458.npy --vector_dir _in/vectors_ffhq --out_dir _out/ffhq_looks
HTML(makevid('_out/ffhq_looks'))

## Tweaking models

NB: No real examples here! The commands are for reference, try with your own files.

Strip G/D networks from a full model, leaving only Gs for inference. Resulting file is saved with `-Gs` suffix. It's recommended to add `-r` option to reconstruct the network, saving necessary arguments with it. Useful for foreign downloaded models.

In [None]:
%run src/model_convert.py --source snapshot-1024.pkl

Add or remove layers (from a trained model) to adjust its resolution for further finetuning. This will produce new model with 512px resolution, populating weights on the layers up to 256px from the source snapshot (the rest will be initialized randomly). It also can decrease resolution (say, make 512 from 1024). Note that this effectively changes the number of layers in the model.   
This option works with complete (G/D/Gs) models only, since it's purposed for transfer-learning (the resulting model will contain either partially random weights, or wrong `ToRGB` params). 

In [None]:
%run src/model_convert.py --source snapshot-256.pkl --res 512

Crop resolution of a trained model. This will produce working non-square 1024x768 model. Opposite to the method above, this one doesn't change layer count. This is experimental feature (as stated by the author @Aydao), also using some voluntary logic, so works only with basic resolutions.

In [None]:
%run src/model_convert.py --source snapshot-1024.pkl --res 1024-768

Combine lower layers from one model with higher layers from another. `<res>` is resolution, at which the models are switched (usually 32/64/128); `<level>` is 0 or 1.

In [None]:
%run src/models_blend.py --pkl1 model1.pkl --pkl2 model2.pkl --res <res> --level <level>

Mix few models by stochastic averaging all weights. This would work properly only for models from one "family", i.e. uptrained (finetuned) from the same original model. 

In [None]:
%run src/models_swa.py --in_dir <models_dir>