# StyleGAN2

![styleGAN2 generated image sample](https://github.com/sony/nnabla-examples/raw/master/GANs/stylegan2/images/sample.png)


This Demo consists of two parts
1. A warmup demo that simply generates faces and does style mixing (similar to presentation on friday)
2. Exploration of interpolation. You can export the videos, as well.


In [None]:
import os
import requests
from tqdm import tqdm_notebook as tqdm

# Helper function to download (windows compatibility)
def download(url, filename):
    response = requests.get(url, stream=True)
    total_size_in_bytes= int(response.headers.get('content-length', 0))
    block_size = 1024 #1 Kibibyte
    progress_bar = tqdm(total=total_size_in_bytes, unit='iB', unit_scale=True)
    
    with open(filename, 'wb') as file:
        for data in response.iter_content(block_size):
            progress_bar.update(len(data))
            file.write(data)
    
    progress_bar.close()

    if total_size_in_bytes != 0 and progress_bar.n != total_size_in_bytes:
        print("ERROR, something went wrong. Please manually delete any residual download files")

# helper function to change directory (windows compat, !cd doesn't work on windows)
def chdir(dir):
  os.chdir(dir)
  print("Changed working directory to: ", dir)

# Warmup

## Preparation
If you're running on Colab, make sure that your Runtime setting is set as GPU, which can be set up from the top menu (Runtime → change runtime type), and make sure to click **Connect** on the top right-hand side of the screen before you start.

This cell clones the necessary repo for simple image generation

*Note: botocore package warning/errors are expected and can be ignored*

In [None]:
!pip install nnabla-ext-cuda100
!git clone https://github.com/sony/nnabla-examples.git

chdir('nnabla-examples/GANs/stylegan2')

# Get the pretrained weights
Now we will get the pretrained weights for styleGAN2, then import some modules and do some preparation for the latter part.

In [None]:
weights = 'styleGAN2_G_params.h5'
url = 'https://nnabla.org/pretrained-models/nnabla-examples/GANs/stylegan2/styleGAN2_G_params.h5'
download(url, weights)

from generate import *
from IPython.display import Image, display

# init gpu
ctx = get_extension_context("cudnn")
nn.set_default_context(ctx)

batch_size = 1

nn.load_parameters(weights)

print("Done initializing")

# StyleGAN2 input config

The noise input **z** is fed to the **mapping network** to produce the latent code **w**. Then **w** is modified via **truncation trick** (*not covered in seminar*) and finally the modified latent code **w'** is injected to the **synthesis network**.

With multiple latent codes **w'** coming from the **mapping network**, **synthesis network** transforms the incoming tensor and gradually converts it to an image. 


In the following cell,  you will choose the random seed used for sampling the noise input **z**, the value for **truncation trick**, and another random seed used for the additional noise input.

**Note: if running locally, ignore the markdown marcos @markdown and @param**

In [None]:
#@markdown Choose the seed for noise input **z**. (This drastically changes the result)
latent_seed = 300  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose the value for truncation trick.
truncation_psi = 0.32  #@param {type: "slider", min: 0.0, max: 1.0, step: 0.01}

#@markdown Choose the seed for stochasticity input.  (This slightly changes the result)
noise_seed = 500  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose the seed for stochasticity input.  number of layers to inject noise (a.k.a stochastic variations) into. *This seems to change very little. Default: 18*
num_layers = 18 #@param {type: "slider", min: 0, max: 500, step:1}


#@markdown ---


# Now let's run StyleGAN2 to generate an image (more later)

Execution the following cell will run the styleGAN2. You can see by changing the value used for **truncation trick**, you will get the different results.

**This cell is to test a single output only, you'll be able to generate way more in a cell further down**

In [None]:
import numpy as np

rnd = np.random.RandomState(latent_seed)
z = rnd.randn(batch_size, 512)

style_noise = nn.Variable((batch_size, 512)).apply(d=z)
style_noises = [style_noise for _ in range(num_layers)]

rgb_output = generate(batch_size, style_noises, noise_seed, truncation_psi)
rgb_output.forward()

image = convert_images_to_uint8(rgb_output, drange=[-1, 1])
filename = f"seed{latent_seed}.png"
imsave(filename, image, channel_first=True)

display(Image(filename, width=512, height=512))

# Try Style Mixing

![styleGAN2 generated image sample](https://github.com/sony/nnabla-examples/raw/master/GANs/stylegan2/images/style_mixing_sample.png)

As described above (and in the presentation), in styleGAN2, **synthesis network** receives latent code **w** multiple times and generates images. In the previous generation, latent code **w** which **synthesis network** receives is made from one single noise input **z**. In this case, we can say that **w** controls the *style* of the generated image.

Given that, with a *different* latent code **w2**, made from another noise input **z2**, **synthesis network** can generate a completely different image. So, what if we use both **w** and **w2**...? That is, *style mixing*.

To be specific, using 2 latent codes **w** and **w2**, **synthesis network** can generate the image which contains both elements (i.e. hair style, face components), present in images made from **w** (controling coarse style) and **w2** (controling fine style).

In the following cell, you will choose one more random seed used for sampling another noise input **z2**. 

You can also choose from which layer it receives the additional latent code **w2**. It slightly changes the result, so try various patterns.

**Note: if running locally, ignore the markdown marcos @markdown and @param**

In [None]:
#@title StyleGAN2 style mixing config
#@markdown Choose seed for the primary noise input **z**.
latent_seed = 300  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose seed for the secondary noise input **z2**.
latent_seed2 = 444  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose from which layer to use the secondary latent code **w2**.
mix_after = 7  #@param {type: "slider", min: 0, max: 17, step:1}

#@markdown Choose seed for stochasticity input.
noise_seed = 500  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose the value for truncation trick.
truncation_psi = 0.5  #@param {type: "slider", min: 0.0, max: 1.0, step: 0.01}

#@markdown ---


# Let's run style mixing.

Running this cell executes style mixing and displays a generated mixed image and images made solely from **w** / **w2**.

In [None]:
rnd = np.random.RandomState(latent_seed)
z = rnd.randn(batch_size, 512)

rnd2 = np.random.RandomState(latent_seed2)
z2 = rnd2.randn(batch_size, 512)

style_noises = [nn.Variable((batch_size, 512)).apply(d=z) for _ in range(mix_after)]
style_noises += [nn.Variable((batch_size, 512)).apply(d=z2) for _ in range(num_layers - mix_after)]

rgb_output = generate(batch_size, style_noises, noise_seed, truncation_psi)
rgb_output.forward()

image_mix = convert_images_to_uint8(rgb_output, drange=[-1, 1])

for style_noise in style_noises:
    style_noise.d = z
rgb_output.forward()
image_A = convert_images_to_uint8(rgb_output, drange=[-1, 1])

for style_noise in style_noises:
    style_noise.d = z2
rgb_output.forward()
image_B = convert_images_to_uint8(rgb_output, drange=[-1, 1])

top_image = 255 * np.ones(image_mix.shape).astype(np.uint8)
top_image = np.concatenate([top_image, image_B], axis=2)
bottom_image = np.concatenate([image_A, image_mix], axis=2)
grid_image = np.concatenate([top_image, bottom_image], axis=1)
imsave("grid.png", grid_image, channel_first=True)
display(Image("grid.png", width=512, height=512))

# Exploring Latent Space

For this, we need the original code, not wrapped into a library (like the example above)

'Config F' will also be downloaded manually (mentioned in the presentation)

In [None]:
%tensorflow_version 1.x
import tensorflow as tf

# Download the code
!git clone https://github.com/NVlabs/stylegan2.git
chdir('stylegan2')
!nvcc test_nvcc.cu -o test_nvcc -run

print('Tensorflow version: {}'.format(tf.__version__) )
!nvidia-smi -L
print('GPU Identified at: {}'.format(tf.test.gpu_device_name()))

## Choose pretrained model

Choose between these pretrained models

'stylegan2-\<DATASET\>-config-f.pkl' is the best choice according to the paper and is downloaded in the cell above! If you chose a different model here, *you must download the weights accordingly in the cell above*

Note: **Do not change the default base path 'gdrive:networks' as it depends on your local colab instance and google account**

In [None]:
# Download the model of choice
import argparse
import numpy as np
import PIL.Image
import dnnlib
import dnnlib.tflib as tflib
import re
import sys
from io import BytesIO
import IPython.display
import numpy as np
from math import ceil
from PIL import Image, ImageDraw
import imageio

import pretrained_networks

# 1024×1024 faces, ffhq stands for FlickrFaceHighQuality
# stylegan2-ffhq-config-a.pkl
# stylegan2-ffhq-config-b.pkl
# stylegan2-ffhq-config-c.pkl
# stylegan2-ffhq-config-d.pkl
# stylegan2-ffhq-config-e.pkl
# stylegan2-ffhq-config-f.pkl

# 512×384 cars
# stylegan2-car-config-a.pkl
# stylegan2-car-config-b.pkl
# stylegan2-car-config-c.pkl
# stylegan2-car-config-d.pkl
# stylegan2-car-config-e.pkl
# stylegan2-car-config-f.pkl

# 256x256 horses
# stylegan2-horse-config-a.pkl
# stylegan2-horse-config-f.pkl

# 256x256 churches
# stylegan2-church-config-a.pkl
# stylegan2-church-config-f.pkl

# 256x256 cats
# stylegan2-cat-config-f.pkl
# stylegan2-cat-config-a.pkl
network_pkl = "gdrive:networks/stylegan2-ffhq-config-f.pkl"

print('Loading networks from "%s"...' % network_pkl)
_G, _D, Gs = pretrained_networks.load_networks(network_pkl)
noise_vars = [var for name, var in Gs.components.synthesis.vars.items() if name.startswith('noise')]

print('Done initializing model')

This cell contains helper functions only!

In [None]:
# Useful utility functions...

# Download file
def download_gdrive_file(file):
  files.download(file) 

# Generates a list of images, based on a list of latent vectors (Z), and a list (or a single constant) of truncation_psi's.
def generate_images_in_w_space(dlatents, truncation_psi):
    Gs_kwargs = dnnlib.EasyDict()
    Gs_kwargs.output_transform = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)
    Gs_kwargs.randomize_noise = False
    Gs_kwargs.truncation_psi = truncation_psi
    dlatent_avg = Gs.get_var('dlatent_avg') # [component]

    imgs = []
    for row, dlatent in log_progress(enumerate(dlatents), name = "Generating images"):
        #row_dlatents = (dlatent[np.newaxis] - dlatent_avg) * np.reshape(truncation_psi, [-1, 1, 1]) + dlatent_avg
        dl = (dlatent-dlatent_avg)*truncation_psi   + dlatent_avg
        row_images = Gs.components.synthesis.run(dlatent,  **Gs_kwargs)
        imgs.append(PIL.Image.fromarray(row_images[0], 'RGB'))
    return imgs       

# Generate array of images
def generate_images(zs, truncation_psi):
    Gs_kwargs = dnnlib.EasyDict()
    Gs_kwargs.output_transform = dict(func=tflib.convert_images_to_uint8, nchw_to_nhwc=True)
    Gs_kwargs.randomize_noise = False
    if not isinstance(truncation_psi, list):
        truncation_psi = [truncation_psi] * len(zs)
        
    imgs = []
    for z_idx, z in log_progress(enumerate(zs), size = len(zs), name = "Generating images"):
        Gs_kwargs.truncation_psi = truncation_psi[z_idx]
        noise_rnd = np.random.RandomState(1) # fix noise
        tflib.set_vars({var: noise_rnd.randn(*var.shape.as_list()) for var in noise_vars}) # [height, width]
        images = Gs.run(z, None, **Gs_kwargs) # [minibatch, height, width, channel]
        imgs.append(PIL.Image.fromarray(images[0], 'RGB'))
    return imgs

# Generate noise input from seeds
def generate_zs_from_seeds(seeds):
    zs = []
    for seed_idx, seed in enumerate(seeds):
        rnd = np.random.RandomState(seed)
        z = rnd.randn(1, *Gs.input_shape[1:]) # [minibatch, component]
        zs.append(z)
    return zs

# Generates a list of images, based on a list of seed for latent vectors (Z), and a list (or a single constant) of truncation_psi's.
def generate_images_from_seeds(seeds, truncation_psi):
    return generate_images(generate_zs_from_seeds(seeds), truncation_psi)

# Save images
def saveImgs(imgs, location):
  for idx, img in log_progress(enumerate(imgs), size = len(imgs), name="Saving images"):
    file = location+ str(idx) + ".png"
    img.save(file)

# Show images in matrix
def imshow(a, format='png', jpeg_fallback=True):
  a = np.asarray(a, dtype=np.uint8)
  str_file = BytesIO()
  PIL.Image.fromarray(a).save(str_file, format)
  im_data = str_file.getvalue()
  try:
    disp = IPython.display.display(IPython.display.Image(im_data))
  except IOError:
    if jpeg_fallback and format != 'jpeg':
      print ('Warning: image was too large to display in format "{}"; '
             'trying jpeg instead.').format(format)
      return imshow(a, format='jpeg')
    else:
      raise
  return disp

def showarray(a, fmt='png'):
    a = np.uint8(a)
    f = StringIO()
    PIL.Image.fromarray(a).save(f, fmt)
    IPython.display.display(IPython.display.Image(data=f.getvalue()))

# clamp x with min and max       
def clamp(x, minimum, maximum):
    return max(minimum, min(x, maximum))
    
# draw latent 
def drawLatent(image,latents,x,y,x2,y2, color=(255,0,0,100)):
  buffer = PIL.Image.new('RGBA', image.size, (0,0,0,0))
   
  draw = ImageDraw.Draw(buffer)
  cy = (y+y2)/2
  draw.rectangle([x,y,x2,y2],fill=(255,255,255,180), outline=(0,0,0,180))
  for i in range(len(latents)):
    mx = x + (x2-x)*(float(i)/len(latents))
    h = (y2-y)*latents[i]*0.1
    h = clamp(h,cy-y2,y2-cy)
    draw.line((mx,cy,mx,cy+h),fill=color)
  return PIL.Image.alpha_composite(image,buffer)
             
# Create image grid
def createImageGrid(images, scale=0.25, rows=1):
   w,h = images[0].size
   w = int(w*scale)
   h = int(h*scale)
   height = rows*h
   cols = ceil(len(images) / rows)
   width = cols*w
   canvas = PIL.Image.new('RGBA', (width,height), 'white')
   for i,img in enumerate(images):
     img = img.resize((w,h), PIL.Image.ANTIALIAS)
     canvas.paste(img, (w*(i % cols), h*(i // cols))) 
   return canvas

# Convert noise to mapping network output
def convertZtoW(latent, truncation_psi=0.7, truncation_cutoff=9):
  dlatent = Gs.components.mapping.run(latent, None) # [seed, layer, component]
  dlatent_avg = Gs.get_var('dlatent_avg') # [component]
  for i in range(truncation_cutoff):
    dlatent[0][i] = (dlatent[0][i]-dlatent_avg)*truncation_psi + dlatent_avg
    
  return dlatent

# interpolate between steps with noise
def interpolate(zs, steps):
   out = []
   for i in range(len(zs)-1):
    for index in range(steps):
     fraction = index/float(steps) 
     out.append(zs[i+1]*fraction + zs[i]*(1-fraction))
   return out

# Taken from https://github.com/alexanderkuk/log-progress
def log_progress(sequence, every=1, size=None, name='Items'):
    from ipywidgets import IntProgress, HTML, VBox
    from IPython.display import display

    is_iterator = False
    if size is None:
        try:
            size = len(sequence)
        except TypeError:
            is_iterator = True
    if size is not None:
        if every is None:
            if size <= 200:
                every = 1
            else:
                every = int(size / 200)     # every 0.5%
    else:
        assert every is not None, 'sequence is iterator, set every'

    if is_iterator:
        progress = IntProgress(min=0, max=1, value=1)
        progress.bar_style = 'info'
    else:
        progress = IntProgress(min=0, max=size, value=0)
    label = HTML()
    box = VBox(children=[label, progress])
    display(box)

    index = 0
    try:
        for index, record in enumerate(sequence, 1):
            if index == 1 or index % every == 0:
                if is_iterator:
                    label.value = '{name}: {index} / ?'.format(
                        name=name,
                        index=index
                    )
                else:
                    progress.value = index
                    label.value = u'{name}: {index} / {size}'.format(
                        name=name,
                        index=index,
                        size=size
                    )
            yield record
    except:
        progress.bar_style = 'danger'
        raise
    else:
        progress.bar_style = 'success'
        progress.value = index
        label.value = "{name}: {index}".format(
            name=name,
            index=str(index or '?')
        )


## Show images from random seeds!

Modify the `n_images` parameter to generate more/fewer images

In [None]:
# generate some random seeds

n_images = 9

seeds = np.random.randint(10000000, size=n_images)
print('Random seeds: ', seeds)

# show the seeds
imshow(createImageGrid(generate_images_from_seeds(seeds, 0.7), 0.7 , 3))

# Interpolation

In the next section, you can choose 2 seeds for random numbers for each image, between which an interpolation is built and saved to a video!

**Note: if running locally, ignore the markdown marcos @markdown and @param**

In [None]:
#@title StyleGAN2 interpolation config
#@markdown Choose seed for the primary noise input **z** (image 1).
seed = 5017689  #@param {type: "slider", min: 0, max: 10000000, step:1}

#@markdown Choose seed for the secondary noise input **z2** (image 2).
seed2 = 1941088  #@param {type: "slider", min: 0, max: 10000000, step:1}

#@markdown Interpolation steps
number_of_steps = 20 #@param {type: "slider", min: 0, max: 10000, step:1}

#@markdown Download resulting video (will be shown below window as well! Choose wisely)
save_video = False #@param {type:"boolean"}
#@markdown ---


In [None]:
from IPython.display import HTML
from base64 import b64encode

# Simple (Z) interpolation: interpolation between two input noises
zs = generate_zs_from_seeds([seed , seed2])

latent1 = zs[0]
latent2 = zs[1]

imgs = generate_images(interpolate([latent1,latent2],number_of_steps), 1.0)
number_of_images = len(imgs)

%mkdir -p out

movieName = 'out/mov_interpolation.mp4'

with imageio.get_writer(movieName, mode='I') as writer:
    for image in log_progress(list(imgs), name = "Creating animation"):
        writer.append_data(np.array(image))

if save_video:
  print("Downloading video")
  download_gdrive_file(movieName)

# Display Video. Can't be put into function, because it modifies the cell itself :/
mp4 = open(movieName,'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML("""
<video width=400 controls>
      <source src="%s" type="video/mp4">
</video>
""" % data_url)

## More complex example

Interpolating between multiple elements of the style mappings, not noise (one for each image)

In [None]:
#@title StyleGAN2 complex interpolation config

#@markdown number of images to interpolate between. You can change the max value if 20 is not enough for your experiments
number_of_images = 8 #@param {type: "slider", min: 0, max: 20, step:1}

#@markdown number of steps of interpolation between pair of two images
number_of_steps = 100 #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose the value for truncation trick.
truncation_psi = 1.0  #@param {type: "slider", min: 0.0, max: 1.0, step: 0.01}

#@markdown Download resulting video (will be shown below window as well! Choose wisely)
save_video = False #@param {type:"boolean"}
#@markdown ---

In [None]:
# more complex example, interpolating in W instead of Z space.
zs = generate_zs_from_seeds(np.random.randint(10000000, size=number_of_images))

# It seems my truncation_psi is slightly less efficient in W space - I probably introduced an error somewhere...

dls = []
for z in zs:
  dls.append(convertZtoW(z ,truncation_psi))

imgs = generate_images_in_w_space(interpolate(dls,number_of_steps), truncation_psi)

%mkdir -p out
movieName = 'out/mov_complex.mp4'

with imageio.get_writer(movieName, mode='I') as writer:
    for image in log_progress(list(imgs), name = "Creating animation"):
        writer.append_data(np.array(image))

if save_video:
  download_gdrive_file(movieName)

# Display Video. Can't be put into function, because it modifies the cell itself :/
mp4 = open(movieName,'rb').read()
data_url = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML("""
<video width=400 controls>
      <source src="%s" type="video/mp4">
</video>
""" % data_url)