# StyleGAN2

![styleGAN2 generated image sample](https://github.com/sony/nnabla-examples/raw/master/image-generation/stylegan2/images/sample.png)

This example demonstrates face image generation using [StyleGAN2](https://github.com/NVlabs/stylegan2). StyleGAN2 is one of the generative models which can generate high-resolution images.


# Preparation
Let's start by installing nnabla and accessing [nnabla-examples repository](https://github.com/sony/nnabla-examples). If you're running on Colab, make sure that your Runtime setting is set as GPU, which can be set up from the top menu (Runtime → change runtime type), and make sure to click **Connect** on the top right-hand side of the screen before you start.

In [1]:
!pip install nnabla-ext-cuda100
!git clone https://github.com/sony/nnabla-examples.git
%cd nnabla-examples/image-generation/stylegan2

Collecting nnabla-ext-cuda100
  Downloading nnabla_ext_cuda100-1.25.0-cp37-cp37m-manylinux_2_17_x86_64.whl (51.1 MB)
[K     |████████████████████████████████| 51.1 MB 190 kB/s 
[?25hCollecting nnabla==1.25.0
  Downloading nnabla-1.25.0-cp37-cp37m-manylinux_2_17_x86_64.whl (18.1 MB)
[K     |████████████████████████████████| 18.1 MB 44.4 MB/s 
Collecting boto3
  Downloading boto3-1.21.44-py3-none-any.whl (132 kB)
[K     |████████████████████████████████| 132 kB 53.1 MB/s 
Collecting configparser
  Downloading configparser-5.2.0-py3-none-any.whl (19 kB)
Collecting jmespath<2.0.0,>=0.7.1
  Downloading jmespath-1.0.0-py3-none-any.whl (23 kB)
Collecting s3transfer<0.6.0,>=0.5.0
  Downloading s3transfer-0.5.2-py3-none-any.whl (79 kB)
[K     |████████████████████████████████| 79 kB 7.8 MB/s 
[?25hCollecting botocore<1.25.0,>=1.24.44
  Downloading botocore-1.24.44-py3-none-any.whl (8.7 MB)
[K     |████████████████████████████████| 8.7 MB 18.8 MB/s 
[?25hCollecting urllib3<1.27,>=1.25.4


# Get the pretrained weights
Now we will get the pretrained weights for styleGAN2, then import some modules and do some preparation for the latter part.

In [2]:
!wget https://nnabla.org/pretrained-models/nnabla-examples/GANs/stylegan2/styleGAN2_G_params.h5
from generate import *
from IPython.display import Image, display
ctx = get_extension_context("cudnn")
nn.set_default_context(ctx)

num_layers = 18
output_dir = 'results'

nn.load_parameters("styleGAN2_G_params.h5")

--2022-04-21 12:21:13--  https://nnabla.org/pretrained-models/nnabla-examples/GANs/stylegan2/styleGAN2_G_params.h5
Resolving nnabla.org (nnabla.org)... 18.160.200.9, 18.160.200.40, 18.160.200.38, ...
Connecting to nnabla.org (nnabla.org)|18.160.200.9|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 121643776 (116M) [binary/octet-stream]
Saving to: ‘styleGAN2_G_params.h5’


2022-04-21 12:21:17 (34.0 MB/s) - ‘styleGAN2_G_params.h5’ saved [121643776/121643776]



2022-04-21 12:21:17,389 [nnabla][INFO]: Initializing CPU extension...
2022-04-21 12:21:19,012 [nnabla][INFO]: Initializing CUDA extension...
2022-04-21 12:21:19,122 [nnabla][INFO]: Initializing cuDNN extension...




# StyleGAN2 input config

In styleGAN2, the noise input **z** is fed to the **mapping network** to produce the latent code **w**. Then **w** is modified via **truncation trick** and finally the modified latent code **w'** is injected to the **synthesis network**.

With multiple latent codes **w'** coming from the **mapping network**, **synthesis network** transforms the incoming tensor and gradually converts it to an image. 

This is how styleGAN2 generates photo-realistic high resolution images. 

In the following cell,  you will choose the random seed used for sampling the noise input **z**, the value for **truncation trick**, and another random seed used for the additional noise input.

In [6]:
#@markdown Choose the seed for noise input **z**. (This drastically changes the result)
latent_seed = 524  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose the value for truncation trick.
truncation_psi = 0.5  #@param {type: "slider", min: 0.0, max: 1.0, step: 0.01}

#@markdown Choose the seed for stochasticity input.  (This slightly changes the result)
noise_seed = 500  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Number of images to generate
batch_size = 1  #@param {type: "slider", min: 0, max: 20, step:1}

# Now let's run StyleGAN2!
Execution the following cell will run the styleGAN2. You can see by changing the value used for **truncation trick**, you will get the different results.

In [None]:
rnd = np.random.RandomState(latent_seed)
z = rnd.randn(batch_size, 512)

nn.set_auto_forward(True) 

style_noise = nn.NdArray.from_numpy_array(z)
style_noises = [style_noise for _ in range(2)] 

rgb_output = generate(batch_size, style_noises, noise_seed, mix_after=7, truncation_psi=truncation_psi) 

images = convert_images_to_uint8(rgb_output, drange=[-1, 1])

# Display all the images
for i in range(batch_size):
  filename = f'seed{latent_seed}_{i}.png'
  imsave(filename, images[i], channel_first=True)
  display(Image(filename, width=512, height=512))

# Try Style Mixing

![styleGAN2 generated image sample](https://github.com/sony/nnabla-examples/raw/master/image-generation/stylegan2/images/style_mixing_sample.png)

As described above, in styleGAN2, **synthesis network** receives latent code **w** multiple times and generates images. In the previous generation, latent code **w** which **synthesis network** receives is made from one single noise input **z**. In this case, we can say that **w** controls the *style* of the generated image.

Given that, with a *different* latent code **w2**, made from another noise input **z2**, **synthesis network** can generate a completely different image. So, what if we use both **w** and **w2**...? That is, *style mixing*.

To be specific, using 2 latent codes **w** and **w2**, **synthesis network** can generate the image which contains both elements (i.e. hair style, face components), present in images made from **w** (controling coarse style) and **w2** (controling fine style).

In the following cell, you will choose one more random seed used for sampling another noise input **z2**. 

You can also choose from which layer it receives the additional latent code **w2**. It slightly changes the result, so try various patterns.

In [8]:
#@title StyleGAN2 style mixing config
#@markdown Choose seed for the primary noise input **z**. This will represent coarse style.
latent_seed = 600  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose seed for the secondary noise input **z2**. This will represent fine style.
latent_seed2 = 500  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose from which layer to use the secondary latent code **w2**.
mix_after = 7  #@param {type: "slider", min: 0, max: 17, step:1}

#@markdown Choose seed for stochasticity input.
noise_seed = 500  #@param {type: "slider", min: 0, max: 1000, step:1}

#@markdown Choose the value for truncation trick.
truncation_psi = 0.5  #@param {type: "slider", min: 0.0, max: 1.0, step: 0.01}

#@markdown Number of images made solely from coarse style noise
batch_size_A = 1  #@param {type: "slider", min: 0, max: 20, step:1}

#@markdown Number of images made solely from fine style noise
batch_size_B = 4  #@param {type: "slider", min: 0, max: 20, step:1}

# Let's run style mixing.

Running this cell executes style mixing and displays a generated mixed image and images made solely from **w** / **w2**.

In [None]:
rnd1 = np.random.RandomState(latent_seed)
z1 = nn.NdArray.from_numpy_array(rnd1.randn(batch_size_A, 512))

rnd2 = np.random.RandomState(latent_seed2)
z2 = nn.NdArray.from_numpy_array(rnd2.randn(batch_size_B, 512))

nn.set_auto_forward(True)

mix_image_stacks = []
for i in range(batch_size_A):
  image_column = []
  for j in range(batch_size_B):
    style_noises = [F.reshape(z1[i], (1, 512)), F.reshape(z2[j], (1, 512))]
    rgb_output = generate(1, style_noises, noise_seed, mix_after, truncation_psi)
    image_column.append(convert_images_to_uint8(rgb_output, drange=[-1, 1])[0])
  image_column = np.concatenate([image for image in image_column], axis=2)
  mix_image_stacks.append(image_column)
mix_image_stacks = np.concatenate([image for image in mix_image_stacks], axis=1)

style_noises= [z1, z1]
rgb_output = generate(batch_size_A, style_noises, noise_seed, mix_after, truncation_psi)
image_A = convert_images_to_uint8(rgb_output, drange=[-1, 1])
image_A = np.concatenate([image for image in image_A], axis=1)

style_noises = [z2, z2]
rgb_output = generate(batch_size_B, style_noises, noise_seed, mix_after, truncation_psi)
image_B = convert_images_to_uint8(rgb_output, drange=[-1, 1])
image_B = np.concatenate([image for image in image_B], axis=2)

top_image = 255 * np.ones(rgb_output[0].shape).astype(np.uint8)

top_image = np.concatenate((top_image, image_B), axis=2)
grid_image = np.concatenate((image_A, mix_image_stacks), axis=2)
grid_image = np.concatenate((top_image, grid_image), axis=1)

imsave("grid.png", grid_image, channel_first=True)
display(Image("grid.png", width=256*(batch_size_B+1), height=256*(batch_size_A+1)))