<a href="https://colab.research.google.com/github/filipecalegario/stylegan2-ada-experiments/blob/main/StyleGAN2_LatentVectors.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advanced StyleGAN Week 3: Latent Vectors

Pretty much everything we’re going to do in the next couple of weeks will be about manipulating vectors. So in order to get a better sense of what we’re doing we should better understand exactly what a vector is, how to manipulat them, and the difference between a Z and W vector.

In [None]:
!git clone https://github.com/NVlabs/stylegan2-ada-pytorch
%cd stylegan2-ada-pytorch

!pip install ninja

In [None]:
!pip install ninja

Let’s also download a StyleGAN model file. You can import your own, or there are many to pick on the [Awesome StyleGAN2 Pretrained Model page](https://github.com/justinpinkney/awesome-pretrained-stylegan2).

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
!wget http://d36zk2xti64re0.cloudfront.net/stylegan2/networks/stylegan2-cat-config-f.pkl

In [None]:
# %cd ../
%mkdir dvschultz
%cd dvschultz
!git clone https://github.com/dvschultz/stylegan2-ada-pytorch

In [None]:
%cd stylegan2-ada-pytorch/

Set the path to your .pkl file below

Let’s import some libraries (some are python libraries, others are from the StyleGAN repo)

In [None]:
import os
import re

import dnnlib
import numpy as np
import PIL.Image
import torch

import legacy

In [None]:
network_pkl = '/content/drive/MyDrive/GDL-studies/ERN-FUFI/modelo/ernesto-inicia-network-snapshot-000016.pkl'

In [None]:
print('Loading networks from "%s"...' % network_pkl)

device = torch.device('cuda') # we will use a GPU
with dnnlib.util.open_url(network_pkl) as f:
    G = legacy.load_network_pkl(f)['G_ema'].to(device)

## Generating images from a Z space Vector

Now that we’ve loaded our model, we can generate a random vector.

`seeds`, as used in the StyleGAN model, refer to a random seed value. This allows us to generate the same random values every time as long as the seed value is the same.

`G.z_dim` in most cases is 512 (This can be customized, hence why we pull it directly from the model)

In [None]:
seed = 3923
z = np.random.RandomState(seed).randn(1, G.z_dim) 

print(z)

Next, we’ll load this vector into PyTorch.

In [None]:
z = torch.from_numpy(z).to(device)

Now we can generate an image from the vector

In [None]:
truncation_psi = 1
noise_mode = 'const' # 'const', 'random', 'none'
outdir = '/content/drive/MyDrive/GDL-studies/ERN-FUFI/02-09-2021'

# make sure our output directory exists
os.makedirs(outdir, exist_ok=True)

# label is for class-based models. Let's assume we're not doing that here.
label = torch.zeros([1, G.c_dim], device=device)

img = G(z, label, truncation_psi=truncation_psi, noise_mode=noise_mode)
print(img)
img = (img.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8)
print(img)
PIL.Image.fromarray(img[0].cpu().numpy(), 'RGB').save(f'{outdir}/seed{seed:04d}_1_0_random.png')

## Linear Interpolation

Let’s look at how to interpolate between zs.

We’ll start by defining a lerp function.

`z[0]*t + z[1]*(1-t)` where t is time (or steps between each z)

In [None]:
def lerp(zs, steps):
    out = []
    for i in range(len(zs)-1):
        for index in range(steps):
            t = index/float(steps)
            out.append(zs[i+1]*t + zs[i]*(1-t))
    return out

Now let’s create two z vectors, then create the lerp vectors, then render them all as images:

In [None]:
z1 = np.random.RandomState(20).randn(1, G.z_dim)
z2 = np.random.RandomState(100).randn(1, G.z_dim)

frame_zs = lerp([z1,z2], 72)

print('how many lerp frames? ',len(frame_zs))

outdir = '/content/output-frames/'
os.makedirs(outdir, exist_ok=True)

# label is still 0
label = torch.zeros([1, G.c_dim], device=device)

for idx, z in enumerate(frame_zs):
    z = torch.from_numpy(z).to(device)
    print('Generating frame %d/%d' % (idx, len(frame_zs)))
    img = G(z, label, truncation_psi=truncation_psi, noise_mode=noise_mode)
    img = (img.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8)
    PIL.Image.fromarray(img[0].cpu().numpy(), 'RGB').save(f'{outdir}/frame-{idx:04d}.png')


And finally let’s convert it to a video using ffmpeg

In [None]:
!ffmpeg -i /content/output-frames/frame-%04d.png -r 24 -vcodec libx264 -pix_fmt yuv420p /content/lerp.mp4

## W space interpolation

W space produced less entangled interpolations. A couple notes about the W space:

* The process is to take a Z vector, project it to the W space, and then interpolate in the W space.
* Interpolating in Z and then converting to W won’t do much. That’s because a specific vector in Z and in W should look exactly the same.
* You will often start with a Z vector and project it to the W. I can’t see a reason why you would do the opposite (maybe there’s some reason but it would be an edge case)

So let’s start by making two Z vectors and then converting them to two W vectors.

In [None]:
z1 = np.random.RandomState(20).randn(1, G.z_dim)
z2 = np.random.RandomState(100).randn(1, G.z_dim)

zs = [z1,z2]

ws = []
for z_idx, z in enumerate(zs):
    z = torch.from_numpy(z).to(device)
    w = G.mapping(z, label, truncation_psi=truncation_psi, truncation_cutoff=8)
    ws.append(w)

If a Z vector is 512 dimensions (often shown as `[1, 512]`) then a W vector is multiple "stacks" of 512 dimensions. The number of stacks is often dependent on the resolution of the model (it’s also settable in the training config).

If you used the cat model that I do in this demo you find it has a shape of `[1, 14, 512]`. A 1024x1024 model is usually `[1, 18, 512]`.

In [None]:
print(ws[1])

The lerp code is actually the exact same (thanks numpy!)

In [None]:
frame_ws = lerp(ws, 100)

print(len(frame_ws))

print(frame_ws[49].shape)

And now we can generate images by using the `G.synthesis` network

In [None]:
outdir = '/content/output-frames-w/'
os.makedirs(outdir, exist_ok=True)

for idx, w in enumerate(frame_ws): 
    img = G.synthesis(w, noise_mode=noise_mode, force_fp32=True)
    img = (img.permute(0, 2, 3, 1) * 127.5 + 128).clamp(0, 255).to(torch.uint8)
    PIL.Image.fromarray(img[0].cpu().numpy(), 'RGB').save(f'{outdir}/frame-{idx:04d}.png')

In [None]:
!ffmpeg -i /content/output-frames-w/frame-%04d.png -r 24 -vcodec libx264 -pix_fmt yuv420p /content/lerp-w.mp4