## **Latent vector Analysis**

Image2Style을 통해 다양한 이미지를 latent vector로 embedding하는 실험을 진행했고, 결과를 분석하고자 한다.

In [None]:
import torch
import torch.nn as nn

import torchvision 
from torchvision import transforms
from torchvision.utils import make_grid
from torchvision.transforms import ToPILImage
from torchvision.utils import save_image
from collections import OrderedDict
from PIL import Image
import matplotlib.pyplot as plt
import numpy as np
import cv2

In [None]:
os.chdir('PyTorch-StyleGAN-Face-Editting')
os.getcwd()

In [None]:
from stylegan_model import G_mapping
from stylegan_model import G_synthesis

device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
print(device)
resolution = 1024
weight_file = 'weights/karras2019stylegan-ffhq-1024x1024.pt'

g_all = nn.Sequential(OrderedDict([
    ('g_mapping', G_mapping()),
    ('g_synthesis', G_synthesis(resolution=resolution))    
]))
g_all.load_state_dict(torch.load(weight_file, map_location=device))
g_all.eval()
g_all.to(device)
 
g_mapping, g_synthesis = g_all[0], g_all[1]

In [None]:
os.chdir('../')
os.getcwd()

In [None]:
# Read an image from a file
def image_reader(image_path, resize=None):
    with open(image_path, "rb") as f:
        image = Image.open(f)
        image = image.convert("RGB")
    if resize != None:
        image = image.resize((resize, resize))
    transform = transforms.Compose([
        transforms.ToTensor() # [0, 1]
    ])
    image = transform(image)
    image = image.unsqueeze(0) # (N, C, H, W)
    return image

# 원본이미지와 embedding을 통헤 구한 latent vector를 통해 만들어낸 이미지를 나란히 그리드로 만들어서 반환 
def make_grid_function(img_path, latent_path):
    img_orig = image_reader(img_path, resize=1024)
    latent_code = torch.tensor(np.load(latent_path)).to(device)
    generated_image = g_synthesis(latent_code)
    generated_image = (generated_image + 1.0) / 2.0
    generated_image = generated_image.clamp(0, 1)
    final_result = torch.cat([img_orig, generated_image.detach().cpu()])
    result_image = ToPILImage()(make_grid(final_result, normalize=True, scale_each=True, range=(0, 1), padding=0))
    
    return result_image

In [None]:
img_names = ['고윤정', '공유', '디카프리오', '수지', '아이유', '엠마스톤', '오바마_1', '오바마_2', '제니', '로제', '푸린', '호날두', '휴잭맨', 'cat', '조커', 'tiger_cartoon', 'Jablonski']

# 원본 이미지, 원본 이미지에 대응하는 latent vector에 해당하는 주소 반환
def return_paths(img_name):
    img_path = os.path.join("PyTorch-StyleGAN-Face-Editting", "images", img_name+'.jpg')
    latent_path = os.path.join("latent vectors", img_name + '_latent.npy')

    return img_path, latent_path


In [None]:
# 테스트
img_path, latent_path = return_paths('고유정')
print(img_path)
print(latent_path)

### **Good case**

결과가 **좋은 케이스**들은 대체적으로 다음과 같은 특징을 가지고 있다는 것을 실험적으로 확인했다.

1. 서양인임. 
2. input으로 들어가는 이미지에 noise가 거의 없음, 화질 좋음.
3. 사진에 얼굴이 차지하는 비율이 거의 80퍼~90퍼(얼굴외에 손이나 기타 다른 부위 혹은 배경들이 거의 보이지 않음)
5. 옆모습이나 측면보다 정면샷으로 찍은 얼굴이 더 잘 나옴.

결과가 좋은 케이스-> 엠마스톤, 오바마1, 휴잭맨, celeba1, 제니, cat, 조커 

In [None]:
img_path, latent_path = return_paths('엠마스톤')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

In [None]:
img_path, latent_path = return_paths('오바마1')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

In [None]:
img_path, latent_path = return_paths('Jablonski')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

In [None]:
img_path, latent_path = return_paths('제니')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

In [None]:
img_path, latent_path = return_paths('로제')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

In [None]:
img_path, latent_path = return_paths('조커')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

### **BAD case**

결과가 **나쁜 케이스**들은 대체적으로 다음과 같은 특징을 가지고 있다는 것을 실험적으로 확인했다.

1. 동양인이 대체로 많음. -> StyleGAN 자체가 FFHQ로 Pretrain되었는데, 동양인의 특징을 잘 반영할 수 있는 이미지들이 부족하지 않았나 생각..(특히 눈)
2. input으로 들어가는 이미지에 noise가 많음, 화질 안 좋음.
3. 사진에 얼굴 외에도 다른 부가적인 요소들이 많음. (EX. 화려한 배경, 옷)
4. 얼굴 중 특히 눈에 대해서 자연스러운 이미지 생성 X
5. 포켓몬스터 -> 1.과 마찬가지로 pretrain시 한번도 보지 않았을 이미지일 확률 높음.

결과가 안 좋은 케이스-> 고윤정, 공유, 수지, 아이유, 오바마_2, 푸린

In [None]:
img_path, latent_path = return_paths('고윤정')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

In [None]:
img_path, latent_path = return_paths('수지')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

In [None]:
img_path, latent_path = return_paths('공유')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

In [None]:
img_path, latent_path = return_paths('호날두')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

In [None]:
img_path, latent_path = return_paths('푸린')
result_image = make_grid_function(img_path, latent_path)
h, w = result_image.size
result_image.resize((h // 2, w // 2))

### **Latent Interpolation**

앞서 결과가 잘 나오는 latent vector를 통해, latent Interpolation 진행

#### Rose to Emma Stone

In [None]:
img_path1, latent_path1 = return_paths('로제')
img_path2, latent_path2 = return_paths('엠마스톤')

In [None]:
latent1 = torch.tensor(np.load(latent_path1)).to(device) # 로제 
latent2 = torch.tensor(np.load(latent_path2)).to(device) # 엠마스톤

In [None]:
#Image Interpolation Experiment
num_imgs = 7 # 변화 단계 설정
for i in range(num_imgs):
  a = (1/num_imgs)*i
  w = latent1 * (1-a)+ latent2 * a
  w = w.to(device)
  syn_img = g_synthesis(w)
  syn_img = (syn_img+1.0)/2.0
  save_image(syn_img.clamp(0,1),"latent Interpolation/Rose_to_Emma/Rose_to_Emma{}.png".format(i))

In [None]:
inter_imgs_dir = os.listdir('latent Interpolation/Rose_to_Emma')
inter_imgs = torch.tensor([])
for idx in range(num_imgs):
    total_path = os.path.join('latent Interpolation/Rose_to_Emma','Rose_to_Emma'+str(idx)+'.png')
    img = image_reader(total_path, resize=1024)
    inter_imgs = torch.cat([inter_imgs, img], dim=0)

print(inter_imgs.shape)
result_image = ToPILImage()(make_grid(inter_imgs))
result_image

#### Rose to Jennie

In [None]:
img_path1, latent_path1 = return_paths('로제')
img_path2, latent_path2 = return_paths('제니')

In [None]:
latent1 = torch.tensor(np.load(latent_path1)).to(device) # 로제 
latent2 = torch.tensor(np.load(latent_path2)).to(device) # 제니

In [None]:
#Image Interpolation Experiment
num_imgs = 7
for i in range(num_imgs):
  a = (1/num_imgs)*i
  w = latent1 * (1-a)+ latent2 * a
  w = w.to(device)
  syn_img = g_synthesis(w)
  syn_img = (syn_img+1.0)/2.0
  save_image(syn_img.clamp(0,1),"latent Interpolation/Rose_to_Jennie/Rose_to_Jennie{}.png".format(i))

In [None]:
inter_imgs = torch.tensor([])
for idx in range(num_imgs):
    total_path = os.path.join('latent Interpolation/Rose_to_Jennie','Rose_to_Jennie'+str(idx)+'.png')
    img = image_reader(total_path, resize=1024)
    inter_imgs = torch.cat([inter_imgs, img], dim=0)

print(inter_imgs.shape)
result_image = ToPILImage()(make_grid(inter_imgs))
result_image

#### Obama to Joker

In [None]:
img_path1, latent_path1 = return_paths('오바마1')
img_path2, latent_path2 = return_paths('조커')

In [None]:
latent1 = torch.tensor(np.load(latent_path1)).to(device) # 오바마
latent2 = torch.tensor(np.load(latent_path2)).to(device) # 조커

In [None]:
#Image Interpolation Experiment
num_imgs = 7
for i in range(num_imgs):
  a = (1/num_imgs)*i
  w = latent1 * (1-a)+ latent2 * a
  w = w.to(device)
  syn_img = g_synthesis(w)
  syn_img = (syn_img+1.0)/2.0
  save_image(syn_img.clamp(0,1),"latent Interpolation/Obama_to_Joker/Obama_to_Joker{}.png".format(i))

In [None]:
inter_imgs = torch.tensor([])
for idx in range(num_imgs):
    total_path = os.path.join('latent Interpolation/Obama_to_Joker','Obama_to_Joker'+str(idx)+'.png')
    img = image_reader(total_path, resize=1024)
    inter_imgs = torch.cat([inter_imgs, img], dim=0)

print(inter_imgs.shape)
result_image = ToPILImage()(make_grid(inter_imgs))
result_image