# The Parker Puzzle: Other Cut-up 

在当前尝试下，我们通过修改 UV 映射图来体现对原图的裁切-扭曲-拼接，总体代码框架沿用 `parker_puzzle_colab.ipynb` 的内容，但是 uv 的映射图由本地生成。首先明确 uv 映射图的含义：

UV 映射图的一个三通道的 RGB 图片（保存为 `png` 格式），其各个通道含义为：

- R：表示 u 坐标，范围是 [0, 1]；
- G：表示 v 坐标，范围是 [0, 1]；
- B：通常为 0。

需要注意的是，为了保证储存在本仓库的文件，需要 clone 我的仓库而非原仓库。

In [None]:
%%bash
if [ ! -d ".git" ]; then 
    rm -rf * .*; #Get rid of Colab's default junk files
    git clone -b master https://github.com/lh314-pku/Diffusion-Illusions .
fi

In [None]:
%pip install --upgrade -r requirements.txt
%pip install rp --upgrade

关于提示词的部分，该模型大概率未使用中文语料库训练过，还是采用英语。描述时对画面的主体、风格、环境都需要描述。

In [None]:
#SET YOUR PROMPTS HERE

prompt_a = "A ceramic coffee mug with handle viewed from side and steam rising. Beautiful 4k photography unreal engine 3d "
prompt_b = "A donut. A shiny frosted chocolate donut. Beautiful 4k photography unreal engine 3d"


#Optional: Specify what you DON'T want to see
negative_prompt = 'blurry ugly'

print()
print('Negative prompt:',repr(negative_prompt))
print()
print('Chosen prompts:')
print('    prompt_a =', repr(prompt_a)) #This will be right-side up
print('    prompt_b =', repr(prompt_b)) #This will be upside-down

In [None]:
from rp import *
import numpy as np
import rp
import torch
import torch.nn as nn
import source.stable_diffusion as sd
from easydict import EasyDict
from source.learnable_textures import LearnableImageFourier
from source.stable_diffusion_labels import NegativeLabel
from itertools import chain
import time

In [None]:
if 's' not in dir():
    #You can select the original Stable Diffusion 1.5 or some dreambooth of it
    model_name="CompVis/stable-diffusion-v1-4"
    model_name="runwayml/stable-diffusion-v1-5"
    # model_name="nitrosocke/Arcane-Diffusion"
    gpu=rp.select_torch_device()
    s=sd.StableDiffusion(gpu,model_name)
device=s.device

In [None]:
label_a = NegativeLabel(prompt_a,negative_prompt)
label_b = NegativeLabel(prompt_b,negative_prompt)

这里是我觉得论文中最出彩的地方之一：图片的隐式表示。通过将图像视为 “坐标到颜色的映射” 延伸到用 FFN 来表示图像并对对其进行优化。

In [None]:
#Image Parametrization and Initialization (this section takes vram)

#Select Learnable Image Size (this has big VRAM implications!):
#Note: We use implicit neural representations for better image quality
#They're previously used in our paper "TRITON: Neural Neural Textures make Sim2Real Consistent" (see tritonpaper.github.io)
# ... and that representation is based on Fourier Feature Networks (see bmild.github.io/fourfeat)
learnable_image_maker = lambda: LearnableImageFourier(height=256,width=256,num_features=256,hidden_dim=256,scale=10).to(s.device);SIZE=256

image=learnable_image_maker()

这里的 `uv_map_b` 可以替换为不同的 UV 映射图。当前可以采用的有：

- `voronoi_uv.png`：根据沃罗诺伊图对图像平面进行划分，并进行拉伸和拼接。
- `tangram_uv.png`；依据七巧板分割，并重新拼接为爱心图像，

这些 uv 图利用 python 生成，位于 `UVs/UV_map_generate.py`。

In [None]:
import torch
import torch.nn.functional as F

#This is the puzzle Matt used in his video!
uv_map_b = rp.load_image("UVs/tangram_uv.png")
uv_map_a = rp.get_identity_uv_map(*rp.get_image_dimensions(uv_map_b))

rp.display_image(uv_map_a)
rp.display_image(uv_map_b)

learnable_image_a = lambda: rp.apply_uv_map(image(), uv_map_a)
learnable_image_b = lambda: rp.apply_uv_map(image(), uv_map_b)

optim=torch.optim.SGD(image.parameters(),lr=1e-4)

In [None]:
labels=[label_a,label_b]
learnable_images=[learnable_image_a,learnable_image_b]

#The weight coefficients for each prompt. For example, if we have [0,1], then only the upside-down mode will be optimized
weights=[1,1]

weights=rp.as_numpy_array(weights)
weights=weights/weights.sum()
weights=weights*len(weights)

In [None]:
#For saving a timelapse
ims=[]

In [None]:
def get_display_image():
    return rp.tiled_images(
        [
            rp.as_numpy_image(learnable_image_a()),
            rp.as_numpy_image(learnable_image_b()),
        ],
        length=len(learnable_images),
        border_thickness=0,
    )

In [None]:
NUM_ITER=5000

#Set the minimum and maximum noise timesteps for the dream loss (aka score distillation loss)
s.max_step=MAX_STEP=990
s.min_step=MIN_STEP=10 

television = rp.JupyterDisplayChannel()
television.display()

display_eta=rp.eta(NUM_ITER, title='Status')

DISPLAY_INTERVAL = 200

print('Every %i iterations we display an image in the form [image_a, image_b], where'%DISPLAY_INTERVAL)
print('    image_a = (the right-side up image)')
print('    image_b = (image_a, but upside down)')
print()
print('Interrupt the kernel at any time to return the currently displayed image')
print('You can run this cell again to resume training later on')
print()
print('Please expect this to take quite a while to get good images (especially on the slower Colab GPU\'s)! The longer you wait the better they\'ll be')

try:
    for iter_num in range(NUM_ITER):
        display_eta(iter_num) #Print the remaining time

        preds=[]
        for label,learnable_image,weight in rp.random_batch(list(zip(labels,learnable_images,weights)), batch_size=1):
            pred=s.train_step(
                label.embedding,
                learnable_image()[None],

                #PRESETS (uncomment one):
                noise_coef=.1*weight,guidance_scale=100,#10
                # noise_coef=0,image_coef=-.01,guidance_scale=50,
                # noise_coef=0,image_coef=-.005,guidance_scale=50,
                # noise_coef=.1,image_coef=-.010,guidance_scale=50,
                # noise_coef=.1,image_coef=-.005,guidance_scale=50,
                # noise_coef=.1*weight, image_coef=-.005*weight, guidance_scale=50,
            )
            preds+=list(pred)

        with torch.no_grad():
            if iter_num and not iter_num%(DISPLAY_INTERVAL*50):
                #Wipe the slate every 50 displays so they don't get cut off
                from IPython.display import clear_output
                clear_output()

            if not iter_num%(DISPLAY_INTERVAL//4):
                im = get_display_image()
                ims.append(im)
                television.update(im)
                
                if not iter_num%DISPLAY_INTERVAL:
                    rp.display_image(im)

        optim.step()
        optim.zero_grad()
except KeyboardInterrupt:
    print()
    print('Interrupted early at iteration %i'%iter_num)
    im = get_display_image()
    ims.append(im)
    rp.display_image(im)

In [None]:
print('Unsolved Image:')
rp.display_image(rp.as_numpy_image(learnable_image_a()))

print('Solved Image:')
rp.display_image(rp.as_numpy_image(learnable_image_b()))

In [None]:
def save_run(name):
    folder="untracked/parker_puzzle_runs/%s"%name
    if rp.path_exists(folder):
        folder+='_%i'%time.time()
    rp.make_directory(folder)
    ims_names=['ims_%04i.png'%i for i in range(len(ims))]
    with rp.SetCurrentDirectoryTemporarily(folder):
        rp.save_images(ims,ims_names,show_progress=True)
    print()
    print('Saved timelapse to folder:',repr(folder))
    
save_run('-'.join([prompt_a,prompt_b])) #You can give it a good custom name if you want!