Copyright 2021 DeepMind Technologies Limited

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at [https://www.apache.org/licenses/LICENSE-2.0](https://www.apache.org/licenses/LICENSE-2.0).
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


# Arnheim 3 - Collage

**Piotr Mirowski, Dylan Banarse, Mateusz Malinowski, Yotam Doron, Oriol Vinyals, Simon Osindero, Chrisantha Fernando**

DeepMind, 2021

![picture](https://github.com/deepmind/arnheim/raw/main/images/arnheim3_examples.png)
Clockwise from top left: "Sri Lankan objects" (200 transparent patches); "Waves" (70 masked transparent patches with background); "Fruit bowl" (100 opaque patches); "Fruit bowl" (100 transparent patches); "Face" (7 opaque patches); "Swans" (100 masked transparent patches); "Chicken" (70 masked transparent patches); "Dancer" (40 transparent patches). See description in the [videos](https://www.youtube.com/watch?v=HKDQsrO5xF4&list=PLKhLdFXp1JN5SEV56w9OWWsT5pAz9z7G_) for settings.

##An Exploration of Architectures and Losses for Painting and Drawing

Arnheim 3 is an algorithm which generates collages by training (by gradient descent) a network that applies affine transformations, i.e translation, scaling, rotation, and shear, to a set of image patches. The set of image patches is subject to evolution in the outer optimisation loop. 

The signal for how good an image is comes from CLIP, a text-image dual encoder. This work simplifies and extends Arnheim 2 which also used CLIP but generated SVG strokes using a more complex hierarchical stroke grammar. 

Here you can experiment with a variety of rendering methods for combining patches in a learnable way.

##Quickstart
1. Click "Connect" in the top right corner.
1. Select __"Runtime -> Run all"__.

Play around:
* Experiment with basic settings in the the __Configure Collage__ cell.
* Under the __Advanced Parameters__ heading are several sections for more detailed control over collage creatio. Read the paper for insight into the different settings.
* After changing an setting, select menu option __"Runtime -> Run after"__ to run all subsequent cells to generate a collage.

**Note that the Colab can easily run out of memory with large populations, many patches and large patch sizes! If you start to encounter CUDA memory issues, try lowering the number of patches and restarting the Colab.**

#More details


##New Features
1. Tiling

  Multiple images can now be tiled to create arbitrary large images. The individual images (referred to as *tiles*) are drawn sequentially starting at the top left. All the tiles overlap each other so the drawing process can blend content of neighbouring tiles. 

1. Compositional Images

  Uses 3x3 prompts covering over-lapping regions of the image to specify different content across the whole image. The main prompt guides the direction of overall image.

1. Coloured Background

  User-selectable background colour or use of uploaded images.

1. Interactive Patch Placement

  Stop the "Create collage loop" cell at any time and run the "Tinker with patches" cell below it to adjust individual patches with sliders. Then re-run the "Create collage loop" cell to continue generation.

##Tips

**Compositional** uses 10 parallel CLIP evaluators; nine in a 3x3 overlapping configuration covering the image, and the tenth evaluating the whole image. Each region of the image can be given a different prompt, together with the global prompt. For example, a global prompt of "A realistic landscape" can be combined with the following local prompts and settings:
* NUM_PATCHES = 70
* COMPOSITIONAL_IMAGE = ON
* PROMPT_X0_Y0 = "a photorealistic sky with sun"
* PROMPT_X1_Y0 = "a photorealistic sky"
* PROMPT_X2_Y0 = "a photorealistic sky with moon"
* PROMPT_X0_Y1 = "a photorealistic tree"
* PROMPT_X1_Y1 = "a photorealistic tree"
* PROMPT_X2_Y1 = "a photorealistic tree"
* PROMPT_X0_Y2 = "a photorealistic field"
* PROMPT_X1_Y2 = "a photorealistic field"
* PROMPT_X2_Y2 = "a photorealistic chicken"

This process is more memory intensive so reducing the number of patches per tile helps avoid out of memory errors. 

**Tiling** produces hard edges if patches go outside the tile canvas. To alleviate this restrict the patch translation and keep them relatively small, try using these settings:

* MIN_TRANS = -0.66
* MAX_TRANS = 0.8
* PATCH_MAX_PROPORTION = 5
* FIXED_PATCH_SCALE = OFF

**opacity rendering** uses alpha and depth to render semi-opaque overlapping patches which allow gradients to be used during learning. The translucency is reduced over the course of learning to end with opaque patches. When using a small number of patches evolution can perform better than learning alone. For example, with only 7 patches, a population of 10 with the Evolutionary Strategies method applied at every step can yield good results. The settings to get the face image above were:

* GLOBAL_PROMPT = “Face”
* 7 patches
* opacity
* 400 steps
* ES evolution every step
* POP_SIZE = 10
* ROT_POS_MUTATION = 0.05
* SCALE_MUTATION = 0.02
* PATCH_MUTATION = 0.2

**Transparency rendering** works well as gradients are more effective. Note that colours are additive so setting INITIAL_MIN_RGB=0.1 and INITIAL_MAX_RGB=0.5 helps reduce bleaching. Something to try could be:

* 80 patches
* Transparency
* INITIAL_MIN_RGB = 0.1
* INITIAL_MAX_RGB = 0.5
* 15000 steps
* Microbial GA every 100 steps
* POP_SIZE = 2
* LEARNING_RATE = 0.07
* PATCH_MUTATION_PROBABILITY = 1
* GLOBAL_PROMPT = "Swans on a pond"

**Masked Transparency rendering** also works well as gradients are more effective.  Something to try could be:

* 200 patches
* Masked Transparency clipped
* INITIAL_MIN_RGB = 0.7
* INITIAL_MAX_RGB = 1.0
* 15000 steps
* Microbial GA every 100 steps
* POP_SIZE = 2
* LEARNING_RATE = 0.07
* PATCH_MUTATION_PROBABILITY = 1
* GLOBAL_PROMPT = "Swans on a pond"


# Preliminaries

In [1]:
#@title Installation of libraries {vertical-output: true}

!nvidia-smi -L

import subprocess
import os

os.environ['CUDA_VISIBLE_DEVICES'] = "1" ##

CUDA_version = [s for s in subprocess.check_output(["nvcc", "--version"]).decode("UTF-8").split(", ") if s.startswith("release")][0].split(" ")[-1]
print("CUDA version:", CUDA_version)

if CUDA_version == "10.0":
  torch_version_suffix = "+cu100"
elif CUDA_version == "10.1":
  torch_version_suffix = "+cu101"
elif CUDA_version == "10.2":
  torch_version_suffix = ""
else:
  torch_version_suffix = "+cu110"

# %cd /content/
# !pip install cssutils
# !pip install torch-tools
# !pip install visdom
# !pip install kornia==0.6.0
# !pip install ftfy regex tqdm
# !pip install git+https://github.com/openai/CLIP.git --no-deps
# !pip3 install imageio==2.4.1
# !pip3 install moviepy
# !pip install -U scikit-image
# !pip install opencv-python ##
# #!pip install torch==1.8.1+cu101 torchvision==0.9.1+cu101 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
# !pip install torch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1
# !git clone https://github.com/deepmind/arnheim.git
# !pip3 install zipfile

GPU 0: TITAN Xp (UUID: GPU-83b4675b-e839-4487-c036-931039191c0b)
GPU 1: TITAN Xp (UUID: GPU-2dd9ecde-77e9-844c-4c86-623340e1135d)
GPU 2: TITAN Xp (UUID: GPU-395cfbd7-7f65-c5e8-b09d-b25c63e831f8)
GPU 3: TITAN Xp (UUID: GPU-f0eec441-be13-aa19-6ff0-8bb6a4439571)
GPU 4: TITAN Xp (UUID: GPU-d6bd63fb-96c8-694d-a7d7-137bdc963d6f)
GPU 5: TITAN Xp (UUID: GPU-b3d839de-edde-165b-0f9a-d90b9182e701)
GPU 6: TITAN Xp (UUID: GPU-1c4af742-41e2-c232-057a-9ac7d0c26539)
GPU 7: TITAN Xp (UUID: GPU-809be035-99e1-e279-1ddc-8a05839ac090)
GPU 8: TITAN Xp (UUID: GPU-7ffb3ec2-5ab4-d00f-522c-177545db431e)
GPU 9: TITAN Xp (UUID: GPU-b90bff4a-8021-2344-2e6e-b7b2e079aa1d)
CUDA version: 10.1


## Imports and libraries

In [2]:
#@title Imports {vertical-output: true}
import clip
import copy
import cv2
import datetime
import glob
# from google.colab import drive
# from google.colab import files
# from google.colab.patches import cv2_imshow
import matplotlib.pyplot as plt #for server
import imageio
import io
from kornia.color import hsv
from matplotlib import pyplot as plt
import moviepy.editor as mvp
from moviepy.video.io.ffmpeg_writer import FFMPEG_VideoWriter
import numpy as np
# import os
import pathlib
import random
import requests
from skimage.transform import resize
import time
import torch, gc
import torch.nn.functional as F
import torchvision.transforms as transforms
import yaml
import zipfile ##
os.environ["FFMPEG_BINARY"] = "ffmpeg"
print("Torch version:", torch.__version__)

import arnheim_3.src #arnheim. 생략
import arnheim_3.src.collage as collage #arnheim. 생략
import arnheim_3.src.video_utils as video_utils #arnheim. 생략


Torch version: 1.12.1+cu102


In [3]:
#@title Initialise and load CLIP model {vertical-output: true}

torch_device="cuda"
device = torch.device(torch_device)
CLIP_MODEL = "ViT-B/32"
print(f"Downloading CLIP model {CLIP_MODEL}...")
clip_model, _ = clip.load(CLIP_MODEL, device, jit=False)

LOCAL_PATCH_SET = None
LOCAL_PATCH_FILE = None

!nvidia-smi

Downloading CLIP model ViT-B/32...
Fri Jan 13 23:05:01 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.59       Driver Version: 440.59       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  TITAN Xp            Off  | 00000000:07:00.0 Off |                  N/A |
| 26%   46C    P2    60W / 250W |   9873MiB / 12196MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  TITAN Xp            Off  | 00000000:08:00.0 Off |                  N/A |
| 23%   37C    P2    59W / 250W |    983MiB / 12196MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  TITAN Xp            Off  | 00000000:0B:00.0 Of

# Configure Collage

In [4]:
#@markdown ##Prompt
#@markdown Enter a **global** description of the image, e.g. 'a photorealistic chicken':
# PROMPT = "A photorealistic chicken"  #@param {type:"string"}

GLOBAL_PROMPT = "a picture of among us player"  #@param {type:"string"}

#@markdown ##Rendering
#@markdown * **opacity** - patches are mostly opaque.
#@markdown * **masked_transparency_clipped** - blended patches appear opaque on background.
#@markdown * **transparency** - colours are added so black is transparent.
#@markdown * **masked transparency normed** - very translucent blending.
RENDER_METHOD = "transparency"  #@param ["opacity", "masked_transparency_clipped", "transparency", "masked_transparency_normed"]

#@markdown ##Patch settings
#@markdown Select a patch set  - select example sets or your own:
EXAMPLE_PATCH_SET = "Sea glass" #@param ['Fruit and veg', 'Sea glass', 'Handwritten MNIST', 'Animals', 'Waste', 'Human artefacts', 'Leaves', 'Broken plate', 'NONE OF ABOVE (use advanced options)']

#@markdown Number of patches to use in image:
NUM_PATCHES =   400 #@param {type:"integer"}

#**opacity** patches overlay each other using a combination of alpha and depth,
#**transparency** _adds_ patch colours (black therefore appearing transparent),
#**masked transparency normed** blends patches using a normalised alpha channel where areas of maximum patch overlap are opaque and all other areas are translucent.
#and **masked transparency clipped** blends patches using a clipped alpha channel where all regions with alpha > 1 are opaque.

#@markdown ##Optimisation steps
#@markdown More optimisation steps generally produce better results but take longer:
OPTIM_STEPS = 2000  #@param{type:"slider", min:200, max:20000, step:100}

#@markdown ##Monitor and visualisation
#@markdown How often to show progress image:
TRACE_EVERY =   400#@param {type:"integer"}
#@markdown How often to create a frame for video animation:
VIDEO_STEPS =   100#@param {type:"integer"}



# Advanced Settings

## Spatial and Colour Transforms

In [5]:
#@title Collage configuration
COLOUR_TRANSFORMATIONS = "RGB space"  #@param ["none", "RGB space", "HSV space"]
#@markdown Invert image colours to have a white background?
INVERT_COLOURS = False #@param {type:"boolean"}

CANVAS_WIDTH = 224 ##224
CANVAS_HEIGHT = 224 ##224
MULTIPLIER_BIG_IMAGE = 18 ##4

In [6]:
#@title Affine transform settings

#@markdown Initial translation bounds for X and Y.
INITIAL_MIN_TRANS = -1.0  #@param{type:"slider", min:-1.0, max:1.0, step:0.01}
INITIAL_MAX_TRANS = 1.0  #@param{type:"slider", min:-1.0, max:1.0, step:0.01}
#@markdown Translation bounds for X and Y.
MIN_TRANS = -1  #@param{type:"slider", min:-1.0, max:1.0, step:0.01}
MAX_TRANS = 1  #@param{type:"slider", min:-1.0, max:1.0, step:0.01}
#@markdown Scale bounds (> 1 means zoom out and < 1 means zoom in).
MIN_SCALE =   1#@param {type:"number"}
MAX_SCALE =   2#@param {type:"number"
#@markdown Bounds on ratio between X and Y scale (default 1).
MIN_SQUEEZE =   0.5#@param {type:"number"}
MAX_SQUEEZE =   2.0#@param {type:"number"}
#@markdown Shear deformation bounds (default 0).
MIN_SHEAR = -0.2  #@param{type:"slider", min:-1.0, max:1.0, step:0.01}
MAX_SHEAR = 0.2  #@param{type:"slider", min:-1.0, max:1.0, step:0.01}
#@markdown Rotation bounds.
MIN_ROT_DEG = -180 #@param{type:"slider", min:-180, max:180, step:1}
MAX_ROT_DEG = 180 #@param{type:"slider", min:-180, max:180, step:1}
MIN_ROT = MIN_ROT_DEG * np.pi / 180.0
MAX_ROT = MAX_ROT_DEG * np.pi / 180.0


In [7]:
#@title Colour transform settings

#@markdown RGB
MIN_RGB = -0.21  #@param {type:"slider", min: -1, max: 1, step: 0.01}
MAX_RGB = 1.0  #@param {type:"slider", min: 0, max: 1, step: 0.01}
INITIAL_MIN_RGB = 0.8  #@param {type:"slider", min: 0, max: 1, step: 0.01}
INITIAL_MAX_RGB = 0.9  #@param {type:"slider", min: 0, max: 1, step: 0.01}
#@markdown HSV
MIN_HUE = 0.  #@param {type:"slider", min: 0, max: 1, step: 0.01}
MAX_HUE_DEG = 360 #@param {type:"slider", min: 0, max: 360, step: 1}
MAX_HUE = MAX_HUE_DEG * np.pi / 180.0
MIN_SAT = 0.  #@param {type:"slider", min: 0, max: 1, step: 0.01}
MAX_SAT = 1.  #@param {type:"slider", min: 0, max: 1, step: 0.01}
MIN_VAL = 0.  #@param {type:"slider", min: -1, max: 1, step: 0.01}
MAX_VAL = 1.  #@param {type:"slider", min: 0, max: 1, step: 0.01}

## Optimisation Settings

In [8]:
#@title Training settings

# Reasonable defaults:
# OPTIM_STEP = 10000 to 20000
# LEARNING_RATE = 0.05
# NUM_AUGS = 16
# GRADIENT_CLIPPING = 10.0
# USE_NORMALIZED_CLIP = True

LEARNING_RATE = 0.05    #@param{type:"slider", min:0.0, max:0.6, step:0.01}
#@markdown Number of augmentations to use in evaluation:
USE_IMAGE_AUGMENTATIONS = True #@param{type:"boolean"}
NUM_AUGS = 16  #@param {type:"integer"}

#@markdown Normalize colours for CLIP, generally leave this as True:
USE_NORMALIZED_CLIP = False  #@param {type:"boolean"}

#@markdown Gradient clipping during optimisation:
GRADIENT_CLIPPING = 10.0  #@param {type:"number"}

#@markdown Initial random search size (1 means no search):
INITIAL_SEARCH_SIZE = 1 #@param {type:"slider", min:1, max:50, step:1}

#@markdown Number of gradient steps in initial search (1 means no optimisation):
INITIAL_SEARCH_NUM_STEPS = 1 #@param {type:"slider", min:1, max:100, step:1}


In [9]:
#@title Evolution settings

# Reasonable defaults:
# POP_SIZE = 2
# EVOLUTION_FREQUENCY = 100
# MUTION SCALES = ~0.1
# MAX_MULTIPLE_VISUALISATIONS = 7

#@markdown For evolution set POP_SIZE greater than 1:
POP_SIZE =    2  #@param{type:"slider", min:1, max:100}
EVOLUTION_FREQUENCY =  100#@param {type:"integer"}

#@markdown ### Genetic algorithm methods

#@markdown **Microbial** - loser of randomly selected pair is replaced by mutated winner. A low selection pressure.

#@markdown **Evolutionary Strategies** - mutantions of the best individual replace the rest of the population. Much higher selection pressure than Microbial GA.
GA_METHOD = "Microbial"  #@param ["Evolutionary Strategies", "Microbial"]
#@markdown ### Mutation levels
#@markdown Scale mutation applied to position and rotation, scale, distortion, colour and patch swaps.
POS_AND_ROT_MUTATION_SCALE = 0.02  #@param{type:"slider", min:0.0, max:0.3, step:0.01}
SCALE_MUTATION_SCALE = 0.02  #@param{type:"slider", min:0.0, max:0.3, step:0.01}
DISTORT_MUTATION_SCALE = 0.02  #@param{type:"slider", min:0.0, max:0.3, step:0.01}
COLOUR_MUTATION_SCALE = 0.02  #@param{type:"slider", min:0.0, max:0.3, step:0.01}
PATCH_MUTATION_PROBABILITY = 1  #@param{type:"slider", min:0.0, max:1.0, step:0.1}
#@markdown Limit the number of individuals shown during training.
MAX_MULTIPLE_VISUALISATIONS =   5#@param {type:"integer"}
#@markdown Save video of population sample over time.
POPULATION_VIDEO = True  #@param (type:"boolean")

USE_EVOLUTION = POP_SIZE > 1

In [10]:
# @title Saving images on Drive
#@markdown Displayed results can also be stored on Google Drive.
STORE_ON_GOOGLE_DRIVE = False  #@param {type:"boolean"}
GOOGLE_DRIVE_RESULTS_DIR = ""  #@param {type:"string"}

MOUNT_DIR = "/content/drive"

if STORE_ON_GOOGLE_DRIVE:
  from google.colab import drive
  drive.mount(MOUNT_DIR)
  DIR_RESULTS = pathlib.PurePath(MOUNT_DIR, "MyDrive", GOOGLE_DRIVE_RESULTS_DIR)
  print(f"Storing results on Google Drive in {DIR_RESULTS}")
else:
  DIR_RESULTS = "./results"
  DIR_RESULTS += datetime.datetime.strftime(
      datetime.datetime.now(), '%Y%m%d_%H%M%S')
  print(f"Storing results in Colab in {DIR_RESULTS}")

pathlib.Path(DIR_RESULTS).mkdir(parents=True, exist_ok=True)

Storing results in Colab in ./results20230113_230502


## Patch Settings

In [11]:
#@title Select segmented patches

#@markdown Load patch sets from elsewhere:
ADVANCED_PATCH_SET = "Load from Google Drive" #@param ["Upload to Colab", "Load from URL", "Load from Google Drive", "Multiple (below)"]
#@markdown URL if downloading .npy file from website:
URL_TO_PATCH_FILE = "https://github.com/deepmind/arnheim/tree/main/collage_patches" #@param {type:"string"}
#@markdown Path if loading .npy file from Google Drive:
DRIVE_PATH_TO_PATCH_FILE = "" #@param {type:"string"}
#@markdown Reuse uploaded patch file if "Upload to Colab" selected:
REUSE_UPLOAD = True  #@param {type: "boolean"}

examples = {"Fruit and veg" : "fruit.npy", 
            "Sea glass" : "shore_glass.npy",
            "Handwritten MNIST" : "handwritten_mnist.npy",
            "Animals" : "animals.npy",
            "Animal Forms": "animal_forms.npy",
            "Plant Forms": "plant_forms.npy",
            "Waste": "waste.npy",
            "Human artefacts": "human_artefacts.npy",
            "Leaves": "open_leaves.npy",
            "Broken plate": "broken_plate.npy",
            "Natural forms": "natural_forms.npy"
            }

# Example patch set selection overrides settings here.
if EXAMPLE_PATCH_SET in examples:
  repo_root = "https://storage.googleapis.com/dm_arnheim_3_assets"
  URL_TO_PATCH_FILE=f"{repo_root}/collage_patches/{examples[EXAMPLE_PATCH_SET]}"
  PATCH_SET = "Load from URL"
else:
  PATCH_SET = ADVANCED_PATCH_SET
  print("don't exits patches in URL")

if PATCH_SET == "Load from Google Drive":
  drive.mount(MOUNT_DIR)
  URL_TO_PATCH_FILE = str(pathlib.PurePath(MOUNT_DIR, "MyDrive", 
                          DRIVE_PATH_TO_PATCH_FILE))
  patchset_filename = os.path.basename(URL_TO_PATCH_FILE)
  # Copy patch set file locally
  !cp {URL_TO_PATCH_FILE} {patchset_filename}
  print("Using patch set", URL_TO_PATCH_FILE)

if PATCH_SET == "Upload to Colab":
  if (REUSE_UPLOAD
      and LOCAL_PATCH_SET is not None
      and LOCAL_PATCH_FILE is not None):
    PATCH_SET = LOCAL_PATCH_SET
    URL_TO_PATCH_FILE = LOCAL_PATCH_FILE
  else: 
    from google.colab import files
    uploaded = files.upload()
    for k, v in uploaded.items():
      !rm -f {k}
      open(k, 'wb').write(v)
      print('User patch file saved to "{name}" with length {length} bytes'.format(
          name=k, length=len(v)))
      PATCH_SET = k
      URL_TO_PATCH_FILE = f"./{k}"
      LOCAL_PATCH_SET = PATCH_SET
      LOCAL_PATCH_FILE = URL_TO_PATCH_FILE

In [12]:
#@title Edit this cell for multiple patch set support

# Define lists below to use different patch settings for each tile.
# Settings are used in order for each tile, with the list repeated as necessary.
# SETTING THESE WILL OVERRIDE THE PATCH SETTINGS IN THE FOLLOWING CELL.
# Set these to empty strings ("") to disable their use.
#
# For example 
# MULTIPLE_PATCH_SET=["shore_glass.npy", "animals.npy"]
# will use patches shore_glass for the first tile and animals for the second.
# Because the list is repeated if necessary, if there are more tiles then 
# shore_glass will be used for all the odd tiles animals for all the even.

# Use the npy file names here. They are loaded from PATCH_REPO_ROOT set here.
PATCH_REPO_ROOT="https://storage.googleapis.com/dm_arnheim_3_assets/collage_patches"
MULTIPLE_PATCH_SET=["human_artefacts.npy", "waste.npy", "animal_form.npy", "vegetal_form.npy"]  # e.g.  ["shore_glass.npy", "animals.npy"]
MULTIPLE_FIXED_SCALE_PATCHES=""  # e.g.  [true, true, false]
MULTIPLE_FIXED_SCALE_COEFF=""  # e.g.  [0.8, 0.3]
MULTIPLE_PATCH_MAX_PROPORTION=""  # e.g. [3, 5, 5]

In [13]:
#@title Image patch sizing for low- and high-res.

#@markdown Scale patches to within (image size / PATCH_MAX_PROPORTION). 
#@markdown E.g. 5 produces small patches good for tiled images.
PATCH_MAX_PROPORTION =  4  #@param{type:"slider", min:2, max:8, step:1}

#@markdown Alternatively, scale all patches by same amount.
FIXED_SCALE_PATCHES = False #@param {type:"boolean"}
FIXED_SCALE_COEFF =   0.3#@param {type:"number"}

#@markdown Brighten patches.
NORMALIZE_PATCH_BRIGHTNESS = False  #@param {type: "boolean"}

PATCH_WIDTH_MIN = 16  
PATCH_HEIGHT_MIN = 16 


## Background, Composition and Tiling
and Background Settings

In [14]:
# @title Configure background

#@markdown Configure a background, e.g. uploaded picture or solid colour.
# NOTE!! Check code in the rest of this cell if modifying these text strings.
BACKGROUND = "None (black)" #@param ["None (black)", "Solid colour below", "Upload image to Colab", "Load image from URL"]
# BACKGROUND = "Load image from Google Drive" #@param ["None (black)", "Solid colour below", "Upload image to Colab", "Load image from URL", "Load image from Google Drive"]
#@markdown Background usage: Global = use image across whole image; Local = reuse same image for every tile.
BACKGROUND_USE = "Global" #@param ["Global", "Local"]

#@markdown Colour configuration for solid colour background.
BACKGROUND_RED = 195 #@param {type:"slider", min:0, max:255, step:1}
BACKGROUND_GREEN = 181 #@param {type:"slider", min:0, max:255, step:1}
BACKGROUND_BLUE = 172 #@param {type:"slider", min:0, max:255, step:1}

#@markdown URL if downloading image file from website:
BACKGROUND_IMAGE_URL = "" #@param {type:"string"}
#markdown Path if loading image file from Google Drive:
# BACKGROUND_IMAGE_DRIVE_PATH = "Art/Collage/Backgrounds/biggest_chicken_ever.jpg" #@param {type:"string"}

PROMPTS = [GLOBAL_PROMPT]

background_image = None

def cached_url_download(url, format):
  cache_filename = os.path.basename(url)
  cache = pathlib.Path(cache_filename)
  if not cache.is_file():
    print("Downloading " + cache_filename)
    r = requests.get(url)
    bytesio_object = io.BytesIO(r.content)
    with open(cache_filename, "wb") as f:
        f.write(bytesio_object.getbuffer())
  else:
    print("Using cached version of " + cache_filename)
  if format == "numpy":
    return np.load(cache, allow_pickle=True)
  elif format == "image as RGB":
    return load_image(cache_filename, show=True) ##show=True

def upload_files():
  # Upload and save to Colab's disk.
  uploaded = files.upload()
  # Save to disk
  for k, v in uploaded.items():
    open(k, 'wb').write(v)
  return list(uploaded.keys())

def load_image(filename, as_cv2_image=False, show=False):
  # Load an image as [0,1] RGB numpy array or cv2 image format.
  img = cv2.imread(filename)
  if show:
    cv2_imshow(img)
  if as_cv2_image:
    return img  # With colour format BGR
  img = np.asarray(img)
  return img[..., ::-1] / 255.  # Reverse colour dim to convert BGR to RGB

if BACKGROUND == "None (black)":
  # 'No background' is actually a black background.
  BACKGROUND = "Solid colour below"
  BACKGROUND_RED = 0 
  BACKGROUND_GREEN = 0
  BACKGROUND_BLUE = 0

if BACKGROUND == "Load image from URL":
  background_image = cached_url_download(BACKGROUND_IMAGE_URL,
                                         format="image as RGB")
elif BACKGROUND == "Solid colour below":
  background_image = np.ones((10, 10, 3), dtype=np.float32)
  background_image[:, :, 0] = BACKGROUND_RED
  background_image[:, :, 1] = BACKGROUND_GREEN
  background_image[:, :, 2] = BACKGROUND_BLUE
  background_image /= 255.
  print('Defined background colour ({}, {}, {})'.format(
      BACKGROUND_RED, BACKGROUND_GREEN, BACKGROUND_BLUE))
elif BACKGROUND == "Load image from Google Drive":
  drive.mount(MOUNT_DIR)
  data_file = pathlib.PurePath(MOUNT_DIR, "MyDrive", 
                               BACKGROUND_IMAGE_DRIVE_PATH)
  print("Reading", data_file)
  background_image = load_image(data_file)
else:  # "Upload image to Colab"
  backgrounds = upload_files()
  background_image = load_image(backgrounds[0], show=True) ##show=True



Defined background colour (0, 0, 0)


In [15]:
# @title Composition prompts (i.e. for regions within a tile)

#@markdown Use additional prompts for each region:
COMPOSITIONAL_IMAGE = False #@param {type:"boolean"}

#@markdown **Single image composition prompts** (i.e. no tiling) for the 3x3 regions (left to right, starting at the top):
PROMPT_x0_y0 = "a photorealistic sky with sun"   #@param {type:"string"}
PROMPT_x1_y0 = "a photorealistic sky"   #@param {type:"string"}
PROMPT_x2_y0 = "a photorealistic sky with moon"   #@param {type:"string"}
PROMPT_x0_y1 = "a photorealistic tree"   #@param {type:"string"}
PROMPT_x1_y1 = "a photorealistic tree"   #@param {type:"string"}
PROMPT_x2_y1 = "a photorealistic tree"   #@param {type:"string"}
PROMPT_x0_y2 = "a photorealistic field"   #@param {type:"string"}
PROMPT_x1_y2 = "a photorealistic field"   #@param {type:"string"}
PROMPT_x2_y2 = "a photorealistic chicken"   #@param {type:"string"}

#@markdown **Tiled images composition prompts** (use when tiling)

#@markdown This string is formated to autogenerate compositional prompts for each tile, using each tile's prompt. e.g. "close-up of {}"
TILE_PROMPT_FORMATING = "close-up of {}"  #@param {type:"string"}

# Example prompt lists for different settings, where
# PROMPT = "Roman"
# TILE_PROMPT_FORMATING = "close-up of {}"
# TILE_PROMPT_STRING = "sun | clouds | sky / fields | fields | trees"

# 1. Single image with **global** prompt
#   * Tile 0 prompts: ['Roman']
# 1. Single image with **composition** prompts (tested)
#   * Tile 0 prompts: ['close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'Roman']
# 1. Tiled images with **global** prompt for each tile.
#   * Tile 0 prompts: ['Roman']
#   * Tile 1 prompts: ['Roman']
#   * Tile 2 prompts: ['Roman']
#   * Tile 3 prompts: ['Roman']
#   * Tile 4 prompts: ['Roman']
#   * Tile 5 prompts: ['Roman']
# 1. Tiled images with **global** prompt for each tile.
#   * Tile 0 prompts: ['sun']
#   * Tile 1 prompts: ['clouds']
#   * Tile 2 prompts: ['sky']
#   * Tile 3 prompts: ['fields']
#   * Tile 4 prompts: ['fields']
#   * Tile 5 prompts: ['trees']
# 1. Tiled images with separate **composition** prompts for each tile.
#   * Tile 0 prompts: ['close-up of sun', 'close-up of sun', 'close-up of sun', 'close-up of sun', 'close-up of sun', 'close-up of sun', 'close-up of sun', 'close-up of sun', 'close-up of sun', 'sun']
#   * Tile 1 prompts: ['close-up of clouds', 'close-up of clouds', 'close-up of clouds', 'close-up of clouds', 'close-up of clouds', 'close-up of clouds', 'close-up of clouds', 'close-up of clouds', 'close-up of clouds', 'clouds']
#   * Tile 2 prompts: ['close-up of sky', 'close-up of sky', 'close-up of sky', 'close-up of sky', 'close-up of sky', 'close-up of sky', 'close-up of sky', 'close-up of sky', 'close-up of sky', 'sky']
#   * Tile 3 prompts: ['close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'fields']
#   * Tile 4 prompts: ['close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'close-up of fields', 'fields']
#   * Tile 5 prompts: ['close-up of trees', 'close-up of trees', 'close-up of trees', 'close-up of trees', 'close-up of trees', 'close-up of trees', 'close-up of trees', 'close-up of trees', 'close-up of trees', 'trees']
# [188]
# 
# 1. Tiled images with **global** **composition** prompts for each tile.
#   * Tile 0 prompts: ['close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'Roman']
#   * Tile 1 prompts: ['close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'Roman']
#   * Tile 2 prompts: ['close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'Roman']
#   * Tile 3 prompts: ['close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'Roman']
#   * Tile 4 prompts: ['close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'Roman']
#   * Tile 5 prompts: ['close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'close-up of Roman', 'Roman']

In [16]:
# @title Tile prompts and tiling settings

TILE_IMAGES = False #@param {type:"boolean"} ##False

TILES_WIDE = 2  #@param {type:"slider", min:1, max:10, step:1}
TILES_HIGH = 2  #@param {type:"slider", min:1, max:10, step:1}

# Turn off tiling if either boolean is set or width/height set to 1.
if not TILE_IMAGES or (TILES_WIDE == 1 and TILES_HIGH == 1):
  TILES_WIDE = 1
  TILES_HIGH = 1
  TILE_IMAGES = False
  
#@markdown **Prompt(s) for tiles**

#@markdown **Global tile prompt** uses GLOBAL_PROMPT (previous cell) for *all* tiles (e.g. "Roman mosaic of an unswept floor").
GLOBAL_TILE_PROMPT = False #@param {type:"boolean"}

#@markdown Otherwise, specify **separate prompt for each tile** (overriding GLOBAL_PROMPT) with columns separated by | and / to delineate new row.

#@markdown E.g. multiple prompts for a 3x2 "landscape" image : "sun | clouds | sky / fields | fields | trees".

TILE_PROMPT_STRING = "photorealistic sun | photorealistic clouds / photograph of colorful buildings | crowds of people"   #@param {type:"string"}

if not TILE_IMAGES or GLOBAL_TILE_PROMPT:
  TILE_PROMPTS = [GLOBAL_PROMPT] * TILES_HIGH * TILES_WIDE
else:
  TILE_PROMPTS = []
  count_y = 0
  count_x = 0
  for row in TILE_PROMPT_STRING.split("/"):
    for prompt in row.split("|"):
      prompt = prompt.strip()
      TILE_PROMPTS.append(prompt)
      count_x += 1
    if count_x != TILES_WIDE:
      raise ValueError(f"Insufficient prompts for row {count_y}; expected {TILES_WIDE} but got {count_x}")
    count_x = 0
    count_y += 1
  if count_y != TILES_HIGH:
    raise ValueError(f"Insufficient prompt rows; expected {TILES_HIGH} but got {count_y}")

print("Tile prompts: ", TILE_PROMPTS)


Tile prompts:  ['a picture of among us player']


## Initial Checkpoint

In [17]:
#@title Select initial checkpoint

# Default filename for the initial checkpoint.
DEFAULT_INIT_CHECKPOINT = "generator.pt"

#@markdown Use an existing `generator.pt` file as checkpoint (if it is present in the local colab), upload it from the computer or upload it from Google Drive:
CHOICE_INIT_CHECKPOINT = "Use existing checkpoint file (if present)"  #@param ["Use existing checkpoint file (if present)", "Upload checkpoint file to colab", "Upload checkpoint file from Google Drive"]

#@markdown Enable this to use that checkpoint to initialise the collage.
USE_INIT_CHECKPOINT = False #@param {type:"boolean"}

#@markdown Path if loading `generator.pt` file from Google Drive:
DRIVE_PATH_TO_INIT_CHECKPOINT = "" #@param {type:"string"}

if CHOICE_INIT_CHECKPOINT == "Upload checkpoint file from Google Drive":
  print(f"Mounting directory {DRIVE_PATH_TO_INIT_CHECKPOINT}")
  drive.mount(MOUNT_DIR)
  URL_TO_INIT_CHECKPOINT = str(pathlib.PurePath(MOUNT_DIR, "MyDrive",
                                                DRIVE_PATH_TO_INIT_CHECKPOINT))
  init_checkpoint_filename = os.path.basename(URL_TO_INIT_CHECKPOINT)
  # Copy patch set file locally
  try:
    INIT_CHECKPOINT = DEFAULT_INIT_CHECKPOINT
    print(f"Copying checkpoint {INIT_CHECKPOINT} from {URL_TO_INIT_CHECKPOINT}")
    !cp "{URL_TO_INIT_CHECKPOINT}/{INIT_CHECKPOINT}" {INIT_CHECKPOINT}
  except:
    USE_INIT_CHECKPOINT = False
    INIT_CHECKPOINT = ''
    print(f"Could not copy file from {URL_TO_INIT_CHECKPOINT}")

if CHOICE_INIT_CHECKPOINT == "Upload checkpoint file to colab":
  uploaded = files.upload()

if USE_INIT_CHECKPOINT:
  INIT_CHECKPOINT = DEFAULT_INIT_CHECKPOINT
else:
  INIT_CHECKPOINT = ''

# Check that the file is present.
path_ckpt = "/content/" + INIT_CHECKPOINT
is_checkpoint_present = os.path.isfile("/content/" + INIT_CHECKPOINT)
if USE_INIT_CHECKPOINT is True and is_checkpoint_present is True:
  print(f"Initialising collage from checkpoint stored in file {path_ckpt}")
if USE_INIT_CHECKPOINT is True and is_checkpoint_present is False:
  USE_INIT_CHECKPOINT = False
  INIT_CHECKPOINT = ""
  print(f"Checkpoint {path_ckpt} is missing.")
if USE_INIT_CHECKPOINT is False:
  INIT_CHECKPOINT = ""
  print(f"Not using a checkpoint to initialise collage.")


Not using a checkpoint to initialise collage.


#Make Collage

In [18]:
#@title Create config

# Do not edit this directly as it may not have an effect as some assets will
# have already been created at this point, e.g. the background.

# Safety check on checkpoint.
is_checkpoint_present = os.path.isfile("/content/" + INIT_CHECKPOINT)
if is_checkpoint_present is False:
  INIT_CHECKPOINT = ""

config = dict(
  background_blue=BACKGROUND_BLUE,
  background_green=BACKGROUND_GREEN,
  background_red=BACKGROUND_RED,
  background_url=BACKGROUND_IMAGE_URL,
  background_use=BACKGROUND_USE,
  canvas_height=CANVAS_HEIGHT,
  canvas_width=CANVAS_WIDTH,
  clean_up=False,
  clip_model=CLIP_MODEL,
  colour_mutation_scale=COLOUR_MUTATION_SCALE,
  colour_transformations=COLOUR_TRANSFORMATIONS,
  compositional_image=COMPOSITIONAL_IMAGE,
  cuda=False, ##True
  distort_mutation_scale=DISTORT_MUTATION_SCALE,
  evolution_frequency=EVOLUTION_FREQUENCY,
  fixed_scale_coeff=FIXED_SCALE_COEFF,
  fixed_scale_patches=FIXED_SCALE_PATCHES,
  ga_method=GA_METHOD,
  global_prompt=GLOBAL_PROMPT,
  global_tile_prompt=GLOBAL_TILE_PROMPT,
  gradient_clipping=GRADIENT_CLIPPING,
  gui=False,
  high_res_multiplier=MULTIPLIER_BIG_IMAGE,
  init_checkpoint=INIT_CHECKPOINT,
  initial_max_rgb=INITIAL_MAX_RGB,
  initial_min_rgb=INITIAL_MIN_RGB,
  initial_search_size=INITIAL_SEARCH_SIZE,
  initial_search_num_steps=INITIAL_SEARCH_NUM_STEPS,
  invert_colours=INVERT_COLOURS,
  learning_rate=LEARNING_RATE,
  max_block_size_high_res=2000,
  max_hue_deg=MAX_HUE_DEG,
  max_multiple_visualizations=MAX_MULTIPLE_VISUALISATIONS,
  max_rgb=MAX_RGB,
  max_rot_deg=MAX_ROT_DEG,
  max_sat=MAX_SAT,
  max_scale=MAX_SCALE,
  max_shear=MAX_SHEAR,
  max_squeeze=MAX_SQUEEZE,
  max_trans=MAX_TRANS,
  max_trans_init=INITIAL_MAX_TRANS,
  max_val=MAX_VAL,
  min_hue_deg=MAX_HUE_DEG,
  min_rgb=MIN_RGB,
  min_rot_deg=MIN_ROT_DEG,
  min_sat=MIN_SAT,
  min_scale=MIN_SCALE,
  min_shear=MIN_SHEAR,
  min_squeeze=MIN_SQUEEZE,
  min_trans=MIN_TRANS,
  min_trans_init=INITIAL_MIN_TRANS,
  min_val=MIN_VAL,
  multiple_patch_set=MULTIPLE_PATCH_SET,  # e.g.  ["shore_glass.npy", "animals.npy"]
  multiple_fixed_scale_patches=MULTIPLE_FIXED_SCALE_PATCHES,  # e.g.  [true, true, false]
  multiple_fixed_scale_coeff=MULTIPLE_FIXED_SCALE_COEFF,  # e.g.  [0.8, 0.3]
  multiple_patch_max_proportion=MULTIPLE_PATCH_MAX_PROPORTION,  # e.g. [3, 5, 5]
  normalize_patch_brightness=NORMALIZE_PATCH_BRIGHTNESS,
  num_augs=NUM_AUGS,
  num_patches=NUM_PATCHES,
  optim_steps=OPTIM_STEPS,
  output_dir=DIR_RESULTS,
  patch_height_min=PATCH_HEIGHT_MIN,
  patch_max_proportion=PATCH_MAX_PROPORTION,
  patch_mutation_probability=PATCH_MUTATION_PROBABILITY,
  patch_repo_root=PATCH_REPO_ROOT,
  patch_set=PATCH_SET,
  patch_width_min=PATCH_WIDTH_MIN,
  pop_size=POP_SIZE,
  population_video=POPULATION_VIDEO,
  pos_and_rot_mutation_scale=POS_AND_ROT_MUTATION_SCALE,
  prompt_x0_y0=PROMPT_x0_y0,
  prompt_x0_y1=PROMPT_x0_y1,
  prompt_x0_y2=PROMPT_x0_y2,
  prompt_x1_y0=PROMPT_x1_y0,
  prompt_x1_y1=PROMPT_x1_y1,
  prompt_x1_y2=PROMPT_x1_y2,
  prompt_x2_y0=PROMPT_x2_y0,
  prompt_x2_y1=PROMPT_x2_y1,
  prompt_x2_y2=PROMPT_x2_y2,
  render_method=RENDER_METHOD,
  save_all_arrays=False,
  scale_mutation_scale=SCALE_MUTATION_SCALE,
  tile_images=TILE_IMAGES,
  tile_prompt_formating=TILE_PROMPT_FORMATING,
  tile_prompt_string=TILE_PROMPT_STRING,
  tiles_high=TILES_HIGH,
  tiles_wide=TILES_WIDE,
  torch_device=torch_device,
  trace_every=TRACE_EVERY,
  url_to_patch_file=URL_TO_PATCH_FILE,
  use_image_augmentations=USE_IMAGE_AUGMENTATIONS,
  use_normalized_clip=USE_NORMALIZED_CLIP,
  video_steps=VIDEO_STEPS,
)


In [19]:
#@title Initialisation

#TODO(dylski) Move this code into a module for Colab and main.py to share.
# Adjust config for compositional image.
if config["compositional_image"] == True:
  print("Generating compositional image")
  config['canvas_width'] *= 2
  config['canvas_height'] *= 2
  config['high_res_multiplier'] = int(config['high_res_multiplier'] / 2)
  print("Using one image augmentations for compositional image creation.")
  config["use_image_augmentations"] = True
  config["num_augs"] = 1

print('A')

# Turn off tiling if either boolean is set or width/height set to 1.
if (not config["tile_images"] or
    (config["tiles_wide"] == 1 and config["tiles_high"] == 1)):
  print("No tiling.")
  config["tiles_wide"] = 1
  config["tiles_high"] = 1
  config["tile_images"] = False
    
print('B')

# Default output dir. Make sure it is a new directory for each generation.
if len(config["output_dir"]) == 0 or STORE_ON_GOOGLE_DRIVE is False:
  config["output_dir"] = "output_"
  config["output_dir"] += datetime.datetime.strftime(
      datetime.datetime.now(), '%Y%m%d_%H%M%S')
  config["output_dir"] += '/'

print('C')

# Make output dir.
output_dir = config["output_dir"]
print(f"Storing results in {output_dir}\n")
pathlib.Path(output_dir).mkdir(parents=True, exist_ok=True)

# Save the config.
config_filename = config["output_dir"] + "/" + "config.yaml"
with open(config_filename, "w") as f:
  yaml.dump(config, f, default_flow_style=False, allow_unicode=True)

# TODO(dylski) Put this into the package code.
# Tiling.
if not config["tile_images"] or config["global_tile_prompt"]:
  tile_prompts = (
    [config["global_prompt"]] * config["tiles_high"] * config["tiles_wide"])
else:
  tile_prompts = []
  count_y = 0
  count_x = 0
  for row in config["tile_prompt_string"].split("/"):
    for prompt in row.split("|"):
      prompt = prompt.strip()
      tile_prompts.append(prompt)
      count_x += 1
    if count_x != config["tiles_wide"]:
      w = config["tiles_wide"]
      raise ValueError(
        f"Insufficient prompts for row {count_y}; expected {w}, got {count_x}")
    count_x = 0
    count_y += 1
  if count_y != config["tiles_high"]:
    h = config["tiles_high"]
    raise ValueError(f"Insufficient prompt rows; expected {h}, got {count_y}")


print("Tile prompts: ", tile_prompts)
# Prepare duplicates of config data if required for tiles.
tile_count = 0
all_prompts = []
for y in range(config["tiles_high"]):
  for x in range(config["tiles_wide"]):
    list_tile_prompts = []
    if config["compositional_image"]:
      if config["tile_images"]:
        list_tile_prompts = [
            config["tile_prompt_formating"].format(tile_prompts[tile_count])
            ] * 9
      else:
        list_tile_prompts = [
            config["prompt_x0_y0"], config["prompt_x1_y0"],
            config["prompt_x2_y0"],
            config["prompt_x0_y1"], config["prompt_x1_y1"],
            config["prompt_x2_y1"],
            config["prompt_x0_y2"], config["prompt_x1_y2"],
            config["prompt_x2_y2"]]
    list_tile_prompts.append(tile_prompts[tile_count])
    tile_count += 1
    all_prompts.append(list_tile_prompts)
print(f"All prompts: {all_prompts}")

ct = collage.CollageTiler(
    all_prompts, background_image, clip_model, device, config)

ct.initialise() #maybe here is error, ct.initialise()에 cv2.imshow()가 있기 때문

generator = ct._collage_maker.generator
img = generator({'gamma': 0})
img = img.permute(0, 3, 1, 2)
_ = video_utils.show_and_save(img, config=config, show=False) #show를 False로 설정함

A
No tiling.
B
C
Storing results in output_20230113_230502/

Tile prompts:  ['a picture of among us player']
All prompts: [['a picture of among us player']]
Tiling 1x1 collages
Optimisation:
Tile size: 224x224
Global size: 224x224 (WxH)
High res:
Tile size: 4032x4032
Global size: 4032x4032 (WxH)
Tile 0 prompts: ['a picture of among us player']

New collage creator for y0, x0 with bg
image (not stitch) min 0.0, max 0.0
Using cached version of shore_glass.npy
Patch set human_artefacts.npy, fixed_scale_patches? False, fixed_scale_coeff=0.3, patch_max_proportion=4
Max patch size on large img: (1008.0, 1008.0)
<class 'bool'>
No video writing implemented
No video writing implemented
CLIP prompt a picture of among us player
PopulationAffineTransforms is_high_res=False, requires_grad=True
PopulationColourRGBTransforms for 400 patches, 2 individuals
PopulationColourRGBTransforms requires_grad=True
Background image of size torch.Size([3, 224, 224])
image (stitch) min 0.0, max 1.0


In [None]:
#@title Create collage loop
#@markdown To edit patches interrupt this cell and run the one below this. Re-run this cell afterwards to continue generating the image.

# gc.collect() ##add
# torch.cuda.empty_cache() ##add
output = ct.loop()


Starting optimization of collage.
image (stitch) min 0.0, max 1.0
Saving temporary image output_20230113_230502//optim_0.png (shape=(224, 448, 3))
Iteration   0, rendering loss -3.778931, 0.311s/iter
image (stitch) min 0.0, max 1.0
Saving temporary image output_20230113_230502//optim_400.png (shape=(224, 448, 3))
Iteration 400, rendering loss -5.127441, 0.598s/iter
image (stitch) min 0.0, max 1.0
Saving temporary image output_20230113_230502//optim_800.png (shape=(224, 448, 3))
Iteration 800, rendering loss -6.008423, 0.592s/iter
image (stitch) min 0.0, max 1.0
Saving temporary image output_20230113_230502//optim_1200.png (shape=(224, 448, 3))
Iteration 1200, rendering loss -5.944458, 0.596s/iter
image (stitch) min 0.0, max 1.0
Saving temporary image output_20230113_230502//optim_1600.png (shape=(224, 448, 3))
Iteration 1600, rendering loss -6.157837, 0.600s/iter
image (stitch) min 0.0, max 1.0
Saving model to output_20230113_230502/...
PopulationAffineTransforms is_high_res=True, req

In [None]:
#@title Tinker with patches
#@markdown Enable this cell to allow patch editing:
PATCH_TINKERING = False #@param {type:"boolean"}

#@markdown Interupt the cell above mid-optimisation and run this cell to manually adjust the patches. Run it several times to adjust different patches. Then re-run the cell above to continue optimising.

if PATCH_TINKERING:
  from ipywidgets import interactive
  import IPython.display
  from google.colab.output import eval_js
  import base64
  
  # Render the current collage(s).
  generator = ct._collage_maker.generator
  step = ct._collage_maker.step
  params = {'gamma': step / OPTIM_STEPS}
  img = generator(params)
  img = img.permute(0, 3, 1, 2)  # NHWC -> NCHW
  print('Current collage(s)')
  res_img = show_stitched_batch(img,
                                max_display=MAX_MULTIPLE_VISUALISATIONS,
                                show=False)
  filename_temp = f"./temp.png"
  res_img = cv2.cvtColor(res_img, cv2.COLOR_BGR2RGB) * 255
  cv2.imwrite(filename_temp, res_img)
  
  # HTML code to plot the image and detect the mouse cursor.
  canvas_html = """
  <canvas width=%d height=%d></canvas>
  <script>
    var filename_image = "%s"
    var canvas = document.querySelector('canvas')
    var ctx = canvas.getContext('2d')
    ctx.lineWidth = 1
    var mouse = {x: 0, y: 0}
    canvas.addEventListener('mousemove', function(e) {
      mouse.x = e.pageX - this.offsetLeft
      mouse.y = e.pageY - this.offsetTop
    })
    canvas.onmousedown = ()=>{
      ctx.beginPath()
      ctx.moveTo(mouse.x, mouse.y)
      canvas.addEventListener('mousemove', onPaint)
    }
    var onPaint = ()=>{
      ctx.lineTo(mouse.x, mouse.y)
      ctx.stroke()
    }
    var data = new Promise(resolve=>{
      canvas.onmouseup = ()=>{
        canvas.removeEventListener('mousemove', onPaint)
        resolve(mouse)
      }
    })
    function draw_collage_image() {
      collage_image = new Image();
      collage_image.src = filename_image;
      collage_image.onload = function(){
        ctx.drawImage(collage_image, 0, 0);
      }
    }
    draw_collage_image();
  </script>
  """
  
  im = IPython.display.Image(filename_temp, embed=True)
  # IPython.display.display(im)
  filename_embed = 'data:image/png;base64,'
  filename_embed += base64.b64encode(im.data).decode('ascii')
  
  # Display an HTML canvas with the image.
  canvas = IPython.display.HTML(
      canvas_html % (CANVAS_WIDTH * POP_SIZE, CANVAS_HEIGHT, filename_embed))
  print('Click with the mouse on the desired image and patch:')
  IPython.display.display(canvas)
  
  # Select the image and pixel coordinates.
  def draw():
    print('draw()')
    mouse = eval_js('data')
    return mouse
  mouse = draw()
  pop_id_mouse = int(np.floor(mouse['x'] / CANVAS_WIDTH))
  x_mouse = int(mouse['x'] % CANVAS_WIDTH)
  y_mouse = int(mouse['y'])
  print(f'Selected image {pop_id_mouse} at ({x_mouse}, {y_mouse})')
  
  def find_patch(generator, id, u, v):
    # Render only the spatial transforms of the patches.
    rendered_patches = generator.spatial_transformer(generator.patches)
    rendered_patches = rendered_patches.detach().cpu().numpy()
    patch_id = np.argmax(rendered_patches[id, :, 3, u, v] * rendered_patches[id, :, 4, u, v])
    return patch_id
  
  # Select the patch.
  patch_id = find_patch(generator, pop_id_mouse, y_mouse, x_mouse)
  print(f'Found matching patch {patch_id}')
  
  # Extract the patch's current affine transform paramaters.
  with torch.no_grad():
    x0 = generator.spatial_transformer.translation[pop_id_mouse, patch_id, 0, 0]
    x0 = float(x0.detach().cpu().numpy())
    y0 = generator.spatial_transformer.translation[pop_id_mouse, patch_id, 1, 0]
    y0 = float(y0.detach().cpu().numpy())
    rot0 = generator.spatial_transformer.rotation[pop_id_mouse, patch_id, 0, 0]
    rot0 = float(rot0.detach().cpu().numpy())
    scale0 = generator.spatial_transformer.scale[pop_id_mouse, patch_id, 0, 0]
    scale0 = float(scale0.detach().cpu().numpy())
    squeeze0 = generator.spatial_transformer.squeeze[pop_id_mouse, patch_id, 0, 0]
    squeeze0 = float(squeeze0.detach().cpu().numpy())
    shear0 = generator.spatial_transformer.shear[pop_id_mouse, patch_id, 0, 0]
    shear0 = float(shear0.detach().cpu().numpy())
    patch_info = {'pop_id': pop_id_mouse, 'patch_id': patch_id,
                  'x0': x0, 'y0': y0, 'rot0': rot0,
                  'scale0': scale0, 'squeeze0': squeeze0, 'shear0': shear0,
                  'x': x0, 'y': y0, 'rot': rot0,
                  'scale': scale0, 'squeeze': squeeze0, 'shear': shear0}
  
  def show_modified(dx, dy, drot, dscale, dsqueeze, dshear):
    """Visualization callback function with affine transform deltas."""
    with torch.no_grad():
      x = patch_info['x0'] - dx
      y = patch_info['y0'] + dy
      rot = patch_info['rot0'] - drot
      scale = patch_info['scale0'] - dscale
      squeeze = patch_info['squeeze0'] + dsqueeze
      shear = patch_info['shear0'] + dshear
      generator.spatial_transformer.translation[pop_id_mouse, patch_id, 0, 0] = x
      generator.spatial_transformer.translation[pop_id_mouse, patch_id, 1, 0] = y
      generator.spatial_transformer.rotation[pop_id_mouse, patch_id, 0, 0] = rot
      generator.spatial_transformer.scale[pop_id_mouse, patch_id, 0, 0] = scale
      generator.spatial_transformer.squeeze[pop_id_mouse, patch_id, 0, 0] = squeeze
      generator.spatial_transformer.shear[pop_id_mouse, patch_id, 0, 0] = shear
    patch_info['x'] = x
    patch_info['y'] = y
    patch_info['rot'] = rot
    patch_info['shear'] = shear
    patch_info['squeeze'] = squeeze
    patch_info['shear'] = shear
    params = {'gamma': step / OPTIM_STEPS}
    img = generator(params)
    img = img.permute(0, 3, 1, 2)  # NHWC -> NCHW
    _ = show_stitched_batch(img,
                            max_display=MAX_MULTIPLE_VISUALISATIONS)
  
  # Interactive editing of the patch's affine transform parameters.
  interactive_plot = interactive(show_modified,
                                dx=(-MAX_TRANS * 2, MAX_TRANS * 2, 0.01),
                                dy=(-MAX_TRANS * 2, MAX_TRANS * 2, 0.01),
                                drot=(-MAX_ROT * 2, MAX_ROT * 2, 0.01),
                                dscale=(-MAX_SCALE * 2, MAX_SCALE * 2, 0.01),
                                dsqueeze=(-MAX_SQUEEZE * 2, MAX_SQUEEZE * 2, 0.01),
                                dshear=(-MAX_SHEAR * 2, MAX_SHEAR * 2, 0.01))
  output = interactive_plot.children[-1]
  output.layout.height = '350px'
else:
  interactive_plot = "Patch tinkering not enabled."
interactive_plot
  

In [None]:
#@title Render high res image and finish up.

ct.assemble_tiles()

In [None]:
#@title Save and download assets

#@markdown Enable this to allow everything to be zipped up and downloaded
DOWNLOAD_FILES = True #@param {type:"boolean"}

if DOWNLOAD_FILES:
  zipname = f"{config['output_dir'].rstrip('/')}.zip"
  print(f"Output {config['output_dir']} will be downladed as {zipname}")
  !zip -r {zipname} {ct._output_dir}
#   from google.colab import files
#   files.download(zipname)
  result_file = zipfile.ZipFile(f"{zipname}", 'w')
  result_file.write(ct._output_dir)


In [None]:
#@title Copy model checkpoint to use as initialisation for next collage

#@markdown Enable this to copy file `generator.pt` from the current output directory. This model file contains the learned spatial and colour transforms from the current collage. If `USE_INIT_CHECKPOINT` is set to `True`, then next collage generation will try to use that model as initialisation.
COPY_CHECKPOINT_FOR_INITIALISATION = True #@param {type:"boolean"}

if COPY_CHECKPOINT_FOR_INITIALISATION:
  !cp "{ct._output_dir}/generator.pt" .
  print(f"Copied {ct._output_dir}/generator.py to root directory.")

In [None]:
raise ValueError("Stop here.")

#Patch file Creation

Patch files can be created from PNGs (best) and JPGS. Both methods produce a `.npy` a file that contains all the patches. They can be uploaded and used in this Colab by

* Set `Configure -> Patch Settings -> Select a patch set -> EXAMPLE_PATCH_SET` to `None of the above`
* Set `Advanced Settings -> Patch Settings -> Select Segmented Patches -> Load patch sets from elsewhere -> ADVANCED_PATCH_SET` to `Upload to Colab`

* PNGs reqiure the subject to have been manually segmented and on a transparent background. The cells below can be used with such PNGs.

* JPGs use the [Arnheim 3 Patch Maker](https://colab.research.google.com/github/deepmind/arnheim/blob/main/arnheim_3_patch_maker.ipynb) Colab which uses a segmentation algorithm to attempt to cut out the subject. This can be hit or miss. Instructions are in the Colab.

## Note:
Make sure you run the `Preliminaries` calls at the top of the Colab before running these.

In [None]:
#@title Functions to create patch files.

def upload_files(image_path):
  """Upload files to target directory."""
  print(f"Uploading files to {image_path}")
  !rm -rf {image_path}
  !mkdir {image_path}
  uploaded = files.upload()
  for k, v in uploaded.items():
    open(image_path + "/" + k, 'wb').write(v)
  return list(uploaded.keys())


def convert_pngs(image_path, destination_file):
  """Convert pngs to single numpy array file."""
  png_imgs = []
  for png_im_path in glob.glob(image_path + "/*.png"):
    png_im = imageio.imread(png_im_path)
    png_imgs.append(png_im)
  
  np.save(destination_file, np.array(png_imgs), allow_pickle=True)


def show_patches(filename):
  patches = np.load(filename, allow_pickle=True)
  for patch in patches:
    print(patch.shape)
    #cv2_imshow(patch)
    plt.imshow(patch)  ###

In [None]:
#@title Create patch file from PNGs
#@markdown Requires PNG files with alpha channel.

OUTPUT_PATCH_FILE_NAME = "my_patches.npy"  # @param{type:"string"}
target_patch_path = f"/content/{OUTPUT_PATCH_FILE_NAME}"
png_path = "/content/pngs"
SHOW_PATCHES = True  #@param{type: "boolean"}

print("Select PNG files to be converted:")
upload_files(png_path)
print("Converting images.")
convert_pngs(png_path, target_patch_path)
if SHOW_PATCHES:
  show_patches(target_patch_path)
files.download(target_patch_path)