# Text guided image synthesis - Part 1: preparing enviornment

In this notebook we install all dependencies and download AI models for this project.

## Assumptions

 - We run on linux.
 - Apt package is available.
 - We are root.
 - torchvision pytorch and nvidia drivers are installed.
 - Jupyter notebook or jupyter lab is installed.

This is essentially setup that we have if we run docker image from pytorch (https://hub.docker.com/r/pytorch/pytorch/).

## Dependencies from apt

We need to have curl and git if it is not already installed.

In [13]:
! apt install -y curl wget git

Reading package lists... Done
Building dependency tree       
Reading state information... Done
curl is already the newest version (7.58.0-2ubuntu3.16).
git is already the newest version (1:2.17.1-1ubuntu0.9).
The following NEW packages will be installed:
  wget
0 upgraded, 1 newly installed, 0 to remove and 3 not upgraded.
Need to get 316 kB of archives.
After this operation, 954 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 wget amd64 1.19.4-1ubuntu2.2 [316 kB]
Fetched 316 kB in 1s (343 kB/s)[0m[33m[33m
debconf: delaying package configuration, since apt-utils is not installed

7[0;23r8[1ASelecting previously unselected package wget.
(Reading database ... 9787 files and directories currently installed.)
Preparing to unpack .../wget_1.19.4-1ubuntu2.2_amd64.deb ...
7[24;0f[42m[30mProgress: [  0%][49m[39m [..........................................................] 87[24;0f[42m[30mProgress: [ 17%][49m[39m [###

## Install required packages using pip

(Pip as to be installed)

We install:

 - taming transformes paper source code
 - tensorflow (for superresolution)
 - Image Superresolution package
 - CLIP source code
 - pytorch lightning
 - omegaconf

In [9]:
!pip install omegaconf pytorch-lightning tensorflow
!pip install git+https://github.com/bfirsh/taming-transformers.git
!pip install git+https://github.com/openai/CLIP.git
!pip install git+https://github.com/idealo/image-super-resolution.git



Collecting git+https://github.com/bfirsh/taming-transformers.git
  Cloning https://github.com/bfirsh/taming-transformers.git to /tmp/pip-req-build-h5cl1p7l
  Running command git clone -q https://github.com/bfirsh/taming-transformers.git /tmp/pip-req-build-h5cl1p7l
  Resolved https://github.com/bfirsh/taming-transformers.git to commit 8ec57d77c125b19f8f6c047496fd0216db3b700f
Collecting git+https://github.com/openai/CLIP.git
  Cloning https://github.com/openai/CLIP.git to /tmp/pip-req-build-41ifntb3
  Running command git clone -q https://github.com/openai/CLIP.git /tmp/pip-req-build-41ifntb3
  Resolved https://github.com/openai/CLIP.git to commit 40f5484c1c74edd83cb9cf687c6ab92b28d8b656
Collecting git+https://github.com/idealo/image-super-resolution.git
  Cloning https://github.com/idealo/image-super-resolution.git to /tmp/pip-req-build-ss87y1_b
  Running command git clone -q https://github.com/idealo/image-super-resolution.git /tmp/pip-req-build-ss87y1_b
  Resolved https://github.com/id

Building wheels for collected packages: ISR
  Building wheel for ISR (setup.py) ... [?25ldone
[?25h  Created wheel for ISR: filename=ISR-2.2.0-py3-none-any.whl size=33513 sha256=da0fd5d82f28a20309a7649a55f71becd458b3cef28738ac1fa2871bfe1506d4
  Stored in directory: /tmp/pip-ephem-wheel-cache-u3usms2w/wheels/dc/fc/9d/b8d248780705e5bdf35cc9fbaa30f0f2c583e4f02275e73d27
Successfully built ISR
Installing collected packages: h5py, pyaml, imageio, ISR
  Attempting uninstall: h5py
    Found existing installation: h5py 3.6.0
    Uninstalling h5py-3.6.0:
      Successfully uninstalled h5py-3.6.0
Successfully installed ISR-2.2.0 h5py-2.10.0 imageio-2.16.1 pyaml-21.10.1


## Download AI models

We will download them to /weights path.

First we download CLIP model. We will download the best performing model (ViT-B-32).

In [10]:
!mkdir weights
!curl -o weights/ViT-L-14.pt https://openaipublic.azureedge.net/clip/models/b8cca3fd41ae0c99ba7e8951adf17d267cdb84cd88be6f7c2e0eca1737a03836/ViT-L-14.pt

mkdir: cannot create directory ‘weights’: File exists
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  889M  100  889M    0     0  28.6M      0  0:00:31  0:00:31 --:--:-- 31.6M


Now we download VQGAN model. We choose model depending on what we plan to generate.
Each cell downloads one model. All models take around 17.7GB so it is recommended to download only the ones you plan to use.

Imagenet 16384 model - Imagenet is a dataset that contains images of 16384 different objects. Model will be good for generating objects.

In [3]:
!curl -L -o weights/vqgan_imagenet_f16_16384.ckpt -C - 'https://heibox.uni-heidelberg.de/f/867b05fc8c4841768640/?dl=1'
!curl -L -o weights/vqgan_imagenet_f16_16384.yaml -C - 'https://heibox.uni-heidelberg.de/f/274fb24ed38341bfa753/?dl=1'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  934M  100  934M    0     0  14.8M      0  0:01:03  0:01:03 --:--:-- 14.9M
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   692  100   692    0     0   1272      0 --:--:-- --:--:-- --:--:--  1272


COCO model - Dataset contains objects in everyday environment. Another model good for generating objects.

In [4]:
!curl -L -o weights/coco.yaml -C - 'https://dl.nmkd.de/ai/clip/coco/coco.yaml'
!curl -L -o weights/coco.ckpt -C - 'https://dl.nmkd.de/ai/clip/coco/coco.ckpt'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1980    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0 100  1980    0     0   2625      0 --:--:-- --:--:-- --:--:--  2625
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 8045M  100 8045M    0     0  29.6M      0  0:04:31  0:04:31 --:--:-- 26.4M  32 2618M    0     0  29.3M      0  0:04:34  0:01:29  0:03:05 29.3M7589M    0     0  29.6M      0  0:04:31  0:04:16  0:00:15 31.5M     0  29.6M      0  0:04:31  0:04:19  0:00:12 33.0M 96 7784M    0     0  29.6M      0  0:04:31  0:04:22  0:00:09 32.7M


FacesHQ model - Dataset containing faces. Model will be good for generating portraits (or more specifically monsters because this method does not generate good portraits).

In [4]:
!curl -L -o weights/faceshq.yaml -C - 'https://drive.google.com/uc?export=download&id=1fHwGx_hnBtC8nsq7hesJvs-Klv-P0gzT'
!curl -L -o weights/faceshq.ckpt -C - 'https://app.koofr.net/content/links/a04deec9-0c59-4673-8b37-3d696fe63a5d/files/get/last.ckpt?path=%2F2020-11-13T21-41-45_faceshq_transformer%2Fcheckpoints%2Flast.ckpt'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  1451  100  1451    0     0   1545      0 --:--:-- --:--:-- --:--:--  1545
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3789M  100 3789M    0     0  32.3M      0  0:01:57  0:01:57 --:--:-- 59.7MM


Wikiart model - Dataset containing paintings. Model will be good for generating paintings.

In [5]:
!curl -L -o weights/wikiart_16384.ckpt -C - 'http://eaidata.bmk.sh/data/Wikiart_16384/wikiart_f16_16384_8145600.ckpt'
!curl -L -o weights/wikiart_16384.yaml -C - 'http://eaidata.bmk.sh/data/Wikiart_16384/wikiart_f16_16384_8145600.yaml'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  958M  100  958M    0     0  7304k      0  0:02:14  0:02:14 --:--:-- 6968k6M    0     0  6177k      0  0:02:38  0:00:06  0:02:32 7575k   0     0  7167k      0  0:02:16  0:00:46  0:01:30 7542k    0  7262k      0  0:02:15  0:01:33  0:00:42 7421k  0  7284k      0  0:02:14  0:01:42  0:00:32 7514k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   920  100   920    0     0   2058      0 --:--:-- --:--:-- --:--:--  2058


Flickr dataset - Dataset containing a lot of landscapes. Model will be good for generating landscapes.

In [11]:
!curl -L -o weights/sflckr.yaml -C - 'https://heibox.uni-heidelberg.de/d/73487ab6e5314cb5adba/files/?p=%2Fconfigs%2F2020-11-09T13-31-51-project.yaml&dl=1'
!curl -L -o weights/sflckr.ckpt -C - 'https://heibox.uni-heidelberg.de/d/73487ab6e5314cb5adba/files/?p=%2Fcheckpoints%2Flast.ckpt&dl=1'

** Resuming transfer from byte position 1603
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
** Resuming transfer from byte position 516096
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 4065M  100 4065M    0     0  14.7M      0  0:04:35  0:04:35 --:--:-- 14.9M8M      0  0:04:33  0:02:07  0:02:26 14.9M4065M   79 3211M    0     0  14.8M      0  0:04:32  0:03:35  0:00:57 15.6M


Superresolution models.

In [14]:
!wget https://public-asai-dl-models.s3.eu-central-1.amazonaws.com/ISR/rdn-C3-D10-G64-G064-x2/PSNR-driven/rdn-C3-D10-G64-G064-x2_PSNR_epoch134.hdf5
!wget https://public-asai-dl-models.s3.eu-central-1.amazonaws.com/ISR/rdn-C6-D20-G64-G064-x2/ArtefactCancelling/rdn-C6-D20-G64-G064-x2_ArtefactCancelling_epoch219.hdf5

--2022-03-27 18:32:50--  https://public-asai-dl-models.s3.eu-central-1.amazonaws.com/ISR/rdn-C3-D10-G64-G064-x2/PSNR-driven/rdn-C3-D10-G64-G064-x2_PSNR_epoch134.hdf5
Resolving public-asai-dl-models.s3.eu-central-1.amazonaws.com (public-asai-dl-models.s3.eu-central-1.amazonaws.com)... 52.219.170.46
Connecting to public-asai-dl-models.s3.eu-central-1.amazonaws.com (public-asai-dl-models.s3.eu-central-1.amazonaws.com)|52.219.170.46|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10694096 (10M) [binary/octet-stream]
Saving to: ‘rdn-C3-D10-G64-G064-x2_PSNR_epoch134.hdf5’


2022-03-27 18:32:54 (3.19 MB/s) - ‘rdn-C3-D10-G64-G064-x2_PSNR_epoch134.hdf5’ saved [10694096/10694096]

--2022-03-27 18:32:54--  https://public-asai-dl-models.s3.eu-central-1.amazonaws.com/ISR/rdn-C6-D20-G64-G064-x2/ArtefactCancelling/rdn-C6-D20-G64-G064-x2_ArtefactCancelling_epoch219.hdf5
Resolving public-asai-dl-models.s3.eu-central-1.amazonaws.com (public-asai-dl-models.s3.eu-central-1.amazon

In [15]:
mv *.hdf5 weights