Josh and I have talked about using Stable Diffusion to generate a new version of the logo on our
landing page every day. Stable diffusion alone hasn't been able to generate anything meaningful, but
let's try creating seed images with DalleMini and then use Stable Diffusion img2img to refine them.

_Based on: https://github.com/huggingface/diffusers/blob/main/README.md#image-to-image-text-guided-generation-with-stable-diffusion_

In [None]:
!pip install git+https://github.com/run-house/runhouse.git@latest_patch

In [1]:
import runhouse as rh
import torch
from PIL import Image
import random

INFO | 2022-12-21 08:40:20,340 | Loaded Runhouse config from /root/.rh/config.yaml


In [None]:
!runhouse login

### Login to Runhouse to load in secrets.

In [None]:
# You can add token=<your token> if you want to be able to run this without pasting into stdin
rh.login(download_secrets=True, download_config=True)

In [None]:
# Only if you're using GCP and running inside Colab!
!gcloud init
!gcloud auth application-default login
!cp -r /content/.config/* ~/.config/gcloud

In [None]:
# Check that secrets are loaded in properly and at least one cloud is ready to use.
!sky check

First try just DalleMini. It runs best on an A100, but AWS doesn't offer single A100s (only clusters of 8),
so let's run it on gcp.
gcp_gpu = rh.cluster(name='a100', instance_type='A100:1', provider='gcp', use_spot=False, autostop_mins=60)

In [2]:
def dm_generate(prompt, num_images_sqrt=1, supercondition_factor=32, is_mega=True, seed=50, top_k=64):
    from min_dalle import MinDalle
    import torch
    from PIL import Image
    torch.cuda.empty_cache()
    torch.no_grad()
    torch.backends.cuda.matmul.allow_tf32 = True
    torch.backends.cuda.matmul.allow_fp16_reduced_precision_reduction = True
    dalle = MinDalle(device='cuda', is_mega=is_mega, is_reusable=False, dtype=torch.float16)
    images = dalle.generate_images(prompt, seed=seed, grid_size=num_images_sqrt,
                                   temperature=1, top_k=top_k, supercondition_factor=supercondition_factor)
    images = images.to(torch.uint8).to('cpu').numpy()
    return [Image.fromarray(images[i]) for i in range(num_images_sqrt**2)]

In [None]:
gcp_gpu = rh.cluster(name='a100', instance_type='A100:1', provider='gcp', use_spot=False)
# gcp_gpu = rh.cluster(name='a10g', instance_type='A10G:1', provider='cheapest', use_spot=False)
dm_generate_gpu = rh.send(fn=dm_generate, 
                          hardware=gcp_gpu,
                          reqs=['./', 'torch', 'min-dalle'],
                          load_from=['rns'], save_to=['rns'],
                          name='dm_generate')

In [None]:
rh_prompt = 'A digital illustration of a woman running on the roof of a house.'

seed = random.randint(0, 1000)
rh_logo_dm_images = dm_generate_gpu(rh_prompt, 
                                    seed=seed,
                                    is_mega=False,
                                    num_images_sqrt=2,
                                    supercondition_factor=256)
[image.show() for image in rh_logo_dm_images]

Now let's try feeding it into StableDiffusionImg2Img. We could put this on a the A100, but it might OOM,
so let's put it on a V100 on AWS.

In [None]:
def sd_img2img_generate(prompt, base_images, num_images=1,
                        steps=100, strength=0.75, guidance_scale=7.5, model_id="stabilityai/stable-diffusion-2-base"):
    from diffusers import StableDiffusionImg2ImgPipeline
    import torch
    torch.cuda.empty_cache()
    torch.no_grad()
    sd_pipe = StableDiffusionImg2ImgPipeline.from_pretrained(model_id)
    sd_pipe = sd_pipe.to('cuda')
    ret = []
    for image in base_images:
        ret = ret + sd_pipe([prompt] * num_images, init_image=image.resize((512, 512)),
                            num_inference_steps=steps, strength=strength,
                            guidance_scale=guidance_scale).images
    return ret

sd_img2img_generate_gpu = rh.send(fn=sd_img2img_generate, hardware='a100',
                                  reqs=['./', 'transformers', 'diffusers'],
                                  load_secrets=True,
                                  load_from=['rns'], save_to=['rns'],
                                  name='sd_img2img_generate')

In [None]:
rh_logo_dm2sd_images = sd_img2img_generate_gpu(rh_prompt, rh_logo_dm_images, strength=.75,
                                               guidance_scale=7.5, steps=25)
[image.show() for image in rh_logo_dm2sd_images]

Now let's do a tester passing an existing runhouse logo image to SDImg2Img.

In [None]:
rh_base_image = Image.open('rh_logo.png').convert("RGB").resize((512, 512))
rh_logo_sd_images = sd_img2img_generate_gpu(rh_prompt, [rh_base_image],
                                            strength=.5, guidance_scale=5,
                                            num_images=2, steps=100)
[display(image) for image in rh_logo_sd_images]