# **Interactive notebook for Stable Diffusion**

<img src='https://drive.google.com/uc?id=18fVPcbmSnLdzp2dGWU5gPrNMhitKKoPV'>

This notebook is a version of the [interactive Stable Diffusion notebook](https://github.com/cpacker/stable-diffusion/blob/interactive-notebook/scripts/stable_diffusion_interactive.ipynb) modified for Google Colab. The goal of the notebooks (both Jupyter Lab and Colab versions) is to make generating images with a local copy of Stable Diffusion as easy as it is with the Discord bot.

Note that this notebook assumes that **you already have access to the Stable Diffusion weights** (you must provide the weights yourself).

---

On a Tesla T4 assigned by Colab (using the free version, not Pro or Pro+), **it takes approximately 90s (1.5m) to run** [**the example prompt from the original README**](https://github.com/CompVis/stable-diffusion#stable-diffusion-v1) (`python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms`), which generates 6 images at 512x512 resolution (shown above).

This does not count the install process, which should take ~3-10m. Additionally, the first time you run the model it will take slightly longer since it requires additional downloads (subsequent runs should be ~90s on a T4).

If you have access to your own powerful GPU (e.g., RTX 3090), I'd recommend using the [Jupyter Lab notebook](https://github.com/cpacker/stable-diffusion/blob/interactive-notebook/scripts/stable_diffusion_interactive.ipynb) or [running this Colab notebook locally](https://research.google.com/colaboratory/local-runtimes.html) - it will likely be significantly faster than the free GPUs on Colab.

---

This notebook was tested using [commit ce05de2](https://github.com/CompVis/stable-diffusion/commit/ce05de28194041e030ccfc70c635fe3707cdfc30) of the Stable Diffusion repo and the v1.3 model weights (`sd-v1-3.ckpt`).

For suggestions and bug reports regarding **this particular notebook**, leave a comment (or open a pull request) [here](https://github.com/cpacker/stable-diffusion/pull/1). For questions regarding Stable Diffusion in general, contact the original authors at https://github.com/CompVis/stable-diffusion.

## Clone repo and install packages

*This will take a few minutes.*

In [None]:
#@title Run once { display-mode: "form" }

# clone repo
!git clone https://github.com/CompVis/stable-diffusion.git
%cd ./stable-diffusion

# base colab installs cause issues in lightning.seed_everything
!pip uninstall -y torchtext

# Copy-pasta of https://github.com/cpacker/stable-diffusion/blob/main/environment.yaml
# But try skipping the torch and cudatoolkit installs
#!pip install numpy==1.19.2  # omit, causes this issue: https://stackoverflow.com/questions/66060487/valueerror-numpy-ndarray-size-changed-may-indicate-binary-incompatibility-exp
!pip install albumentations==0.4.3
!pip install opencv-python==4.1.2.30
!pip install pudb==2019.2
!pip install imageio==2.9.0
!pip install imageio-ffmpeg==0.4.2
!pip install pytorch-lightning==1.4.2
!pip install omegaconf==2.1.1
!pip install test-tube>=0.7.5
!pip install streamlit>=0.73.1
!pip install einops==0.3.0
!pip install torch-fidelity==0.3.0
!pip install transformers==4.19.2
!pip install torchmetrics==0.6.0
!pip install kornia==0.6
!pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers
!pip install -e git+https://github.com/openai/CLIP.git@main#egg=clip
!pip install -e .

# Check what GPU we're using
!nvidia-smi

# Colab broke widget support on 8/19/2022, here's the temp fix:
# https://github.com/googlecolab/colabtools/issues/3020


In [None]:
!pip install "ipywidgets>=7,<8"
!pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers

Upload your copy of the weights (e.g., `sd-v1-3.ckpt`) to a folder on your Drive called "stable-diffusion-checkpoints" (or change the following code to match the path where the weights are stored on your account). If you didn't put your weights in a Drive folder called `stable-diffusion-checkpoints`, update `/content/drive/stable-diffusion-checkpoints/sd-v1-3.ckpt` accordingly.

You can also mount the Drive folder using the Colab file browser.

If you haven't already, generate an access token for HuggingFace [here](https://huggingface.co/settings/tokens).

Then run the following code but change `HUGG_USER_NAME` and `HUGG_TOKEN` to your HuggingFace username and token.

In [None]:

!mkdir -p models/ldm/stable-diffusion-v1/
!ln -s /stable-diffusion-checkpoints/sd-v1-3-full-ema.ckpt models/ldm/stable-diffusion-v1/model.ckpt
!ls -l models/ldm/stable-diffusion-v1



## Model and widget setup code (run once)

Before running this - **make sure you restarted your runtime (after doing the install)!**

`Runtime -> Restart Runtime`

In [None]:
!pip install rich

In [None]:
#@title Run once { display-mode: "form" }

# Slightly modified version of: https://github.com/CompVis/stable-diffusion/blob/main/scripts/txt2img.py
import argparse, os, sys, glob    
import torch    
import numpy as np    
from omegaconf import OmegaConf    
from PIL import Image    
#from tqdm.auto import tqdm, trange  # NOTE: updated for notebook
from tqdm import tqdm, trange  # NOTE: updated for notebook
from itertools import islice    
from einops import rearrange    
from torchvision.utils import make_grid    
import time    
import rich
from pytorch_lightning import seed_everything    
from torch import autocast    
from contextlib import contextmanager, nullcontext    
    
from ldm.util import instantiate_from_config    
from ldm.models.diffusion.ddim import DDIMSampler    
from ldm.models.diffusion.plms import PLMSSampler
from scripts.txt2img import chunk, load_model_from_config

from IPython.display import clear_output

# Code to turn kwargs into Jupyter widgets
import ipywidgets as widgets
from collections import OrderedDict

if torch.cuda.is_available():
    device = torch.device("cuda")
else:
    print("Warning - running in CPU mode!")
    device = torch.device("cpu")

def load_model(opt):
    """Seperates the loading of the model from the inference"""
    
    # if opt.laion400m:
    #     print("Falling back to LAION 400M model...")
    #     opt.config = "configs/latent-diffusion/txt2img-1p4B-eval.yaml"
    #     opt.ckpt = "models/ldm/text2img-large/model.ckpt"
    #     opt.outdir = "outputs/txt2img-samples-laion400m"

    config = OmegaConf.load(f"{opt.config}")
    model = load_model_from_config(config, f"{opt.ckpt}")

    model = model.to(device)
    
    return model

def slerp(t, v0, v1, DOT_THRESHOLD=0.9995):
    """ helper function to spherically interpolate two arrays v1 v2 """

    if not isinstance(v0, np.ndarray):
        inputs_are_torch = True
        input_device = v0.device
        v0 = v0.cpu().numpy()
        v1 = v1.cpu().numpy()

    dot = np.sum(v0 * v1 / (np.linalg.norm(v0) * np.linalg.norm(v1)))
    if np.abs(dot) > DOT_THRESHOLD:
        v2 = (1 - t) * v0 + t * v1
    else:
        theta_0 = np.arccos(dot)
        sin_theta_0 = np.sin(theta_0)
        theta_t = theta_0 * t
        sin_theta_t = np.sin(theta_t)
        s0 = np.sin(theta_0 - theta_t) / sin_theta_0
        s1 = sin_theta_t / sin_theta_0
        v2 = s0 * v0 + s1 * v1

    if inputs_are_torch:
        v2 = torch.from_numpy(v2).to(input_device)

    return v2

def diffuse(opt, model, sampler, batch_size, start_code, c):
    uc = None
    if opt.scale != 1.0:
        uc = model.get_learned_conditioning(batch_size * [""])
    shape = [opt.C, opt.H // opt.f, opt.W // opt.f]
    samples_ddim, _ = sampler.sample(S=opt.ddim_steps,
                                                         conditioning=c,
                                                         batch_size=opt.n_samples,
                                                         shape=shape,
                                                         verbose=False,
                                                         unconditional_guidance_scale=opt.scale,
                                                         unconditional_conditioning=uc,
                                                         eta=opt.ddim_eta,
                                                         x_T=start_code)
    print("samples_ddim", samples_ddim.shape)
    x_samples_ddim = model.decode_first_stage(samples_ddim)
    x_samples_ddim = torch.clamp((x_samples_ddim + 1.0) / 2.0, min=0.0, max=1.0)
    return x_samples_ddim


def run_inference(opt, model):
    """Seperates the loading of the model from the inference
    
    Additionally, slightly modified to display generated images inline
    """
    seed_everything(opt.seed)

    if opt.plms:
        sampler = PLMSSampler(model)
    else:
        sampler = DDIMSampler(model)

    os.makedirs(opt.outdir, exist_ok=True)
    outpath = opt.outdir

    batch_size = opt.n_samples
    n_rows = opt.n_rows if opt.n_rows > 0 else batch_size

    
    prompts = opt.prompts

    print("embedding prompts")
    cs = [model.get_learned_conditioning(prompt) for prompt in prompts]
    print("done embedding prompts",cs)

    datas = [[batch_size * c] for c in cs]

    print("datas", len(datas), len(datas[0]), len(datas[0][0]), len(datas[0][0][0]),len(datas[0][0][0][0]))
    #return

    
    samples_path = os.path.join(outpath, "samples")
    os.makedirs(samples_path, exist_ok=True)

    run_count = len(os.listdir(samples_path)) + 1

    sample_path = os.path.join(samples_path, f"run_{run_count}")
    os.makedirs(sample_path, exist_ok=True)
    
    base_count = len(os.listdir(sample_path))
    
    grid_count = len(os.listdir(outpath)) - 1

    start_code = None
    if opt.fixed_code:
        start_code = torch.randn([opt.n_samples, opt.C, opt.H // opt.f, opt.W // opt.f], device=device)

    precision_scope = autocast if opt.precision=="autocast" else nullcontext
    with torch.no_grad():
        with precision_scope("cuda"):
            with model.ema_scope():
                tic = time.time()
                all_samples = list()
                for n in trange(opt.n_iter, desc="Sampling"):
                    for data_a,data_b in zip(datas,datas[1:]):
                        for t in np.linspace(0, 1, opt.num_interpolation_steps):
                            print("data_a",data_a)
                            data = [slerp(float(t), data_a[0], data_b[0])]
                            for c in tqdm(data, desc="data"):

                                print("learned_conditioning",c.shape)
                                x_samples_ddim = diffuse(opt, model, sampler, batch_size, start_code, c)

                                if not opt.skip_save:
                                    for x_sample in x_samples_ddim:
                                        x_sample = 255. * rearrange(x_sample.cpu().numpy(), 'c h w -> h w c')
                                        Image.fromarray(x_sample.astype(np.uint8)).save(
                                            os.path.join(sample_path, f"{base_count:05}.png"))
                                        base_count += 1

                                if not opt.skip_grid:
                                    all_samples.append(x_samples_ddim)

                if not opt.skip_grid:
                    # additionally, save as grid
                    grid = torch.stack(all_samples, 0)
                    grid = rearrange(grid, 'n b c h w -> (n b) c h w')
                    grid = make_grid(grid, nrow=n_rows)

                    # to image
                    grid = 255. * rearrange(grid, 'c h w -> h w c').cpu().numpy()
                    Image.fromarray(grid.astype(np.uint8)).save(os.path.join(outpath, f'grid-{grid_count:04}.png'))
                    grid_count += 1
                    
                    # display
                    if opt.display_inline:
                        clear_output()
                        display(Image.fromarray(grid.astype(np.uint8)))

                toc = time.time()

    print(f"Your samples have been saved to: \n{outpath} \n"
          f" \nEnjoy.")




class WidgetDict2(OrderedDict):
    def __getattr__(self,val):
        return self[val]


# Package into box and render
#primary_options = ['prompt', 'outdir']  # options to put up top
#secondary_options = [k for k in options.keys() if k not in primary_options]  # rest, ordered by insertion

load_options = ['config', 'ckpt']
inference_options = [k for k in options.keys() if k not in load_options]  # rest, ordered by insertion
assert all([k in inference_options + load_options for k in options.keys()])  # make sure we didn't miss any options



In [137]:


# args from argparse converted to widgets:
# https://github.com/CompVis/stable-diffusion/blob/main/scripts/txt2img.py#L48-L177

sins = ["Lust","Gluttony","Greed","Sloth","Wrath","Envy","Pride"]

options = WidgetDict2()
options['prompts'] = [f"{sin} by mati klarwein" for sin in sins]
#    "anatomical illustration of a jellyfish by ernst haeckel",
    # "anatomical illustration of an alien by ernst haeckel",
    # "anatomical illustration of an octopus by ernst haeckel",
    # "anatomical illustration of an angel by ernst haeckel",
    # "anatomical illustration of a buddha by ernst haeckel",
    # "anatomical illustration of a biological cell by ernst haeckel",    
#    ]
options['outdir'] ="outputs/txt2img-samples"
options['skip_grid'] = True
options['skip_save'] = False
options['ddim_steps'] = 50
options['plms'] = True
options['laion400m'] = False
options['fixed_code'] = True
options['ddim_eta'] = 0.0
options['n_iter'] = 1
options['H'] = 512
options['W'] = 512
options['C'] = 4
options['f'] = 8
options['n_samples'] = 1
options['n_rows'] = 0
options['scale'] = 7.5
options['from_file'] = None
options['config'] = "configs/stable-diffusion/v1-inference.yaml"
options['ckpt'] ="models/ldm/stable-diffusion-v1/model.ckpt"
options['seed'] = 66
options['precision'] = "autocast"  # or "full"
# Extra option for the notebook
options['display_inline'] = False
options['num_interpolation_steps'] = 300


## Interactive loop

Change options using the GUI, then run the next cell - **no need to re-run/display the GUI cell (the GUI will automatically update the variables)**

**Just edit the options, then run the next cell (`Run to start dreaming`) to run the model.**

In [None]:

model = load_model(options)



In [None]:
import transformers
print(transformers.__version__)
#import taming
!git clone https://github.com/CompVis/taming-transformers.git
sys.path.append("./taming-transformers")

!git clone https://github.com/openai/CLIP.git
sys.path.append("./CLIP")


In [138]:
run_inference(options, model)

Global seed set to 66


embedding prompts
done embedding prompts [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0'), tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.3656,  0.2489,  1.8134,  ..., -0.1425, -0.7799, -0.2908],
         [-0.3530, -0.4841,  0.9187,  ..., -0.1721,  0.1183, -0.7913],
         ...,
         [ 0.3527,  0.0053, -0.0133,  ...,  0.0450,  0.1052, -1.0591],
         [ 0.3480,  0.0083, -0.0162,  ...,  0.0555,  0.1064, -1.0626],
         [ 0.2633,  0.0443,  0.0890,  ..., -0.0519,  0.1281, -1.0275]]],
       device='cuda:0'), tensor([[[-0.3884,  0.02

Sampling:   0%|                                                 | 0/1 [00:00<?, ?it/s]

data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.57it/s]
data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.81s/it]


samples_ddim torch.Size([1, 4, 64, 64])
data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.58it/s]
data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.81s/it]


samples_ddim torch.Size([1, 4, 64, 64])
data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.58it/s]
data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.80s/it]


samples_ddim torch.Size([1, 4, 64, 64])
data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.58it/s]
data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.80s/it]


samples_ddim torch.Size([1, 4, 64, 64])
data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.58it/s]
data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.80s/it]


samples_ddim torch.Size([1, 4, 64, 64])
data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.58it/s]

data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.81s/it]

samples_ddim torch.Size([1, 4, 64, 64])


data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.81s/it]


data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.59it/s]
data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.80s/it]


samples_ddim torch.Size([1, 4, 64, 64])
data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.57it/s]
data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.82s/it]


samples_ddim torch.Size([1, 4, 64, 64])
data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.58it/s]
data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.80s/it]


samples_ddim torch.Size([1, 4, 64, 64])
data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
PLMS Sampler: 100%|███████████████████████████████████| 50/50 [00:06<00:00,  7.58it/s]
data: 100%|█████████████████████████████████████████████| 1/1 [00:06<00:00,  6.81s/it]


samples_ddim torch.Size([1, 4, 64, 64])
data_a [tensor([[[-0.3884,  0.0229, -0.0522,  ..., -0.4899, -0.3066,  0.0675],
         [-1.0161, -0.6827,  1.2633,  ...,  1.3344,  0.8378,  0.5656],
         [-0.1854,  0.7642,  2.8608,  ...,  0.1620, -0.7441, -1.0416],
         ...,
         [ 0.3158, -0.9338,  0.4018,  ..., -0.5944,  0.2031, -0.9343],
         [ 0.3116, -0.9309,  0.4027,  ..., -0.5929,  0.2033, -0.9449],
         [ 0.2601, -0.8891,  0.4942,  ..., -0.6409,  0.2081, -0.9391]]],
       device='cuda:0')]




learned_conditioning torch.Size([1, 77, 768])
Data shape for PLMS sampling is (1, 4, 64, 64)
Running PLMS Sampling with 50 timesteps



[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A
[A

In [None]:
encoding_options = "-c:v libx264 -crf 20 -preset slow -vf format=yuv420p -c:a aac -movflags +faststart"

!ffmpeg -y -r 4 -i outputs/txt2img-samples/samples/run_3/%*.png -r 4 $encoding_options interpollation.mp4

In [None]:

!python stablediffusionwalk.py --prompt "blueberry spaghetti" --name blueberry