# Disco Tutorials

Disco Diffusion is a colab work cooperated by many talented people. From the algorithm aspect, this work is based on guided-diffusion and clip-guided diffusion. 

The generation process of disco-diffusion consists of following steps,
1. Disco-diffusion takes as inputs user-provided text prompts and image prompts.
2. A pretrained clip model calculates the embeddings of the text and image prompts.
3. The Unet will predict the original sample.
3. A list of image cutouts will be generated from original sample by a cutter.
4. The generation process is optimized by calculating the similarity between the embeddings of the image cutouts, text and image prompts, and other losses.
5. Finally, with the carefully-designed loss functions, the model enables classifier-guidance samping to generate desired images according to the text and image prompts.

So corresponding to above processes, we will introduce what to set for generating and show your results by adjusting arguments/ configs. 

The contents of this tutorials are as follows:

[1.Runtime Settings](#1-Runtime-Settings)

[2.Unet Settings](#2-Unet-Settings)

[3.CLIP Models Settings](#3-CLIP-Models-Settings)

[4.Diffusion Scheduler Settings](#4-Diffusion-Scheduler-Settings)

[5.Loss Settings](#5-Loss-Settings)

[6.Cutter Settings](#6-Cutter-Settings)

## Install MMagic

In [None]:
# Check PyTorch version and CLIP version
!pip list | grep torch
!pip list | grep clip

In [None]:
# Install mmcv dependency via openmim
!pip install openmim
!mim install 'mmcv>=2.0.0'

In [None]:
# Install mmagic from source
%cd /content/
!rm -rf mmagic
!git clone https://github.com/open-mmlab/mmagic.git 
%cd mmagic
!pip install -r requirements.txt
!pip install -e .

## 1. Runtime Settings

In [None]:
import torch
from mmengine import Config, MODELS
from mmengine.registry import init_default_scope
from mmagic.registry import MODULES
from mmcv import tensor2imgs
from matplotlib import pyplot as plt

from torchvision.transforms import ToPILImage, Normalize, Compose
from IPython.display import Image

init_default_scope('mmagic')


def show_tensor(image_tensor, index=0):
    normalized_image = ((image_tensor + 1.) / 2.).clamp(0, 1)
    out = tensor2imgs(normalized_image * 255, to_rgb=False)
    plt.imshow(out[index])
    plt.show()


In [None]:
config = 'configs/disco_diffusion/disco-diffusion_adm-u-finetuned_imagenet-512x512.py'
disco = MODELS.build(Config.fromfile(config).model).cuda().eval()
text_prompts = {
    0: ["clouds surround the mountains and Chinese palaces, sunshine, lake, overlook, overlook, unreal engine, light effect, Dream, Greg Rutkowski, James Gurney, artstation"]
}

seed = 2022
num_inference_steps = 250
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, seed=seed)['samples']
show_tensor(image)

### Image Resolution
Despite the limit of your device limitation, you can set height and width of image as you like.

In [None]:
# image resolution
image = disco.infer(width=768, height=1280, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, seed=seed)['samples']
show_tensor(image)

### image_prompts
 Work in progress.

In [None]:
# # image prompts
# image_prompts = ['path_of_image']
# image = disco.infer(width=768, height=1280, image_prompts=image_prompts, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, seed=seed)['samples']
# show_tensor(image)


### clip_guidance_scale
clip_guidance_scale is one of the most important parameters you will  use. It tells DD how strongly you want CLIP to move toward your prompt  each timestep.  Higher is generally better, but if CGS is too strong it  will overshoot the goal and distort the image. So a happy medium is  needed, and it takes experience to learn how to adjust CGS.

In [None]:
# clip guidance scale
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, clip_guidance_scale=8000, seed=seed)['samples']
show_tensor(image)

image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, clip_guidance_scale=4000, seed=seed)['samples']
show_tensor(image)

## 2. Unet Settings

Disco-Diffusion provides different unet, and we offer configs for the different unets and convert the weights. You only need to select these configs to use the different unets freely.

In [None]:
# 256x256_diffusion_uncond
text_prompts = {
    0: ["clouds surround the mountains and Chinese palaces,sunshine,lake,overlook,overlook,unreal engine,light effect,Dream，Greg Rutkowski,James Gurney,artstation"]
}
config = 'configs/disco_diffusion/disco-diffusion_adm-u-finetuned_imagenet-256x256.py'
disco = MODELS.build(Config.fromfile(config).model).cuda().eval()
image = disco.infer(width=512, height=448, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, seed=seed)['samples']
show_tensor(image)

In [None]:

# 512x512_diffusion_uncond_finetune_008100
config = 'configs/disco_diffusion/disco-diffusion_adm-u-finetuned_imagenet-512x512.py'
disco = MODELS.build(Config.fromfile(config).model).cuda().eval()
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, seed=seed)['samples']
show_tensor(image)


In [None]:

# portrait_generator_v001
text_prompts = {
    0: ["a portrait of supergirl, by artgerm, rosstran, trending on artstation."]
}
config = 'disco-diffusion_portrait-generator-v001.py'
disco = MODELS.build(Config.fromfile(config).model).cuda().eval()
image = disco.infer(width=512, height=512, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, tv_scale=5000, range_scale = 5000, sat_scale = 15250, clip_guidance_scale=20000, seed=seed)['samples']
show_tensor(image)


### The rest unets will come soon!
- pixelartdiffusion_expanded
- pixel_art_diffusion_hard_256
- pixelartdiffusion4k
- watercolordiffusion_2
- watercolordiffusion
- PulpSciFiDiffusion
- secondary


## 3.CLIP Models Settings

Disco-Diffusion uses different clip models to guide the image generation, and the images generated by different clip models have different characteristics. In practice, we combine multiple clip models to generate images.
 In order to study the effect of different clip models, in the following example, we only use one clip model at a time, with other settings keeping the same,  you can experience the characteristics of different clip models by observing the results.

In [None]:
from mmagic.models.editors.disco_diffusion.guider import ImageTextGuider


config = 'configs/disco_diffusion/disco-diffusion_adm-u-finetuned_imagenet-512x512.py'
disco = MODELS.build(Config.fromfile(config).model).cuda().eval()
text_prompts = {0: ["A beautiful painting of a map of the city of Atlantis"]}



### ViTB32


In [None]:

clip_models = []
clip_models_cfg = [
    dict(type='ClipWrapper', clip_type='clip', name='ViT-B/32', jit=False),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### ViTB16

In [None]:


clip_models = []
clip_models_cfg = [
    dict(type='ClipWrapper', clip_type='clip', name='ViT-B/16', jit=False),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### ViTL14

In [None]:

clip_models = []
clip_models_cfg = [
    dict(type='ClipWrapper', clip_type='clip', name='ViT-L/14', jit=False),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### ViTL14_336px

In [None]:

clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper', clip_type='clip', name='ViT-L/14@336px',
        jit=False),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN50

In [None]:
clip_models = []
clip_models_cfg = [
    dict(type='ClipWrapper', clip_type='clip', name='RN50', jit=False),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN50x4

In [None]:
clip_models = []
clip_models_cfg = [
    dict(type='ClipWrapper', clip_type='clip', name='RN50x4', jit=False),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN50x16

In [None]:
clip_models = []
clip_models_cfg = [
    dict(type='ClipWrapper', clip_type='clip', name='RN50x16', jit=False),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN50x64

In [None]:
clip_models = []
clip_models_cfg = [
    dict(type='ClipWrapper', clip_type='clip', name='RN50x64', jit=False),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN101

In [None]:
clip_models = []
clip_models_cfg = [
    dict(type='ClipWrapper', clip_type='clip', name='RN101', jit=False),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### ViTB32_laion2b_e16

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='ViT-B-32',
        pretrained='laion2b_e16',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### ViTB32_laion400m_e31

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='ViT-B-32',
        pretrained='laion400m_e31',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### ViTB32_laion400m_32

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='ViT-B-32',
        pretrained='laion400m_e32',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### ViTB32quickgelu_laion400m_e31

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='ViT-B-32-quickgelu',
        pretrained='laion400m_e31',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)

### ViTB32quickgelu_laion400m_e32

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='ViT-B-32-quickgelu',
        pretrained='laion400m_e32',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### ViTB16_laion400m_e31

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='ViT-B-16',
        pretrained='laion400m_e31',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### ViTB16_laion400m_e32

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='ViT-B-16',
        pretrained='laion400m_e32',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN50_yffcc15m

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='RN50',
        pretrained='yfcc15m',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN50_cc12m

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='RN50',
        pretrained='cc12m',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN50_quickgelu_yfcc15m

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='RN50-quickgelu',
        pretrained='yfcc15m',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN50_quickgelu_cc12m

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='RN50-quickgelu',
        pretrained='cc12m',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN101_yfcc15m

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='RN101',
        pretrained='yfcc15m',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


### RN101_quickgelu_yfcc15m

In [None]:
clip_models = []
clip_models_cfg = [
    dict(
        type='ClipWrapper',
        clip_type='open_clip',
        model_name='RN101-quickgelu',
        pretrained='yfcc15m',
        device='cuda'),
]
for clip_cfg in clip_models_cfg:
    clip_models.append(MODULES.build(clip_cfg))
disco.guider = ImageTextGuider(clip_models).cuda()

image = disco.infer(
    width=1280,
    height=768,
    text_prompts=text_prompts,
    show_progress=True,
    num_inference_steps=num_inference_steps,
    eta=0.8,
    seed=seed)['samples']
show_tensor(image)


## 4.Diffusion Scheduler Settings

Typically, a diffusion model generates images by a reverse scheduler. Many researchers are working on designing different reverse schedulers to improve diffusion models. Now, we only support DDIM scheduler. More kinds of reverse schedulers are coming soon.

### skip_steps
First, When you set the value of `init_image`, you can adjust `skip_steps` for creative reasons. With low skip_steps you can get a result “inspired by” the init_image which will retain the colors and rough layout and shapes but look quite different. With high skip_steps you can preserve most of the init_image contents and only fine-tune the texture.

In [None]:
!wget https://user-images.githubusercontent.com/22982797/205579254-30c3b446-63bb-4172-bbfe-8d1d05e151cf.png -O init.png

In [None]:
init_path = 'init.png'
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, init_image=init_path, skip_steps=50, seed=seed)['samples']
show_tensor(image)
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, init_image=init_path, skip_steps=200, seed=seed)['samples']
show_tensor(image)


Note that, you can still use `skip_steps` to reduce rendering time even you don't set the value of `init_image`.

### steps
Increasing steps will provide more opportunities for the AI to adjust the image, and each adjustment will be smaller, and thus will yield a  more precise, detailed image. Using a larger `steps` will generally increase image quality but also increasing the render time. However, some intricate images can take 1000, 2000, or more steps. It is really up to the user.  

In [None]:
steps = 100
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=steps, eta=0.8, seed=seed)['samples']
show_tensor(image)

steps = 1000
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=steps, eta=0.8, seed=seed)['samples']
show_tensor(image)

## 5.Loss Settings

We define `loss_cfg` in config like this. The loss function defines the distance between the generated image and the expected target, so you can adjust loss settings for you purpose.

In [None]:
loss_cfg = dict(tv_scale=0, range_scale=150, sat_scale=0, init_scale=1000)

In [None]:
config = 'configs/disco_diffusion/disco-diffusion_adm-u-finetuned_imagenet-512x512.py'
disco = MODELS.build(Config.fromfile(config).model).cuda().eval()

### range_scale
Optional, set to zero to turn off. It is used to adjust color contrast. Lower range_scale will increase contrast. Very low numbers  create a reduced color palette, resulting in more vibrant or poster-like  images. Higher range_scale will reduce contrast, for more muted images.

In [None]:
# range_scale
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, range_scale=50, seed=seed)['samples']
show_tensor(image)

image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, range_scale=200, seed=seed)['samples']
show_tensor(image)


### tv_scale
Total variance denoising. Optional, set to zero to turn off. Controls  ‘smoothness’ of final output. If used, tv_scale will try to smooth out  your final image to reduce overall noise. If your image is too 'crunchy', increase tv_scale. TV denoising is good at preserving edges while smoothing away noise in flat regions.

In [None]:

# tv_scale
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, tv_scale=0.1, seed=seed)['samples']
show_tensor(image)

image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, tv_scale=0.9, seed=seed)['samples']
show_tensor(image)


### sat_scale
Saturation scale. Optional, set to zero to turn off.   If used, sat_scale will help mitigate oversaturation. If your image is  too saturated, increase sat_scale to reduce saturation.

In [None]:

# sat_scale
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, sat_scale=0.1, seed=seed)['samples']
show_tensor(image)

image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, sat_scale=0.9, seed=seed)['samples']
show_tensor(image)


### init_scale
 This controls how strongly CLIP will try to match the  init_image provided.  This is balanced against  the clip_guidance_scale (CGS) above. An extreamly large value of `init_scale` won't change the results dramatcally, while an exteamly large value of CGS will destroy the `init_image`.

In [None]:

# init_scale
init_path = 'init.png'
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, init_image=init_path, skip_steps=125, init_scale=1000, seed=seed)['samples']
show_tensor(image)

image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, init_image=init_path, skip_steps=125, init_scale=100, seed=seed)['samples']
show_tensor(image)

## 6.Cutter Settings

### Cutter Settings
This section determines the schedule of CLIP cuts, or snapshots that  CLIP uses to evaluate your image while processing.  In DD, there are two  types of cuts: overview cuts, which take a snapshot of the entire image and evaluate that against the prompt, and inner cuts,  which are smaller cropped images from the interior of the image,  helpful in tuning fine details.  The size of the inner cuts can be  adjusted using the cut_ic_pow parameter.

### cut_overview
The schedule of overview cuts

In [None]:

# cut_overview
cut_overview = [12] * 100 + [4] * 900
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cut_overview=cut_overview, seed=seed)['samples']
show_tensor(image)

cut_overview = [12] * 900 + [4] * 100
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cut_overview=cut_overview, seed=seed)['samples']
show_tensor(image)


### cut_innercut
The schedule of inner cuts

In [None]:

# cut_innercut
cut_innercut = [4] * 100 + [12] * 900
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cut_innercut=cut_innercut, seed=seed)['samples']
show_tensor(image)

cut_innercut = [4] * 900 + [12] * 100
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cut_innercut=cut_innercut, seed=seed)['samples']
show_tensor(image)


### cut_ic_pow
This sets the size of the border used for inner cuts.   High cut_ic_pow values have larger borders, and therefore the cuts  themselves will be smaller and provide finer details.  If you have too  many or too-small inner cuts, you may lose overall image coherency  and/or it may cause an undesirable ‘mosaic’ effect.    Low cut_ic_pow values will allow the inner cuts to be larger, helping  image coherency while still helping with some details.

In [None]:

# cut_ic_pow
cut_ic_pow = [1] * 200 + [0] * 800
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cut_ic_pow=cut_ic_pow, seed=seed)['samples']
show_tensor(image)

cut_ic_pow = [1] * 800 + [0] * 200
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cut_ic_pow=cut_ic_pow, seed=seed)['samples']
show_tensor(image)


### cut_icgray_p
In addition to the overall cut schedule, a portion of the cuts can be  set to be grayscale instead of color.   This may help with improved  definition of shapes and edges, especially in the early diffusion steps  where the image structure is being defined.  cut_icgray_p affects  overview and inner cuts

In [None]:

# cut_icgray_p
cut_icgray_p=[0.2] * 200 + [0] * 800
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cut_icgray_p=cut_icgray_p, seed=seed)['samples']
show_tensor(image)

cut_icgray_p=[0.2] * 800 + [0] * 200
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cut_icgray_p=cut_icgray_p, seed=seed)['samples']
show_tensor(image)


### cutn_batches
Each iteration, the AI cuts the image into smaller  pieces known as cuts, and compares each cut to the prompt to decide how  to guide the next diffusion step.  More cuts can generally lead to  better images, since DD has more chances to fine-tune the image  precision at each timesteps.  

In [None]:

# cutn_batches
cutn_batches = 2
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cutn_batches=cutn_batches, seed=seed)['samples']
show_tensor(image)

cutn_batches = 8
image = disco.infer(width=1280, height=768, text_prompts=text_prompts, show_progress=True, num_inference_steps=num_inference_steps, eta=0.8, cutn_batches=cutn_batches, seed=seed)['samples']
show_tensor(image)