In [None]:
!pip install -qU diffusers transformers huggingface_hub

In [None]:
from huggingface_hub import notebook_login
notebook_login()

# Scheduler features

The scheduler controls the entire denoising (or sampling) process.

## Timestep schedules

The timestep or noise schedule determines the amount of noise at each sampling step. The scheduler uses this to generate an image with the corresponding amount of noise at each step. The timestep schedule is generated from the scheduler's default configuration.

For example, `Align Your Steps (AYS)` is a method for optimizing a sampling schedule to generate a high-quality image in as little as 10 steps. The optimal 10-step schedule for SDXL is

In [1]:
from diffusers.schedulers import AysSchedules

sampling_schedule = AysSchedules['StableDiffusionXLTimesteps']
sampling_schedule

[999, 845, 730, 587, 443, 310, 193, 116, 53, 13]

In [None]:
from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
import torch

pipeline = StableDiffusionXLPipeline.from_pretrained(
    'SG161222/RealVisXL_V4.0',
    torch_dtype=torch.float16,
    variant='fp16'
).to('cuda')

pipeline.scheduler = DPMSolverMultistepScheduler.from_config(
    pipeline.scheduler.config,
    algorithm_type='sde-dpmsolver++'
)

In [None]:
prompt = "A cinematic shot of a cute little rabbit wearing a jacket and doing a thumbs up"
generator = torch.Generator('cpu').manual_seed(111)
image = pipeline(
    prompt=prompt,
    negative_prompt="",
    generator=generator,
    timesteps=sampling_schedule # use the AYS sampling schedule
).images[0]

## Timestep spacing

The way sample steps are selected in the schedule can affect the quality of the generated image, especially with respect to rescaling the noise schedule, which can enable a model to generate much brighter or darker images.

Available timestep spacing:
* `leading` creates evenly spaced steps
* `linspace` includes the first and last steps and evenly selects the remaining intermediate steps
* `trailing` only includes the last step and evenly selects the remaining intermediate steps starting from the end

It is recommended to use the `trailing` spacing method because it generates higher quality images with more details when there are fewer sample steps.

In [None]:
from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
import torch

pipeline = StableDiffusionXLPipeline.from_pretrained(
    'SG161222/RealVisXL_V4.0',
    torch_dtype=torch.float16,
    variant='fp16'
).to('cuda')

pipeline.scheduler = DPMSolverMultistepScheduler.from_config(
    pipeline.scheduler.config,
    timestep_spacing='trailing'
)

In [None]:
prompt = "A cinematic shot of a cute little black cat sitting on a pumpkin at night"
generator = torch.Generator(device="cpu").manual_seed(111)
image = pipeline(
    prompt=prompt,
    negative_prompt="",
    generator=generator,
    num_inference_steps=5,
).images[0]
image

In [None]:
pipeline.scheduler = DPMSolverMultistepScheduler.from_config(
    pipeline.scheduler.config,
    timestep_spacing='leading'
)

prompt = "A cinematic shot of a cute little black cat sitting on a pumpkin at night"
generator = torch.Generator(device="cpu").manual_seed(111)
image = pipeline(
    prompt=prompt,
    negative_prompt="",
    generator=generator,
    num_inference_steps=5,
).images[0]
image

## Sigmas

The `sigmas` parameter is the amount of noise added at each timestep according
to the timestep schedule.

When we use a custom `sigmas` value, the `timesteps` are calculated from the custom `sigmaas` value and the default scheduler configuration is ignored.

For example, we can manually pass the `sigmas` for the 10-step AYS schedule.

In [None]:
from diffusers import StableDiffusionXLPipeline, EulerDiscreteScheduler
import torch

pipeline = StableDiffusionXLPipeline.from_pretrained(
    'stabilityai/stable-diffusion-xl-base-1.0',
    torch_dtype=torch.float16,
    variant='fp16'
).to('cuda')

pipeline.scheduler = EulerDiscreteScheduler.from_config(pipeline.scheduler.config)

In [None]:
sigmas = [14.615, 6.315, 3.771, 2.181, 1.342, 0.862, 0.555, 0.380, 0.234, 0.113, 0.0]
prompt = "anthropomorphic capybara wearing a suit and working with a computer"
generator = torch.Generator(device='cuda').manual_seed(123)
image = pipeline(
    prompt=prompt,
    num_inference_steps=10,
    sigmas=sigmas,
    generator=generator
).images[0]

We can check the scheduler's `timesteps`, and see that it is the same as the AYS timestep schedule because the `timestep` schedule is calculated from the `sigmas` we cutsomed:

In [None]:
pipeline.scheduler.timesteps

### Karras sigmas

Karras sigmas should not be used for models that were not trained with them. For example, the base SDXL should not use Karras sigmas but the DreamShaperXL can since it is trained with Karras sigmas.

Karras scheduler's use the timestep schedule and sigmas from the *Elucidating the Design Space of Diffusion-Based Generative Models* paper. This scheduler variant applies a smaller amount of noise per step as it approaches the end of the sampling process compared to other schedulers, and can increase the level of details in the generated image.

In [None]:
from diffusers import StableDiffusionXLPipeline, DPMSolverMultistepScheduler
import torch

pipeline = StableDiffusionXLPipeline.from_pretrained(
    'SG161222/RealVisXL_V4.0',
    torch_dtype=torch.float16,
    variant='fp16'
).to('cuda')

pipeline.scheduler = DPMSolverMultistepScheduler.from_config(
    pipeline.scheduler.config,
    algorithm_type='sde-dpmsolver++',
    use_karras_sigmas=True, # enable Karras sigmas
)

In [None]:
prompt = "A cinematic shot of a cute little black cat sitting on a pumpkin at night"
generator = torch.Generator(device="cpu").manual_seed(111)
image = pipeline(
    prompt=prompt,
    negative_prompt="",
    generator=generator,
).images[0]
image

## Rescale noise schedule

In *Common Diffusion Noise Schedules and Sample Steps are Flawed*, the common noise schedules allowed some signal to leak into the last timestep. This signal leakage at inference can cause models to only generate images with medium brightness. By enforcing a zero signal-to-noise ratio (SNR) for the timstep schedule and sampling from the last timestep, the model can be improved to generate very bright or dark images.

For inference, we need a model that has been trained with *v_prediction*. For example, the `ptx0/pseudo-journey-v2` checkpoint was trained with `v_prediction` and the `DDIMScheduler`

In [None]:
from diffusers import DiffusionPipeline, DDIMScheduler
import torch

pipeline = DiffusionPipeline.from_pretrained(
    'ptx0/pseudo-journey-v2',
    use_safetensors=True
)
pipeline.scheduler = DDIMScheduler.from_config(
    pipeline.scheduler.config,
    rescale_betas_zero_snr=True, # rescale the noise schedule to zero SNR
    timestep_spacing='trailing', # start sampling from the last timestep
)
pipeline.to('cuda')

In [None]:
prompt = "cinematic photo of a snowy mountain at night with the northern lights aurora borealis overhead, 35mm photograph, film, professional, 4k, highly detailed"
generator = torch.Generator('cpu').manual_seed(111)

image = pipeline(
    prompt,
    guidance_rescale=0.7, # prevent over-exposure
    generator=generator,
).images[0]
image

In [None]:
prompt = "cinematic photo of a snowy mountain at night with the northern lights aurora borealis overhead, 35mm photograph, film, professional, 4k, highly detailed"
generator = torch.Generator('cpu').manual_seed(111)

image = pipeline(
    prompt,
    generator=generator,
).images[0]
image