In [None]:
!pip install diffusers==0.11.1
!pip install transformers scipy ftfy accelerate

In [None]:
import torch
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("ayoubkirouane/Stable-Cats-Generator", torch_dtype=torch.float16)

Next, let's move the pipeline to GPU to have faster inference.

In [None]:
pipe = pipe.to("cuda")

And we are ready to generate images:

In [None]:
prompt = "A photo of a picture-perfect white cat."
image = pipe(prompt).images[0]  # image here is in [PIL format](https://pillow.readthedocs.io/en/stable/)

# Now to display an image you can either save it such as:
image.save(f"cat.png")

# or if you're in a google colab you can directly display it with
image

Running the above cell multiple times will give you a different image every time. If you want deterministic output you can pass a random seed to the pipeline. Every time you use the same seed you'll have the same image result.

In [None]:
import torch

generator = torch.Generator("cuda").manual_seed(1024)

image = pipe(prompt, generator=generator).images[0]

image

You can change the number of inference steps using the num_inference_steps argument. In general, results are better the more steps you use. Stable Diffusion, being one of the latest models, works great with a relatively small number of steps, so we recommend to use the default of 50. If you want faster results you can use a smaller number.

The following cell uses the same seed as before, but with fewer steps. Note how some details, such as the horse's head or the helmet, are less defin realistic and less defined than in the previous image:

In [None]:
import torch

generator = torch.Generator("cuda").manual_seed(1024)

image = pipe(prompt, num_inference_steps=15, generator=generator).images[0]

image

Stable Diffusion produces images of 512 × 512 pixels by default. But it's very easy to override the default using the height and width arguments, so you can create rectangular images in portrait or landscape ratios.

These are some recommendations to choose good image sizes:

Make sure height and width are both multiples of 8.
Going below 512 might result in lower quality images.
Going over 512 in both directions will repeat image areas (global coherence is lost).
The best way to create non-square images is to use 512 in one dimension, and a value larger than that in the other one.

In [None]:
image = pipe(prompt, height=512, width=768).images[0]
image