<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# Stable Diffusion - Generate image from text

**Tags:** #stable-diffusion #image-generation #text-to-image #ai #machine-learning #deep-learning

**Author:** [Oussama El Bahaoui](https://www.linkedin.com/in/oelbahaoui/)

**Description:** This notebook would allow you to generate image from text using Stable Diffusion. It is usefull for organizations to create visuals from text for marketing purposes.

**References:**
- [Stability.ai - Stable Diffusion](https://stability.ai/stable-diffusion)
- [Stability.ai - Text to Image](https://stability.ai/text-to-image)

## Input

### Install and update libraries!pip install --user --upgrade transformers diffusers accelerate

In [None]:
!pip install --user --upgrade transformers diffusers accelerate

### Import libraries

In [2]:
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
from PIL import Image
import matplotlib.pyplot as plt
import torch

### Setup Variables

In [2]:
# Model to use: this is the identifier for the model repository that we're going to use. 
# In this case, we're using the "stable-diffusion-2" model from the "stabilityai" repository.
REPO_ID = "stabilityai/stable-diffusion-2"

# This is the number of steps the model will take during inference. 
# In other words, this is the number of times the model will update its predictions.
NUM_INFERENCE_STEPS = 25

# Image output path
IMAGE_PATH = "output.png"

In [3]:
# Define the prompt: this is the description that the model will use as a basis to generate the image. 

prompt = "A person walking through a field of tall grass"

## Model

### Generate image from text

Using Stable Diffusion, we can generate an image from text.

In [None]:
# Create the DiffusionPipeline
# This pipeline is created using a pretrained model from the specified repository. 
# We specify that the model should use 16-bit floating point precision for its computations 
# (which can help to save memory and improve computational speed, with a slight tradeoff in precision).
# We also specify the revision of the model to be "fp16".
pipe = DiffusionPipeline.from_pretrained(REPO_ID, torch_dtype=torch.float16, revision="fp16")

# Modify the scheduler of the pipeline
# The scheduler determines the timing of the steps in the diffusion process.
# Here, we're creating a new scheduler from the configuration of the current one, which effectively keeps the current scheduler's settings.
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)

# Move the pipeline to GPU
# This moves all the computations of the pipeline to the GPU, which is typically much faster than the CPU for these types of tasks.
pipe = pipe.to("cuda")

Fetching 12 files:   0%|          | 0/12 [00:00<?, ?it/s]

## Output

### Generate the image

In [None]:
# Generate an image from a prompt
# This line uses the diffusion pipeline to generate an image from the provided prompt.
# The number of inference steps specified is used in the generation process.
# The output of the pipeline is a batch of images, and we're taking the first one from this batch.
image = pipe(prompt, num_inference_steps=NUM_INFERENCE_STEPS).images[0]

### Save and show the image

In [None]:
# Save the image
image.save(IMAGE_PATH)

# Display the image using matplotlib
img = Image.open(IMAGE_PATH)
plt.imshow(img)
plt.show()