# Simple PyTorch-based Stable Diffusion

This notebook demonstrates how to run Stable Diffusion using PyTorch with Intel Extensions on Intel GPUs. Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input. This implementation leverages hardware acceleration available on Intel GPUs to improve performance.

In this notebook, you will:
- Configure the environment for Intel GPU acceleration
- Install necessary dependencies
- Load a pre-trained Stable Diffusion model
- Generate an image from a text prompt

## Setup PyTorch to use Intel Extensions & Confirm GPU

This section configures PyTorch to work efficiently with Intel GPUs through Intel Extensions for PyTorch (IPEX). IPEX optimizes PyTorch operations to take advantage of Intel hardware features, significantly accelerating deep learning workloads including Stable Diffusion inference.

We'll start by setting up the Python environment and ensuring access to user-installed packages. Then we'll verify that the Intel GPU is properly detected and available for computation.

In [None]:
import os
import sys

username = os.environ.get('USER')
user_bin_path = os.path.expanduser(f"/home/{username}/.local/bin")
sys.path.append(user_bin_path)
print(sys.path)

### Install Required Dependencies

The following packages are essential for running Stable Diffusion:

- **diffusers**: Hugging Face's library that provides implementations for state-of-the-art diffusion models
- **accelerate**: Library for easy PyTorch distributed training and mixed precision
- **transformers**: Provides pre-trained models for natural language understanding and generation
- **tqdm**: Adds progress bars for long-running operations

This cell installs or updates these packages to their latest compatible versions.

In [None]:
!{sys.executable} -m pip install --upgrade diffusers accelerate transformers tqdm

### Understanding Intel XPU Architecture

The Intel XPU architecture refers to Intel's unified programming model that works across different Intel hardware accelerators, including:

- Intel GPUs (like the Intel Arc series)
- Intel Data Center GPUs (like the Intel Max 1100 series)
- Intel CPUs with integrated graphics

XPU provides several advantages for deep learning workloads:

1. **Performance Optimization**: Hardware-specific optimizations for neural network operations
2. **Memory Management**: Efficient memory usage for large models like Stable Diffusion
3. **Mixed Precision**: Support for lower precision formats (FP16, BF16) which accelerate computation
4. **Graph Optimization**: Runtime optimizations to eliminate redundant operations

In this notebook, we use `pipe.to("xpu")` to move our model to the Intel GPU, leveraging these acceleration capabilities.

In [None]:
import torch
import intel_extension_for_pytorch as ipex
print(ipex.xpu.get_device_name(0))

## Setup the Stable Diffusion Pipeline

Now we'll load and configure the Stable Diffusion model pipeline. This is a machine learning system that converts text descriptions into high-quality images.

The pipeline consists of several neural networks working together:
1. A text encoder that converts the input prompt into an embedding
2. A U-Net that gradually denoises random noise guided by the text embedding
3. A VAE decoder that converts the denoised latent representation into an RGB image

We're using the widely-adopted Stable Diffusion v1.5 model from Runway ML, loading a version optimized for lower precision (FP16) to improve performance on the Intel GPU. After loading, we'll move the model to the Intel XPU device and generate our first image.

In [None]:
from diffusers import StableDiffusionPipeline

# Load the Stable Diffusion model from the specified path
pipe = StableDiffusionPipeline.from_pretrained("/home/common/data/Big_Data/GenAI/runwayml/stable-diffusion-v1-5",  
                                               revision="fp16", 
                                               torch_dtype=torch.float16)
# move the model to Intel GPU MAX 1100
pipe = pipe.to("xpu")

# model is ready for submitting queries
result = pipe("A cat riding a surfboard on a big wave")

# Display the generated image
image = result.images[0]
display(image)

# Save the image if desired
# image.save("cat_surfing.png")

## Results and Next Steps

Above, you can see the generated image of "A cat riding a surfboard on a big wave". The image demonstrates how Stable Diffusion can interpret a text prompt and create a corresponding visual representation.

### Experiment Further

Try modifying the text prompt to generate different images. Here are some ideas:

- Adjust the level of detail in your prompt (e.g., "A detailed oil painting of a cat riding a blue surfboard on a massive wave at sunset")
- Try different artistic styles (e.g., "A watercolor sketch of a cat surfing")
- Experiment with completely different subjects and scenarios

### Advanced Options

For more control over the generation process, you can customize additional parameters:

```python
result = pipe(
    prompt="Your detailed prompt here",
    negative_prompt="low quality, blurry",  # What you don't want to see
    num_inference_steps=50,               # More steps = more detail but slower
    guidance_scale=7.5                    # How closely to follow the prompt
)
```

These parameters let you fine-tune the generation process for better results.