# OpenVINO-Easy: Stable Diffusion Example

This notebook demonstrates how to use OpenVINO-Easy with Stable Diffusion for text-to-image generation.

## Features Demonstrated
- Automatic model downloading and conversion
- Device selection (NPU → GPU → CPU)
- INT8 quantization
- Performance benchmarking
- CLI usage

## Installation

First, install OpenVINO-Easy with Stable Diffusion support:

In [None]:
# Install OpenVINO-Easy with Stable Diffusion support
!pip install openvino-easy[sd]

## Import and Setup

In [None]:
import oe
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import json

print(
    f"OpenVINO-Easy version: {oe.__version__ if hasattr(oe, '__version__') else 'dev'}"
)
print(f"Available devices: {oe.devices()}")

## Load Stable Diffusion Model

OpenVINO-Easy will automatically:
1. Download the model from Hugging Face
2. Convert it to OpenVINO format
3. Select the best available device
4. Cache the converted model for future use

In [None]:
# Load Stable Diffusion model with FP16 precision
print("Loading Stable Diffusion model...")
pipe = oe.load(
    "runwayml/stable-diffusion-v1-5",
    dtype="fp16",
    device_preference=("NPU", "GPU", "CPU"),
)

print(f"Model loaded on device: {pipe.device}")
print(f"Model info: {pipe.runtime.get_model_info()}")

## Generate Images

Now let's generate some images using the loaded model:

In [None]:
# Define prompts
prompts = [
    "a beautiful sunset over mountains, digital art",
    "a cyberpunk city at night with neon lights",
    "a cute cat sitting in a garden, watercolor style",
]

# Generate images
images = []
for i, prompt in enumerate(prompts):
    print(f"Generating image {i + 1}: {prompt}")

    # Run inference
    result = pipe.infer(prompt)

    # Convert result to PIL Image (assuming result is a numpy array)
    if isinstance(result, np.ndarray):
        # Normalize to 0-255 range if needed
        if result.max() <= 1.0:
            result = (result * 255).astype(np.uint8)

        # Convert to PIL Image
        if result.ndim == 3 and result.shape[2] == 3:
            img = Image.fromarray(result)
        else:
            # Handle grayscale or other formats
            img = Image.fromarray(result.squeeze())
    else:
        # Handle other result types
        img = result

    images.append(img)
    print(f"Image {i + 1} generated successfully!")

print(f"Generated {len(images)} images")

## Display Generated Images

In [None]:
# Display images
fig, axes = plt.subplots(1, len(images), figsize=(15, 5))
if len(images) == 1:
    axes = [axes]

for i, (img, prompt) in enumerate(zip(images, prompts)):
    axes[i].imshow(img)
    axes[i].set_title(f"{prompt[:30]}...")
    axes[i].axis("off")

plt.tight_layout()
plt.show()

## Benchmark Performance

Let's benchmark the model to see its performance characteristics:

In [None]:
# Run benchmark
print("Running benchmark...")
stats = pipe.benchmark(warmup=3, runs=10)

# Display results
print("\nBenchmark Results:")
print(f"Device: {stats['device']}")
print(f"Average Latency: {stats['mean_ms']:.2f} ms")
print(f"FPS: {stats['fps']:.1f}")
print(f"P50 Latency: {stats['p50_ms']:.2f} ms")
print(f"P90 Latency: {stats['p90_ms']:.2f} ms")
print(f"Min Latency: {stats['min_ms']:.2f} ms")
print(f"Max Latency: {stats['max_ms']:.2f} ms")
print(f"Standard Deviation: {stats['std_ms']:.2f} ms")

## Compare FP16 vs INT8 Quantization

Let's compare the performance of FP16 vs INT8 quantization:

In [None]:
# Load model with INT8 quantization
print("Loading model with INT8 quantization...")
pipe_int8 = oe.load(
    "runwayml/stable-diffusion-v1-5",
    dtype="int8",
    device_preference=("NPU", "GPU", "CPU"),
)

# Benchmark INT8 model
print("Benchmarking INT8 model...")
stats_int8 = pipe_int8.benchmark(warmup=3, runs=10)

# Compare results
print("\nPerformance Comparison:")
print(
    f"FP16 - Device: {stats['device']}, FPS: {stats['fps']:.1f}, Latency: {stats['mean_ms']:.2f}ms"
)
print(
    f"INT8 - Device: {stats_int8['device']}, FPS: {stats_int8['fps']:.1f}, Latency: {stats_int8['mean_ms']:.2f}ms"
)

# Calculate improvement
fps_improvement = (stats_int8["fps"] - stats["fps"]) / stats["fps"] * 100
latency_improvement = (
    (stats["mean_ms"] - stats_int8["mean_ms"]) / stats["mean_ms"] * 100
)

print("\nINT8 vs FP16:")
print(f"FPS Improvement: {fps_improvement:+.1f}%")
print(f"Latency Improvement: {latency_improvement:+.1f}%")

## CLI Usage

You can also use OpenVINO-Easy from the command line:

In [None]:
# List available devices
!oe devices

In [None]:
# Run inference via CLI
!oe run runwayml/stable-diffusion-v1-5 -p "a beautiful landscape" --dtype fp16

In [None]:
# Benchmark via CLI
!oe bench runwayml/stable-diffusion-v1-5 --warmup 3 --runs 10 --output benchmark_results.json

## Save and Load Benchmark Results

In [None]:
# Save benchmark results
with open("benchmark_results.json", "w") as f:
    json.dump(stats, f, indent=2)

print("Benchmark results saved to benchmark_results.json")

# Load and display saved results
with open("benchmark_results.json", "r") as f:
    loaded_stats = json.load(f)

print("\nLoaded benchmark results:")
print(json.dumps(loaded_stats, indent=2))

## Advanced Usage: Custom Cache Directory

In [None]:
# Load model with custom cache directory
pipe_custom = oe.load(
    "runwayml/stable-diffusion-v1-5", dtype="fp16", cache_dir="./my_cache"
)

print("Model loaded with custom cache directory")
print("Cache location: ./my_cache")

## Model Information

Get detailed information about the loaded model:

In [None]:
# Get model information
model_info = pipe.runtime.get_model_info()

print("Model Information:")
print(f"Device: {model_info['device']}")
print(f"Model ID: {model_info['model_id']}")
print(f"Data Type: {model_info['dtype']}")
print("\nInput Information:")
for name, info in model_info["input_info"].items():
    print(f"  {name}: shape={info['shape']}, dtype={info['dtype']}")
print("\nOutput Information:")
for name, info in model_info["output_info"].items():
    print(f"  {name}: shape={info['shape']}, dtype={info['dtype']}")

## Summary

This notebook demonstrated:

1. **Easy Model Loading** - One line to download, convert, and load models
2. **Automatic Device Selection** - NPU → GPU → CPU fallback
3. **Quantization Support** - FP16 and INT8 precision
4. **Performance Benchmarking** - Comprehensive latency and FPS analysis
5. **CLI Integration** - Command-line tools for quick testing
6. **Smart Caching** - Avoid redundant downloads and conversions

OpenVINO-Easy makes it incredibly simple to use OpenVINO models with just three main functions:
- `oe.load()` - Load any model
- `pipe.infer()` - Run inference
- `pipe.benchmark()` - Measure performance