# Stable Diffusion API Server for Multimodal RAG

This notebook sets up a Stable Diffusion API server on Google Colab GPU that can be accessed from your local machine.

**Requirements:**
- Google Colab with GPU runtime (T4 or better)
- ngrok account (free) for tunneling
- (Optional) Hugging Face token for gated models

## 1. Install Dependencies

In [None]:
!pip install -q diffusers transformers accelerate safetensors flask pyngrok pillow

## 2. (Optional) Setup Hugging Face Token

**Only needed if using gated models like stable-diffusion-2-1**

Get your token from: https://huggingface.co/settings/tokens

In [None]:
# Option 1: Use Colab Secrets (Recommended)
# Add HF_TOKEN to your Colab secrets
from google.colab import userdata
try:
    HF_TOKEN = userdata.get('HF_TOKEN')
    print("‚úÖ HF_TOKEN loaded from Colab secrets")
except:
    print("‚ö†Ô∏è HF_TOKEN not found in secrets")
    # Option 2: Paste token directly (less secure)
    HF_TOKEN = ""  # Paste your token here if needed

# Login to Hugging Face
if HF_TOKEN:
    from huggingface_hub import login
    login(token=HF_TOKEN)
    print("‚úÖ Logged in to Hugging Face")
else:
    print("‚ö†Ô∏è No HF token provided. Will use public models only.")

## 3. Load Stable Diffusion Model

**Choose one of the following models:**
- `runwayml/stable-diffusion-v1-5` - No auth required, fast
- `stabilityai/stable-diffusion-2-1` - Requires HF token, better quality
- `CompVis/stable-diffusion-v1-4` - No auth required

In [None]:
import torch
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
from PIL import Image
import io
import base64

# Model configuration - Choose one:
# Option 1: No authentication required (Recommended for quick start)
model_id = "runwayml/stable-diffusion-v1-5"

# Option 2: Better quality but requires HF token
# model_id = "stabilityai/stable-diffusion-2-1"

# Option 3: Alternative public model
# model_id = "CompVis/stable-diffusion-v1-4"

print(f"Loading Stable Diffusion model: {model_id}...")

try:
    pipe = StableDiffusionPipeline.from_pretrained(
        model_id,
        torch_dtype=torch.float16,
        safety_checker=None,
        use_auth_token=HF_TOKEN if 'HF_TOKEN' in globals() and HF_TOKEN else None
    )
    
    # Use DPM solver for faster generation
    pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
    
    # Move to GPU
    pipe = pipe.to("cuda")
    
    print("‚úÖ Model loaded successfully!")
    print(f"   Model: {model_id}")
    print(f"   Device: {pipe.device}")
    
except Exception as e:
    print(f"‚ùå Error loading model: {e}")
    print("\nTroubleshooting:")
    print("1. If using stabilityai/stable-diffusion-2-1:")
    print("   - Accept the license at: https://huggingface.co/stabilityai/stable-diffusion-2-1")
    print("   - Add HF_TOKEN to Colab secrets")
    print("2. Or use runwayml/stable-diffusion-v1-5 (no auth needed)")
    raise

## 4. Create Flask API Server

In [None]:
from flask import Flask, request, jsonify, send_file
from io import BytesIO

app = Flask(__name__)

@app.route('/health', methods=['GET'])
def health():
    return jsonify({"status": "healthy", "model": model_id})

@app.route('/generate', methods=['POST'])
def generate():
    try:
        data = request.json
        
        prompt = data.get('prompt', '')
        negative_prompt = data.get('negative_prompt', 'blurry, bad quality, distorted')
        num_inference_steps = data.get('num_inference_steps', 30)
        guidance_scale = data.get('guidance_scale', 7.5)
        height = data.get('height', 512)
        width = data.get('width', 512)
        
        print(f"Generating image for prompt: {prompt}")
        
        # Generate image
        image = pipe(
            prompt=prompt,
            negative_prompt=negative_prompt,
            num_inference_steps=num_inference_steps,
            guidance_scale=guidance_scale,
            height=height,
            width=width
        ).images[0]
        
        # Convert to bytes
        img_io = BytesIO()
        image.save(img_io, 'PNG')
        img_io.seek(0)
        
        return send_file(img_io, mimetype='image/png')
        
    except Exception as e:
        print(f"Error: {e}")
        return jsonify({"error": str(e)}), 500

print("‚úÖ Flask app created!")

## 5. Setup ngrok Tunnel

Get your ngrok auth token from: https://dashboard.ngrok.com/get-started/your-authtoken

In [None]:
from pyngrok import ngrok

# Set your ngrok auth token
NGROK_AUTH_TOKEN = "YOUR_NGROK_TOKEN_HERE"  # Replace with your token

ngrok.set_auth_token(NGROK_AUTH_TOKEN)

# Create tunnel
public_url = ngrok.connect(5000)
print("\n" + "="*60)
print("üöÄ Stable Diffusion API is running!")
print("="*60)
print(f"\nPublic URL: {public_url}")
print("\nAdd this URL to your .env file:")
print(f"SD_API_URL={public_url}")
print("\n" + "="*60)

## 6. Start the Server

**Important:** Keep this cell running to maintain the API server.

In [None]:
# Run the Flask app
app.run(port=5000)

## Test the API (Optional)

In [None]:
import requests
from PIL import Image
from io import BytesIO

# Test generation
url = str(public_url) + "/generate"

payload = {
    "prompt": "a beautiful sunset over the ocean, vibrant colors, photorealistic",
    "num_inference_steps": 30,
    "guidance_scale": 7.5
}

print("Sending request...")
response = requests.post(url, json=payload, timeout=120)

if response.status_code == 200:
    img = Image.open(BytesIO(response.content))
    display(img)
    print("‚úÖ Image generated successfully!")
else:
    print(f"‚ùå Error: {response.status_code}")
    print(response.text)