# Animated Storyboard Generator


This Colab notebook creates a web interface using **Gradio** to automatically generate animated storyboards. It integrates several powerful AI models to transform a simple topic into a full video with images and voiceover.

### Key Features:

* **AI-Powered Script & Prompt Generation:** Utilizes **OpenAI's GPT** to generate a complete storyboard script (visuals and narration) from a user-provided topic. It then automatically converts these scene descriptions into optimized prompts for the image generation model.

* **High-Quality Image Generation:** Employs **Stable Diffusion XL** (with a base and refiner model) to create visually appealing images corresponding to each scene in the script.

* **Realistic Voiceovers:** Integrates with the **Eleven Labs API** to generate high-quality, natural-sounding voiceovers for the narration part of the script.

* **Automated Animation:** Uses the **moviepy** library to seamlessly combine the generated images and audio into a final animated video storyboard.

* **Interactive Web Interface:** The entire process is wrapped in a user-friendly **Gradio** interface with two main tabs:
    * **Generate Storyboard:** Input a topic to create the script and generate all the necessary images.
    * **Create Animation:** A single click to compile the generated images and narrations into a final video.

### Core Technologies Used:

* **Web UI:** Gradio
* **Image Generation:** Stable Diffusion XL (via `diffusers`)
* **Language Models:** OpenAI GPT, BERT (for text embeddings)
* **Audio Generation:** Eleven Labs API
* **Video Processing:** moviepy
* **Core Libraries:** `transformers`, `accelerate`, `pillow`, `scikit-learn`, `openai`

##1. Install Dependencies
#Installs required libraries using pip. -q suppresses output.

In [None]:
## 1. Install Dependencies
!pip install -q diffusers transformers accelerate scikit-learn pillow huggingface-hub gradio openai==0.28 moviepy

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m13.8/13.8 MB[0m [31m23.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.6/24.6 MB[0m [31m81.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m883.7/883.7 kB[0m [31m49.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m850.1 kB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.5/211.5 MB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.3/56.3 MB[0m [31m18.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

# Import necessary libraries for the application.


In [None]:
import torch
from PIL import Image
import os
import io
import gradio as gr
import numpy as np
from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline, DiffusionPipeline, EulerDiscreteScheduler, AutoencoderKL
from transformers import AutoTokenizer, AutoModel
from sklearn.metrics.pairwise import cosine_similarity
import zipfile
from google.colab import files
from moviepy.editor import ImageClip,AudioFileClip,concatenate_videoclips,CompositeAudioClip
from moviepy.config import change_settings
import requests
from typing import List, Tuple
import time
import json
import openai
import tqdm

  if event.key is 'enter':



# Define constant variables for models and API keys.


In [None]:
base_model = "stabilityai/stable-diffusion-xl-base-1.0"
style_model = "blink7630/storyboard-sketch"
ELEVENLABS_API_KEY = ""
VOICE_ID = "onwK4e9ZLuTAKqWW03F9"
GPT_MODEL = "gpt-4.1"
GPT_MODEL_2  = "gpt-4o"
openai_api_key = ""

# Check and print available GPU information.


In [None]:
print(f"GPU Available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU Device: {torch.cuda.get_device_name(0)}")
    print(f"GPU Memory: {torch.cuda.get_device_properties(0).total_memory / (1024**3):.2f} GB")

GPU Available: True
GPU Device: Tesla T4
GPU Memory: 14.74 GB


#Function to load Stable Diffusion and BERT models.
## 3. Core Functions


In [None]:
## 3. Core Functions
def load_models(model_id=base_model):
    """Load models with optional quantization for smaller GPUs"""

    dtype = torch.float16
    vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix",torch_dtype=torch.float16)
    try:
        base = DiffusionPipeline.from_pretrained(
        base_model,
        vae=vae,
        torch_dtype=torch.float16,
        variant="fp16",
        use_safetensors=True
        )

        base.load_lora_weights(style_model)
        _ = base.to("cuda")
        refiner = DiffusionPipeline.from_pretrained(
        "stabilityai/stable-diffusion-xl-refiner-1.0",
        text_encoder_2=base.text_encoder_2,
        vae=base.vae,
        torch_dtype=torch.float16,
        use_safetensors=True,
        variant="fp16",
        )
        _ = refiner.to("cuda")
        bert_tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
        bert_model = AutoModel.from_pretrained("bert-base-uncased").to("cuda")
        return base, refiner, bert_tokenizer, bert_model
    except Exception as e:
        print(f"Error loading models: {e}")
        return None, None, None, None

def bert_sentence_embedding(sentence, tokenizer, model, device):
    """Get BERT embedding for a sentence"""
    tokens = tokenizer(sentence, return_tensors="pt")
    with torch.no_grad():
        output = model(**{k: v.to(device) for k, v in tokens.items()})
    return output.last_hidden_state[:, 0, :].cpu().numpy()

def find_similar_prompt(query, string_list, tokenizer, model, device):
    """Find the most similar prompt using BERT embeddings"""
    if not string_list:
        return 0
    query_embedding = bert_sentence_embedding(query, tokenizer, model, device)
    string_embeddings = [bert_sentence_embedding(s, tokenizer, model, device) for s in string_list]
    similarities = [cosine_similarity(query_embedding, s.reshape(1, -1))[0, 0] for s in string_embeddings]
    return similarities.index(max(similarities))

# Function to load models globally to avoid reloading.


In [None]:
# Global variables to store models
global_models = None

def load_global_models(model_id):
    """Load models globally to avoid reloading"""
    global global_models
    device = "cuda"
    if global_models is None or global_models[0] != model_id:
        base, refiner, bert_tokenizer, bert_model = load_models(model_id)
        global_models = (model_id, base, refiner, bert_tokenizer, bert_model)

    return global_models[1], global_models[2], global_models[3], global_models[4]

## 4. Gradio Interface Function


In [None]:
## 4. Gradio Interface Function
def generate_storyboard(
    prompts,
    model_id=base_model,
    prefix=" ",
    strength=0.8,
    guidance_scale=7.5,
    use_similarity=True,
    progress=gr.Progress()
):
    prompt_list = [p.strip() for p in prompts.splitlines() if p.strip()]
    if not prompt_list:
        return None, "Error: No valid prompts provided"
    device = "cuda"
    try:
        progress(0, desc="Loading models...")
        base, refiner, bert_tokenizer, bert_model = load_global_models(model_id)

        generated_images = []

        for idx, prompt in enumerate(prompt_list):
            full_prompt = prefix + prompt
            progress_val = 0.05 + 0.9 * (idx / len(prompt_list))
            progress(progress_val, desc=f"Generating image {idx+1}/{len(prompt_list)}: {prompt}")

            # Step 1: Generate latent image from base model
            base_output = base(
                prompt=full_prompt,
                num_inference_steps=50,
                guidance_scale=guidance_scale,
                output_type="latent",
                return_dict=True
            )

            # Step 2: Refine using refiner
            refined_output = refiner(
                prompt=full_prompt,
                num_inference_steps=20,
                guidance_scale=guidance_scale,
                image=base_output.images,
                return_dict=True
            )

            final_image = refined_output.images[0]
            generated_images.append(final_image)

        # Create a grid of images
        rows = (len(generated_images) + 2) // 3
        grid_height = rows * 512
        grid_width = min(3, len(generated_images)) * 512

        grid = Image.new('RGB', (grid_width, grid_height))
        for i, img in enumerate(generated_images):
            row = i // 3
            col = i % 3
            grid.paste(img, (col * 512, row * 512))

        os.makedirs("output", exist_ok=True)
        for i, img in enumerate(generated_images):
            img.save(f"output/frame_{i+1}.png")

        with open("output/prompts.txt", "w") as f:
            for i, prompt in enumerate(prompt_list):
                f.write(f"Frame {i+1}: {prompt}\n")

        with zipfile.ZipFile("output/storyboard.zip", "w") as zipf:
            for i in range(len(generated_images)):
                zipf.write(f"output/frame_{i+1}.png", f"frame_{i+1}.png")
            zipf.write("output/prompts.txt", "prompts.txt")

        return grid, "Storyboard generated successfully! Click the download button to get all images."

    except Exception as e:
        return None, f"Error generating storyboard: {e}"

def create_download_btn():
    """Create download button for the zip file"""
    if os.path.exists("output/storyboard.zip"):
        with open("output/storyboard.zip", "rb") as f:
            content = f.read()
        return content
    return None

# Function to generate a cute voice from text using Eleven Labs API.


In [None]:
def generate_cute_voice(prompt, output_audio_path):
    url = f"https://api.elevenlabs.io/v1/text-to-speech/{VOICE_ID}"
    headers = {
        "xi-api-key": ELEVENLABS_API_KEY,
        "Content-Type": "application/json"
    }
    data = {
        "text": prompt,
        "model_id": "eleven_monolingual_v1",
        "voice_settings": {
            "stability": 0.5,
            "similarity_boost": 0.75,
            "style": 0.5,
            "use_speaker_boost": True
        }
    }

    response = requests.post(url, json=data, headers=headers)
    if response.status_code == 200:
        with open(output_audio_path, 'wb') as f:
            f.write(response.content)
    else:
        raise Exception(f"Failed to generate voice: {response.text}")

# Function to create an animated storyboard video with generated voiceovers.


In [None]:
def create_storyboard_animation(
    images,
    prompts,
    output_path="output/storyboard_animation.mp4",
    fps=24,
    gap_between_frames=1.0
):
    clips = []

    for i, (img, prompt) in enumerate(zip(images, prompts)):
        print(f"Generating voice for frame {i+1}...")

        img_path = f"output/frame_{i}.png"
        img.save(img_path)

        audio_path = f"output/audio_{i}.mp3"
        generate_cute_voice(prompt, audio_path)

        audio_clip = AudioFileClip(audio_path)
        duration = audio_clip.duration + gap_between_frames

        image_clip = (
            ImageClip(img_path)
            .set_duration(duration)
            .set_audio(audio_clip)
            .fadein(0.3)
            .fadeout(0.3)
        )
        clips.append(image_clip)

    final_video = concatenate_videoclips(clips, method="compose")

    print(f"Exporting final video to: {output_path}")
    final_video.write_videofile(output_path, codec='libx264', audio_codec='aac', fps=fps)

    print("Video created successfully.")
    return output_path

In [None]:
'''Act as a storyboard concept artist. Create an animated short for high-school and college students on the theme "{topic}".

STORY RULES
Produce 8 to 12 scenes.

Characters / Visual Focus:
1. Analyze the topic: decide whether it is best explained through human actions (e.g., "Teamwork", "Public Speaking", "Garbage Management") or through non-human elements / abstract visuals (e.g., "Photosynthesis", "Water Cycle", "Cloud Formation").
2. Establish the initial visual focus in Scene 1:
   • Generic human characters interacting with the topic, or
   • Specific non-human elements (plants, animals, objects, environments, diagrams, etc.).
3. Maintain consistency: whatever focus you establish in Scene 1 (people-centric or non-human) must remain consistent in every subsequent scene.

Ensure logical continuity so each scene flows naturally from the previous one (cause-and-effect, escalating stakes, or step-by-step problem solving).
Write narration only in third person—describe actions and emotions; no direct dialogue.
Keep all content age-appropriate, engaging, and relevant to older students.

OUTPUT FORMAT
Return only a JSON array—no extra text. Each element must follow this schema:

{{
  "scene_number": <integer>,
  "visual_cue": "Brief note on the setting or key visual action",
  "narration": "Third-person description of what is happening and why it matters"
}}
'''

'Act as a storyboard concept artist. Create an animated short for high-school and college students on the theme "{topic}".\n\nSTORY RULES\nProduce 8 to 12 scenes.\n\nCharacters / Visual Focus:\n1. Analyze the topic: decide whether it is best explained through human actions (e.g., "Teamwork", "Public Speaking", "Garbage Management") or through non-human elements / abstract visuals (e.g., "Photosynthesis", "Water Cycle", "Cloud Formation").\n2. Establish the initial visual focus in Scene 1:\n   • Generic human characters interacting with the topic, or\n   • Specific non-human elements (plants, animals, objects, environments, diagrams, etc.).\n3. Maintain consistency: whatever focus you establish in Scene 1 (people-centric or non-human) must remain consistent in every subsequent scene.\n\nEnsure logical continuity so each scene flows naturally from the previous one (cause-and-effect, escalating stakes, or step-by-step problem solving).\nWrite narration only in third person—describe action

# Define GPT models and API key for generating storyboard scripts.


In [None]:
GPT_MODEL = "gpt-4.1"
GPT_MODEL_2  = "gpt-4o"
openai_api_key = ""

global narrations
# Store last scenes for introspection
last_scenes = []
narrations = []

def generate_storyboard_script(topic: str) -> list:
    if not topic.strip():
        raise ValueError("Topic cannot be empty")
    system_msg = (
    f'''Act as a storyboard concept artist. Create an animated short for high-school and college students on the theme: "{topic}".

STORY RULES

- Produce between ** 3 scenes**.
- Dynamically adapt the storytelling style:
  - If the topic is **non-technical** (e.g., "Teamwork", "Peer Pressure", "Procrastination"), use a **character-driven** approach with relatable human figures in everyday settings.
  - If the topic is **technical or scientific** (e.g., "Climate Change", "Blockchain", "Photosynthesis"), use a **diagrammatic or illustration-driven** approach with non-human elements like animated symbols, natural elements, or labeled visuals.
- In **Scene 1**, clearly establish the visual direction (character-based or illustration-based). Maintain this style consistently through all scenes.
- Ensure a **clear narrative flow** from one scene to the next—based on:
  - Cause and effect
  - Escalating tension
  - Sequential explanation
  - Step-by-step discovery
- Ensure that visual cues make sense for an text to image model

VOICEOVER STYLE

- **Narration must be in third person** only. No direct speech.
- **Tone**: Conversational, engaging, slightly energetic—suitable for teen and young adult attention spans.
- **Length**: Keep each narration **crisp and punchy**, like lines in a well-paced animation.
- Describe **actions, feelings, or discoveries** to support visual storytelling.

OUTPUT FORMAT
Return only a JSON array—no extra text. Each element must follow this schema:

{{
  "scene_number": <integer>,
  "visual_cue": "Brief note on the setting or key visual descriptiom",
  "narration": "Third-person description of what is happening and why it matters"
}}
''')

    messages = [{"role": "system", "content": system_msg}]
    openai.api_key = openai_api_key
    resp = openai.ChatCompletion.create(
        model=GPT_MODEL,
        messages=messages,
        temperature=1.0,
    )
    raw = resp.choices[0].message.content.strip()
    try:

        scenes = json.loads(raw)
        count = len(scenes)
        scenes = scenes[:count]

        global last_scenes
        last_scenes = scenes

        print("Generated Scenes:")
        for sc in scenes:
            print(f"Scene {sc['scene_number']}:")
            print(f"  Visual Cue: {sc['visual_cue']}")
            print(f"  INarration: {sc['narration']}\n")
            narrations.append(sc['narration'])

        return scenes

    except json.JSONDecodeError as e:
        print("Error parsing JSON:", e)
        return []

# Instructions for the Stable Diffusion prompt builder.
_SD_INSTRUCTIONS = r"""
You are a **Stable-Diffusion prompt-builder** for **animation-style images**.
Given an idea {{input_idea}}, return **one comma-separated keyword string**—nothing else—suitable for SD-XL **≤ 77 tokens**.

─────────────────
GLOBAL RULE — Animation Focus
1. Every prompt must read like a frame from an animated film.
2. Start with **one** prefix that best matches the vibe (choose automatically unless the user names one):
   • animation still · animated frame · cartoon scene · cel-shaded concept art · hand-drawn sketch frame · stop-motion capture · claymation shot · paper-cut artwork
3. If the user *explicitly* requests **Anime**, override with → **masterpiece, best quality, (Anime:1.4),** at the very start.

─────────────────
PROMPT-BUILDING CHECKLIST
• Formula “[chosen animation prefix] of [main subject], [1–3 style cues]”
• Main subject 1–2 concrete nouns + adjectives; no long verb chains.
• Style cues concept-art tags, aesthetics (steampunk, vaporwave…), named artists (≤2), mediums (oil on canvas if you want painterly animation, etc.).
• Composition / camera portrait, ultrawide, macro, bird’s-eye… (optional).
• Color / lighting cinematic lighting, vivid colors, neon glow, golden hour…
• Positive phrasing state what *is*, never what *is not*.
• Specific counts use singular nouns or explicit numbers.
• Distill complexity bold, readable visuals trump over-detailed ones.
• Anything unstated is random—specify only what truly matters.
• Add "Vibrant colours" keyword in the prompt.



Special prefix for **photographic** requests (rare in animation workflows):
(((photographic, photo, photogenic))), extremely high quality high detail RAW color photo,

Forbidden anywhere in the output:
category labels (Subject, Medium, Style, Artist, Website, Resolution, Additional details, Color, Lighting), articles (“a”, “the”, “there”), quote marks, phrases “the image”, “the overall tone”, “by artist”.

If nudity is present, include **nude** and omit “tasteful” or “respectful”.

─────────────────
TRY LISTS (optionally pick 0–2 from each)
• Objects / Figures wizard, angel, necromancer, city, queen, temple, farm, rockstar …
• Feelings / Themes “sense of awe”, “birth of time”, “desire for knowledge”, “shores of infinity” …
• Styles cyberpunk, solarpunk, surreal, vaporwave, psychedelic, minimalism, impressionism …
• Mediums watercolor painting, charcoal sketch, woodblock print, graffiti mural, stone sculpture …
• Artists James Gurney, MC Escher, Salvador Dali, Alphonse Mucha, Greg Rutkowski, Studio Ghibli …


─────────────────
EXAMPLES

Input idea: **“Photosynthesis explained visually”**
Output keywords (≤77 tokens):
animation still of sunflower cross-section glowing chloroplasts absorbing sunlight, educational infographic style, vivid green and gold palette, warm backlight, ultra-detailed, concept art, by James Gurney, ArtStation

Input idea: **“Cyberpunk shinto priest”**
Output keywords:
cartoon scene of cyberpunk shinto priest in neon alley holding holographic ofuda, rain-slick pavement reflection, dramatic rim light, ultra-detailed, vaporwave palette, by Greg Rutkowski and Ross Tran
"""

# Function to generate a Stable Diffusion prompt for a given scene using the GPT model.
def sd_prompt_for_scene(scene: str) -> tuple:
    openai.api_key = openai_api_key  # Replace with your API key
    response = openai.ChatCompletion.create(
    model=GPT_MODEL_2,
    temperature=1.0,
    messages=[
               {"role": "system", "content": _SD_INSTRUCTIONS},
               {"role": "user", "content": f"Scene:\n{scene}"}
           ]
       )
    text = response.choices[0].message.content.strip()
    print(text)
    return text

# Function to generate a list of Stable Diffusion prompts from a list of scenes.
def generate_sd_prompts(scenes: list) -> list:
    prompts = []
    for i, scene in enumerate(tqdm.tqdm(scenes, desc="Generating SD prompts")):
        prompt = sd_prompt_for_scene(scene)
        prompts.append(prompt)
        if i < len(scenes) - 1:
            time.sleep(5)
    return prompts

# Function to generate a storyboard from a list of prompts using Stable Diffusion models.
def generate_storyboard(
    prompt_list: list,
    model_id=base_model,
    prefix="high quality, detailed digital art of ",
    strength=0.8,
    guidance_scale=7.5,
    use_similarity=True,
    progress=gr.Progress()
):
    """Generate a storyboard from text prompts"""

    if not prompt_list:
        return None, "Error: No valid prompts provided"
    device = "cuda"
    try:
        progress(0, desc="Loading models...")
        base, refiner, bert_tokenizer, bert_model = load_global_models(model_id)
        generated_images = []
        for idx, prompt in enumerate(prompt_list):
            full_prompt = prefix + prompt
            progress_val = 0.05 + 0.9 * (idx / len(prompt_list))
            progress(progress_val, desc=f"Generating image {idx+1}/{len(prompt_list)}: {prompt}")

            # Step 1: Generate latent image from base model
            base_output = base(
                prompt=full_prompt,
                num_inference_steps=50,
                guidance_scale=guidance_scale,
                output_type="latent",
                return_dict=True
            )

            # Step 2: Refine using refiner
            refined_output = refiner(
                prompt=full_prompt,
                num_inference_steps=20,
                guidance_scale=guidance_scale,
                image=base_output.images,
                return_dict=True
            )

            final_image = refined_output.images[0]
            generated_images.append(final_image)

        # Create a grid of images
        rows = (len(generated_images) + 2) // 3  # 3 images per row
        grid_height = rows * 512
        grid_width = min(3, len(generated_images)) * 512

        grid = Image.new('RGB', (grid_width, grid_height))
        for i, img in enumerate(generated_images):
            row = i // 3
            col = i % 3
            grid.paste(img, (col * 512, row * 512))

        # Save images and prompts to output folder
        os.makedirs("output", exist_ok=True)
        for i, img in enumerate(generated_images):
            img.save(f"output/frame_{i+1}.png")

        with open("output/prompts.txt", "w") as f:
            for i, prompt in enumerate(prompt_list):
                f.write(f"Frame {i+1}: {prompt}\n")

        with zipfile.ZipFile("output/storyboard.zip", "w") as zipf:
            for i in range(len(generated_images)):
                zipf.write(f"output/frame_{i+1}.png", f"frame_{i+1}.png")
            zipf.write("output/prompts.txt", "prompts.txt")

        return grid, "Storyboard generated successfully! Click the download button to get all images."

    except Exception as e:
        return None, f"Error generating storyboard: {e}"

# Function to generate a storyboard and its corresponding Stable Diffusion prompts from a given topic.
def generate_from_topic(
    topic, model_id, prefix, strength, guidance_scale, use_similarity
):
    scenes = generate_storyboard_script(topic)
    prompt_pairs = generate_sd_prompts(scenes)
    return generate_storyboard(
        prompt_pairs,
        model_id=model_id,
        prefix=prefix,
        strength=strength,
        guidance_scale=guidance_scale,
        use_similarity=use_similarity
    )



# Function to create an animation from generated storyboard images and narrations.


In [None]:
def create_animation():
    try:
        # Check if images exist
        if not os.path.exists("output/frame_1.png"):
            return None, "Please generate a storyboard first"
        frame_files = [
            f for f in os.listdir("output")
            if f.startswith("frame_") and f.endswith(".png")
        ]

        frame_files.sort(key=lambda x: int(x.split("_")[1].split(".")[0]))

        # Read frames and prompts
        images = [Image.open(f"output/{f}") for f in frame_files]
        min_len = len(images)
        images = images[:min_len]

        video_path = create_storyboard_animation(images=images,prompts=narrations)

        return video_path, "Animation created successfully! Click the download button to get the video."

    except Exception as e:
        return None, f"Error creating animation: {e}"

## 5. Create Gradio Interface
# Defines the Gradio interface for the application.

In [None]:
custom_css = """
body {
    background-color: #000814 !important;
    color: #00f0ff !important;
    font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
}

.gradio-container, .gr-block, .gr-panel, .gr-box {
    background-color: #000814 !important;
    color: #00f0ff !important;
    border: none !important;
}

h1, h2, h3,
.gr-markdown h1, .gr-markdown h2, .gr-markdown h3 {
    color: #00f0ff !important;
    text-shadow: 0 0 6px rgba(0, 240, 255, 0.6);
}

input, textarea, select {
    background-color: #001d2e !important;
    color: #00f0ff !important;
    border: 1px solid #00f0ff !important;
    box-shadow: 0 0 10px rgba(0, 240, 255, 0.3) inset !important;
    border-radius: 6px;
}

button, .gr-button {
    background-color: #001d2e !important;
    color: #00f0ff !important;
    border: 1px solid #00f0ff !important;
    box-shadow: 0 0 12px rgba(0, 240, 255, 0.5);
    border-radius: 6px;
    transition: background-color 0.3s;
}
button:hover, .gr-button:hover {
    background-color: #003344 !important;
    box-shadow: 0 0 16px rgba(0, 240, 255, 0.75);
}

.gr-slider .noUi-target {
    background: #001a26 !important;
    border: 1px solid #00f0ff !important;
}
.gr-slider .noUi-connect {
    background: #00f0ff !important;
}
.gr-slider .noUi-handle {
    background: #003344 !important;
    border: 2px solid #00f0ff !important;
    box-shadow: 0 0 10px rgba(0, 240, 255, 0.6);
}

.gr-tabs, .gr-tabitem {
    background-color: #00121f !important;
    color: #00f0ff !important;
}
.gr-tabs .tabitem.selected {
    background-color: #003344 !important;
    border-bottom: 2px solid #00f0ff !important;
}

.gr-image img, .gr-video video {
    border: 1px solid #00f0ff !important;
    border-radius: 8px;
    box-shadow: 0 0 16px rgba(0, 240, 255, 0.3);
}

input[type="checkbox"] {
    accent-color: #00f0ff !important;
}
.gr-checkbox label {
    color: #00f0ff !important;
}
"""

# Add css=custom_css to your interface:
with gr.Blocks(title="Stable Diffusion Storyboard Generator", css=custom_css) as demo:
    gr.Markdown("# 🎬 Stable Diffusion Storyboard Generator")
    gr.Markdown("Generate a sequence of images for a visual storyboard from text prompts")

    with gr.Tabs():
        with gr.TabItem("Generate Storyboard"):
            with gr.Row():
                with gr.Column():
                    topic = gr.Textbox(
                        label="Storyboard Topic (one line)",
                        placeholder="e.g. Photosynthesis process in a leaf…"
                    )

                    with gr.Row():
                        model_id = gr.Dropdown(
                            label="Model",
                            choices=[base_model],
                            value=base_model
                        )
                        prefix = gr.Textbox(
                            label="Style Prefix",
                            value="high quality, detailed digital art of "
                        )

                    with gr.Row():
                        strength = gr.Slider(
                            label="Transformation Strength",
                            minimum=0.1,
                            maximum=1.0,
                            value=0.8,
                            step=0.05
                        )
                        guidance_scale = gr.Slider(
                            label="Guidance Scale",
                            minimum=1.0,
                            maximum=20.0,
                            value=7.5,
                            step=0.5
                        )

                    use_similarity = gr.Checkbox(
                        label="Use similarity for continuity",
                        value=True
                    )

                    generate_btn = gr.Button("🚀 Generate Storyboard", variant="primary")
                    download_btn = gr.File(label="Download ZIP with all images", interactive=False)
                    status = gr.Textbox(label="Status", interactive=False)

                with gr.Column():
                    gallery = gr.Image(label="Generated Storyboard", interactive=False)
                    Feedback = gr.Textbox(
                        label="Feedback",
                        placeholder="Provide Feedback for the images generated.",
                        lines=10
                    )

            generate_btn.click(
                fn=generate_from_topic,
                inputs=[
                    topic,
                    model_id,
                    prefix,
                    strength,
                    guidance_scale,
                    use_similarity
                ],
                outputs=[gallery, status]
            ).then(
                fn=create_download_btn,
                inputs=None,
                outputs=download_btn
            )

        with gr.TabItem("Create Animation"):
            with gr.Row():
                with gr.Column():
                    animate_btn = gr.Button("🎞️ Create Animation", variant="primary")
                    video_status = gr.Textbox(label="Animation Status", interactive=False)
                    video_download = gr.File(label="Download Animation", interactive=False)

                with gr.Column():
                    video_output = gr.Video(label="Preview Animation")

            animate_btn.click(
                fn=create_animation,
                inputs=[],
                outputs=[video_output, video_status]
            )

    gr.Markdown("""
    ## How to use
    1. Enter your storyboard prompts (one per line)
    2. Adjust generation parameters as needed
    3. Click "Generate Storyboard" button
    4. Download the ZIP file containing all images

    ## Parameters
    - **Style Prefix**: Text added to the beginning of each prompt
    - **Transformation Strength**: How much to transform the previous image (0.1-1.0)
    - **Guidance Scale**: How closely to follow the prompt (higher = more faithful)
    - **Use similarity for continuity**: Uses the most similar previous frame instead of just the last one
    """)

demo.launch(share=True, debug=True)


Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://3398649297fc5445dc.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
