<a href="https://www.kaggle.com/code/pgvishnu526/prompt2manga?scriptVersionId=282177114" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [1]:

!pip install diffusers transformers accelerate torch safetensors fpdf2

Collecting fpdf2
  Downloading fpdf2-2.8.5-py3-none-any.whl.metadata (76 kB)
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m76.9/76.9 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manyl

# üìö Prompt2Manga ‚Äì Multi-Agent Manga Generation with Gemini

## 1. Problem Statement

Creating a short manga or comic usually requires:
- Character design
- Story writing
- Dialog scripting
- Panel-wise illustration
- Layout and publishing

Doing this manually is time-consuming and requires multiple skills (writer, artist, designer).

## 2. Solution Overview

**Prompt2Manga** is an AI-powered, **Sequential Multi-Agent System** that converts a simple user prompt into a **5-page manga PDF**.

Given a user prompt (e.g., *"Create a 5-page manga about a robot who learns human emotions"*), the system:

1. Generates characters and their traits  
2. Builds a story outline  
3. Converts the story into a panel-wise script  
4. Generates images using a diffusion model  
5. Compiles everything into a final manga PDF

All of this runs inside a **single Kaggle notebook**.

---

## 3. Key Features (Course Concepts Used)

This project demonstrates multiple agentic / GenAI concepts:

- ‚úÖ **Sequential Multi-Agent Pipeline**  
  Each agent focuses on one task and passes structured output to the next agent.

- ‚úÖ **LLM-Orchestrated Workflow**  
  Gemini (or LLM) is used to:
  - Generate characters
  - Plan the story
  - Write dialog/script

- ‚úÖ **Tool-Using Agent**  
  One sub-agent triggers image generation using a **Stable Diffusion pipeline** (tool use).

- ‚úÖ **Automated Document Generation**  
  A final ‚Äúpublisher agent‚Äù composes images and text into a **multi-page manga PDF**.

---

## 4. High-Level Architecture

The system has the following agents:

1. **Character Agent**  
   - Input: User prompt  
   - Output: List of characters with names, roles, and personality traits.

2. **Story Agent**  
   - Input: User prompt + character list  
   - Output: Story outline divided into pages and panels.

3. **Script Agent**  
   - Input: Story outline  
   - Output: Panel-wise script including dialogues and scene descriptions.

4. **Illustration Agent**  
   - Input: Panel descriptions  
   - Output: Panel images generated using Stable Diffusion.

5. **Publisher Agent**  
   - Input: Script + images  
   - Output: A **5-page PDF** saved as `final_manga_comic.pdf`.

Data flows in a **linear sequence**:  
**User Prompt ‚Üí Character Agent ‚Üí Story Agent ‚Üí Script Agent ‚Üí Illustration Agent ‚Üí Publisher Agent**.

---

## 5. Architecture Diagram (Logical View)

You can visualize the system like this:

**User Prompt**  
‚¨á  
**Character Agent** ‚Üí Characters JSON  
‚¨á  
**Story Agent** ‚Üí Story Outline (pages + panels)  
‚¨á  
**Script Agent** ‚Üí Panel Script (dialog + descriptions)  
‚¨á  
**Illustration Agent** ‚Üí Panel Images  
‚¨á  
**Publisher Agent** ‚Üí üìÑ `final_manga_comic.pdf`

---

## 6. Technologies Used

- **Language Model**: Gemini (or compatible LLM)
- **Image Generation**: Stable Diffusion (Diffusers)
- **PDF Creation**: FPDF
- **Runtime**: Kaggle Notebook (Python)
- **Hardware**: GPU (T4) for faster image generation

---

## 7. How to Run (Kaggle Setup)

1. **Enable GPU**
   - Go to **Settings ‚Üí Accelerator ‚Üí GPU (T4)**

2. **Install Dependencies** (run the setup cell)
   - `diffusers`, `transformers`, `accelerate`, `fpdf`, etc.

3. **Run All Cells**
   - From the top menu: **Run ‚Üí Run All**
   - Or run each section in order:
     1. Imports & Setup  
     2. Model Loading  
     3. Agent Definitions  
     4. Manga Generation Pipeline  
     5. PDF Creation

4. **Output Location**
   - The final manga is saved at:
     ```bash
     /kaggle/working/final_manga_comic.pdf
     ```

---

## 8. File Outputs

- `final_manga_comic.pdf` ‚Äì 5-page manga generated from user prompt  
- Individual panel images (PNG/JPG) stored in `/kaggle/working/`

---

## 9. Limitations & Future Work

- Image generation speed depends on GPU.
- Currently supports a fixed number of pages (e.g., 5).
- Future improvements:
  - Interactive UI for prompt input
  - Parameter control for style (chibi, realistic, etc.)
  - Multi-language support for dialogues


In [2]:
import os
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
GEMINI_API_KEY = user_secrets.get_secret("GEMINI_API_KEY")
os.environ["GEMINI_API_KEY"] = GEMINI_API_KEY

print("‚úÖ API Key successfully loaded into Environment.")

‚úÖ API Key successfully loaded into Environment.


**Creating tool for our use case**

In [3]:
import json
import torch
import time
from fpdf import FPDF
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler

# ============================================================
# 1. MODEL SETUP ‚Äì Stable Diffusion (Anything V5 for Manga/Anime)
# ============================================================

print("üîÑ Loading Anime Model...")
try:
    # "Anything V5" is a Stable Diffusion variant tuned for anime/manga style artwork
    model_id = "stablediffusionapi/anything-v5"
    
    # Load the diffusion pipeline in half precision (float16) on GPU
    pipe = StableDiffusionPipeline.from_pretrained(
        model_id,
        torch_dtype=torch.float16
    ).to("cuda")
    
    # Use an efficient scheduler for faster and cleaner images
    pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
    
    # Disable the safety checker to allow action scenes / intense visuals
    pipe.safety_checker = None
    pipe.requires_safety_checker = False
    
    print("‚úÖ Model Loaded & Safety Checker Disabled (Action Scenes Allowed).")
except Exception as e:
    # If model load fails, we keep pipe=None so later tools can fall back gracefully
    print(f"‚ö†Ô∏è Model Load Error: {e}")
    pipe = None


# ============================================================
# 2. GLOBAL PATHS ‚Äì Where we store images, script, and state
# ============================================================

WORK_DIR   = "/kaggle/working"
IMG_DIR    = os.path.join(WORK_DIR, "images")          # All generated panel images
SCRIPT_FILE = os.path.join(WORK_DIR, "manga_script.json")  # Script from the LLM/agent
STATE_FILE  = os.path.join(WORK_DIR, "manga_state.json")   # Script + image paths for PDF

os.makedirs(IMG_DIR, exist_ok=True)


# ============================================================
# 3. HELPER ‚Äì Fix encoding issues for PDF text
# ============================================================

def sanitize_text(text):
    """
    Cleans Unicode punctuation (like smart quotes, em-dash, ellipsis)
    and converts text to a PDF-safe Latin-1 representation.
    """
    if not isinstance(text, str):
        return str(text)
    
    replacements = {
        "\u2026": "...",  # ellipsis
        "\u2018": "'",    # left single quote
        "\u2019": "'",    # right single quote
        "\u201c": '"',    # left double quote
        "\u201d": '"',    # right double quote
        "\u2013": "-",    # en dash
        "\u2014": "-"     # em dash
    }
    for k, v in replacements.items():
        text = text.replace(k, v)
    
    # Ensure the final string is Latin-1 compatible for FPDF
    return text.encode('latin-1', 'replace').decode('latin-1')


# ============================================================
# 4. TOOL 1 ‚Äì Save Script from Agent
# ============================================================

def save_script_tool(json_input: str):
    """
    TOOL 1 ‚Äì Accepts a JSON string from the LLM/agent, 
    cleans it, parses it, and saves to SCRIPT_FILE.
    
    Input : json_input (string, may contain ```json ... ``` wrapper)
    Output: Status string ("SCRIPT_SAVED" or error message)
    """
    try:
        # Remove common Markdown fencing like ```json ... ```
        clean = json_input.replace("```json", "").replace("```", "").strip()
        
        # Parse JSON into Python structure
        data = json.loads(clean)
        
        # Save pretty-printed JSON for easier debugging
        with open(SCRIPT_FILE, 'w') as f:
            json.dump(data, f, indent=2)
        
        return "SCRIPT_SAVED"
    except Exception as e:
        return f"Save Error: {e}"


# ============================================================
# 5. TOOL 2 ‚Äì Generate Images from Script
# ============================================================

def generate_images_tool(_ignored_input: str):
    """
    TOOL 2 ‚Äì Reads the saved script and generates panel images.
    
    - Reads SCRIPT_FILE
    - For each scene:
        - Build a Stable Diffusion prompt from description + mood
        - Generate an image (if model is loaded)
        - Attach image path back to the scene
    - Saves merged result to STATE_FILE
    
    Input : (unused, required by tool interface)
    Output: Status string ("IMAGES_GENERATED" or error message)
    """
    try:
        if not os.path.exists(SCRIPT_FILE):
            return "Error: No script file."
        
        # Load script (could be dict or nested JSON string)
        with open(SCRIPT_FILE, 'r') as f:
            data = json.load(f)

        if isinstance(data, str):
            data = json.loads(data)
            
        scenes = data.get('scenes', [])
        updated_scenes = []
        
        print(f"üé® Generating {len(scenes)} Panels...")
        
        for i, scene in enumerate(scenes):
            # Ensure scene is always a dict for safety
            if isinstance(scene, str):
                scene = json.loads(scene)
                
            desc = scene.get('description', '')
            mood = scene.get('mood', '')
            img_path = os.path.join(IMG_DIR, f"scene_{i+1}.png")
            
            if pipe:
                # Positive prompt: emphasize manga style, clean art
                prompt = (
                    "masterpiece, best quality, manga style, monochrome, greyscale, lineart, "
                    f"{desc}, {mood}, intense action, highly detailed, 4k"
                )
                
                # Negative prompt: avoid unwanted artifacts
                negative = (
                    "color, 3d, realistic, blurry, messy, sketch, text, watermark, "
                    "bad anatomy, deformed"
                )
                
                # Use autocast for faster, memory-efficient generation
                with torch.autocast("cuda"):
                    image = pipe(
                        prompt,
                        negative_prompt=negative,
                        num_inference_steps=25,  # speed/quality trade-off
                        width=512,
                        height=768,
                        guidance_scale=8.0
                    ).images[0]
                
                image.save(img_path)
                scene['image_path'] = img_path
            else:
                # Fallback if model failed to load; keeps pipeline alive for testing
                scene['image_path'] = "dummy.png"
            
            updated_scenes.append(scene)
        
        # Save combined state (script + image paths)
        with open(STATE_FILE, 'w') as f:
            json.dump({"scenes": updated_scenes}, f, indent=2)
            
        return "IMAGES_GENERATED"
    except Exception as e:
        return f"Gen Error: {e}"


# ============================================================
# 6. TOOL 3 ‚Äì Create Manga-Style PDF from Images + Script
# ============================================================

def create_pdf_tool(_ignored_input: str):
    """
    TOOL 3 ‚Äì Uses STATE_FILE (scenes + image paths) to create a manga PDF.
    
    Layout:
    - A4 portrait
    - 2 scenes (panels) per page: top and bottom
    - Each panel: image + narration box + dialogue box
    
    Input : (unused, required by tool interface)
    Output: Status string with PDF path, or error message
    """
    try:
        if not os.path.exists(STATE_FILE):
            return "Error: No state file."
        
        # ---- 1. Load and robustly parse state JSON ----
        with open(STATE_FILE, 'r') as f:
            raw_data = f.read().strip()
        
        data = None
        data_str = raw_data
        
        # Try multiple times in case of nested JSON strings
        for _ in range(5):
            try:
                data = json.loads(data_str)
                if not isinstance(data, str):
                    break  # parsed into final structure (dict)
                data_str = data
            except:
                break
        
        scenes = data.get('scenes', []) if isinstance(data, dict) else []
        
        # ---- 2. Setup PDF (A4) ----
        pdf = FPDF(orientation='P', unit='mm', format='A4')
        pdf.set_auto_page_break(auto=False)  # manual layout
        
        # Helper: draw narration/dialog boxes on top of the panel image
        def draw_text_box(pdf, text, x, y, w, h, is_dialogue=False):
            if not text:
                return
            
            text = sanitize_text(text)
            
            # Dialogue: white bubble, Narration: light grey box
            if is_dialogue:
                pdf.set_fill_color(255, 255, 255)    # white
            else:
                pdf.set_fill_color(240, 240, 240)    # light grey
                
            pdf.set_draw_color(0, 0, 0)   # black border
            pdf.set_line_width(0.3)
            
            # Draw the filled rectangle (box)
            pdf.rect(x, y, w, h, style='FD')
            
            # Text properties
            pdf.set_xy(x + 2, y + 2)
            pdf.set_text_color(0, 0, 0)
            font_size = 10 if is_dialogue else 9
            font_style = 'B' if is_dialogue else 'I'
            pdf.set_font("Arial", font_style, font_size)
            
            # Wrap text inside the box
            pdf.multi_cell(w - 4, 5, text, align='C' if is_dialogue else 'L')

        # ---- 3. Layout each scene as a panel ----
        for i, scene in enumerate(scenes):
            if not isinstance(scene, dict):
                continue

            # New page for every 2 scenes
            if i % 2 == 0:
                pdf.add_page()
                
                # Optional title only on the first page
                if i == 0:
                    pdf.set_font("Arial", "B", 16)
                    pdf.cell(0, 10, "AI Generated Manga", ln=True, align='C')

            # Top or bottom half of the page
            base_y = 20 if (i == 0) else 10
            y = base_y if (i % 2 == 0) else 148  # 2 panels per page
            
            img_path = scene.get('image_path', '')
            x_img = 10
            w_img = 190
            h_img = 130
            
            # Draw panel image with a border
            if os.path.exists(img_path):
                pdf.image(img_path, x=x_img, y=y, w=w_img, h=h_img)
                pdf.set_draw_color(0, 0, 0)
                pdf.set_line_width(1.0)
                pdf.rect(x_img, y, w_img, h_img)

            # Short narration box (top-left)
            narrative = scene.get('description', '')[:100]
            if narrative:
                draw_text_box(
                    pdf, narrative,
                    x=x_img + 2, y=y + 2,
                    w=80, h=20,
                    is_dialogue=False
                )
            
            # Dialogue: can be list of lines or a single string
            dialogue = scene.get('dialogue')
            dialogue_text = ""
            
            if isinstance(dialogue, list):
                for d in dialogue:
                    line = d.get('line', '') if isinstance(d, dict) else str(d)
                    dialogue_text += line + " "
            elif isinstance(dialogue, str):
                dialogue_text = dialogue
            
            # Dialogue bubble (bottom-right)
            if len(dialogue_text) > 0:
                # Adjust height based on approximate length
                box_h = 20 + (len(dialogue_text) // 30) * 5
                draw_text_box(
                    pdf, dialogue_text,
                    x=x_img + 110,
                    y=y + h_img - box_h - 5,
                    w=75, h=box_h,
                    is_dialogue=True
                )
        
        # ---- 4. Save final PDF ----
        out_path = "/kaggle/working/Final_Manga_Comic.pdf"
        pdf.output(out_path)
        
        return f"PDF_CREATED: {out_path}"
        
    except Exception as e:
        return f"PDF Error: {e}"


2025-11-27 11:04:44.131711: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1764241484.353611      47 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1764241484.417573      47 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

üîÑ Loading Anime Model...


model_index.json:   0%|          | 0.00/584 [00:00<?, ?B/s]

Fetching 15 files:   0%|          | 0/15 [00:00<?, ?it/s]

preprocessor_config.json:   0%|          | 0.00/520 [00:00<?, ?B/s]

config.json: 0.00B [00:00, ?B/s]

tokenizer_config.json:   0%|          | 0.00/737 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/472 [00:00<?, ?B/s]

text_encoder/model.safetensors:   0%|          | 0.00/492M [00:00<?, ?B/s]

safety_checker/model.safetensors:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

scheduler_config.json:   0%|          | 0.00/465 [00:00<?, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]

unet/diffusion_pytorch_model.safetensors:   0%|          | 0.00/3.44G [00:00<?, ?B/s]

config.json:   0%|          | 0.00/582 [00:00<?, ?B/s]

vae/diffusion_pytorch_model.safetensors:   0%|          | 0.00/335M [00:00<?, ?B/s]

Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?it/s]



‚úÖ Model Loaded & Safety Checker Disabled (Action Scenes Allowed).


In [4]:
!pip install -U google-genai google-adk


Collecting google-genai
  Downloading google_genai-1.52.0-py3-none-any.whl.metadata (46 kB)
[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m46.8/46.8 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
Collecting google-adk
  Downloading google_adk-1.19.0-py3-none-any.whl.metadata (13 kB)
Collecting google-cloud-bigquery-storage>=2.0.0 (from google-adk)
  Downloading google_cloud_bigquery_storage-2.34.0-py3-none-any.whl.metadata (10 kB)
Collecting cachetools<6.0,>=2.0.0 (from google-auth<3.0.0,>=2.14.1->google-genai)
  Downloading cachetools-5.5.2-py3-none-any.whl.metadata (5.4 kB)
Collecting protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<7.0.0,>=3.20.2 (from google-cloud-aiplatform<2.0.0,>=1.125.0->google-cloud-aiplatform[agent-engines]<2.0.0,>=1.125.0->google-adk)
  Downloading protobuf-5.29.5-cp38-abi3-manylinux2014_x86_64.whl.metadata (592 bytes)
Downloading google_genai-1.52.0

## üß¨ Detailed Architecture & Data Flow

This notebook follows a **Sequential Agent Architecture**:

1. **Input Layer**
   - A single user text prompt:
     - Example: *"Create a 5-page manga about a shy girl who becomes a hero."*

2. **Agent 1 ‚Äì Character Agent**
   - Uses Gemini/LLM to:
     - Identify main and side characters.
     - Define names, traits, roles (protagonist, antagonist, etc.).
   - Output format: **JSON-like structure** (characters list).

3. **Agent 2 ‚Äì Story Agent**
   - Takes user prompt + character list.
   - Produces a **page-wise story outline**:
     - Page 1 ‚Üí Introduction
     - Page 2‚Äì4 ‚Üí Conflict & development
     - Page 5 ‚Üí Resolution

4. **Agent 3 ‚Äì Script Agent**
   - Converts the outline into a **panel-wise script**:
     - For each panel:
       - Scene description
       - Characters present
       - Dialogues

5. **Agent 4 ‚Äì Illustration Agent**
   - For each panel description:
     - Builds a Stable Diffusion prompt.
     - Calls the diffusion pipeline to generate an image.
   - Saves images as `panel_1.png`, `panel_2.png`, etc.

6. **Agent 5 ‚Äì Publisher Agent**
   - Uses FPDF to:
     - Create a PDF.
     - Place panel images and dialogues on each page.
   - Final output: `final_manga_comic.pdf`.

---

## üîÅ Control Flow

1. Initialize all agents.
2. Pass the user prompt to **Character Agent**.
3. Pass characters ‚Üí **Story Agent**.
4. Pass story ‚Üí **Script Agent**.
5. Pass script ‚Üí **Illustration Agent**.
6. Pass script + images ‚Üí **Publisher Agent**.
7. Show final path of the PDF in the notebook output.


In [5]:
from google.adk.agents import LlmAgent,SequentialAgent
from google.adk.models.google_llm import Gemini
from google.adk.runners import InMemoryRunner
from google.adk.tools import AgentTool, FunctionTool, google_search
from google.genai import types

In [6]:
retry_config=types.HttpRetryOptions(
    attempts=5,  
    exp_base=7, 
    initial_delay=1,
    http_status_codes=[429, 500, 503, 504], 
)

In [7]:
config_model = Gemini(
    model="gemini-2.5-flash-lite", 
    retry_options=retry_config
)

In [8]:
# 1. CHARACTER AGENT

char_agent = LlmAgent(
    name ="character_agent",
    model = config_model,
    instruction="""
        Role: Lead Character Designer.
        Task: Define the main protagonist for a Manga.
        Output Requirement: strictly valid JSON with keys:
        - "name": Character name.
        - "type": EXACTLY one of: ["Person", "Animal", "Fruit", "Object"].
        - "appearance": A dense, comma-separated string of visual keywords matching the 'type'.
          (e.g., "cybernetic wolf, metallic fur, glowing red eyes, forest background").
        - "personality": Brief archetype.
        
        NO CHAT. ONLY JSON.
        """,
    output_key = "char_data"
    
)

In [9]:
# 2. STORY AGENT

story_agent = LlmAgent(
    name="story_agent",
    model=config_model,
    instruction="""
        Role: Manga Director.
        Input: {char_data}
        Task: Create a detailed **10-panel story sequence** (approx 5 pages).
        
        Structure the pacing carefully:
        - Scenes 1-2: Introduction (Establish the setting and the character).
        - Scenes 3-4: The Conflict (An enemy appears or a problem starts).
        - Scenes 5-8: The Action/Climax (Dynamic battles, chases, or intensity).
        - Scenes 9-10: Resolution (The aftermath and cool ending pose).

        Output Requirement: strictly valid JSON with key "scenes", containing a list of 10 objects. Each object must have:
        - "scene_index": The number (1-10).
        - "description": Detailed visual instructions for the artist. **Focus on ACTION and CAMERA ANGLES** (e.g., "Low angle shot," "Close up on eyes," "Wide shot of explosion").
        - "mood": The atmosphere (e.g., "Tense", "Victory", "Dark").
        
        **CRITICAL SAFETY**: Describe action with 'energy', 'impact', 'motion blur'. Do NOT use 'blood' or 'gore'.
        NO CHAT. ONLY JSON.
        """,
    output_key = "story_data"
)

In [10]:
# 3. SCRIPT AGENT( Uses Tool)

script_agent = LlmAgent(
    name="dialogue_agent",
    model=config_model,
    instruction="""
        Role: Manga Script Writer.
        Input: {story_data}
        Task: 
        1. Review the full 10-scene sequence to understand the flow.
        2. Add 'dialogue' to EACH scene. 
           - Early scenes: Set the mystery.
           - Middle scenes: Short, punchy shouts or sound effects (SFX).
           - Final scenes: A cool one-liner.
        3. CALL `save_script_tool` with the full JSON.
        4. Output the tool result.
        
        Constraint: Keep dialogue under 20 words per bubble.
        NO CHAT.
        """,
    output_key = "save_status",
    tools=[save_script_tool] 
)

In [11]:
# 4. ILLUSTRATOR AGENT (Uses Tool)

illustrator_agent = LlmAgent(
    name="illustrator_agent",
    model=config_model,
    instruction="""
        Role: Render Engine Trigger.
        Task: The script is ready. Initiate the visual pipeline.
        Action: CALL `generate_images_tool` with argument 'start'.
        Do not explain. JUST CALL THE TOOL.
        """,
    output_key = "illus_status",
    tools=[generate_images_tool] 
)

In [12]:
# 5. PUBLISHER AGENT (Uses Tool)

publisher_agent = LlmAgent(
    name="publisher_agent",
    model=config_model,
     instruction="""
        Role: Publisher.
        Task: CALL `create_pdf_tool` with argument 'start'.
        Output the filename.
        NO CHAT.
        """,
    description="Compiles the PDF.",
    tools=[create_pdf_tool] 
)

In [13]:
# ROOT AGENT (Sequential Pattern)

root_agent = SequentialAgent(
    name="root",
    sub_agents=[
        char_agent,   
        story_agent,      
        script_agent,     
        illustrator_agent,  
        publisher_agent     
    ],
    
    description="Robust Manga Pipeline"
)

In [14]:
from google.adk.runners import InMemoryRunner 

runner = InMemoryRunner(agent=root_agent)
response = await runner.run_debug(
    "A boy going to forest with his grandpa and cousin , they stay nigth there and suddenly a meteor falling on earth near the place they staying and boy run to say it and goes close to say that a watch cameout form the box  and pasted on his hand and suddenly went to alien transformation"
)



 ### Created new session: debug_session_id

User > A boy going to forest with his grandpa and cousin , they stay nigth there and suddenly a meteor falling on earth near the place they staying and boy run to say it and goes close to say that a watch cameout form the box  and pasted on his hand and suddenly went to alien transformation
character_agent > ```json
{
  "name": "Kaito",
  "type": "Person",
  "appearance": "young boy, messy black hair, determined expression, tattered simple clothes, forest background, alien watch on wrist",
  "personality": "curious and brave"
}
```
story_agent > ```json
{
  "scenes": [
    {
      "scene_index": 1,
      "description": "WIDE SHOT of a dense, sun-dappled forest. Kaito, a young boy with messy black hair and wearing tattered clothes, walks ahead on a dirt path, a determined expression on his face. His grandfather and cousin are further back, out of focus. The atmosphere is peaceful and adventurous.",
      "mood": "Adventurous"
    },
    {
   

Token indices sequence length is longer than the specified maximum sequence length for this model (89 > 77). Running this sequence through the model will result in indexing errors
The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['adventurous., adventurous, intense action, highly detailed, 4 k']


üé® Generating 10 Panels...


  0%|          | 0/25 [00:00<?, ?it/s]

  0%|          | 0/25 [00:00<?, ?it/s]

  0%|          | 0/25 [00:00<?, ?it/s]

  0%|          | 0/25 [00:00<?, ?it/s]

  0%|          | 0/25 [00:00<?, ?it/s]

The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['highly detailed, 4 k']


  0%|          | 0/25 [00:00<?, ?it/s]

The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['4 k']


  0%|          | 0/25 [00:00<?, ?it/s]

The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['action, highly detailed, 4 k']


  0%|          | 0/25 [00:00<?, ?it/s]

The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['forest around him is subtly altered, perhaps with glowing flora., powerful, intense action, highly detailed, 4 k']


  0%|          | 0/25 [00:00<?, ?it/s]

The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens: ['intense action, highly detailed, 4 k']


  0%|          | 0/25 [00:00<?, ?it/s]

  pdf.set_font("Arial", "B", 16)
  pdf.cell(0, 10, "AI Generated Manga", ln=True, align='C')
  pdf.set_font("Arial", font_style, font_size)


publisher_agent > The PDF has been created and can be found at the following location: /kaggle/working/Final_Manga_Comic.pdf
