# üé® MangaGen - AI Manga Generation Pipeline

Generate complete manga pages with consistent characters from text prompts!

**Features:**
- üìù Story ‚Üí Scene JSON (Gemini 2.0 Flash)
- üé® SDXL + IP-Adapter for consistent characters
- üí¨ Smart dialogue bubble placement
- üìÑ PDF output with zip download

---

## üìã Prerequisites

Before running, add these secrets in **Kaggle Settings ‚Üí Add-ons ‚Üí Secrets**:
1. `GEMINI_API_KEY` - Get from https://aistudio.google.com/app/apikey
2. `HF_TOKEN` (optional) - HuggingFace token for model downloads

---

## üîß Cell 1: Clone Repository & Install Dependencies

In [None]:
%%bash
# Clone the repository
cd /kaggle/working
if [ -d "manga-gen" ]; then
    echo "Repository already exists, pulling latest..."
    cd manga-gen && git pull
else
    echo "Cloning repository..."
    git clone --branch mvp/kaggle-flux https://github.com/Barun-2005/manga-gen-ai-pipeline.git manga-gen
    cd manga-gen
fi
echo ""
echo "Latest commit:"
git log -1 --oneline

In [None]:
# Install dependencies using our Kaggle-specific script
import os
os.chdir('/kaggle/working/manga-gen')
!bash install_kaggle_deps.sh

In [None]:
# CRITICAL: Verify diffusers is installed!
print("üîç Verifying critical packages...")
try:
    import diffusers
    print(f"‚úÖ diffusers: {diffusers.__version__}")
except ImportError:
    print("‚ùå diffusers NOT FOUND - installing now...")
    !pip install diffusers==0.27.2 transformers==4.40.2 accelerate==0.29.3
    import diffusers
    print(f"‚úÖ diffusers: {diffusers.__version__}")

import torch
print(f"‚úÖ PyTorch: {torch.__version__}")
print(f"‚úÖ CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"‚úÖ GPU: {torch.cuda.get_device_name(0)}")

## üîë Cell 2: Set Up API Keys

In [None]:
import os
import sys

# Add manga-gen to path
sys.path.insert(0, '/kaggle/working/manga-gen')
os.chdir('/kaggle/working/manga-gen')

# Load API keys from Kaggle Secrets
try:
    from kaggle_secrets import UserSecretsClient
    secrets = UserSecretsClient()
    
    # Required: Gemini API Key
    os.environ['GEMINI_API_KEY'] = secrets.get_secret('GEMINI_API_KEY')
    print("‚úÖ GEMINI_API_KEY loaded from Kaggle secrets")
    
    # Optional: HuggingFace Token
    try:
        os.environ['HF_TOKEN'] = secrets.get_secret('HF_TOKEN')
        print("‚úÖ HF_TOKEN loaded")
    except:
        print("‚ÑπÔ∏è HF_TOKEN not set (optional)")
        
except Exception as e:
    print(f"‚ö†Ô∏è Could not load secrets: {e}")
    print("")
    print("üîß To add secrets:")
    print("   1. Click 'Add-ons' menu at top")
    print("   2. Select 'Secrets'")
    print("   3. Add 'GEMINI_API_KEY' with your API key")
    print("")
    print("Get your Gemini API key at: https://aistudio.google.com/app/apikey")

## ‚öôÔ∏è Cell 3: Configuration

Customize your manga generation settings here!

In [None]:
# ============================================
# üé® MANGA CONFIGURATION - EDIT THIS!
# ============================================

# Your story prompt - describe your manga scene
STORY_PROMPT = """
Astra, a determined space scavenger with messy silver hair and a grease-stained orange jumpsuit, 
explores a derelict spaceship. She finds a glowing blue artifact in the cockpit.
""".strip()

# Visual style: "bw_manga" (black & white) or "color_anime" (colorful)
STYLE = "bw_manga"

# Panel layout: "2x2" (4 panels), "vertical_webtoon" (3 panels), "3_panel", "single"
LAYOUT = "2x2"

# Generation quality (higher = better but slower)
INFERENCE_STEPS = 25  # 20-30 for testing, 40-50 for quality
GUIDANCE_SCALE = 7.5  # 6-9 recommended

# ============================================

print("üìã Configuration:")
print(f"   Story: {STORY_PROMPT[:80]}...")
print(f"   Style: {STYLE}")
print(f"   Layout: {LAYOUT}")
print(f"   Steps: {INFERENCE_STEPS}")
print(f"   Guidance: {GUIDANCE_SCALE}")

## üìù Cell 4: Generate Scene Plan (Gemini)

In [None]:
import json

# Generate scene plan using Gemini
!python scripts/generate_scene_json.py "{STORY_PROMPT}" --style {STYLE} --layout {LAYOUT} --output scene_plan.json

# Display the generated scene plan
print("\n" + "="*50)
print("üìã Generated Scene Plan")
print("="*50)

if os.path.exists('scene_plan.json'):
    with open('scene_plan.json', 'r') as f:
        scene_plan = json.load(f)
    
    print(f"\nTitle: {scene_plan.get('title', 'Untitled')}")
    print(f"Style: {scene_plan.get('style', 'unknown')}")
    print(f"Layout: {scene_plan.get('layout', 'unknown')}")
    
    print(f"\nüìö Characters ({len(scene_plan.get('characters', []))})")
    for char in scene_plan.get('characters', []):
        print(f"   ‚Ä¢ {char['name']}: {char['hair_color']} hair")
    
    print(f"\nüñºÔ∏è Panels ({len(scene_plan.get('panels', []))})")
    for panel in scene_plan.get('panels', []):
        print(f"   {panel['panel_number']}. [{panel['camera_angle']}] {panel['description'][:50]}...")
else:
    print("‚ùå Scene plan generation failed. Check the error above.")

## üñºÔ∏è Cell 5: Generate Panel Images (GPU)

This is the main image generation step. Uses:
- SDXL for high-quality anime/manga style
- IP-Adapter for character consistency (if installed)

**Time estimate:** ~3-8 minutes for 4 panels on Kaggle T4 GPU

In [None]:
import torch
import time

# Check GPU availability
if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    gpu_mem = torch.cuda.get_device_properties(0).total_memory / 1024**3
    print(f"‚úÖ GPU Available: {gpu_name} ({gpu_mem:.1f} GB)")
    USE_MOCK = False
else:
    print("‚ö†Ô∏è No GPU detected - using mock mode")
    USE_MOCK = True

# Create output directory
os.makedirs('outputs', exist_ok=True)

# Build command
cmd = f"python scripts/generate_panels.py --scene scene_plan.json --output outputs/ --steps {INFERENCE_STEPS} --guidance {GUIDANCE_SCALE}"
if USE_MOCK:
    cmd += " --mock"

print(f"\nüé® Running: {cmd}")
print("\n" + "="*50)
start_time = time.time()

!{cmd}

elapsed = time.time() - start_time
print(f"\n‚è±Ô∏è Generation time: {elapsed/60:.1f} minutes")

# If it was very fast and not mock mode, something might be wrong
if elapsed < 30 and not USE_MOCK:
    print("\n‚ö†Ô∏è Generation was very fast - might have fallen back to mock mode!")
    print("   Check if diffusers is installed: !pip show diffusers")

In [None]:
# Display generated panels
from IPython.display import display, Image as IPImage
import glob

print("\nüñºÔ∏è Generated Panels:")
print("="*50)

panels = sorted(glob.glob('outputs/panel_*.png'))
if panels:
    for panel in panels:
        if 'with_bubbles' not in panel:
            print(f"\n{os.path.basename(panel)}")
            display(IPImage(filename=panel, width=400))
else:
    print("‚ùå No panels found! Check errors above.")

# Show character references if they exist
refs = glob.glob('outputs/character_refs/*.png')
if refs:
    print("\nüì∏ Character References:")
    for ref in refs:
        print(f"\n{os.path.basename(ref)}")
        display(IPImage(filename=ref, width=200))

## üí¨ Cell 6: Place Dialogue Bubbles

In [None]:
# Calculate bubble positions
!python scripts/place_bubbles.py --panels outputs/ --scene scene_plan.json --output bubbles.json

# Display bubble data
if os.path.exists('bubbles.json'):
    with open('bubbles.json', 'r') as f:
        bubbles = json.load(f)
    
    print("\nüí¨ Bubble Placements:")
    for panel_key, panel_bubbles in bubbles.items():
        print(f"   {panel_key}: {len(panel_bubbles)} bubble(s)")

## üìÑ Cell 7: Compose Final Page & PDF

In [None]:
# Compose final page with bubbles
!python scripts/compose_page.py --panels outputs/ --bubbles bubbles.json --scene scene_plan.json --output outputs/

# Display final page
if os.path.exists('outputs/manga_page.png'):
    print("\nüé® Final Manga Page:")
    display(IPImage(filename='outputs/manga_page.png', width=600))
    
    # Show file info
    print("\nüìÅ Output Files:")
    for f in ['outputs/manga_page.png', 'outputs/manga_page.pdf', 'manga_output.zip']:
        if os.path.exists(f):
            size_mb = os.path.getsize(f) / 1024 / 1024
            print(f"   ‚úÖ {f} ({size_mb:.2f} MB)")

## üì¶ Cell 8: Download Your Manga!

Use the Kaggle file browser (left sidebar) to download `manga_output.zip`

In [None]:
import os
from IPython.display import display, HTML, FileLink

print('\n' + '='*50)
print('üéâ YOUR MANGA IS READY!')
print('='*50)

zip_path = 'manga_output.zip'

if os.path.exists(zip_path):
    size_mb = os.path.getsize(zip_path) / 1024 / 1024
    print(f'\nüì¶ File: {zip_path} ({size_mb:.2f} MB)')
    
    print('\nüì• HOW TO DOWNLOAD:')
    print('   1. Look at the LEFT SIDEBAR (file browser)')
    print('   2. Navigate to: kaggle/working/manga-gen/')
    print('   3. Right-click on "manga_output.zip"')
    print('   4. Click "Download"')
    
    print('\n   Or try clicking the link below:')
    try:
        display(FileLink(zip_path))
    except:
        print('   (Use file browser instead)')
    
    print('\nüìã Zip contains:')
    print('   ‚Ä¢ manga_page.pdf - Final manga with bubbles')
    print('   ‚Ä¢ manga_page.png - Full resolution page')
    print('   ‚Ä¢ panel_*.png - Individual panels')
    print('   ‚Ä¢ character_refs/ - Character reference images')
else:
    print('‚ö†Ô∏è Zip file not found!')
    print('\n   Check the compose_page.py output above for errors.')
    print('   You can download individual files from outputs/ folder.')

---

## üìä Pipeline Summary

| Step | Component | Time |
|------|-----------|------|
| 1 | Clone & Install | ~1 min |
| 2 | API Keys | instant |
| 3 | Scene Plan (Gemini) | ~5 sec |
| 4 | Panel Generation (SDXL) | ~3-8 min |
| 5 | Bubble Placement | ~10 sec |
| 6 | PDF Composition | ~5 sec |

**Total time:** ~5-10 minutes

---

Made with ‚ù§Ô∏è by Barun | [GitHub](https://github.com/Barun-2005/manga-gen-ai-pipeline)