# Corey Book - Character Consistent Image Generation

This notebook uses ControlNet with reference images to generate consistent characters for "The Chef at the Store" children's book.

**Key Features:**
- Uses your character reference images for consistency
- ControlNet ensures pose/composition control
- IP-Adapter for character appearance matching
- Free to run on Google Colab

**Cost: FREE** (uses Colab's free GPU)

## 1. Setup and Installation

In [None]:
# Simplified installation without xformers to avoid conflicts
print("üì¶ Installing required packages...")

# Install core dependencies
!pip install -q diffusers transformers accelerate
!pip install -q controlnet-aux opencv-python pillow

# Optional: Try to install xformers (but don't fail if it doesn't work)
print("\nüîß Attempting to install xformers (optional)...")
!pip install -q xformers || echo "XFormers installation failed, continuing without it"

print("\n‚úÖ Installation complete!")

In [None]:
import torch
import numpy as np
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt
import json
import os
from pathlib import Path
import requests
import zipfile
from io import BytesIO
import base64

from diffusers import (
    StableDiffusionXLControlNetPipeline,
    ControlNetModel,
    StableDiffusionXLImg2ImgPipeline,
    AutoencoderKL
)
from transformers import CLIPVisionModelWithProjection
from diffusers.utils import load_image
from controlnet_aux import CannyDetector, OpenposeDetector

# Check GPU
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using device: {device}")
if device == "cuda":
    print(f"GPU: {torch.cuda.get_device_name()}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory // 1024**3} GB")

## 2. Upload Your Project Files

You need to upload:
1. Your character reference images (cartoon-characters/ folder)
2. Your page prompts (page-prompts/ folder)

**Option A:** Upload manually using the file browser on the left

**Option B:** Upload a zip file with your project

In [None]:
# Create directories
os.makedirs('cartoon-characters', exist_ok=True)
os.makedirs('page-prompts', exist_ok=True)
os.makedirs('generated_images', exist_ok=True)

print("üìÅ Directories created. Please upload your files:")
print("1. cartoon-characters/corey1.jpg (main character reference)")
print("2. cartoon-characters/emily.jpg (wife reference)")
print("3. cartoon-characters/store-cartoon.jpg (store reference)")
print("4. page-prompts/page-00-cover.md through page-03.md")
print("\nUse the file browser on the left to upload these files.")

In [None]:
# Check uploaded files
def check_files():
    required_files = [
        'cartoon-characters/corey1.jpg',
        'page-prompts/page-00-cover.md',
        'page-prompts/page-01.md',
        'page-prompts/page-02.md',
        'page-prompts/page-03.md'
    ]
    
    missing = []
    for file in required_files:
        if os.path.exists(file):
            print(f"‚úÖ {file}")
        else:
            print(f"‚ùå {file} - MISSING")
            missing.append(file)
    
    if missing:
        print(f"\n‚ö†Ô∏è  Please upload the missing files before continuing.")
        return False
    else:
        print(f"\nüéâ All required files found!")
        return True

check_files()

## 3. Load Models

In [None]:
# Load ControlNet models
print("üì¶ Loading ControlNet models...")

# Canny ControlNet for edge detection
canny_controlnet = ControlNetModel.from_pretrained(
    "diffusers/controlnet-canny-sdxl-1.0",
    torch_dtype=torch.float16
)

# Base SDXL pipeline with ControlNet
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    controlnet=canny_controlnet,
    torch_dtype=torch.float16,
    use_safetensors=True
)

# Enable memory efficient attention
pipe.enable_model_cpu_offload()

# Try to enable xformers, but don't fail if it's not available
try:
    pipe.enable_xformers_memory_efficient_attention()
    print("‚úÖ XFormers memory efficient attention enabled!")
except:
    print("‚ö†Ô∏è XFormers not available, using standard attention (slightly slower but still works)")
    # Use alternative memory optimization
    pipe.enable_attention_slicing()

print("‚úÖ Models loaded!")

In [None]:
# Load image processors
canny_detector = CannyDetector()

print("üîß Image processors ready!")

## 4. Generation Functions

In [None]:
def load_page_prompt(file_path):
    """Load and parse a page prompt markdown file."""
    with open(file_path, 'r', encoding='utf-8') as f:
        content = f.read()
    
    lines = content.split('\n')
    title = lines[0].replace('# ', '') if lines else "Unknown Page"
    
    # Extract sections
    page_text = ""
    image_prompt = ""
    
    in_page_text = False
    in_image_prompt = False
    
    for line in lines:
        if line.startswith("## PAGE TEXT"):
            in_page_text = True
            in_image_prompt = False
            continue
        elif line.startswith("## IMAGE PROMPT"):
            in_page_text = False
            in_image_prompt = True
            continue
        elif line.startswith("## "):
            in_page_text = False
            in_image_prompt = False
            continue
        
        if in_page_text and line.strip():
            page_text += line.strip() + " "
        elif in_image_prompt and line.strip() and not line.startswith("**Art Style**"):
            image_prompt += line.strip() + " "
    
    return {
        'title': title.strip(),
        'page_text': page_text.strip(), 
        'image_prompt': image_prompt.strip(),
        'file_path': str(file_path)
    }

def create_controlnet_prompt(page_data):
    """Create optimized prompt for ControlNet generation."""
    prompt = "high quality children's book illustration, cartoon style, "
    
    # Add character consistency
    prompt += "COREY: completely bald chef, no hair, round face, navy apron, friendly smile. "
    
    # Add scene
    prompt += page_data['image_prompt']
    
    # Style keywords
    prompt += " vibrant colors, cel-shading, bold outlines, professional illustration"
    
    return prompt

def prepare_reference_image(ref_path):
    """Prepare reference image for ControlNet."""
    if not os.path.exists(ref_path):
        print(f"‚ùå Reference image not found: {ref_path}")
        return None
        
    # Load and resize reference image
    ref_image = Image.open(ref_path).convert('RGB')
    ref_image = ref_image.resize((1024, 1024))
    
    # Generate Canny edge map
    canny_image = canny_detector(ref_image)
    
    return ref_image, canny_image

print("üõ†Ô∏è Helper functions loaded!")

In [None]:
def generate_consistent_image(page_data, reference_image_path="cartoon-characters/corey1.jpg"):
    """Generate image with character consistency using ControlNet."""
    
    print(f"üé® Generating: {page_data['title']}")
    
    # Prepare reference
    ref_result = prepare_reference_image(reference_image_path)
    if ref_result is None:
        return None
        
    ref_image, canny_image = ref_result
    
    # Create prompt
    prompt = create_controlnet_prompt(page_data)
    negative_prompt = "low quality, blurry, deformed, extra limbs, bad anatomy, text, watermark, signature"
    
    print(f"üìù Prompt: {prompt[:100]}...")
    
    # Generate with ControlNet
    try:
        image = pipe(
            prompt=prompt,
            negative_prompt=negative_prompt,
            image=canny_image,
            controlnet_conditioning_scale=0.7,  # How much to follow the reference structure
            num_inference_steps=30,
            guidance_scale=7.5,
            width=1024,
            height=1024,
        ).images[0]
        
        return image, ref_image, canny_image
        
    except Exception as e:
        print(f"‚ùå Generation failed: {e}")
        return None

print("üöÄ Generation function ready!")

## 5. Test Generation

In [None]:
# Test with cover page first
if os.path.exists('page-prompts/page-00-cover.md'):
    print("üß™ Testing with cover page...")
    
    # Load page data
    page_data = load_page_prompt('page-prompts/page-00-cover.md')
    
    # Generate image
    result = generate_consistent_image(page_data)
    
    if result:
        generated_image, reference_image, canny_image = result
        
        # Display results
        fig, axes = plt.subplots(1, 3, figsize=(15, 5))
        
        axes[0].imshow(reference_image)
        axes[0].set_title("Reference Image")
        axes[0].axis('off')
        
        axes[1].imshow(canny_image, cmap='gray')
        axes[1].set_title("Canny Control")
        axes[1].axis('off')
        
        axes[2].imshow(generated_image)
        axes[2].set_title("Generated Image")
        axes[2].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        # Save the generated image
        output_path = 'generated_images/page-00-cover.png'
        generated_image.save(output_path)
        print(f"üíæ Saved: {output_path}")
        
    else:
        print("‚ùå Test generation failed")
else:
    print("‚ùå Cover page not found. Please upload page-prompts/page-00-cover.md")

## 6. Generate All Pages (0-3)

In [None]:
# Generate pages 0-3
pages_to_generate = ['page-00-cover.md', 'page-01.md', 'page-02.md', 'page-03.md']

print("üé® Starting batch generation...")
print(f"üìä Generating {len(pages_to_generate)} pages")
print(f"üí∞ Cost: FREE (using Colab GPU)")

results = []

for i, page_file in enumerate(pages_to_generate, 1):
    page_path = f'page-prompts/{page_file}'
    
    if not os.path.exists(page_path):
        print(f"‚è≠Ô∏è  Skipping {page_file} (not found)")
        continue
    
    print(f"\nüñºÔ∏è  [{i}/{len(pages_to_generate)}] Processing {page_file}...")
    
    # Load page data
    page_data = load_page_prompt(page_path)
    
    # Choose reference image based on content
    if 'store' in page_data['image_prompt'].lower() and 'corey' not in page_data['image_prompt'].lower():
        ref_path = 'cartoon-characters/store-cartoon.jpg'
    elif 'family' in page_data['image_prompt'].lower():
        ref_path = 'cartoon-characters/wentworth-family-foglio.jpg'
    else:
        ref_path = 'cartoon-characters/corey1.jpg'  # Default to Corey
    
    # Generate image
    result = generate_consistent_image(page_data, ref_path)
    
    if result:
        generated_image, reference_image, canny_image = result
        
        # Save image
        output_name = page_file.replace('.md', '.png')
        output_path = f'generated_images/{output_name}'
        generated_image.save(output_path)
        
        results.append({
            'page': page_file,
            'image': generated_image,
            'path': output_path,
            'title': page_data['title']
        })
        
        print(f"‚úÖ Saved: {output_path}")
        
        # Display the result
        plt.figure(figsize=(8, 8))
        plt.imshow(generated_image)
        plt.title(f"{page_data['title']}")
        plt.axis('off')
        plt.show()
        
    else:
        print(f"‚ùå Failed to generate {page_file}")

print(f"\nüéâ Generation complete!")
print(f"‚úÖ Successfully generated: {len(results)} images")
print(f"üìÅ Images saved in: generated_images/")

## 7. Download Results

In [None]:
# Create zip file with all generated images
import zipfile
from datetime import datetime

# Create zip filename with timestamp
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
zip_filename = f"corey_book_images_{timestamp}.zip"

with zipfile.ZipFile(zip_filename, 'w') as zipf:
    # Add all generated images
    for file in os.listdir('generated_images/'):
        if file.endswith('.png'):
            zipf.write(f'generated_images/{file}', file)

print(f"üì¶ Created zip file: {zip_filename}")
print(f"üìÅ Contains {len(os.listdir('generated_images/'))} images")
print(f"\nüíæ Right-click the file in the file browser to download it")

# Show file sizes
print("\nüìä Generated Images:")
for file in sorted(os.listdir('generated_images/')):
    if file.endswith('.png'):
        size_mb = os.path.getsize(f'generated_images/{file}') / (1024*1024)
        print(f"  {file}: {size_mb:.1f} MB")

## 8. Tips for Better Results

### Character Consistency:
- The notebook uses your `corey1.jpg` as a structural reference
- Canny edge detection preserves the overall pose/composition
- Adjust `controlnet_conditioning_scale` (0.5-1.0) for more/less reference influence

### Quality Improvements:
- Use higher resolution reference images (1024x1024 or larger)
- Increase `num_inference_steps` (30-50) for better quality
- Adjust `guidance_scale` (5-10) to control prompt adherence

### Different Reference Images:
- Use `store-cartoon.jpg` for building-focused scenes
- Use `wentworth-family-foglio.jpg` for family scenes
- Create pose-specific references for different Corey positions

### Cost:
- **FREE** on Google Colab (with daily GPU limits)
- Much better character consistency than text-only approaches
- Can generate all 56 pages in 1-2 sessions

**Next Steps:**
1. Download your generated images
2. Review character consistency
3. Run again with remaining pages (4-56)
4. Adjust parameters if needed