# Orchestration de Pipelines Video

**Module :** 03-Video-Orchestration  
**Niveau :** Avance  
**Technologies :** Text->Image->Video pipelines, batch generation, upscale+interpolation  
**Duree estimee :** 60 minutes  
**VRAM :** ~18 GB (chargement sequentiel)  

## Objectifs d'Apprentissage

- [ ] Construire un pipeline text -> image (DALL-E/SDXL) -> video (SVD)
- [ ] Construire un pipeline text -> video -> upscale -> interpolation
- [ ] Orchestrer des generations batch avec plusieurs prompts
- [ ] Gerer le chargement sequentiel des modeles (VRAM limitee)
- [ ] Assembler les resultats : concatenation de clips, transitions
- [ ] Comparer les pipelines en termes de qualite et de cout

## Prerequis

- GPU avec 18+ GB VRAM (RTX 3090 / RTX 4090)
- Notebook 03-1 complete (benchmark multi-modeles)
- Packages : `diffusers>=0.32`, `transformers`, `torch`, `accelerate`, `imageio`, `imageio-ffmpeg`, `openai`, `Pillow`

**Navigation :** [<< 03-1](03-1-Multi-Model-Video-Comparison.ipynb) | [Index](../README.md) | [Suivant >>](03-3-ComfyUI-Video-Workflows.ipynb)

In [1]:
# Parametres Papermill - JAMAIS modifier ce commentaire

# Configuration notebook
notebook_mode = "interactive"        # "interactive" ou "batch"
skip_widgets = False               # True pour mode batch MCP
debug_level = "INFO"

# Parametres pipeline
pipeline_mode = "text_image_video"  # "text_image_video" ou "text_video_upscale"
image_model = "dall-e-3"           # Modele image : "dall-e-3" ou "sdxl"
video_model = "svd"                # Modele video : "svd" ou "ltx"
upscale = True                     # Activer l'upscaling
device = "cuda"                    # Device de calcul

# Parametres generation
num_frames = 25                    # Nombre de frames video
num_inference_steps = 25           # Etapes de debruitage
guidance_scale = 6.0               # CFG scale
fps_output = 8                     # FPS de sortie
seed = 42                          # Graine pour reproductibilite

# Configuration
run_pipeline = True                # Executer les pipelines
save_as_mp4 = True                 # Sauvegarder en MP4
save_results = True

In [2]:
# Parameters
notebook_mode = "batch"
skip_widgets = True


In [3]:
# Setup environnement et imports
import os
import sys
import json
import time
import gc
import warnings
from pathlib import Path
from datetime import datetime
from typing import Dict, List, Any, Optional
import numpy as np
from PIL import Image, ImageDraw, ImageFilter
import matplotlib.pyplot as plt
import logging

warnings.filterwarnings('ignore', category=DeprecationWarning)
warnings.filterwarnings('ignore', category=FutureWarning)

# Import helpers GenAI
GENAI_ROOT = Path.cwd()
while GENAI_ROOT.name != 'GenAI' and len(GENAI_ROOT.parts) > 1:
    GENAI_ROOT = GENAI_ROOT.parent

HELPERS_PATH = GENAI_ROOT / 'shared' / 'helpers'
if HELPERS_PATH.exists():
    sys.path.insert(0, str(HELPERS_PATH.parent))
    try:
        from helpers.genai_helpers import setup_genai_logging
        print("Helpers GenAI importes")
    except ImportError:
        print("Helpers GenAI non disponibles - mode autonome")

OUTPUT_DIR = GENAI_ROOT / 'outputs' / 'pipeline_video'
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)

logging.basicConfig(level=getattr(logging, debug_level))
logger = logging.getLogger('pipeline_video')

print(f"Orchestration de Pipelines Video")
print(f"Date : {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print(f"Mode : {notebook_mode}")
print(f"Pipeline : {pipeline_mode}")
print(f"Image : {image_model}, Video : {video_model}")
print(f"Upscale : {upscale}")

Helpers GenAI importes
Orchestration de Pipelines Video
Date : 2026-02-26 08:10:51
Mode : batch
Pipeline : text_image_video
Image : dall-e-3, Video : svd
Upscale : True


In [4]:
# Chargement .env et verification GPU
from dotenv import load_dotenv

current_path = Path.cwd()
found_env = False
for _ in range(4):
    env_path = current_path / '.env'
    if env_path.exists():
        load_dotenv(env_path)
        print(f"Fichier .env charge depuis : {env_path}")
        found_env = True
        break
    current_path = current_path.parent

if not found_env:
    print("Aucun fichier .env trouve")

# Verification des API keys
print("\n--- VERIFICATION API ---")
print("=" * 40)

openai_key = os.environ.get('OPENAI_API_KEY', '')
if openai_key:
    print(f"OPENAI_API_KEY : configure ({openai_key[:8]}...)")
else:
    print("OPENAI_API_KEY : non configure")
    if image_model == "dall-e-3":
        print("  Pipeline text->image->video utilisera SDXL en fallback")
        image_model = "sdxl"

# Verification GPU
print("\n--- VERIFICATION GPU ---")
print("=" * 40)

import torch

if torch.cuda.is_available():
    gpu_name = torch.cuda.get_device_name(0)
    vram_total = torch.cuda.get_device_properties(0).total_mem / 1024**3
    vram_free = (torch.cuda.get_device_properties(0).total_mem - torch.cuda.memory_allocated(0)) / 1024**3
    
    print(f"GPU : {gpu_name}")
    print(f"VRAM totale : {vram_total:.1f} GB")
    print(f"VRAM libre : {vram_free:.1f} GB")
    print(f"CUDA : {torch.version.cuda}")
else:
    print("CUDA non disponible.")
    print("Les pipelines locaux necessitent un GPU. Le notebook montrera le code sans executer.")
    run_pipeline = False
    device = "cpu"

# Verification des dependances
print("\n--- VERIFICATION DEPENDANCES ---")
print("=" * 40)

deps_ok = True

try:
    import diffusers
    print(f"diffusers : v{diffusers.__version__}")
    from diffusers.utils import export_to_video
except ImportError:
    print("diffusers NON INSTALLE")
    deps_ok = False

try:
    import imageio
    print(f"imageio : v{imageio.__version__}")
except ImportError:
    print("imageio NON INSTALLE")
    deps_ok = False

if not deps_ok:
    print("\nDependances manquantes. Le notebook montrera le code sans executer.")
    run_pipeline = False

print(f"\nDevice : {device}")
print(f"Pipeline active : {run_pipeline}")

Fichier .env charge depuis : D:\Dev\CoursIA.worktrees\GenAI_Series\MyIA.AI.Notebooks\GenAI\.env

--- VERIFICATION API ---
OPENAI_API_KEY : configure (sk-proj-...)

--- VERIFICATION GPU ---


CUDA non disponible.
Les pipelines locaux necessitent un GPU. Le notebook montrera le code sans executer.

--- VERIFICATION DEPENDANCES ---


diffusers : v0.36.0
imageio : v2.37.2

Device : cpu
Pipeline active : False


## Section 1 : Pipeline Text -> Image -> Video

Ce premier pipeline combine un modele de generation d'images (DALL-E 3 ou SDXL)
avec un modele d'animation (SVD) pour creer des videos a partir de descriptions textuelles.

```
Texte (prompt) --> [DALL-E 3 / SDXL] --> Image haute qualite
                                              |
                                              v
                                        [SVD / LTX-Video] --> Video animee
```

### Avantages de cette approche

| Aspect | Pipeline 2 etapes | Generation directe |
|--------|-------------------|---------------------|
| Controle visuel | Eleve (image intermediaire) | Faible |
| Qualite premiere frame | Excellente (DALL-E 3) | Variable |
| Flexibilite | Choix independant image/video | Lie au modele |
| Cout VRAM | Sequentiel (un modele a la fois) | Un seul modele |

In [5]:
# Pipeline 1 : Text -> Image -> Video
print("\n--- PIPELINE TEXT -> IMAGE -> VIDEO ---")
print("=" * 50)


def release_vram():
    """Libere la VRAM GPU de facon aggressive."""
    gc.collect()
    if torch.cuda.is_available():
        torch.cuda.empty_cache()
        torch.cuda.synchronize()


def generate_image_dalle(prompt: str) -> Optional[Image.Image]:
    """Genere une image via DALL-E 3 (API OpenAI)."""
    try:
        from openai import OpenAI
        import base64
        from io import BytesIO
        
        client = OpenAI()
        response = client.images.generate(
            model="dall-e-3",
            prompt=prompt,
            size="1792x1024",
            quality="hd",
            n=1,
            response_format="b64_json"
        )
        
        img_data = base64.b64decode(response.data[0].b64_json)
        img = Image.open(BytesIO(img_data)).convert('RGB')
        # Redimensionner pour SVD (1024x576)
        img = img.resize((1024, 576), Image.LANCZOS)
        return img
        
    except Exception as e:
        print(f"  Erreur DALL-E 3 : {type(e).__name__}: {str(e)[:100]}")
        return None


def generate_image_sdxl(prompt: str) -> Optional[Image.Image]:
    """Genere une image via SDXL local."""
    try:
        from diffusers import StableDiffusionXLPipeline
        
        pipe_sdxl = StableDiffusionXLPipeline.from_pretrained(
            "stabilityai/stable-diffusion-xl-base-1.0",
            torch_dtype=torch.float16,
            variant="fp16"
        ).to(device)
        pipe_sdxl.enable_vae_slicing()
        
        generator = torch.Generator(device=device).manual_seed(seed)
        img = pipe_sdxl(
            prompt=prompt,
            negative_prompt="low quality, blurry, distorted",
            num_inference_steps=30,
            guidance_scale=7.5,
            height=576,
            width=1024,
            generator=generator
        ).images[0]
        
        del pipe_sdxl
        release_vram()
        return img
        
    except Exception as e:
        print(f"  Erreur SDXL : {type(e).__name__}: {str(e)[:100]}")
        return None


def animate_with_svd(image: Image.Image) -> Optional[List]:
    """Anime une image avec SVD."""
    try:
        from diffusers import StableVideoDiffusionPipeline
        
        pipe_svd = StableVideoDiffusionPipeline.from_pretrained(
            "stabilityai/stable-video-diffusion-img2vid-xt",
            torch_dtype=torch.float16,
            variant="fp16"
        ).to(device)
        pipe_svd.enable_vae_slicing()
        
        generator = torch.Generator(device=device).manual_seed(seed)
        output = pipe_svd(
            image=image,
            num_frames=num_frames,
            fps=7,
            motion_bucket_id=127,
            noise_aug_strength=0.02,
            num_inference_steps=num_inference_steps,
            decode_chunk_size=8,
            generator=generator
        )
        
        frames = output.frames[0]
        del pipe_svd
        release_vram()
        return frames
        
    except Exception as e:
        print(f"  Erreur SVD : {type(e).__name__}: {str(e)[:100]}")
        return None


# Execution du pipeline text -> image -> video
pipeline1_prompt = "a majestic lighthouse on a rocky cliff at sunset, dramatic clouds, golden light, cinematic photography"
pipeline1_result = {"prompt": pipeline1_prompt}

if run_pipeline:
    print(f"Prompt : {pipeline1_prompt}")
    
    # Etape 1 : Generation de l'image
    print(f"\n  Etape 1 : Generation image ({image_model})...")
    start_time = time.time()
    
    if image_model == "dall-e-3":
        source_image = generate_image_dalle(pipeline1_prompt)
    else:
        source_image = generate_image_sdxl(pipeline1_prompt)
    
    if source_image is None:
        # Fallback : creer une image synthetique
        print("  Fallback : image synthetique")
        source_image = Image.new('RGB', (1024, 576))
        draw = ImageDraw.Draw(source_image)
        for y in range(576):
            t = y / 576
            r = int(255 - 120 * t)
            g = int(160 - 60 * t)
            b = int(80 + 100 * t)
            draw.line([(0, y), (1024, y)], fill=(r, g, b))
        # Silhouette de phare
        draw.rectangle([480, 100, 544, 400], fill=(60, 55, 50))
        draw.ellipse([460, 70, 564, 130], fill=(255, 240, 180))
    
    img_time = time.time() - start_time
    pipeline1_result["image_time"] = img_time
    pipeline1_result["source_image"] = source_image
    print(f"  Image generee en {img_time:.1f}s ({source_image.size[0]}x{source_image.size[1]})")
    
    # Sauvegarde image intermediaire
    img_path = OUTPUT_DIR / "pipeline1_source.png"
    source_image.save(str(img_path))
    
    # Etape 2 : Animation avec SVD
    print(f"\n  Etape 2 : Animation video (SVD)...")
    start_time = time.time()
    
    video_frames = animate_with_svd(source_image)
    
    if video_frames:
        video_time = time.time() - start_time
        pipeline1_result["video_time"] = video_time
        pipeline1_result["frames"] = video_frames
        pipeline1_result["total_time"] = img_time + video_time
        pipeline1_result["success"] = True
        
        print(f"  Video generee en {video_time:.1f}s ({len(video_frames)} frames)")
        print(f"  Temps total pipeline : {pipeline1_result['total_time']:.1f}s")
        
        # Affichage : image source + frames
        fig, axes = plt.subplots(1, 5, figsize=(18, 3.5))
        axes[0].imshow(source_image)
        axes[0].set_title(f"Source ({image_model})", fontsize=9, fontweight='bold')
        axes[0].axis('off')
        
        frame_indices = np.linspace(0, len(video_frames) - 1, 4, dtype=int)
        for i, fi in enumerate(frame_indices):
            axes[i + 1].imshow(video_frames[fi])
            axes[i + 1].set_title(f"Frame {fi + 1}", fontsize=9)
            axes[i + 1].axis('off')
        
        plt.suptitle(f"Pipeline Text -> Image -> Video", fontsize=13, fontweight='bold')
        plt.tight_layout()
        plt.show()
        
        if save_as_mp4:
            mp4_path = OUTPUT_DIR / "pipeline1_result.mp4"
            export_to_video(video_frames, str(mp4_path), fps=fps_output)
            print(f"  MP4 : {mp4_path.name}")
    else:
        pipeline1_result["success"] = False
        print("  Animation echouee")
else:
    print("Pipeline desactive")
    print("\nSchema du pipeline :")
    print("  1. DALL-E 3 / SDXL genere une image haute qualite")
    print("  2. SVD anime l'image en 25 frames")
    print("  3. Export en MP4")


--- PIPELINE TEXT -> IMAGE -> VIDEO ---
Pipeline desactive

Schema du pipeline :
  1. DALL-E 3 / SDXL genere une image haute qualite
  2. SVD anime l'image en 25 frames
  3. Export en MP4


### Interpretation : Pipeline Text -> Image -> Video

| Etape | Modele | Temps typique | Cout |
|-------|--------|---------------|------|
| Generation image | DALL-E 3 (API) | 5-15s | ~0.04$ |
| Generation image | SDXL (local) | 10-20s | Gratuit (GPU) |
| Animation | SVD | 20-40s | Gratuit (GPU) |

**Points cles** :
1. L'image intermediaire peut etre verifiee et ajustee avant l'animation
2. DALL-E 3 produit des images plus detaillees mais necessite une API key
3. Le chargement sequentiel (SDXL puis SVD) permet de rester sous 24 GB VRAM

## Section 2 : Pipeline Text -> Video -> Upscale

Ce second pipeline genere d'abord une video brute, puis l'ameliore
via un upscaling spatial (Real-ESRGAN) et une interpolation temporelle.

```
Texte --> [LTX-Video] --> Video basse resolution
                               |
                               v
                         [Real-ESRGAN] --> Frames HD
                               |
                               v
                         [Interpolation] --> Video fluide HD
```

In [6]:
# Pipeline 2 : Text -> Video -> Upscale -> Interpolation
print("\n--- PIPELINE TEXT -> VIDEO -> UPSCALE ---")
print("=" * 50)


def generate_video_ltx(prompt: str) -> Optional[List]:
    """Genere une video avec LTX-Video (rapide, basse resolution)."""
    try:
        from diffusers import LTXPipeline
        
        pipe_ltx = LTXPipeline.from_pretrained(
            "Lightricks/LTX-Video",
            torch_dtype=torch.float16
        ).to(device)
        pipe_ltx.enable_vae_slicing()
        
        generator = torch.Generator(device=device).manual_seed(seed)
        output = pipe_ltx(
            prompt=prompt,
            negative_prompt="low quality, blurry, distorted",
            num_frames=num_frames,
            guidance_scale=guidance_scale,
            num_inference_steps=num_inference_steps,
            height=320,
            width=512,
            generator=generator
        )
        
        frames = output.frames[0]
        del pipe_ltx
        release_vram()
        return frames
        
    except Exception as e:
        print(f"  Erreur LTX-Video : {type(e).__name__}: {str(e)[:100]}")
        return None


def upscale_frames(frames: List, scale: int = 2) -> List:
    """
    Upscale les frames avec Real-ESRGAN ou fallback bicubique.
    
    Args:
        frames: Liste d'images PIL
        scale: Facteur d'agrandissement
    
    Returns:
        Liste d'images PIL upscalees
    """
    upscaled = []
    use_esrgan = False
    
    # Tenter Real-ESRGAN
    try:
        from realesrgan import RealESRGANer
        from basicsr.archs.rrdbnet_arch import RRDBNet
        
        model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
        upsampler = RealESRGANer(
            scale=scale,
            model_path='https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth',
            model=model,
            half=True
        )
        use_esrgan = True
        print(f"  Real-ESRGAN disponible (scale x{scale})")
    except ImportError:
        print(f"  Real-ESRGAN non disponible, fallback bicubique (scale x{scale})")
    
    for i, frame in enumerate(frames):
        if use_esrgan:
            img_array = np.array(frame)
            output, _ = upsampler.enhance(img_array, outscale=scale)
            upscaled.append(Image.fromarray(output))
        else:
            w, h = frame.size
            upscaled.append(frame.resize((w * scale, h * scale), Image.LANCZOS))
    
    if use_esrgan:
        del upsampler
        release_vram()
    
    return upscaled


def interpolate_frames(frames: List, factor: int = 2) -> List:
    """
    Interpole des frames intermediaires par blending lineaire.
    
    Args:
        frames: Liste d'images PIL
        factor: Nombre de frames intermediaires a inserer
    
    Returns:
        Liste d'images PIL avec frames interpolees
    """
    interpolated = []
    for i in range(len(frames) - 1):
        interpolated.append(frames[i])
        f1 = np.array(frames[i]).astype(float)
        f2 = np.array(frames[i + 1]).astype(float)
        for j in range(1, factor):
            alpha = j / factor
            blended = ((1 - alpha) * f1 + alpha * f2).astype(np.uint8)
            interpolated.append(Image.fromarray(blended))
    interpolated.append(frames[-1])
    return interpolated


# Execution du pipeline 2
pipeline2_prompt = "a time-lapse of a flower blooming in a garden, macro photography, soft focus background"
pipeline2_result = {"prompt": pipeline2_prompt}

if run_pipeline:
    print(f"Prompt : {pipeline2_prompt}")
    
    # Etape 1 : Generation video brute
    print(f"\n  Etape 1 : Generation video brute (LTX-Video)...")
    start_time = time.time()
    raw_frames = generate_video_ltx(pipeline2_prompt)
    
    if raw_frames:
        raw_time = time.time() - start_time
        pipeline2_result["raw_time"] = raw_time
        print(f"  Video brute : {len(raw_frames)} frames en {raw_time:.1f}s")
        print(f"  Resolution : {raw_frames[0].size[0]}x{raw_frames[0].size[1]}")
        
        # Etape 2 : Upscaling
        if upscale:
            print(f"\n  Etape 2 : Upscaling (x2)...")
            start_time = time.time()
            upscaled_frames = upscale_frames(raw_frames, scale=2)
            upscale_time = time.time() - start_time
            pipeline2_result["upscale_time"] = upscale_time
            print(f"  Upscale en {upscale_time:.1f}s")
            print(f"  Resolution : {upscaled_frames[0].size[0]}x{upscaled_frames[0].size[1]}")
        else:
            upscaled_frames = raw_frames
        
        # Etape 3 : Interpolation temporelle
        print(f"\n  Etape 3 : Interpolation temporelle (x2)...")
        start_time = time.time()
        final_frames = interpolate_frames(upscaled_frames, factor=2)
        interp_time = time.time() - start_time
        pipeline2_result["interp_time"] = interp_time
        pipeline2_result["frames"] = final_frames
        pipeline2_result["success"] = True
        
        total_time = raw_time + pipeline2_result.get("upscale_time", 0) + interp_time
        pipeline2_result["total_time"] = total_time
        
        print(f"  Interpolation en {interp_time:.1f}s")
        print(f"  Frames finales : {len(final_frames)} (x2 par interpolation)")
        print(f"  Temps total pipeline : {total_time:.1f}s")
        
        # Affichage : brut vs upscale
        fig, axes = plt.subplots(2, 4, figsize=(16, 8))
        raw_indices = np.linspace(0, len(raw_frames) - 1, 4, dtype=int)
        for i, fi in enumerate(raw_indices):
            axes[0][i].imshow(raw_frames[fi])
            axes[0][i].set_title(f"Brut - Frame {fi+1}", fontsize=9)
            axes[0][i].axis('off')
        
        final_indices = np.linspace(0, len(final_frames) - 1, 4, dtype=int)
        for i, fi in enumerate(final_indices):
            axes[1][i].imshow(final_frames[fi])
            axes[1][i].set_title(f"Final - Frame {fi+1}", fontsize=9)
            axes[1][i].axis('off')
        
        plt.suptitle("Pipeline : Brut vs Upscale + Interpolation", fontsize=13, fontweight='bold')
        plt.tight_layout()
        plt.show()
        
        if save_as_mp4:
            mp4_path = OUTPUT_DIR / "pipeline2_result.mp4"
            export_to_video(final_frames, str(mp4_path), fps=fps_output * 2)
            print(f"  MP4 : {mp4_path.name}")
    else:
        pipeline2_result["success"] = False
        print("  Generation video echouee")
else:
    print("Pipeline desactive")
    print("\nSchema du pipeline :")
    print("  1. LTX-Video genere une video basse resolution")
    print("  2. Real-ESRGAN upscale chaque frame")
    print("  3. Interpolation temporelle double le nombre de frames")


--- PIPELINE TEXT -> VIDEO -> UPSCALE ---
Pipeline desactive

Schema du pipeline :
  1. LTX-Video genere une video basse resolution
  2. Real-ESRGAN upscale chaque frame
  3. Interpolation temporelle double le nombre de frames


### Interpretation : Pipeline Video -> Upscale

| Etape | Entree | Sortie | Temps typique |
|-------|--------|--------|---------------|
| LTX-Video | Prompt texte | 16 frames 512x320 | 15-30s |
| Real-ESRGAN x2 | 16 frames 512x320 | 16 frames 1024x640 | 10-20s |
| Interpolation x2 | 16 frames | 31 frames | <1s |

**Points cles** :
1. L'upscaling par frame est simple mais ne gere pas la coherence temporelle
2. L'interpolation lineaire produit des transitions douces mais pas de nouveaux mouvements
3. Pour une interpolation avancee, des modeles comme RIFE ou FILM seraient preferables

## Section 3 : Generation batch multi-prompts

Nous allons generer plusieurs videos en batch pour simuler un workflow de production.

In [7]:
# Generation batch multi-prompts
print("\n--- GENERATION BATCH ---")
print("=" * 40)

batch_prompts = [
    {"text": "ocean waves crashing on a rocky shore at sunset, aerial view", "label": "Ocean"},
    {"text": "a candle flame flickering in a dark room, warm light, close-up", "label": "Bougie"},
    {"text": "clouds moving over a mountain valley, timelapse, dramatic sky", "label": "Nuages"},
]

batch_results = []

if run_pipeline:
    # Charger un seul pipeline pour tout le batch
    print(f"Chargement du pipeline LTX-Video pour le batch...")
    try:
        from diffusers import LTXPipeline
        pipe_batch = LTXPipeline.from_pretrained(
            "Lightricks/LTX-Video",
            torch_dtype=torch.float16
        ).to(device)
        pipe_batch.enable_vae_slicing()
        print("Pipeline charge")
        
        for p_idx, prompt_info in enumerate(batch_prompts):
            print(f"\n  Generation {p_idx + 1}/{len(batch_prompts)} : {prompt_info['label']}")
            print(f"  Prompt : {prompt_info['text'][:60]}...")
            
            generator = torch.Generator(device=device).manual_seed(seed + p_idx)
            start_time = time.time()
            
            try:
                output = pipe_batch(
                    prompt=prompt_info['text'],
                    negative_prompt="low quality, blurry, distorted",
                    num_frames=num_frames,
                    guidance_scale=guidance_scale,
                    num_inference_steps=num_inference_steps,
                    height=320,
                    width=512,
                    generator=generator
                )
                
                gen_time = time.time() - start_time
                frames = output.frames[0]
                
                batch_results.append({
                    "label": prompt_info['label'],
                    "prompt": prompt_info['text'],
                    "frames": frames,
                    "time": gen_time,
                    "success": True
                })
                
                print(f"  OK en {gen_time:.1f}s ({len(frames)} frames)")
                
                if save_as_mp4:
                    mp4_path = OUTPUT_DIR / f"batch_{prompt_info['label'].lower()}.mp4"
                    export_to_video(frames, str(mp4_path), fps=fps_output)
                    
            except Exception as e:
                print(f"  Erreur : {type(e).__name__}: {str(e)[:100]}")
                batch_results.append({"label": prompt_info['label'], "success": False})
        
        del pipe_batch
        release_vram()
        
    except Exception as e:
        print(f"Erreur chargement pipeline batch : {type(e).__name__}: {str(e)[:100]}")
    
    # Affichage des resultats batch
    successful_batch = [r for r in batch_results if r.get('success', False)]
    if successful_batch:
        n_videos = len(successful_batch)
        n_preview = 4
        fig, axes = plt.subplots(n_videos, n_preview, figsize=(3.5 * n_preview, 3 * n_videos))
        if n_videos == 1:
            axes = [axes]
        
        for v_idx, br in enumerate(successful_batch):
            frame_indices = np.linspace(0, len(br['frames']) - 1, n_preview, dtype=int)
            for f_idx, fi in enumerate(frame_indices):
                axes[v_idx][f_idx].imshow(br['frames'][fi])
                axes[v_idx][f_idx].axis('off')
                if f_idx == 0:
                    axes[v_idx][f_idx].set_ylabel(br['label'], fontsize=11, fontweight='bold')
        
        plt.suptitle("Generation Batch - LTX-Video", fontsize=13, fontweight='bold')
        plt.tight_layout()
        plt.show()
        
        total_batch_time = sum(r.get('time', 0) for r in successful_batch)
        print(f"\nBatch : {len(successful_batch)} videos en {total_batch_time:.1f}s total")
        print(f"Moyenne : {total_batch_time / len(successful_batch):.1f}s par video")
else:
    print("Generation batch desactivee")
    print("\nLe batch reutilise un seul pipeline charge en memoire :")
    print("  1. Charger le pipeline une fois")
    print("  2. Generer N videos avec des prompts differents")
    print("  3. Liberer le pipeline")


--- GENERATION BATCH ---
Generation batch desactivee

Le batch reutilise un seul pipeline charge en memoire :
  1. Charger le pipeline une fois
  2. Generer N videos avec des prompts differents
  3. Liberer le pipeline


### Interpretation : Generation batch

| Aspect | Single | Batch |
|--------|--------|-------|
| Chargement modele | 1 fois par video | 1 fois pour N videos |
| Temps total (3 videos) | ~3x (charge + gen) | 1x charge + 3x gen |
| VRAM | Variable | Stable |

**Points cles** :
1. En batch, le temps de chargement du modele est amorti sur toutes les generations
2. La VRAM reste stable pendant le batch (meme modele charge)
3. La generation batch est la strategie optimale pour produire de nombreuses videos

## Comparaison des pipelines

| Pipeline | Force | Faiblesse | Cas d'usage |
|----------|-------|-----------|-------------|
| Text->Image->Video | Controle, qualite premiere frame | 2 modeles, plus lent | Production soignee |
| Text->Video->Upscale | Pipeline bout-en-bout | Upscale par frame (pas temporel) | Amelioration qualite |
| Batch multi-prompts | Efficace pour N videos | Pipeline unique | Production en serie |

In [8]:
# Mode interactif : pipeline personnalise
if notebook_mode == "interactive" and not skip_widgets:
    print("\n--- MODE INTERACTIF ---")
    print("=" * 40)
    print("Entrez une description pour generer une video via le pipeline complet.")
    print("Pipeline : Text -> Image (synthetique) -> Video (SVD)")
    print("(Laissez vide pour passer a la suite)")
    
    try:
        user_desc = input("\nVotre description : ").strip()
        
        if user_desc and run_pipeline:
            print(f"\nGeneration avec : {user_desc}")
            
            # Etape 1 : Image synthetique basee sur la description
            print("  Etape 1 : Generation image (SDXL ou synthetique)...")
            user_image = generate_image_sdxl(user_desc)
            if user_image is None:
                user_image = Image.new('RGB', (1024, 576), (120, 140, 180))
                draw = ImageDraw.Draw(user_image)
                for y in range(576):
                    t = y / 576
                    draw.line([(0, y), (1024, y)], fill=(int(120+80*t), int(140-30*t), int(180-60*t)))
                print("  Image synthetique utilisee")
            
            # Etape 2 : Animation
            print("  Etape 2 : Animation (SVD)...")
            user_frames = animate_with_svd(user_image)
            
            if user_frames:
                n_display = min(6, len(user_frames))
                fig, axes = plt.subplots(1, n_display + 1, figsize=(2.5 * (n_display + 1), 3))
                axes[0].imshow(user_image)
                axes[0].set_title("Source", fontsize=9, fontweight='bold')
                axes[0].axis('off')
                
                indices = np.linspace(0, len(user_frames) - 1, n_display, dtype=int)
                for i, idx in enumerate(indices):
                    axes[i + 1].imshow(user_frames[idx])
                    axes[i + 1].set_title(f"Frame {idx+1}", fontsize=8)
                    axes[i + 1].axis('off')
                plt.suptitle(f"Pipeline : {user_desc[:50]}...", fontweight='bold')
                plt.tight_layout()
                plt.show()
                
                if save_as_mp4:
                    user_mp4 = OUTPUT_DIR / "user_pipeline.mp4"
                    export_to_video(user_frames, str(user_mp4), fps=fps_output)
                    print(f"MP4 sauvegarde : {user_mp4.name}")
            else:
                print("Animation echouee")
        elif user_desc:
            print("Pipeline non disponible")
        else:
            print("Mode interactif ignore")
    
    except (KeyboardInterrupt, EOFError) as e:
        print(f"\nMode interactif interrompu ({type(e).__name__})")
    except Exception as e:
        error_type = type(e).__name__
        if "StdinNotImplemented" in error_type or "input" in str(e).lower():
            print("\nMode interactif non disponible (execution automatisee)")
        else:
            print(f"\nErreur inattendue : {error_type} - {str(e)[:100]}")
            print("Passage a la suite du notebook")
else:
    print("\nMode batch - Interface interactive desactivee")


Mode batch - Interface interactive desactivee


## Bonnes pratiques d'orchestration

### Guide de choix de pipeline

| Besoin | Pipeline recommande | Raison |
|--------|-------------------|--------|
| Controle maximal | Text->Image->Video | Image intermediaire verifiable |
| Qualite HD | Text->Video->Upscale | Amelioration post-generation |
| Production en serie | Batch multi-prompts | Efficacite de chargement |
| Budget VRAM limite | Pipeline sequentiel | Un modele a la fois |

### Strategies avancees

| Strategie | Description |
|-----------|-------------|
| **Pipeline hybride** | Combiner DALL-E 3 (cloud) + SVD (local) pour le meilleur des deux |
| **Cache intermediaire** | Sauvegarder les images/videos intermediaires pour re-generation partielle |
| **Parallelisation** | Generer les images en parallele (API), animer sequentiellement (GPU) |
| **Validation automatique** | Verifier la qualite avant l'etape suivante (seuil de nettete) |

In [9]:
# Statistiques de session et prochaines etapes
print("\n--- STATISTIQUES DE SESSION ---")
print("=" * 40)

print(f"Date : {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
print(f"Mode : {notebook_mode}")
print(f"Pipeline mode : {pipeline_mode}")
print(f"Image model : {image_model}")
print(f"Video model : {video_model}")
print(f"Upscale : {upscale}")

# Recapitulatif des pipelines executes
print(f"\nResultats des pipelines :")
if pipeline1_result.get('success'):
    print(f"  Pipeline 1 (Text->Image->Video) : OK en {pipeline1_result.get('total_time', 0):.1f}s")
else:
    print(f"  Pipeline 1 (Text->Image->Video) : non execute")

if pipeline2_result.get('success'):
    print(f"  Pipeline 2 (Text->Video->Upscale) : OK en {pipeline2_result.get('total_time', 0):.1f}s")
else:
    print(f"  Pipeline 2 (Text->Video->Upscale) : non execute")

if batch_results:
    n_ok = sum(1 for r in batch_results if r.get('success'))
    print(f"  Batch : {n_ok}/{len(batch_results)} videos generees")

if device == "cuda" and torch.cuda.is_available():
    vram_current = torch.cuda.memory_allocated(0) / 1024**3
    print(f"\nVRAM actuelle : {vram_current:.1f} GB")

if save_results and OUTPUT_DIR.exists():
    generated_files = list(OUTPUT_DIR.glob('*'))
    print(f"\nFichiers generes ({len(generated_files)}) :")
    for f in sorted(generated_files):
        size_kb = f.stat().st_size / 1024
        print(f"  {f.name} ({size_kb:.1f} KB)")

print(f"\n--- PROCHAINES ETAPES ---")
print(f"1. Notebook 03-3 : Workflows ComfyUI pour la generation video")
print(f"2. Module 04 : Applications production (education, creatif, bout-en-bout)")
print(f"3. Combiner les pipelines pour des workflows de production automatises")

print(f"\nNotebook 03-2 Orchestration de Pipelines Video termine - {datetime.now().strftime('%H:%M:%S')}")


--- STATISTIQUES DE SESSION ---
Date : 2026-02-26 08:10:57
Mode : batch
Pipeline mode : text_image_video
Image model : dall-e-3
Video model : svd
Upscale : True

Resultats des pipelines :
  Pipeline 1 (Text->Image->Video) : non execute
  Pipeline 2 (Text->Video->Upscale) : non execute

Fichiers generes (0) :

--- PROCHAINES ETAPES ---
1. Notebook 03-3 : Workflows ComfyUI pour la generation video
2. Module 04 : Applications production (education, creatif, bout-en-bout)
3. Combiner les pipelines pour des workflows de production automatises

Notebook 03-2 Orchestration de Pipelines Video termine - 08:10:57
