Skip to content

reth0608/Long_AI_Video_Generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI Video Generation Pipeline

Production-grade pipeline for generating 15-20 minute AI videos from text prompts using open-source models.

Python 3.8+ License: MIT

🎬 Overview

This pipeline transforms a single text prompt into a complete long-form video by:

  1. Splitting the concept into scenes using LLM
  2. Planning individual camera shots for each scene
  3. Generating 3-12 second video clips using AI models
  4. Maintaining continuity between clips via frame analysis
  5. Creating audio (narration + background music)
  6. Stitching everything with FFmpeg into a final MP4
User Prompt β†’ Scene Splitter β†’ Shot Planner β†’ Clip Generator β†’ Continuity Manager
                                                    ↓
                                            Audio Pipeline β†’ FFmpeg Stitcher β†’ Final Video

⚠️ Hardware Requirements

Component Minimum Recommended
RAM 16GB 64GB
GPU VRAM None (cloud) 24GB+ (local)
Storage 50GB 200GB
CPU 8 cores 16+ cores

Note: This pipeline is designed for cloud-first execution (Replicate, HuggingFace). Local video generation requires 12GB+ VRAM.

πŸš€ Quick Start

1. Clone and Setup

cd AI_VIDEO
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

2. Configure API Keys (Optional but Recommended)

# For video generation
export REPLICATE_API_TOKEN="your_token_here"

# For scene splitting and HuggingFace models
export HF_TOKEN="your_token_here"

Get free tokens:

3. Run Your First Video

# Dry run to test setup
python -m pipeline.orchestrator --dry-run --example

# Generate a 90-second test video
python -m pipeline.orchestrator "A cinematic sunset over mountains" --duration 1.5

# Full 12-minute documentary
python -m pipeline.orchestrator "Your video concept here" --duration 12

πŸ“ Project Structure

AI_VIDEO/
β”œβ”€β”€ pipeline/                    # Core pipeline modules
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ orchestrator.py         # Main coordinator
β”‚   β”œβ”€β”€ scene_splitter.py       # Prompt β†’ Scenes (LLM)
β”‚   β”œβ”€β”€ shot_planner.py         # Scenes β†’ Shots
β”‚   β”œβ”€β”€ prompt_templates.py     # Jinja2 templates
β”‚   β”œβ”€β”€ clip_worker.py          # Video generation
β”‚   β”œβ”€β”€ continuity_manager.py   # Visual continuity
β”‚   └── audio_pipeline.py       # TTS + Music
β”œβ”€β”€ scripts/
β”‚   └── ffmpeg_tools.sh         # FFmpeg commands
β”œβ”€β”€ examples/
β”‚   └── salt_flats_shots.json   # Sample shot sequence
β”œβ”€β”€ tests/
β”‚   └── test_prompts.py         # Unit tests
β”œβ”€β”€ config.yaml                 # Configuration
β”œβ”€β”€ requirements.txt            # Dependencies
β”œβ”€β”€ sample_run.sh               # Example script
└── manifest.json               # Project manifest

🎯 Usage Examples

Basic Video Generation

from pipeline import VideoPipeline

pipeline = VideoPipeline()
result = pipeline.run(
    user_prompt="A serene Japanese garden in autumn, falling maple leaves, koi pond",
    target_duration=5.0  # minutes
)

print(f"Final video: {result['final_video']}")

Custom Configuration

from pipeline.orchestrator import PipelineConfig, VideoPipeline

config = PipelineConfig(
    resolution=1080,
    default_clip_duration=8.0,
    primary_backend="replicate",
    tts_engine="bark",
    skip_music=False
)

pipeline = VideoPipeline(config)

CLI Options

python -m pipeline.orchestrator "Your prompt" \
    --duration 10 \           # Target duration in minutes
    --resolution 1080 \       # 720 or 1080
    --backend replicate \     # replicate, huggingface, local
    --skip-audio \            # Skip audio generation
    --dry-run                 # Test without generation

🎨 Sample: "Ghosts of the Salt Flats"

The pipeline includes a complete example for a 12-minute documentary:

# View the sample shot breakdown
cat examples/salt_flats_shots.json

# Generate using the example
python -m pipeline.orchestrator --example --duration 2

Sample Scenes:

  1. Opening - Empty Horizon (vast salt flats, dusk)
  2. First Glimpse - The Traveler (silhouette in green coat)
  3. Lost Machine - The Plane (rusted aircraft in salt)

πŸ”§ Configuration

Edit config.yaml to customize:

video:
  resolution: 720
  fps: 24
  default_clip_duration: 6

backend:
  primary: "replicate"
  fallback: "huggingface"

audio:
  tts:
    engine: "bark"
    voice: "v2/en_speaker_6"
  music:
    engine: "musicgen"

πŸ§ͺ Testing

# Run unit tests
python -m pytest tests/ -v

# Validate FFmpeg installation
bash scripts/ffmpeg_tools.sh --validate

# Test module imports
python -c "from pipeline import VideoPipeline; print('OK')"

πŸ“Š Evaluation Metrics

The pipeline tracks:

Metric Description Target
Continuity Score Visual consistency between clips >0.8
Audio Sync Narration/video alignment Β±0.5s
Color Match Cross-clip color consistency Ξ”E <10
Generation Success Clips generated vs planned >95%

πŸ”„ Pipeline Flow

flowchart TD
    A[User Prompt] --> B[Scene Splitter]
    B --> C[Shot Planner]
    C --> D{For Each Shot}
    D --> E[Get Continuity Context]
    E --> F[Generate Clip]
    F --> G[Extract Last Frame]
    G --> H[Analyze for Continuity]
    H --> D
    D --> I[Audio Pipeline]
    I --> J[TTS Narration]
    I --> K[Music Generation]
    J --> L[FFmpeg Stitch]
    K --> L
    F --> L
    L --> M[Final MP4]
Loading

⚑ Performance Tips

  1. Batch Processing: Generate clips overnight to work within free tier limits
  2. Checkpointing: Pipeline saves progress; resume with --continue-from
  3. Local Testing: Use --backend local for fast pipeline testing (placeholders only)
  4. Resolution: Start with 720p; upscale final video if needed

πŸ†˜ Troubleshooting

"No API token provided"

export REPLICATE_API_TOKEN="your_token"
# or
export HF_TOKEN="your_token"

"FFmpeg not found"

Install FFmpeg: https://ffmpeg.org/download.html

"Out of GPU memory" (local mode)

Use cloud backends instead:

python -m pipeline.orchestrator "prompt" --backend replicate

"Rate limit exceeded"

Wait and retry, or use checkpoints to resume later.

πŸ“„ License

MIT License - see LICENSE file.

πŸ™ Credits

  • Video Models: Stability AI, Replicate, HuggingFace
  • Audio: Bark TTS, MusicGen
  • Processing: FFmpeg

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published