# Video Dubbing Service V2 - Google Colab Edition

This notebook allows you to run the video dubbing service on Google Colab with GPU support.

## Setup Instructions:
1. Upload this notebook to Google Colab
2. Upload your `v2` project folder to Colab (or clone from repository)
3. Mount Google Drive (optional - for saving files)
4. Upload your video files to `/content/videos/` (or use the upload widget)
5. Run all cells sequentially
6. Process your videos!

In [None]:
# Install required dependencies
!pip install -q torch torchaudio transformers openai-whisper demucs moviepy pydub pyrubberband python-dotenv TTS

# Install system dependencies for audio processing
!apt-get update -qq
!apt-get install -y -qq ffmpeg sox

print("‚úÖ Dependencies installed successfully!")

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Preparing metadata (setup.py) ... [?25l[?25hdone
[31mERROR: Ignored the following versions that require a different python version: 0.0.10.2 Requires-Python >=3.6.0, <3.9; 0.0.10.3 Requires-Python >=3.6.0, <3.9; 0.0.11 Requires-Python >=3.6.0, <3.9; 0.0.12 Requires-Python >=3.6.0, <3.9; 0.0.13.1 Requires-Python >=3.6.0, <3.9; 0.0.13.2 Requires-Python >=3.6.0, <3.9; 0.0.14.1 Requires-Python >=3.6.0, <3.9; 0.0.15 Requires-Python >=3.6.0, <3.9; 0.0.15.1 Requires-Python >=3.6.0, <3.9; 0.0.9 Requires-Python >=3.6.0, <3.9; 0.0.9.1 Requires-Python >=3.6.0, <3.9; 0.0.9.2 Requires-Python >=3.6.0, <3.9; 0.0.9a10 Requires-Python >=3.6.0, <3.9; 0.0.9a9 Requires-Python >=3.6.0, <3.9; 0.1.0 Requires-Python >=3.6.0, <3.10; 0.1.1 Requires-Python >=3.6.0, <3.10; 0.1.2 Requires-Python >=3.6.0, <3.10; 0.1.3 Requires-Python >=3.6.0,

In [None]:
# Setup environment and paths
import os
import sys
from pathlib import Path

# Mount Google Drive (optional - if you want to save files there)
from google.colab import drive
drive.mount('/content/drive')

# Set working directory to /content
WORK_DIR = Path('/content')
os.chdir(WORK_DIR)

# Add v2 directory to path
# Assuming v2 folder is uploaded to /content/v2
V2_PATH = '/content/v2'

if os.path.exists(V2_PATH):
    # Add parent directory to path so 'v2' can be imported
    sys.path.insert(0, '/content')
    print(f"‚úÖ v2 project found at: {V2_PATH}")
else:
    print(f"‚ö†Ô∏è  v2 project not found at {V2_PATH}")
    print("Please upload the v2 folder to /content/v2 in Colab")
    print("Or adjust V2_PATH if you placed it elsewhere")
    
print(f"Current working directory: {os.getcwd()}")
print(f"Python path includes: {[p for p in sys.path if 'content' in p][:3]}")

‚ùå Error: Could not find v2 directory!
   Current directory: /content
   Please ensure you're running from the project directory
   Expected structure: .../Dubbing-Service/v2/services/


In [None]:
# Import all necessary modules from v2 project
import asyncio
import logging
from pathlib import Path

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)
logger = logging.getLogger(__name__)

# Import v2 services
from v2.services.dubbing_processor import DubbingProcessor
from v2.services.video_processor import VideoProcessor
from v2.services.transcription_processor import TranscriptionProcessor
from v2.services.translation_processor import TranslationProcessor
from v2.services.tts_processor import TTSProcessor
from v2.services.audio_assembler import AudioAssembler

print("‚úÖ All modules imported successfully!")

üìã Python path includes: []
‚ùå Import error: No module named 'v2'

üîß Troubleshooting:
   1. Make sure Cell 2 ran successfully
   2. Check that sys.path includes the project root directory
   3. Verify v2/services/ directory exists with all required files

   Current sys.path entries related to project:


ModuleNotFoundError: No module named 'v2'

## Main Dubbing Function

The function below processes videos from a directory. It will:
1. Find all video files in the specified directory
2. Process each video through the dubbing pipeline
3. Save output videos to the output directory

In [None]:
async def process_video_directory(
    video_directory: str,
    source_language: str,
    target_language: str,
    output_directory: str = None
):
    """
    Process all videos in a directory through the dubbing pipeline
    
    Args:
        video_directory: Path to directory containing video files
        source_language: Source language code (e.g., 'en', 'fr', 'es')
        target_language: Target language code (e.g., 'yo', 'ig', 'ha')
        output_directory: Optional output directory (defaults to video_directory/dubbed)
    
    Returns:
        List of results for each processed video
    """
    video_dir = Path(video_directory)
    if not video_dir.exists():
        raise ValueError(f"Video directory not found: {video_directory}")
    
    # Set output directory
    if output_directory is None:
        output_dir = video_dir / "dubbed"
    else:
        output_dir = Path(output_directory)
    
    output_dir.mkdir(parents=True, exist_ok=True)
    
    # Find all video files
    video_extensions = ['.mp4', '.avi', '.mov', '.mkv', '.flv', '.wmv', '.m4v']
    video_files = []
    for ext in video_extensions:
        video_files.extend(video_dir.glob(f"*{ext}"))
        video_files.extend(video_dir.glob(f"*{ext.upper()}"))
    
    if not video_files:
        logger.warning(f"No video files found in {video_directory}")
        return []
    
    logger.info(f"Found {len(video_files)} video file(s) to process")
    
    results = []
    
    # Process each video
    for video_path in video_files:
        logger.info(f"\n{'='*60}")
        logger.info(f"Processing: {video_path.name}")
        logger.info(f"{'='*60}")
        
        try:
            # Create dubbing processor
            processor = DubbingProcessor()
            
            # Process video
            result = await processor.process_video(
                str(video_path),
                source_language,
                target_language
            )
            
            if result['success']:
                # Move output to designated output directory
                output_path = Path(result['output_path'])
                final_output = output_dir / f"{video_path.stem}_dubbed{video_path.suffix}"
                
                # Copy or move the file
                import shutil
                shutil.copy2(output_path, final_output)
                logger.info(f"‚úÖ Saved dubbed video to: {final_output}")
                
                result['final_output_path'] = str(final_output)
                results.append(result)
            else:
                logger.error(f"‚ùå Failed to process {video_path.name}: {result.get('error', 'Unknown error')}")
                results.append(result)
                
        except Exception as e:
            logger.error(f"‚ùå Error processing {video_path.name}: {e}", exc_info=True)
            results.append({
                'success': False,
                'video': str(video_path),
                'error': str(e)
            })
    
    logger.info(f"\n{'='*60}")
    logger.info(f"Processing complete! Processed {len([r for r in results if r.get('success')])}/{len(video_files)} videos")
    logger.info(f"Output directory: {output_dir}")
    logger.info(f"{'='*60}")
    
    return results


def process_videos(
    video_directory: str,
    source_language: str,
    target_language: str,
    output_directory: str = None
):
    """
    Synchronous wrapper for process_video_directory
    
    Use this function directly in Colab cells
    """
    # Run async function
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
    try:
        results = loop.run_until_complete(
            process_video_directory(
                video_directory,
                source_language,
                target_language,
                output_directory
            )
        )
        return results
    finally:
        loop.close()


print("‚úÖ Dubbing functions defined successfully!")

In [None]:
# CONFIGURATION - Adjust these parameters

# Path to directory containing your video files
VIDEO_DIRECTORY = "/content/videos"  # Change this to your video directory path

# Source language (language spoken in the video)
SOURCE_LANGUAGE = "en"  # Options: 'en', 'fr', 'es', 'de', 'ru', 'zh', etc.

# Target language (language you want to dub to)
TARGET_LANGUAGE = "yo"  # Options: 'yo' (Yoruba), 'ig' (Igbo), 'ha' (Hausa), etc.

# Output directory (optional - will create 'dubbed' subdirectory if not specified)
OUTPUT_DIRECTORY = None  # e.g., "/content/drive/MyDrive/dubbed_videos"

print(f"Configuration:")
print(f"  Video Directory: {VIDEO_DIRECTORY}")
print(f"  Source Language: {SOURCE_LANGUAGE}")
print(f"  Target Language: {TARGET_LANGUAGE}")
print(f"  Output Directory: {OUTPUT_DIRECTORY or 'Auto (dubbed subdirectory)'}")

In [None]:
# Create video directory if it doesn't exist
os.makedirs(VIDEO_DIRECTORY, exist_ok=True)
print(f"üìÅ Video directory ready: {VIDEO_DIRECTORY}")

# List existing video files
video_files = []
if os.path.exists(VIDEO_DIRECTORY):
    video_files = [f for f in os.listdir(VIDEO_DIRECTORY) 
                   if f.lower().endswith(('.mp4', '.avi', '.mov', '.mkv', '.flv', '.wmv', '.m4v'))]

if video_files:
    print(f"üìπ Found {len(video_files)} video file(s) in {VIDEO_DIRECTORY}:")
    for vf in video_files[:5]:  # Show first 5
        print(f"  - {vf}")
    if len(video_files) > 5:
        print(f"  ... and {len(video_files) - 5} more")
else:
    print(f"‚ö†Ô∏è  No video files found in {VIDEO_DIRECTORY}")
    print(f"\nüì§ To upload videos, you can:")
    print(f"   1. Use the file upload widget in the next cell")
    print(f"   2. Drag and drop files to the Colab file browser sidebar")
    print(f"   3. Upload to Google Drive and access via mounted drive")

In [None]:
# Upload video files (optional - if you haven't uploaded them yet)
# Uncomment the code below to use the file upload widget

from google.colab import files
import shutil

print("üì§ Upload your video files:")
print("   Click 'Choose Files' and select your video files")
print("   Files will be saved to:", VIDEO_DIRECTORY)

# Uncomment the next two lines to enable file upload:
# uploaded = files.upload()
# for filename in uploaded.keys():
#     shutil.move(filename, os.path.join(VIDEO_DIRECTORY, filename))
#     print(f"‚úÖ Saved: {filename}")

print("\nüí° Tip: You can also drag and drop files directly in the Colab file browser!")

In [None]:
# Run the dubbing process
# This will process all videos in the specified directory

print("üöÄ Starting dubbing process...")
print("This may take a while, especially for the first run (model downloads)")
print("-" * 60)

results = process_videos(
    video_directory=VIDEO_DIRECTORY,
    source_language=SOURCE_LANGUAGE,
    target_language=TARGET_LANGUAGE,
    output_directory=OUTPUT_DIRECTORY
)

# Display results summary
print("\n" + "="*60)
print("PROCESSING SUMMARY")
print("="*60)

for i, result in enumerate(results, 1):
    status = "‚úÖ SUCCESS" if result.get('success') else "‚ùå FAILED"
    video_name = result.get('video', 'Unknown')
    if isinstance(video_name, str):
        video_name = Path(video_name).name
    print(f"{i}. {video_name}: {status}")
    if result.get('success') and 'final_output_path' in result:
        print(f"   Output: {result['final_output_path']}")
    elif 'error' in result:
        print(f"   Error: {result['error']}")

print("="*60)

## Download Results

After processing, you can download the dubbed videos from Colab or access them in your Google Drive (if you mounted it).

In [None]:
# Download all dubbed videos as a ZIP file
from zipfile import ZipFile
import shutil
from google.colab import files

# Find output directory
if OUTPUT_DIRECTORY:
    output_dir = Path(OUTPUT_DIRECTORY)
else:
    output_dir = Path(VIDEO_DIRECTORY) / "dubbed"

if output_dir.exists():
    zip_path = Path("/content/dubbed_videos.zip")
    
    # Create ZIP file
    with ZipFile(zip_path, 'w') as zipf:
        for video_file in output_dir.glob("*_dubbed.*"):
            zipf.write(video_file, video_file.name)
    
    print(f"‚úÖ Created ZIP file: {zip_path}")
    print(f"   Contains {len(list(output_dir.glob('*_dubbed.*')))} dubbed video(s)")
    print(f"\nüì• Downloading the ZIP file...")
    
    # Download the ZIP file
    files.download(str(zip_path))
else:
    print(f"‚ö†Ô∏è  Output directory not found: {output_dir}")

## Additional Help

### Supported Languages

**Source Languages:** Any language supported by Whisper (e.g., 'en', 'fr', 'es', 'de', 'ru', 'zh', 'ja', 'ko', etc.)

**Target Languages:**
- **Nigerian Languages:** 'yo' (Yoruba), 'ig' (Igbo), 'ha' (Hausa)
- **Other Languages:** 'en', 'fr', 'es', 'de', 'ru', 'zh', 'sw', etc.

### Notes:
- First run will download models (NLLB-200 ~2.5GB, Whisper, Demucs)
- Processing time depends on video length and complexity
- GPU is recommended but not required
- Output videos will be saved with "_dubbed" suffix