# 🎬 AI Shorts Generator - Complete Edition

Generate AI-powered short clips from long videos with both CLI and Gradio interfaces.

**⚠️ IMPORTANT: Run Setup Cells First!**
Before importing any libraries, you **MUST** run the setup cells below to install all required dependencies. If you skip this step, you'll get import errors like "ModuleNotFoundError: No module named 'pytubefix'".

**Setup Steps:**
1. Run the "🚀 Setup: System Dependencies & Python Packages" cell
2. Run the "📦 Import Required Libraries" cell
3. Then proceed with the rest of the notebook

**Features:**
- ✅ Upload videos or use YouTube URLs
- ✅ Optional SRT subtitle upload
- ✅ AI-powered highlight detection (OpenAI/Gemini)
- ✅ Automatic title generation
- ✅ Multiple aspect ratios (9:16, 16:9, 1:1)
- ✅ Face tracking or center crop modes
- ✅ Karaoke-style subtitle burning
- ✅ Watermark/logo overlay
- ✅ Batch processing
- ✅ Both CLI and GUI interfaces

**Run in:** Google Colab (recommended) or local Jupyter

## 🚀 Setup: System Dependencies & Python Packages

In [None]:
# Install system dependencies (for Colab)
!apt-get update -qq
!apt-get install -y -qq ffmpeg imagemagick fonts-freefont-ttf > /dev/null
!sed -i 's/<policy domain="path" rights="none" pattern="@\*" \/>/<\!-- <policy domain="path" rights="none" pattern="@\*" \/> --\>/g' /etc/ImageMagick-6/policy.xml || true

In [None]:
# Install Python packages
!pip install --upgrade pip wheel setuptools
!pip install gradio==4.* moviepy==2.2.1 imageio-ffmpeg
!pip install numpy<2.0 opencv-python-headless pytubefix pydub pysrt
!pip install faster-whisper google-generativeai 'openai>=1.35.0'

print("✅ All dependencies installed successfully!")
print("📦 Key packages installed:")
print("  - pytubefix: YouTube video downloading")
print("  - moviepy: Video processing and editing")
print("  - faster-whisper: AI transcription")
print("  - gradio: Web interface")
print("  - openai & google-generativeai: AI services")

In [None]:
# Setup GPU support (Colab only)
import torch
if torch.cuda.is_available():
    import os
    os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'expandable_segments:True'
    print("✅ GPU support enabled")
else:
    print("⚠️ GPU not available, using CPU")

## 📦 Import Required Libraries

In [None]:
import os, zipfile, sys
import numpy as np
from typing import List, Dict, Tuple, Optional
from pathlib import Path

# Video processing
try:
    from moviepy import VideoFileClip, TextClip, CompositeVideoClip, ImageClip
    from moviepy.video.fx import Crop as mp_crop
    print("✅ MoviePy imported successfully")
except ImportError as e:
    print(f"❌ MoviePy import failed: {e}")
    print("Please run the installation cell above first!")

# YouTube downloading
try:
    from pytubefix import YouTube
    print("✅ PyTubeFix imported successfully")
except ImportError as e:
    print(f"❌ PyTubeFix import failed: {e}")
    print("Please run the installation cell above first!")

# Audio processing
try:
    from faster_whisper import WhisperModel
    print("✅ FasterWhisper imported successfully")
except ImportError as e:
    print(f"❌ FasterWhisper import failed: {e}")
    print("Please run the installation cell above first!")

# AI services
try:
    from openai import OpenAI
    import google.generativeai as genai
    print("✅ AI services imported successfully")
except ImportError as e:
    print(f"❌ AI services import failed: {e}")
    print("Please run the installation cell above first!")

# Utilities
try:
    import cv2
    import gradio as gr
    print("✅ Utilities imported successfully")
except ImportError as e:
    print(f"❌ Utilities import failed: {e}")
    print("Please run the installation cell above first!")

print("\n🎯 All imports completed! Ready to proceed.")

## 🔧 Core Functions

In [None]:
# SRT/Subtitle utilities
def parse_srt_segments(path: str) -> List[Dict]:
    """Parse SRT file and return segments with timing"""
    import re
    with open(path, 'r', encoding='utf-8', errors='ignore') as f:
        content = f.read()
    
    blocks = re.split(r'\n\s*\n', content.strip())
    segs: List[Dict] = []
    
    for b in blocks:
        lines = [l.strip('\ufeff ') for l in b.splitlines() if l.strip()]
        if not lines: continue
        
        time_line = None
        for l in lines:
            if '-->' in l:
                time_line = l
                break
        if not time_line: continue
        
        try:
            t0, t1 = [x.strip() for x in time_line.split('-->')]
            def to_s(ts: str) -> float:
                h, m, rest = ts.split(':')
                s, ms = (rest + ',0').split(',')[:2]
                return int(h)*3600 + int(m)*60 + int(s) + int(ms)/1000
            st, et = to_s(t0), to_s(t1)
            if et > st:
                text = ' '.join([l for l in lines if l != time_line and not l.isdigit()])
                segs.append({'start': st, 'end': et, 'text': text})
        except Exception:
            pass
    return segs

def segs_to_text(segs: List[Dict]) -> str:
    """Convert segments to continuous text"""
    return ' '.join(s.get('text', '').strip() for s in segs)

def words_from_segs(segs: List[Dict]) -> List[Dict]:
    """Extract words with timing from segments"""
    import re
    out: List[Dict] = []
    for s in segs:
        if s.get('words'):
            out.append(s)
            continue
        text = s.get('text', '').strip()
        words = re.findall(r"\w+['’-]?\w*|\S", text)
        dur = max(0.001, s['end'] - s['start'])
        n = max(1, len(words))
        step = dur / n
        wlist = []
        for i, w in enumerate(words):
            ws = s['start'] + i*step
            we = min(s['end'], ws + step)
            wlist.append({'start': ws, 'end': we, 'text': w})
        s2 = dict(s)
        s2['words'] = wlist
        out.append(s2)
    return out

In [None]:
# Video processing utilities
def aspect_tuple(s: str) -> Tuple[int, int]:
    """Convert aspect ratio string to tuple"""
    a, b = s.split(':')
    return int(a), int(b)

def compute_center_crop(w: int, h: int, ratio: str) -> Tuple[int, int, int, int]:
    """Calculate center crop dimensions"""
    aw, ah = aspect_tuple(ratio)
    tr = aw / ah
    sr = w / h
    if sr > tr:
        cw = int(h * tr)
        ch = h
        x = (w - cw) // 2
        y = 0
    else:
        cw = w
        ch = int(w / tr)
        x = 0
        y = (h - ch) // 2
    return x, y, cw, ch

def crop_center(v: VideoFileClip, ratio: str) -> VideoFileClip:
    """Apply center crop to video"""
    x, y, cw, ch = compute_center_crop(v.w, v.h, ratio)
    return mp_crop(v, x1=x, y1=y, width=cw, height=ch).resize((cw, ch))

# Face detection and tracking
_HAAR: Optional[cv2.CascadeClassifier] = None

def _load_haar():
    """Load Haar cascade classifier"""
    global _HAAR
    if _HAAR is None:
        try:
            candidates = []
            haar_dir = getattr(cv2.data, 'haarcascades', '')
            if haar_dir:
                candidates.append(os.path.join(haar_dir, 'haarcascade_frontalface_default.xml'))
            candidates.append('haarcascade_frontalface_default.xml')
            candidates.append(os.path.join('models', 'haarcascade_frontalface_default.xml'))
            path = next((p for p in candidates if os.path.exists(p)), '')
            if path:
                _HAAR = cv2.CascadeClassifier(path)
            else:
                _HAAR = None
        except Exception:
            _HAAR = None

def detect_face(frame) -> Optional[Tuple[int, int, int, int]]:
    """Detect face in frame"""
    _load_haar()
    if _HAAR is None: return None
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    det = _HAAR.detectMultiScale(gray, 1.2, 3)
    if len(det) == 0: return None
    det = sorted(det, key=lambda d: d[2]*d[3], reverse=True)[0]
    return int(det[0]), int(det[1]), int(det[2]), int(det[3])

def crop_face_track(v: VideoFileClip, ratio: str, sample_fps: float = 4.0, smooth: float = 0.8) -> VideoFileClip:
    """Apply face tracking crop to video"""
    aw, ah = aspect_tuple(ratio)
    tr = aw / ah
    w, h = v.w, v.h
    sr = w / h
    if sr > tr:
        ch = h
        cw = int(h * tr)
    else:
        cw = w
        ch = int(w / tr)
    duration = v.duration
    times = np.arange(0, duration, 1.0/max(1.0, sample_fps))
    path: List[Tuple[float, int, int]] = []
    prev = None
    for t in times:
        try:
            frame = v.get_frame(t)
            b = detect_face(frame)
            if b:
                x, y, bw, bh = b
                cx, cy = x + bw/2, y + bh/2
            else:
                cx, cy = prev if prev else (w/2, h/2)
            if prev is None:
                sx, sy = cx, cy
            else:
                sx = smooth*prev[0] + (1-smooth)*cx
                sy = smooth*prev[1] + (1-smooth)*cy
            prev = (sx, sy)
            x1 = max(0, min(w - cw, int(sx - cw/2)))
            y1 = max(0, min(h - ch, int(sy - ch/2)))
            path.append((t, x1, y1))
        except Exception:
            continue
    if not path:
        return crop_center(v, ratio)
    ts = [p[0] for p in path]
    xs = [p[1] for p in path]
    ys = [p[2] for p in path]

    def interp(series: List[int]) -> callable:
        def f(t: float) -> float:
            if t <= ts[0]: return float(series[0])
            if t >= ts[-1]: return float(series[-1])
            i = max(0, np.searchsorted(ts, t) - 1)
            t0, t1 = ts[i], ts[i+1]
            v0, v1 = series[i], series[i+1]
            if t1 == t0: return float(v1)
            a = (t - t0) / (t1 - t0)
            return float(v0*(1-a) + v1*a)
        return f

    fx = interp(xs)
    fy = interp(ys)
    return mp_crop(v, x1=lambda t: fx(t), y1=lambda t: fy(t), width=cw, height=ch).resize((cw, ch))

In [None]:
# AI-powered content analysis
def pick_highlights(transcription: str, provider: str, api_key: str, max_clips: int, min_len: int, max_len: int) -> List[Dict]:
    """Use AI to identify highlight segments from transcription"""
    import json
    
    sys = (
        f"You are an expert at finding viral video moments. Return up to {max_clips} segments between {min_len} and {max_len} seconds "
        "as JSON array with keys start,end,content. Only return JSON. If none, return []."
    )
    
    if provider == 'OpenAI':
        client = OpenAI(api_key=api_key)
        r = client.chat.completions.create(
            model='gpt-4o-mini', temperature=0.5,
            messages=[{'role':'system','content':sys},{'role':'user','content':transcription}]
        )
        txt = r.choices[0].message.content
    else:
        genai.configure(api_key=api_key)
        m = genai.GenerativeModel('gemini-2.5-flash')
        txt = m.generate_content(sys + '\n\n' + transcription).text
    
    txt = (txt or '').strip().replace('```','').replace('json','').strip()
    try:
        arr = json.loads(txt) if txt else []
    except Exception:
        arr = []
    
    out = []
    for h in arr:
        try:
            s = float(h.get('start', 0)); e = float(h.get('end', 0))
            if e > s and min_len <= e - s <= max_len:
                out.append({'start': s, 'end': e, 'content': h.get('content','')})
        except Exception:
            continue
    return out

def generate_titles_from_highlights(highs: List[Dict], provider: str, api_key: str) -> List[str]:
    """Generate titles for highlight segments using AI"""
    import json
    if not highs: return []
    
    prompt = 'Create ultra-short (<=40 chars), high-energy titles with emojis for these clip summaries. Return JSON array of strings only.\n' + \
             json.dumps([h.get('content','') for h in highs])
    
    try:
        if provider == 'OpenAI':
            client = OpenAI(api_key=api_key)
            r = client.chat.completions.create(model='gpt-4o-mini', temperature=0.7, messages=[{'role':'user','content':prompt}])
            txt = r.choices[0].message.content
        else:
            genai.configure(api_key=api_key)
            m = genai.GenerativeModel('gemini-2.5-flash')
            txt = m.generate_content(prompt).text
        
        txt = (txt or '').strip().replace('```','').replace('json','').strip()
        arr = json.loads(txt) if txt else []
        return [str(a)[:60] for a in arr]
    except Exception:
        return [h.get('content','Clip')[:40] for h in highs]

In [None]:
# Main pipeline functions
def download_youtube(url: str) -> Optional[str]:
    """Download YouTube video and return path"""
    try:
        yt = YouTube(url)
        stream = (yt.streams.filter(progressive=True, file_extension='mp4').order_by('resolution').desc().first() or
                  yt.streams.filter(file_extension='mp4').order_by('resolution').desc().first())
        os.makedirs('videos', exist_ok=True)
        return stream.download(output_path='videos')
    except Exception:
        return None

def transcribe(video_path: str, logger=print):
    """Transcribe video using Whisper"""
    try:
        import torch
        device = 'cuda' if getattr(torch, 'cuda', None) and torch.cuda.is_available() else 'cpu'
    except Exception:
        device = 'cpu'
    
    model = WhisperModel('base.en', device=device, compute_type='float16' if device=='cuda' else 'int8')
    seg_iter, _ = model.transcribe(video_path, beam_size=5, language='en', word_timestamps=True)
    segs = []
    for s in seg_iter:
        words = []
        if getattr(s, 'words', None):
            for w in s.words:
                words.append({'start': float(w.start), 'end': float(w.end), 'text': w.word})
        segs.append({'start': float(s.start), 'end': float(s.end), 'text': s.text.strip(), 'words': words})
    return segs, segs_to_text(segs)

def add_title_overlay(video_path: str, out_path: str, title_text: str, platform: str = 'TikTok'):
    """Add title overlay to video"""
    with VideoFileClip(video_path) as v:
        w, h = v.w, v.h
        margin = int(0.10*h if platform == 'TikTok' else 0.08*h)
        txt = TextClip(title_text, font='FreeMono', fontsize=max(36, int(h*0.05)), color='white', stroke_color='black', stroke_width=2)
        txt = txt.set_pos(('center', margin)).set_duration(v.duration)
        CompositeVideoClip([v, txt]).write_videofile(out_path, codec='libx264', audio_codec='aac')

def add_watermark(video_path: str, wm_path: str, out_path: str):
    """Add watermark to video"""
    with VideoFileClip(video_path) as v:
        wm = (ImageClip(wm_path).set_duration(v.duration).resize(height=int(max(48, v.h*0.06))).set_pos(('right','top')))
        CompositeVideoClip([v, wm]).write_videofile(out_path, codec='libx264', audio_codec='aac')

def write_ass_karaoke(segs: List[Dict], path: str, t0: float, t1: float, resolution: Tuple[int, int]):
    """Write ASS karaoke subtitle file"""
    import re
    W, H = resolution
    header = (
        "[Script Info]\n"
        "ScriptType: v4.00+\n"
        f"PlayResX: {W}\n"
        f"PlayResY: {H}\n"
        "ScaledBorderAndShadow: yes\n\n"
        "[V4+ Styles]\n"
        "Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding\n"
        "Style: Karaoke,Arial,60,&H00FFFFFF,&H0000FFFF,&H00000000,&H80000000,0,0,0,0,100,100,0,0,1,2,0,2,80,80,140,1\n\n"
        "[Events]\n"
        "Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text\n"
    )
    lines = [header]
    for s in segs:
        if s['end'] < t0 or s['start'] > t1: continue
        a = max(t0, s['start'])
        b = min(t1, s['end'])
        words = [w for w in s.get('words', []) if not (w['end'] < a or w['start'] > b)]
        if not words: continue
        parts = []
        for w in words:
            ws = max(a, w['start'])
            we = min(b, w['end'])
            k = max(1, int((we - ws) * 100))
            txt = re.sub(r'[{}\\\\]', '', w['text'])
            parts.append(f"{{\\k{k}}}{txt}")
        text = ''.join(parts)
        lines.append(
            f"Dialogue: 0,{_ass_ts(a-t0)},{_ass_ts(b-t0)},Karaoke,,0000,0000,0000,,{text}\n"
        )
    with open(path, 'w', encoding='utf-8') as f:
        f.writelines(lines)

def _ass_ts(t: float) -> str:
    """Convert time to ASS timestamp format"""
    h = int(t // 3600)
    m = int((t % 3600) // 60)
    s = int(t % 60)
    cs = int((t - int(t)) * 100)
    return f"{h:01d}:{m:02d}:{s:02d}.{cs:02d}"

def burn_ass_to_video(input_path: str, ass_path: str, output_path: str):
    """Burn ASS subtitles to video using ffmpeg"""
    import subprocess, shlex
    cmd = f"ffmpeg -y -i {shlex.quote(input_path)} -vf subtitles={shlex.quote(ass_path)} -c:a aac -c:v libx264 -pix_fmt yuv420p {shlex.quote(output_path)}"
    subprocess.run(cmd, shell=True, check=True)

def write_srt_for_range(segs: List[Dict], path: str, t0: float, t1: float):
    """Write SRT file for time range"""
    def _srt_ts(t: float) -> str:
        t = max(0.0, t)
        h = int(t // 3600)
        m = int((t % 3600) // 60)
        s = int(t % 60)
        ms = int(round((t - int(t)) * 1000))
        return f"{h:02d}:{m:02d}:{s:02d},{ms:03d}"
    
    idx = 1
    lines = []
    for s in segs:
        if s['end'] <= t0 or s['start'] >= t1: continue
        a = max(t0, s['start'])
        b = min(t1, s['end'])
        text = s.get('text', '').strip()
        if not text: continue
        sa = _srt_ts(a - t0)
        sb = _srt_ts(b - t0)
        lines.append(f"{idx}\n{sa} --> {sb}\n{text}\n\n")
        idx += 1
    with open(path, 'w', encoding='utf-8') as f:
        f.writelines(lines)

In [None]:
# Main pipeline function
def generate_pipeline(youtube_url, video_file, srt_file, provider, openai_key, gemini_key, 
                     min_len, max_len, max_clips, aspect, crop_mode, karaoke, export_srt, 
                     title_mode, custom_title, platform, out_prefix, watermark_file, 
                     seo_text='', logger=print):
    """Main pipeline to generate shorts from video"""
    
    # Get video path
    path = None
    if youtube_url:
        path = download_youtube(youtube_url)
        if path: logger(f"Downloaded YouTube -> {path}")
    if not path and video_file is not None:
        path = video_file.name
    if not path:
        logger('No video provided.')
        return None

    # Transcription
    if srt_file is not None:
        segs = parse_srt_segments(srt_file.name)
        text = segs_to_text(segs)
        segs = words_from_segs(segs)
    else:
        logger('Starting Whisper transcription...')
        segs, text = transcribe(path, logger)
    
    if not text:
        logger('Empty transcription')
        return None

    # AI highlight selection
    api_key = openai_key if provider == 'OpenAI' else gemini_key
    if not api_key:
        logger(f"Missing API key for {provider}. Please provide a valid key.")
        return None
    
    logger(f'Finding highlights using {provider}...')
    highs = pick_highlights(text, provider, api_key, int(max_clips), int(min_len), int(max_len))
    if not highs:
        logger('No highlights found.')
        return None

    # Title generation
    if title_mode == 'Auto':
        logger('Generating titles...')
        titles = generate_titles_from_highlights(highs, provider, api_key)
    elif title_mode == 'Custom':
        titles = [custom_title or ''] * len(highs)
    else:
        titles = [''] * len(highs)

    out_pref = out_prefix or 'short'
    outputs: List[str] = []
    srt_outputs: List[str] = []

    # Process each highlight
    for i, h in enumerate(highs, start=1):
        s, e = float(h['start']), float(h['end'])
        clip_path = f"{out_pref}_{i}.mp4"
        
        with VideoFileClip(path) as v:
            sub = v.subclip(s, e)
            if crop_mode == 'Face-track':
                sub = crop_face_track(sub, aspect)
            else:
                sub = crop_center(sub, aspect)
            
            logger(f"Rendering clip {i}: {s:.2f}s to {e:.2f}s")
            sub.write_videofile(clip_path, codec='libx264', audio_codec='aac')

        # Export per-clip SRT
        if export_srt and segs:
            try:
                srt_path = f"{out_pref}_{i}.srt"
                write_srt_for_range(segs, srt_path, s, e)
                srt_outputs.append(srt_path)
            except Exception as ex:
                logger(f"SRT export failed for clip {i}: {ex}")

        cur = clip_path
        
        # Add karaoke subtitles
        if karaoke and segs:
            ass = f"{out_pref}_{i}.ass"
            res = (1080,1920) if aspect == '9:16' else (1920,1080)
            write_ass_karaoke(segs, ass, s, e, res)
            kara = f"{out_pref}_{i}_karaoke.mp4"
            try:
                burn_ass_to_video(cur, ass, kara)
                cur = kara
            except Exception as ex:
                logger(f"Karaoke burn failed: {ex}")

        # Add title overlay
        ttl = titles[i-1] if i-1 < len(titles) else ''
        if ttl:
            ttl_out = f"{out_pref}_{i}_title.mp4"
            try:
                add_title_overlay(cur, ttl_out, ttl, platform)
                cur = ttl_out
            except Exception as ex:
                logger(f"Title overlay failed: {ex}")

        # Add watermark
        if watermark_file is not None:
            wm_out = f"{out_pref}_{i}_wm.mp4"
            try:
                add_watermark(cur, watermark_file.name, wm_out)
                cur = wm_out
            except Exception as ex:
                logger(f"Watermark failed: {ex}")

        outputs.append(cur)

    # Write SEO/description if provided
    if seo_text:
        try:
            with open(f"{out_pref}_description.txt", 'w', encoding='utf-8') as f:
                f.write(seo_text.strip() + "\n")
        except Exception:
            pass

    # Create ZIP file
    zip_path = f"{out_pref}_results.zip"
    with zipfile.ZipFile(zip_path, 'w') as z:
        for f in outputs:
            if os.path.exists(f):
                z.write(f)
        
        for srt in srt_outputs:
            if os.path.exists(srt):
                z.write(srt)
        
        if export_srt and segs:
            with open('transcription.txt','w',encoding='utf-8') as f:
                f.write(text)
            z.write('transcription.txt')
        
        if seo_text and os.path.exists(f"{out_pref}_description.txt"):
            z.write(f"{out_pref}_description.txt")
    
    return zip_path

## 🎛️ CLI Interface

In [None]:
import argparse

def parse_args():
    """Parse command line arguments"""
    p = argparse.ArgumentParser(description="AI Shorts Generator - CLI runner")

    src = p.add_mutually_exclusive_group(required=True)
    src.add_argument("--youtube-url", type=str, help="YouTube video URL to download and process")
    src.add_argument("--video-file", type=str, help="Local video file path to process")

    p.add_argument("--srt-file", type=str, help="Optional SRT file to skip transcription and use its timing/text")

    p.add_argument("--provider", choices=["OpenAI", "Gemini"], default="OpenAI",
                   help="LLM provider to use for highlight selection and title generation")
    p.add_argument("--openai-key", type=str, default=os.getenv("OPENAI_API_KEY", ""),
                   help="OpenAI API key (fallback to env OPENAI_API_KEY)")
    p.add_argument("--gemini-key", type=str, default=os.getenv("GEMINI_API_KEY", ""),
                   help="Gemini API key (fallback to env GEMINI_API_KEY)")

    p.add_argument("--min-len", type=float, default=15, help="Minimum clip length (seconds)")
    p.add_argument("--max-len", type=float, default=60, help="Maximum clip length (seconds)")
    p.add_argument("--max-clips", type=int, default=5, help="Maximum number of clips to generate")

    p.add_argument("--aspect", choices=["9:16", "16:9", "1:1"], default="9:16",
                   help="Target aspect ratio for output clips")
    p.add_argument("--crop-mode", choices=["Center", "Face-track"], default="Center",
                   help="Cropping mode: simple center crop or face tracking where possible")

    p.add_argument("--karaoke", action="store_true", help="Burn karaoke-style subtitles into the clips")
    p.add_argument("--export-srt", action="store_true", help="Export per-clip SRT files alongside clips")

    p.add_argument("--title-mode", choices=["Auto", "Custom", "None"], default="Auto",
                   help="How to set titles for the clips")
    p.add_argument("--custom-title", type=str, default="", help="Custom title text if --title-mode=Custom")
    p.add_argument("--platform", choices=["TikTok", "YouTube", "Instagram"], default="TikTok",
                   help="Platform to adjust title overlay layout slightly")

    p.add_argument("--watermark", type=str, help="Path to watermark/logo image to overlay")

    p.add_argument("--out-prefix", type=str, default="short", help="Prefix for output files")

    seo = p.add_mutually_exclusive_group(required=False)
    seo.add_argument("--seo-text", type=str, default="", help="Optional SEO/description text to include in zip")
    seo.add_argument("--seo-text-file", type=str, help="Path to a text file with SEO/description content")

    return p.parse_args()

class NamedPath:
    """Lightweight wrapper to mimic Gradio's uploaded file objects"""
    def __init__(self, path: Optional[str]):
        self.name = path if path else ''

def run_cli():
    """Run the CLI interface"""
    args = parse_args()

    # Setup parameters
    youtube_url = args.youtube_url or None
    video_file = NamedPath(args.video_file) if args.video_file else None
    srt_file = NamedPath(args.srt_file) if args.srt_file else None
    watermark_file = NamedPath(args.watermark) if args.watermark else None

    # Load SEO text from file if provided
    seo_text = args.seo_text or ""
    if args.seo_text_file and os.path.exists(args.seo_text_file):
        try:
            with open(args.seo_text_file, "r", encoding="utf-8", errors="ignore") as f:
                seo_text = f.read()
        except Exception:
            pass

    # Run pipeline
    zip_path = generate_pipeline(
        youtube_url=youtube_url,
        video_file=video_file,
        srt_file=srt_file,
        provider=args.provider,
        openai_key=args.openai_key,
        gemini_key=args.gemini_key,
        min_len=args.min_len,
        max_len=args.max_len,
        max_clips=args.max_clips,
        aspect=args.aspect,
        crop_mode=args.crop_mode,
        karaoke=args.karaoke,
        export_srt=args.export_srt,
        title_mode=args.title_mode,
        custom_title=args.custom_title,
        platform=args.platform,
        out_prefix=args.out_prefix,
        watermark_file=watermark_file,
        seo_text=seo_text,
        logger=print,
    )

    if zip_path:
        print(f"\n✅ Success! Results saved to: {zip_path}")
    else:
        print("\n❌ Pipeline did not produce results. Check logs above for details.")

# Example CLI usage:
# python AI_Shorts_Generator_Complete.ipynb --video-file your_video.mp4 --provider OpenAI --openai-key YOUR_KEY --min-len 15 --max-len 60 --max-clips 3 --aspect 9:16

## 🌐 Gradio Web Interface

In [None]:
def create_gradio_interface():
    """Create Gradio web interface"""
    
    def run_gradio_ui(youtube_url, video_file, srt_file, provider, openai_key, gemini_key, 
                      target_len, tol, max_clips, aspect, crop_mode, karaoke, export_srt, 
                      title_mode, custom_title, platform, out_prefix, watermark_file, seo_text):
        """Gradio interface handler"""
        logs_buf = []
        
        def log(m):
            logs_buf.append(str(m))
        
        min_len = max(5, int(target_len) - int(tol))
        max_len = int(target_len) + int(tol)
        
        try:
            zip_path = generate_pipeline(
                youtube_url, video_file, srt_file, provider, openai_key, gemini_key, 
                min_len, max_len, int(max_clips), aspect, crop_mode, bool(karaoke), 
                bool(export_srt), title_mode, custom_title, platform, out_prefix, 
                watermark_file, seo_text, logger=log
            )
            log('Done.' if zip_path else 'Failed to generate.')
            return zip_path, '\n'.join(logs_buf)
        except Exception as ex:
            log(f'Error: {ex}')
            return None, '\n'.join(logs_buf)
    
    # Create interface
    with gr.Blocks(title="AI Shorts Generator") as demo:
        gr.Markdown("# 🎬 AI Shorts Generator - Complete Edition")
        gr.Markdown("Generate AI-powered short clips from long videos with both CLI and GUI interfaces.")
        
        with gr.Row():
            with gr.Column(scale=1):
                youtube_url = gr.Textbox(label='YouTube URL (optional)', placeholder='https://www.youtube.com/watch?v=...')
                video_file = gr.File(label='Or upload a video file', file_types=['video'])
                srt_file = gr.File(label='Upload SRT subtitles (optional)', file_types=['.srt'])
                
                provider = gr.Dropdown(['OpenAI','Gemini'], label='AI Provider', value='OpenAI')
                openai_key = gr.Textbox(label='OpenAI API Key', type='password', visible=True, 
                                      placeholder='sk-...')
                gemini_key = gr.Textbox(label='Gemini API Key', type='password', visible=False,
                                       placeholder='your-gemini-key')
                
                with gr.Row():
                    target_len = gr.Slider(10, 120, value=60, step=1, label='Target short length (s)')
                    tol = gr.Slider(2, 30, value=10, step=1, label='Length tolerance ± (s)')
                max_clips = gr.Slider(1, 10, value=3, step=1, label='Maximum number of clips')
                
                aspect = gr.Dropdown(['9:16','16:9','1:1'], value='9:16', label='Aspect ratio')
                crop_mode = gr.Dropdown(['Center','Face-track'], value='Face-track', label='Crop mode')
                
                with gr.Row():
                    karaoke = gr.Checkbox(label='Burn karaoke subtitles', value=True)
                    export_srt = gr.Checkbox(label='Export SRT files', value=True)
                
                title_mode = gr.Dropdown(['Auto','Custom','None'], value='Auto', label='Title overlay mode')
                custom_title = gr.Textbox(label='Custom title (if Custom mode)', 
                                        placeholder='Enter your custom title...')
                platform = gr.Dropdown(['TikTok','YouTube','Instagram'], value='TikTok', label='Platform')
                
                out_prefix = gr.Textbox(label='Output name prefix', value='short')
                watermark_file = gr.File(label='Watermark image (optional)', file_types=['image'])
                seo_text = gr.Textbox(label='SEO description (optional)', lines=3, 
                                    placeholder='Enter description for your clips...')
                
                go = gr.Button('🚀 Generate Shorts', variant='primary')
            
            with gr.Column(scale=1):
                logs = gr.Textbox(label='Process logs', lines=20, interactive=False)
                out_zip = gr.File(label='📦 Download Results ZIP')
        
        def toggle_provider(p):
            return (gr.update(visible=p=='OpenAI'), gr.update(visible=p=='Gemini'))
        
        provider.change(toggle_provider, provider, [openai_key, gemini_key])
        go.click(run_gradio_ui, 
                [youtube_url, video_file, srt_file, provider, openai_key, gemini_key, 
                 target_len, tol, max_clips, aspect, crop_mode, karaoke, export_srt, 
                 title_mode, custom_title, platform, out_prefix, watermark_file, seo_text], 
                [out_zip, logs])
    
    return demo

# Launch Gradio interface
print("🎯 Starting Gradio interface...")
print("📝 You can also use CLI with: python AI_Shorts_Generator_Complete.ipynb [arguments]")

# Uncomment to launch Gradio (will show public link in Colab)
# demo = create_gradio_interface()
# demo.launch(debug=True, share=True)

## 📝 Usage Instructions

### 🎯 **Option 1: Gradio Web Interface (Easiest)**
1. Uncomment the last 3 lines in the Gradio section above
2. Run the cell
3. Click the public link that appears
4. Upload your video and configure settings
5. Click "Generate Shorts" and download results

### 💻 **Option 2: CLI Interface (Advanced)**
```bash
# Basic usage with local video
python AI_Shorts_Generator_Complete.ipynb \
  --video-file your_video.mp4 \
  --provider OpenAI \
  --openai-key YOUR_OPENAI_KEY \
  --min-len 15 \
  --max-len 60 \
  --max-clips 3 \
  --aspect 9:16

# With YouTube URL
python AI_Shorts_Generator_Complete.ipynb \
  --youtube-url "https://www.youtube.com/watch?v=..." \
  --provider Gemini \
  --gemini-key YOUR_GEMINI_KEY \
  --karaoke \
  --export-srt
```

### ⚙️ **Configuration Options:**
- **Video Input**: Upload file or provide YouTube URL
- **AI Provider**: OpenAI (gpt-4o-mini) or Google Gemini (2.5-flash)
- **Clip Length**: 10-120 seconds per clip
- **Aspect Ratio**: 9:16 (TikTok), 16:9 (YouTube), 1:1 (Instagram)
- **Crop Mode**: Center crop or face tracking
- **Subtitles**: Optional karaoke-style burning
- **Titles**: Auto-generated, custom, or none
- **Watermark**: Optional logo overlay
- **Export**: SRT files, transcription, SEO text

### 🎬 **Output Includes:**
- Multiple MP4 short clips
- Per-clip SRT subtitle files (optional)
- Full transcription text
- SEO description (if provided)
- All files in a ZIP download

### 🔑 **API Keys Setup:**
- **OpenAI**: Get from [platform.openai.com](https://platform.openai.com)
- **Gemini**: Get from [aistudio.google.com](https://aistudio.google.com)
- Keys are required for highlight detection and title generation

**Ready to create amazing shorts! 🎬✨**