<a href="https://colab.research.google.com/github/thedronemenace/media-tools/blob/main/Clip_Factory_Bot_Autopilot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🔥 Clip Factory Bot — Autopilot (TikTok + Instagram)
**What this notebook does (hands‑free content factory):**
1. Mounts your Google Drive
2. Pulls safe trending sources (podcasts, interviews, speeches) via `yt-dlp`
3. Auto-cuts **10 clips/day** (default) into **20–30s vertical (9:16)** videos
4. Generates **subtitles** with Whisper and **burns them** into each clip
5. Adds your **watermark** (e.g., `@thedronemenace`)
6. Auto-writes a caption + **rotating hashtags** (5 for TikTok, 20+ for Instagram) into a `.txt` next to each video
7. Saves everything to your Drive folder: **`Autopost_Clips`**

> Once you connect this Drive folder to your scheduler/emulator workflow, your posts can go out automatically.

In [1]:
# @title STEP 0 — Install dependencies (5–7 min on first run)
# This cell installs: ffmpeg, yt-dlp, OpenAI Whisper (for subtitles)

!apt-get -y update >/dev/null
!apt-get -y install ffmpeg >/dev/null

!pip -q install yt-dlp ffmpeg-python openai-whisper numpy pandas >/dev/null

print("✅ Dependencies installed.")

W: Skipping acquire of configured file 'main/source/Sources' as repository 'https://r2u.stat.illinois.edu/ubuntu jammy InRelease' does not seem to provide it (sources.list entry misspelt?)
✅ Dependencies installed.


In [2]:
# @title STEP 1 — Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')
print("✅ Drive mounted. You will see 'drive/MyDrive' available.")

Mounted at /content/drive
✅ Drive mounted. You will see 'drive/MyDrive' available.


In [3]:
# @title STEP 2 — Configure your bot (edit the fields then run)
HANDLE = "@thedronemenace"  # @title Your watermark handle (e.g., @thedronemenace)
OUTPUT_DIR = "Autopost_Clips"  # @title Drive folder to save clips to
CLIPS_PER_DAY = 10  # @title How many clips to produce per run (recommended: 10)
MIN_CLIP_SEC = 20   # @title Min clip length (seconds)
MAX_CLIP_SEC = 30   # @title Max clip length (seconds)

# Safe viral-ish sources (podcast/interview/news channels).
# You can add/remove links. The bot will pull the latest videos from these.
SAFE_SOURCES = [
  "https://www.youtube.com/@PowerfulJRE/videos",
  "https://www.youtube.com/@lexfridman/videos",
  "https://www.youtube.com/@TED/videos",
  "https://www.youtube.com/@valutainment/videos",
  "https://www.youtube.com/@impacttheory/videos",
  "https://www.youtube.com/@PBDPodcast/videos",
  "https://www.youtube.com/@theDiaryOfACEO/videos"
]

print(f"✅ Config set: {CLIPS_PER_DAY} clips/run → Drive/{OUTPUT_DIR} | Watermark: {HANDLE}")

✅ Config set: 10 clips/run → Drive/Autopost_Clips | Watermark: @thedronemenace


In [4]:
# @title STEP 3 — Prepare workspace folders
import os

WORK_DIR = "/content/work"
RAW_DIR = os.path.join(WORK_DIR, "raw")
CUT_DIR = os.path.join(WORK_DIR, "cut")
SUB_DIR = os.path.join(WORK_DIR, "subs")
OUT_DIR = "/content/drive/MyDrive/" + OUTPUT_DIR

for d in (WORK_DIR, RAW_DIR, CUT_DIR, SUB_DIR, OUT_DIR):
    os.makedirs(d, exist_ok=True)

print("✅ Folders ready:")
print("RAW  →", RAW_DIR)
print("CUT  →", CUT_DIR)
print("SUBS →", SUB_DIR)
print("OUT  →", OUT_DIR)

✅ Folders ready:
RAW  → /content/work/raw
CUT  → /content/work/cut
SUBS → /content/work/subs
OUT  → /content/drive/MyDrive/Autopost_Clips


In [5]:
# @title STEP 4 — Fetch latest videos from safe sources (yt-dlp)
import os, subprocess
from glob import glob

MAX_VIDS_PER_SOURCE = 3  # up to 3 per source

def download_from_source(url):
    cmd = [
        "yt-dlp",
        "--no-warnings",
        "--ignore-errors",
        "--dateafter", "now-1month",
        "-f", "bv*[ext=mp4]+ba[ext=m4a]/b[ext=mp4]/b",
        "-o", os.path.join(RAW_DIR, "%(uploader)s__%(title).80B__%(id)s.%(ext)s"),
        "--max-downloads", str(MAX_VIDS_PER_SOURCE),
        url
    ]
    print("→ Grabbing:", url)
    subprocess.run(cmd, check=False)

for src in SAFE_SOURCES:
    try:
        download_from_source(src)
    except Exception as e:
        print("Failed:", src, e)

raw_files = sorted(glob(os.path.join(RAW_DIR, "*.mp4")))
print(f"✅ Downloaded raw videos: {len(raw_files)}")

→ Grabbing: https://www.youtube.com/@PowerfulJRE/videos
→ Grabbing: https://www.youtube.com/@lexfridman/videos
→ Grabbing: https://www.youtube.com/@TED/videos
→ Grabbing: https://www.youtube.com/@valutainment/videos
→ Grabbing: https://www.youtube.com/@impacttheory/videos
→ Grabbing: https://www.youtube.com/@PBDPodcast/videos
→ Grabbing: https://www.youtube.com/@theDiaryOfACEO/videos
✅ Downloaded raw videos: 12


In [None]:
# @title STEP 5 — Cut 20–30s vertical segments and generate subtitles
import os, subprocess, random, json
from glob import glob
from datetime import datetime

def get_duration(path):
    cmd = [
        "ffprobe", "-v", "error", "-print_format", "json", "-show_format", "-show_streams", path
    ]
    out = subprocess.check_output(cmd).decode("utf-8", "ignore")
    info = json.loads(out)
    dur = float(info["format"]["duration"])
    return dur

def pick_segment(duration, min_sec, max_sec):
    length = random.randint(min_sec, max_sec)
    if duration <= length + 2:
        return 0, min(duration-1, length)
    import random as _r
    start = _r.uniform(0, max(0, duration - length - 1))
    return start, length

made = 0
random.shuffle(raw_files)

for src in raw_files:
    if made >= CLIPS_PER_DAY:
        break
    try:
        dur = get_duration(src)
        ss, ll = pick_segment(dur, MIN_CLIP_SEC, MAX_CLIP_SEC)

        base = os.path.splitext(os.path.basename(src))[0]
        ts = datetime.utcnow().strftime("%Y%m%d_%H%M%S")
        cut_path = os.path.join("/content/work/cut", f"{base}__{ts}.mp4")

        # 1) Cut a 20-30s chunk
        subprocess.run([
            "ffmpeg", "-y",
            "-ss", str(ss), "-t", str(ll),
            "-i", src,
            "-c:v", "libx264", "-c:a", "aac",
            "-movflags", "+faststart",
            cut_path
        ], check=False, stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)

        # 2) Generate subtitles with Whisper
        import whisper
        model = whisper.load_model("small")
        result = model.transcribe(cut_path, fp16=False, language='en')

        # Write SRT
        try:
            import srt
        except:
            !pip -q install srt >/dev/null
            import srt
        import datetime as dt
        subs = []
        for i, seg in enumerate(result.get("segments", [])):
            start = dt.timedelta(seconds=max(seg.get("start", 0), 0))
            end = dt.timedelta(seconds=max(seg.get("end", 0), 0))
            subs.append(srt.Subtitle(index=i+1, start=start, end=end, content=str(seg.get("text", "")).strip()))
        srt_path = os.path.join("/content/work/subs", os.path.splitext(os.path.basename(cut_path))[0] + ".srt")
        with open(srt_path, "w", encoding="utf-8") as f:
            f.write(srt.compose(subs))

        # 3) Make vertical 9:16, burn subs, add watermark
        out_name = os.path.basename(cut_path).replace(".mp4", "__9x16_" + HANDLE[1:] + ".mp4")
        out_path = os.path.join("/content/drive/MyDrive/" + OUTPUT_DIR, out_name)

        vf = (
            "scale=-2:1920:flags=lanczos,"
            "crop=1080:1920,"
            "subtitles='" + srt_path.replace("'", "\\'") + "':force_style='Fontsize=24,Outline=2,Shadow=1',"
            "drawtext=fontfile=/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf:"
            "text='" + HANDLE.replace(":", "\\:") + "':x=(w-tw-40):y=(h-th-40):fontsize=28:box=1:boxcolor=black@0.4:boxborderw=10"
        )

        subprocess.run([
            "ffmpeg", "-y", "-i", cut_path,
            "-vf", vf,
            "-c:v", "libx264", "-preset", "veryfast", "-crf", "23",
            "-c:a", "aac", "-b:a", "128k",
            "-movflags", "+faststart",
            out_path
        ], check=False, stdout=subprocess.DEVNULL, stderr=subprocess.STDOUT)

        print("✅ Created:", out_path)
        made += 1
    except Exception as e:
        print("Skip due to error:", e)
        continue

print(f"✅ Finished: {made} vertical clips saved to Drive/{OUTPUT_DIR}")

  ts = datetime.utcnow().strftime("%Y%m%d_%H%M%S")
100%|███████████████████████████████████████| 461M/461M [00:14<00:00, 34.1MiB/s]


✅ Created: /content/drive/MyDrive/Autopost_Clips/The Diary Of A CEO__World No.1 Fasting Expert： The Link Between Cancer & Fasting That They're Hiding__jDG1m_b5Ih0__20250902_174725__9x16_thedronemenace.mp4


  ts = datetime.utcnow().strftime("%Y%m%d_%H%M%S")


✅ Created: /content/drive/MyDrive/Autopost_Clips/PBD Podcast__Benjamin Netanyahu ADMITS Genocide, Slams AIPAC Critics & Trump Owning Gaza ｜ PB__0nsgCE4HC0U__20250902_175247__9x16_thedronemenace.mp4


  ts = datetime.utcnow().strftime("%Y%m%d_%H%M%S")


In [None]:
# @title STEP 6 — Auto-captions + rotating hashtags (creates .txt next to each video)
import os, re, random
from glob import glob
OUT_DIR = "/content/drive/MyDrive/" + OUTPUT_DIR

HASHTAGS_TT_BUCKETS = [
    ["#fyp", "#viral", "#trending", "#xyzbca", "#foryou"],
    ["#viralvideo", "#reels", "#tiktok", "#explore", "#mustwatch"],
    ["#motivation", "#podcast", "#interview", "#mindset", "#success"],
    ["#wow", "#shocking", "#insane", "#waitforit", "#watchtillend"],
]

HASHTAGS_IG = [
    "#reels", "#reelsinstagram", "#reelitfeelit", "#explorepage", "#viral",
    "#trending", "#instadaily", "#discover", "#motivation", "#mindset",
    "#inspiration", "#podcast", "#interview", "#clips", "#dailyreels",
    "#mustwatch", "#foryou", "#wow", "#insane", "#watchtillend"
]

def slugify(t):
    t = re.sub(r"[^\w\s-]", "", t)
    t = re.sub(r"\s+", " ", t).strip()
    return t

mp4s = sorted(glob(os.path.join(OUT_DIR, "*.mp4")))
for vid in mp4s:
    base = os.path.splitext(os.path.basename(vid))[0]
    txt_path = os.path.join(OUT_DIR, base + ".txt")

    # simple caption from filename
    cap_core = slugify(base.split("__")[0]).replace("_", " ")
    hook = random.choice([
        "Wait for it…", "You won’t believe this.", "This changed my mind.",
        "Unreal moment.", "What do you think?"
    ])
    tt_tags = random.choice(HASHTAGS_TT_BUCKETS)
    ig_tags = HASHTAGS_IG

    caption = hook + "\n\n" + HANDLE + "\n\n" + \
              "TikTok: " + " ".join(tt_tags) + "\n" + \
              "Instagram: " + " ".join(ig_tags)

    with open(txt_path, "w", encoding="utf-8") as f:
        f.write(caption)

print("✅ Captions written for", len(mp4s), "videos. (.txt files next to each .mp4)")

## ✅ Done
- Your clips are in **`Drive/MyDrive/Autopost_Clips`** with matching `.txt` captions.
- Pair this with your **emulator bot** (Bluestacks macro) to auto-post on schedule.

### Notes
- First run can take longer while Whisper downloads a model.
- You can switch `model = whisper.load_model("small")` to `'tiny'` for faster (lower quality) or `'medium'/'large'` for higher quality.
- Add/remove `SAFE_SOURCES` to tune the niche.
- To run daily on autopilot: Use Colab’s “Schedule cell execution” or a simple desktop automation to open Colab and click *Run all* each day.