Markiplier Archive Pipeline

A local media pipeline that discovers, catalogs, verifies, and downloads Markiplier content that has been deleted or removed from his active YouTube channel. Serves the content through a local Plex media server viewable on Amazon Fire TV and any browser on the local network.

The Golden Rule

Cross-reference everything against his live channel first. If a video exists on Markiplier's active YouTube channel right now, skip it entirely. Only surface and download content that is confirmed absent from his live channel.

Tech Stack

Runtime: Python 3.11+
Downloading: yt-dlp
Media Server: Plex (Plex Pass)
Database: SQLite via SQLAlchemy
Web Dashboard: FastAPI + HTML/JS frontend
APIs: YouTube Data API v3, Internet Archive API
Package Management: Homebrew + pip
Target Machine: Mac mini, 1TB storage, macOS

Quick Start

One-Line Setup

chmod +x setup.sh && ./setup.sh

The setup script will:

Check for Homebrew, install if missing
Install Python 3.11, yt-dlp, ffmpeg via Homebrew
Create a virtualenv and install Python dependencies
Create the folder structure on your specified drive
Prompt for YouTube API key and Plex token, write to .env
Initialize the SQLite database
Run first channel inventory pull

Manual Setup

python3.11 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your API keys
python main.py init

Usage

All commands are manual-trigger only — nothing runs on a schedule.

source venv/bin/activate

python main.py inventory      # Phase 1: Pull/refresh live channel inventory
python main.py discover       # Phase 2: Run archive discovery (all sources)
python main.py verify         # Phase 3: Verify pending links
python main.py download       # Phase 4: Download verified items
python main.py plex-scan      # Phase 6: Trigger Plex library scan
python main.py full           # Run all phases in sequence
python main.py import <file>  # Import URLs from a text file
python main.py dashboard      # Start web dashboard at http://localhost:8080
python main.py init           # Initialize database only

Pipeline Phases

Phase 1 — Channel Inventory

Uses YouTube Data API v3 to pull the complete inventory of Markiplier's current live channel (UCX6OQ3DkcsbYNE6H8uQQuVA). Stores all video IDs, titles, playlist memberships, and upload dates in SQLite as the "live" reference table. This table gets refreshed every time the tool runs so the cross-reference stays current.

Phase 2 — Archive Discovery

Queries these sources for Markiplier content not in the live channel inventory:

Internet Archive (archive.org) — Searches subject:markiplier and creator:markiplier. Pulls metadata: title, date, URL, format availability.
YouTube search (via API) — Searches for fan reupload channels and known archive channels. Cross-references video titles/IDs against the live channel.
Manual URL list — Accepts a plaintext file of URLs (one per line, # for comments). Run same cross-reference logic against it.

Phase 3 — Link Verification

Before anything hits the download queue:

Checks every discovered URL with yt-dlp --simulate — confirms downloadable without actually downloading
Checks Internet Archive links with a HEAD request
Tags each item: verified | dead | region_locked | pending
Dead links get flagged in the dashboard but never queue for download

Phase 4 — Download Pipeline

yt-dlp handles all downloads
Output formatted for Plex TV Show library:

/Volumes/MarkiplierArchive/
└── Shows/
    └── Markiplier - {Series Name}/
        └── Season {Year}/
            └── S{Year}E{###} - {Title}.mp4

Downloads best available quality up to 1080p (storage conscious on 1TB)
Pulls thumbnail, description, and subtitles alongside each video
Marks each item in DB as downloaded on completion
Skips anything already marked downloaded on rerun

Phase 5 — Web Dashboard

FastAPI backend + browser UI accessible at http://localhost:8080

Dashboard views:

View	Description
Catalog	All discovered content, filterable by series/year/status
Queue	What's pending download, what's in progress
Dead Links	Archive of undownloadable content with source info
Live Channel	Current Markiplier channel inventory for reference
Settings	URL import, storage path, refresh controls

Status tags on every item:

On Live Channel | Archived - Available | Archived - Dead Link | Downloaded

Phase 6 — Plex Integration

Point Plex library at /Volumes/MarkiplierArchive/Shows/
Library type: TV Shows

After each download batch, triggers Plex library scan via API:

GET http://localhost:32400/library/sections/{id}/refresh?X-Plex-Token={token}

Series show up in Plex automatically after scan, watchable on Fire TV via Plex app

Configuration

All configuration is via the .env file:

YOUTUBE_API_KEY=your_key_here
PLEX_TOKEN=your_plex_token
PLEX_LIBRARY_ID=your_library_section_id
PLEX_URL=http://localhost:32400
ARCHIVE_SEARCH_TERMS=markiplier
STORAGE_PATH=/Volumes/MarkiplierArchive
MAX_QUALITY=1080
DASHBOARD_PORT=8080

Getting Your Plex Token

Open Plex Web App and sign in
Navigate to any media item and click "Get Info"
Click "View XML" — the URL will contain X-Plex-Token=YOUR_TOKEN

Getting Your Plex Library ID

Run the dashboard and check Plex settings, or:

curl -s "http://localhost:32400/library/sections?X-Plex-Token=YOUR_TOKEN" | xmllint --format -

The key attribute on each <Directory> is the library ID.

Project Structure

Markiplier/
├── config.py                    # Environment config loader
├── main.py                      # CLI entry point
├── setup.sh                     # One-shot Mac setup script
├── requirements.txt             # Python dependencies
├── .env.example                 # Config template
├── database/
│   ├── models.py                # LiveVideo + ArchiveItem SQLAlchemy models
│   └── session.py               # DB engine and session management
├── phases/
│   ├── channel_inventory.py     # Phase 1 — YouTube API channel pull
│   ├── archive_discovery.py     # Phase 2 — Internet Archive + reuploads + manual
│   ├── link_verification.py     # Phase 3 — yt-dlp simulate + HEAD checks
│   ├── download_pipeline.py     # Phase 4 — yt-dlp downloads, Plex folders
│   └── plex_integration.py      # Phase 6 — Plex library scan trigger
└── dashboard/
    ├── app.py                   # FastAPI backend with all API routes
    └── templates/
        └── index.html           # Dark-themed dashboard UI

What This Does NOT Do

Does not touch any content currently on his live channel
Does not monetize or redistribute anything
Does not run on a schedule automatically — manual trigger only, you control when it runs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Markiplier Archive Pipeline

The Golden Rule

Tech Stack

Quick Start

One-Line Setup

Manual Setup

Usage

Pipeline Phases

Phase 1 — Channel Inventory

Phase 2 — Archive Discovery

Phase 3 — Link Verification

Phase 4 — Download Pipeline

Phase 5 — Web Dashboard

Phase 6 — Plex Integration

Configuration

Getting Your Plex Token

Getting Your Plex Library ID

Project Structure

What This Does NOT Do

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
dashboard		dashboard
database		database
phases		phases
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
config.py		config.py
main.py		main.py
requirements.txt		requirements.txt
setup.sh		setup.sh

Folders and files

Latest commit

History

Repository files navigation

Markiplier Archive Pipeline

The Golden Rule

Tech Stack

Quick Start

One-Line Setup

Manual Setup

Usage

Pipeline Phases

Phase 1 — Channel Inventory

Phase 2 — Archive Discovery

Phase 3 — Link Verification

Phase 4 — Download Pipeline

Phase 5 — Web Dashboard

Phase 6 — Plex Integration

Configuration

Getting Your Plex Token

Getting Your Plex Library ID

Project Structure

What This Does NOT Do

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages