Skip to content

Jwinter89/Markiplier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Markiplier Archive Pipeline

A local media pipeline that discovers, catalogs, verifies, and downloads Markiplier content that has been deleted or removed from his active YouTube channel. Serves the content through a local Plex media server viewable on Amazon Fire TV and any browser on the local network.

The Golden Rule

Cross-reference everything against his live channel first. If a video exists on Markiplier's active YouTube channel right now, skip it entirely. Only surface and download content that is confirmed absent from his live channel.

Tech Stack

  • Runtime: Python 3.11+
  • Downloading: yt-dlp
  • Media Server: Plex (Plex Pass)
  • Database: SQLite via SQLAlchemy
  • Web Dashboard: FastAPI + HTML/JS frontend
  • APIs: YouTube Data API v3, Internet Archive API
  • Package Management: Homebrew + pip
  • Target Machine: Mac mini, 1TB storage, macOS

Quick Start

One-Line Setup

chmod +x setup.sh && ./setup.sh

The setup script will:

  1. Check for Homebrew, install if missing
  2. Install Python 3.11, yt-dlp, ffmpeg via Homebrew
  3. Create a virtualenv and install Python dependencies
  4. Create the folder structure on your specified drive
  5. Prompt for YouTube API key and Plex token, write to .env
  6. Initialize the SQLite database
  7. Run first channel inventory pull

Manual Setup

python3.11 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your API keys
python main.py init

Usage

All commands are manual-trigger only — nothing runs on a schedule.

source venv/bin/activate

python main.py inventory      # Phase 1: Pull/refresh live channel inventory
python main.py discover       # Phase 2: Run archive discovery (all sources)
python main.py verify         # Phase 3: Verify pending links
python main.py download       # Phase 4: Download verified items
python main.py plex-scan      # Phase 6: Trigger Plex library scan
python main.py full           # Run all phases in sequence
python main.py import <file>  # Import URLs from a text file
python main.py dashboard      # Start web dashboard at http://localhost:8080
python main.py init           # Initialize database only

Pipeline Phases

Phase 1 — Channel Inventory

Uses YouTube Data API v3 to pull the complete inventory of Markiplier's current live channel (UCX6OQ3DkcsbYNE6H8uQQuVA). Stores all video IDs, titles, playlist memberships, and upload dates in SQLite as the "live" reference table. This table gets refreshed every time the tool runs so the cross-reference stays current.

Phase 2 — Archive Discovery

Queries these sources for Markiplier content not in the live channel inventory:

  1. Internet Archive (archive.org) — Searches subject:markiplier and creator:markiplier. Pulls metadata: title, date, URL, format availability.
  2. YouTube search (via API) — Searches for fan reupload channels and known archive channels. Cross-references video titles/IDs against the live channel.
  3. Manual URL list — Accepts a plaintext file of URLs (one per line, # for comments). Run same cross-reference logic against it.

Phase 3 — Link Verification

Before anything hits the download queue:

  • Checks every discovered URL with yt-dlp --simulate — confirms downloadable without actually downloading
  • Checks Internet Archive links with a HEAD request
  • Tags each item: verified | dead | region_locked | pending
  • Dead links get flagged in the dashboard but never queue for download

Phase 4 — Download Pipeline

  • yt-dlp handles all downloads
  • Output formatted for Plex TV Show library:
/Volumes/MarkiplierArchive/
└── Shows/
    └── Markiplier - {Series Name}/
        └── Season {Year}/
            └── S{Year}E{###} - {Title}.mp4
  • Downloads best available quality up to 1080p (storage conscious on 1TB)
  • Pulls thumbnail, description, and subtitles alongside each video
  • Marks each item in DB as downloaded on completion
  • Skips anything already marked downloaded on rerun

Phase 5 — Web Dashboard

FastAPI backend + browser UI accessible at http://localhost:8080

Dashboard views:

View Description
Catalog All discovered content, filterable by series/year/status
Queue What's pending download, what's in progress
Dead Links Archive of undownloadable content with source info
Live Channel Current Markiplier channel inventory for reference
Settings URL import, storage path, refresh controls

Status tags on every item:

On Live Channel | Archived - Available | Archived - Dead Link | Downloaded

Phase 6 — Plex Integration

  • Point Plex library at /Volumes/MarkiplierArchive/Shows/
  • Library type: TV Shows
  • After each download batch, triggers Plex library scan via API:
    GET http://localhost:32400/library/sections/{id}/refresh?X-Plex-Token={token}
    
  • Series show up in Plex automatically after scan, watchable on Fire TV via Plex app

Configuration

All configuration is via the .env file:

YOUTUBE_API_KEY=your_key_here
PLEX_TOKEN=your_plex_token
PLEX_LIBRARY_ID=your_library_section_id
PLEX_URL=http://localhost:32400
ARCHIVE_SEARCH_TERMS=markiplier
STORAGE_PATH=/Volumes/MarkiplierArchive
MAX_QUALITY=1080
DASHBOARD_PORT=8080

Getting Your Plex Token

  1. Open Plex Web App and sign in
  2. Navigate to any media item and click "Get Info"
  3. Click "View XML" — the URL will contain X-Plex-Token=YOUR_TOKEN

Getting Your Plex Library ID

Run the dashboard and check Plex settings, or:

curl -s "http://localhost:32400/library/sections?X-Plex-Token=YOUR_TOKEN" | xmllint --format -

The key attribute on each <Directory> is the library ID.

Project Structure

Markiplier/
├── config.py                    # Environment config loader
├── main.py                      # CLI entry point
├── setup.sh                     # One-shot Mac setup script
├── requirements.txt             # Python dependencies
├── .env.example                 # Config template
├── database/
│   ├── models.py                # LiveVideo + ArchiveItem SQLAlchemy models
│   └── session.py               # DB engine and session management
├── phases/
│   ├── channel_inventory.py     # Phase 1 — YouTube API channel pull
│   ├── archive_discovery.py     # Phase 2 — Internet Archive + reuploads + manual
│   ├── link_verification.py     # Phase 3 — yt-dlp simulate + HEAD checks
│   ├── download_pipeline.py     # Phase 4 — yt-dlp downloads, Plex folders
│   └── plex_integration.py      # Phase 6 — Plex library scan trigger
└── dashboard/
    ├── app.py                   # FastAPI backend with all API routes
    └── templates/
        └── index.html           # Dark-themed dashboard UI

What This Does NOT Do

  • Does not touch any content currently on his live channel
  • Does not monetize or redistribute anything
  • Does not run on a schedule automatically — manual trigger only, you control when it runs

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors