Skip to content

maple-underscore/youtube-scraper

Repository files navigation

License: CC BY-NC-SA 4.0

YouTube Video Scraper

A powerful and user-friendly YouTube video downloader with support for multiple formats, codecs, quality options, and tunneling capabilities. Now includes video re-encoding and crop/rescale functionality with parallel processing. Features a beautiful console GUI with progress bars.

Features

Download Features

  • 📥 Batch Downloads: Download multiple videos from a queue file
  • 🎬 Multiple Codecs: Support for AV1, H.264, H.265, and VP9
  • 🎵 Audio Quality Options: Choose from 320kbps, 256kbps, 192kbps, 128kbps, and 96kbps
  • 📺 Quality Presets: 8K, 4K, 1440p, 1080p, 720p, 480p, 360p
  • 🔄 Smart Fallback: Automatically selects the highest available quality if preferred quality is unavailable
  • 🌐 Proxy Support: Built-in support for custom proxies and Tor network
  • Parallel Downloads: Download multiple videos simultaneously
  • 🍪 Cookie Support: Bypass bot detection using browser cookies

Re-encoding Features

  • 🔄 Codec Conversion: Convert videos to H.264, H.265, AV1, or VP9
  • 📉 Quality Control: Multiple quality presets or custom CRF values
  • 📏 Video Downscaling: Reduce resolution to save space (e.g., 1080p to 720p)
  • 💾 Size-based Compression: Compress videos to a target file size
  • Parallel Re-encoding: Process multiple videos simultaneously
  • 🎨 Beautiful Progress Bars: Real-time encoding progress with rich UI

Crop/Rescale Features

  • ✂️ Aspect Ratio Conversion: Convert videos to standard aspect ratios (16:9, 4:3, 21:9, etc.)
  • 📐 Custom Dimensions: Rescale to specific pixel dimensions
  • 🎯 Two Processing Modes: Scale (with letterbox/pillarbox) or Crop (center crop)
  • Parallel Processing: Process multiple videos simultaneously
  • 🎨 Real-time Progress: Beautiful progress bars with time estimates

Installation

Prerequisites

  • Python 3.10 or higher
  • FFmpeg (required for merging video and audio)

Install FFmpeg

Ubuntu/Debian:

sudo apt update
sudo apt install ffmpeg

macOS:

brew install ffmpeg

Windows: Download from ffmpeg.org and add to PATH

Install Python Dependencies

pip install -r requirements.txt

Usage

Quick Start

  1. Create a downloadqueue.txt file with YouTube URLs (one per line):
https://www.youtube.com/watch?v=dQw4w9WgXcQ
https://www.youtube.com/watch?v=jNQXAC9IVRw
  1. Run the scraper in interactive mode:
python youtube_utils.py

Command Line Options

# Download with custom quality and codec
python youtube_utils.py -q 4k -c av1 -a 320 -i downloadqueue.txt

# Specify output directory
python youtube_utils.py -o /path/to/output -i downloadqueue.txt

# Use Tor for anonymity
python youtube_utils.py --tor -i downloadqueue.txt

# Use custom proxy
python youtube_utils.py --proxy socks5://127.0.0.1:1080 -i downloadqueue.txt

# Force interactive mode
python youtube_utils.py --interactive

Re-encoding Existing Videos

The scraper now includes powerful video re-encoding capabilities:

# Convert videos to H.265 with medium quality
python youtube_utils.py --reencode --reencode-dir ./downloads --reencode-codec h265 --reencode-quality medium

# Parallel re-encoding with 4 jobs
python youtube_utils.py --reencode --reencode-dir ./videos -p 4 --reencode-codec h264

# Downscale videos to 720p
python youtube_utils.py --reencode --reencode-dir ./videos --reencode-scale 720

# Compress videos to 100MB each
python youtube_utils.py --reencode --reencode-dir ./videos --reencode-size 100

# Convert to AV1 with custom CRF
python youtube_utils.py --reencode --reencode-dir ./videos --reencode-codec av1 --reencode-crf 30

# Combine options: downscale to 720p and use H.265 with fast preset
python youtube_utils.py --reencode --reencode-dir ./videos --reencode-codec h265 --reencode-scale 720 --reencode-preset fast -p 4

Crop/Rescale Existing Videos

Convert videos to different aspect ratios or dimensions:

# Convert videos to 16:9 aspect ratio (scale mode with letterbox)
python youtube_utils.py --crop --crop-dir ./downloads --crop-aspect-ratio 16:9

# Crop videos to 16:9 (center crop, no letterbox)
python youtube_utils.py --crop --crop-dir ./videos --crop-aspect-ratio 16:9 --crop-mode crop

# Convert to 4:3 aspect ratio for classic displays
python youtube_utils.py --crop --crop-dir ./videos --crop-aspect-ratio 4:3

# Rescale to specific dimensions (1920x1080)
python youtube_utils.py --crop --crop-dir ./videos --crop-width 1920 --crop-height 1080

# Convert vertical videos to 9:16 (Instagram/TikTok format)
python youtube_utils.py --crop --crop-dir ./videos --crop-aspect-ratio 9:16

# Parallel processing with 4 jobs
python youtube_utils.py --crop --crop-dir ./videos --crop-aspect-ratio 16:9 -p 4

Available aspect ratios: 16:9, 16:10, 4:3, 21:9 (ultrawide), 1:1 (square), 9:16 (vertical)

Processing modes:

  • scale (default): Resize video to fit aspect ratio, adding letterbox/pillarbox as needed
  • crop: Center crop video to exact aspect ratio, removing edges

Available Options

Download Options

Option Description Default
-i, --input Path to download queue file downloadqueue.txt
-o, --output Output directory for downloads ./downloads
-q, --quality Video quality (8k, 4k, 1440p, 1080p, 720p, 480p, 360p) 1080p
-c, --codec Video codec (av1, h264, h265, vp9) h264
-a, --audio-bitrate Audio bitrate in kbps (320, 256, 192, 128, 96) 192
-p, --parallel Number of parallel downloads (1-10) 1
--container Video container format (mp4, webm, mkv, auto) auto
--proxy Proxy URL (e.g., socks5://127.0.0.1:1080) None
--tor Use Tor network False
--cookies-from-browser Extract cookies from browser (chrome, firefox, edge, safari, etc.) None
--cookies Path to cookies.txt file None
--interactive Run in interactive mode False

Re-encoding Options

Option Description Default
--reencode Enable re-encoding mode False
--reencode-dir Directory containing videos to re-encode (required) None
--reencode-output Output directory for re-encoded videos <input-dir>/reencoded
--reencode-codec Target codec (h264, h265, av1, vp9) h264
--reencode-quality Quality preset (best, high, medium, low) None
--reencode-crf Custom CRF value (codec-dependent) None
--reencode-size Target file size in MB None
--reencode-scale Downscale to height (e.g., 720 for 720p) None
--reencode-preset FFmpeg preset (ultrafast to veryslow) medium
-p, --parallel Number of parallel re-encoding jobs (1-10) 1

Crop/Rescale Options

Option Description Default
--crop Enable crop/rescale mode False
--crop-dir Directory containing videos to crop/rescale (required) None
--crop-output Output directory for processed videos <input-dir>/cropped
--crop-aspect-ratio Aspect ratio preset (16:9, 16:10, 4:3, 21:9, 1:1, 9:16) None
--crop-width Target width in pixels (requires --crop-height) None
--crop-height Target height in pixels (requires --crop-width) None
--crop-mode Processing mode: scale or crop scale
-p, --parallel Number of parallel processing jobs (1-10) 1

Configuration File

You can customize default settings by editing config.ini:

[DEFAULT]
output_dir = ./downloads
quality = 1080p
video_codec = h264
audio_bitrate = 192
queue_file = downloadqueue.txt

[PROXY]
enabled = false
use_tor = false
proxy_url = 

[ADVANCED]
merge_format = mp4
max_concurrent = 1
retry_count = 3
wait_time = 0

Download Queue File Format

The downloadqueue.txt file should contain one YouTube URL per line:

# You can add comments with #
https://www.youtube.com/watch?v=VIDEO_ID_1
https://www.youtube.com/watch?v=VIDEO_ID_2

# Empty lines are ignored
https://www.youtube.com/watch?v=VIDEO_ID_3

Video Codecs

  • AV1: Modern, efficient codec with excellent compression (requires compatible hardware/software)
  • H.264 (AVC): Most compatible, works on all devices
  • H.265 (HEVC): Better compression than H.264, newer devices
  • VP9: Google's open codec, good quality and compression

Quality Options

The scraper automatically selects the best available quality up to your specified limit:

  • 8K: 7680×4320 (4320p)
  • 4K: 3840×2160 (2160p)
  • 1440p: 2560×1440 (QHD)
  • 1080p: 1920×1080 (Full HD)
  • 720p: 1280×720 (HD)
  • 480p: 854×480 (SD)
  • 360p: 640×360 (Low)

If your selected quality is not available, the scraper will automatically download the highest available quality.

Proxy & Tunneling

Using Tor

  1. Install and run Tor:
# Ubuntu/Debian
sudo apt install tor
sudo service tor start

# macOS
brew install tor
tor
  1. Run scraper with Tor:
python youtube_utils.py --tor

Using Custom Proxy

# SOCKS5 proxy
python youtube_utils.py --proxy socks5://127.0.0.1:1080

# HTTP proxy
python youtube_utils.py --proxy http://proxy.example.com:8080

# With authentication
python youtube_utils.py --proxy socks5://user:pass@127.0.0.1:1080

Bypassing Bot Detection with Cookies

YouTube may sometimes block downloads with "Sign in to confirm you're not a bot" errors. You can bypass this by using cookies from your browser.

Option 1: Extract Cookies from Browser (Recommended)

The easiest method is to let yt-dlp extract cookies directly from your browser:

# Extract from Chrome
python youtube_utils.py --cookies-from-browser chrome

# Extract from Firefox
python youtube_utils.py --cookies-from-browser firefox

# Extract from Edge
python youtube_utils.py --cookies-from-browser edge

# Extract from Safari (macOS)
python youtube_utils.py --cookies-from-browser safari

Supported browsers: chrome, firefox, edge, safari, opera, brave, chromium, vivaldi

Note: Make sure you're logged into YouTube in the browser you're extracting cookies from.

Option 2: Export Cookies to File

If automatic extraction doesn't work, you can manually export cookies:

  1. Install a cookie export extension:

  2. Export cookies:

    • Navigate to YouTube while logged in
    • Click the extension icon
    • Export cookies as cookies.txt in Netscape format
    • Save to your project directory
  3. Use the cookies file:

python youtube_utils.py --cookies cookies.txt

Interactive Mode Cookie Setup

When using interactive mode, you'll be prompted:

Use cookies to bypass bot detection? [y/N]: y

Cookie options:
  1. Extract from browser (chrome, firefox, edge, safari, etc.)
  2. Use cookies.txt file
Choose option [1]: 1

Available browsers:
  chrome, firefox, edge, safari, opera, brave, chromium
Enter browser name [chrome]: chrome

Troubleshooting Cookie Issues

  • "Could not find browser" error: Make sure the browser is installed and yt-dlp can access it
  • "Could not extract cookies" error: Try closing the browser and running the scraper again
  • Still getting bot detection: Try logging out and back into YouTube, then extract cookies again
  • Permission errors: On Linux/macOS, you may need to close the browser before extracting cookies

Examples

Download in highest quality with AV1 codec

python youtube_utils.py -q 8k -c av1 -a 320

Download 4K videos with high quality audio through Tor

python youtube_utils.py -q 4k -a 320 --tor

Download to specific directory with custom settings

python youtube_utils.py -o ~/Videos/YouTube -q 1440p -c h265 -a 256

Bypass bot detection with browser cookies

# Using Chrome cookies
python youtube_utils.py --cookies-from-browser chrome

# Using Firefox cookies
python youtube_utils.py --cookies-from-browser firefox

# Using cookies.txt file
python youtube_utils.py --cookies cookies.txt

Parallel downloads

# Download 3 videos simultaneously
python youtube_utils.py -p 3 -i downloadqueue.txt

# Download in 4K with 5 parallel jobs
python youtube_utils.py -q 4k -p 5 -i downloadqueue.txt

Re-encoding examples

# Convert all MP4 files to H.265
python youtube_utils.py --reencode --reencode-dir ./downloads --reencode-codec h265

# Compress videos to 50MB each with 3 parallel jobs
python youtube_utils.py --reencode --reencode-dir ./videos --reencode-size 50 -p 3

# Downscale to 720p and convert to AV1
python youtube_utils.py --reencode --reencode-dir ./videos --reencode-codec av1 --reencode-scale 720 --reencode-quality high

# Fast re-encoding with ultrafast preset
python youtube_utils.py --reencode --reencode-dir ./videos --reencode-preset ultrafast -p 4

Crop/Rescale examples

# Convert all videos to 16:9 aspect ratio
python youtube_utils.py --crop --crop-dir ./downloads --crop-aspect-ratio 16:9

# Crop videos to 4:3 (remove edges, no letterbox)
python youtube_utils.py --crop --crop-dir ./videos --crop-aspect-ratio 4:3 --crop-mode crop

# Rescale to 1280x720 with 4 parallel jobs
python youtube_utils.py --crop --crop-dir ./videos --crop-width 1280 --crop-height 720 -p 4

# Convert to square format for Instagram
python youtube_utils.py --crop --crop-dir ./videos --crop-aspect-ratio 1:1 --crop-mode crop

Troubleshooting

"FFmpeg not found" or "You have requested merging of multiple formats but ffmpeg is not installed"

Install FFmpeg as described in the Installation section.

"Sign in to confirm you're not a bot"

YouTube is blocking the download. Use cookies to bypass this:

python youtube_utils.py --cookies-from-browser chrome

See the "Bypassing Bot Detection with Cookies" section for more details.

"No formats found"

The video might not be available in your requested quality/codec. Try a different quality preset or codec.

Slow downloads

  • Try using a different video codec (h264 is usually fastest to encode)
  • Lower the quality setting
  • Check your internet connection
  • Disable proxy if not needed

Proxy connection failed

  • Ensure your proxy/Tor is running
  • Check the proxy URL format
  • Verify firewall settings

License

CC BY-NC-SA 4.0 License - See LICENSE and NOTICE files for details. Non-commercial use only.

Credits

Built with:

  • yt-dlp - YouTube downloader
  • rich - Beautiful terminal formatting

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages