Skip to content

The backend for a proof-of-concept internal tool to find and monitor copyrighted content on YouTube.

Notifications You must be signed in to change notification settings

Adamhunter108/youtube_scraper

Repository files navigation

youtube_scraper

About

  • The backend for a proof-of-concept internal tool to find and monitor copyrighted content on YouTube.

  • This is a Python Flask application that searches the YouTube Data API, filters out specific channels from the search and saves data to a live PosgreSQL database.

  • The data includes:

    Channel Title, Channel ID, Video ID, Description, Thumbnail URL, and Publish Time

  • The app is deployed continuously to Heroku and the PostgreSQL database is hosted on Supabase.

Endpoint

⚠️ note: this endpoint is protected with an authorization Bearer token.

  • The live base URL:
https://flask-youtube-scraper-a55f990bea9f.herokuapp.com/
  • Local development URL:
localhost:5000/

Search

{{URL}}/api/search?query=<YOUR_SEARCH_QUERY>

Optionally exclude specific channels by name:

{{URL}}/api/search?query=<YOUR_SEARCH_QUERY>&exclude=ChannelNameToExclude,AnotherChannelToExclude

for example: to search for "Lil Wayne" but exclude his official channel with his channel ID:

{{URL}}/api/search?query=lil%20wayne&exclude=LilWayneVEVO

Run Locally

‼️ Requirements:

  • rename .env.example to .env and add your environment variables
$ # Create virtual environment
$ venv venv
$ # Activate virtual environment
$ # If on Mac or Linux
$  source venv/bin/activate
$ # If on Windows
$ c:\>c:\Python35\python -m venv c:\path\to\venv
$ # Install dependencies
$ pip install -r requirements.txt
$ # Export Flask app
$ export FLASK_APP=app.py
$ # Run the development server
$ flask run