# YouTube Channel Video Tracker: Usage Examples

## Introduction

This notebook provides a comprehensive guide to the YouTube Channel Video Tracker tool. We'll explore each functionality, explaining how it works and why it's useful.

## Setup and Basic Concepts

First, let's import our module and set up the environment:

In [5]:
import get_infoYT as gt
from dotenv import load_dotenv

load_dotenv()  # Load API key from .env file

True

### Notes on YouTube API

This program uses the YouTube API. You must set up a project and get an API key to run this program. More information: https://developers.google.com/youtube/registering_an_application

Once you have an API Key, store it as an environment variable. Here's a quick setup example:

In [6]:
import os
from googleapiclient.discovery import build

DEVELOPER_KEY = os.getenv('YOUTUBE_API_KEY')
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'
youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, developerKey=DEVELOPER_KEY)

IMPORTANT: Your usage is determined by a daily Quota limit. Monitor this as each request depletes your quota.

## Creating a Channel Instance

Let's start by creating an instance for a YouTube channel:

In [8]:
url1 = 'https://www.youtube.com/@WeightliftingHouse'
channel1 = gt.InfoYT(url1)

The file WeightliftingHouse_videos.json doesn't exist yet in the Channel_Videos/ folder. 
There is no history record for this channel.

INFO ABOUT THE CHANNEL:
The username for this channel is: Weightlifting House.
The channel id is: UCd5WxLFvKjEbJl5xyUqyHSw
The number of videos published by this channel is: 1061.


When we create an instance, it prints information about existing records and channel details. This is useful for quickly understanding the channel's size and our current data status.

## Retrieving Recent Videos

We can fetch the most recent videos from the channel:

In [9]:
channel1.get_recent_videos(max_result=3)

[{'video_id': 'F-6535Aeh_4',
  'title': 'Make this lift &amp; go to the Olympics! #LuoShifang #weightlifting #olympicweightlifting',
  'published_at': '2024-07-17T13:34:05Z',
  'description': '',
  'timestamps': None,
  'duration': 'PT29S',
  'tags': None},
 {'video_id': '52bB3Q1s1oA',
  'title': 'How This Guy Saved Chinese Weightlifting | Liu Huanhua aka Gigachad #weightlifting',
  'published_at': '2024-07-16T15:25:48Z',
  'description': '',
  'timestamps': None,
  'duration': 'PT59S',
  'tags': None},
 {'video_id': 'u51_xd3GJA4',
  'title': 'How This Guy is Saving Chinese Weightlifting',
  'published_at': '2024-07-16T12:40:11Z',
  'description': "Liu Huanhua, aka Giga Chad, has become arguably China's best weightlifter on the men's team. After years of total dominance, China's squad has started missing out on Gold medals. \n\nIn Thailand, at their most important competition of the entire Olympic quad, 'only' two of the six took home gold medals, two more managed medals, and two fell 

This method is handy for quickly checking the latest content without processing the entire channel history.

## Working with Stored Data

Let's examine a channel with existing stored data:

In [10]:
url2 = 'https://www.youtube.com/@KeithGalli/videos'
channel2 = gt.InfoYT(url2)

print(f'The total number of videos already stored is: {len(channel2.all_videos)}.')
channel2.all_videos
channel2.get_videos_dataframe()

We already have history record for this channel in the file KeithGalli_videos.json.

INFO ABOUT THE CHANNEL:
The username for this channel is: Keith Galli.
The channel id is: UCq6XkhO5SZ66N04IcPbqNcw
The number of videos published by this channel is: 89.
The number of videos already retrieved is: 89
The oldest video was published on: 2016-12-27 21:12:20
The most recent video was published on: 2024-06-29 14:10:02
The total number of videos already stored is: 89.


Unnamed: 0,video_id,title,published_at,duration,description,tags,timestamps
0,2uvysYbKdjM,Complete Python Pandas Data Science Tutorial! ...,2024-06-29 14:10:02+00:00,PT1H34M11S,"Hey, what's up everyone? Welcome back to anoth...","[Keith Galli, python, programming, python 3, d...","{'0:00': '- Video Overview', '1:11': '- Gettin..."
1,rPrFkmZ5Lws,The Most Epic Data Science Tutorial Promo Video!,2024-06-22 14:21:55+00:00,PT1M9S,Mark your calendars! New pandas data science t...,"[Keith Galli, python, programming, python 3, d...",
2,DcI_AZqfZVc,Advanced Web Scraping Tutorial! (w/ Python Bea...,2024-06-08 13:57:23+00:00,PT42M43S,Get started w/ Bright Data + $15 free credit u...,"[Keith Galli, python, programming, python 3, d...","{'0:00': '- Intro & Overview', '1:30': '- Iden..."
3,oad9tVEsfI0,Real-World Dataset Cleaning with Python Pandas...,2024-04-20 08:10:35+00:00,PT2H2M26S,I'm prepping a dataset for an upcoming tutoria...,"[Keith Galli, python, programming, python 3, d...","{'0:00': '- Livestream Overview', '4:00': '- A..."
4,i7v2m-ebXB4,Solving 100 Python Pandas Problems! (from easy...,2024-04-13 17:43:22+00:00,PT5H20M18S,"In this tutorial, you'll gain hands-on experie...","[Keith Galli, python, programming, python 3, d...","{'0:00': '- Intro & Setup', '2:14': '- Problem..."
...,...,...,...,...,...,...,...
84,5O6f1GTLLeQ,Simple explanation of Asymptotic Notation!,2017-01-17 06:39:43+00:00,PT9M47S,In this video I give a brief introduction on A...,"[KeithGLearning, Keith Galli, MIT Student, MIT...",
85,m0h6XzKfulM,Simplest way to remember Complementary vs Supp...,2017-01-14 20:43:23+00:00,PT3M20S,Quick video on a trick to remember the differe...,"[KeithGLearning, Keith Galli, MIT Student, MIT...",
86,jMpbYpaKtao,How to win at Battleship almost every time!,2017-01-02 20:51:35+00:00,PT7M53S,"In this video, I walk you through the best way...","[KeithGLearning, Keith Galli, MIT Student, MIT...",
87,sJgLi32jMo0,How to win at Othello almost every time!,2017-01-02 19:14:52+00:00,PT8M59S,This video gives you strategy tips for how to ...,"[MIT Student, Keith Galli, KeithGLearning, Bes...",


The `all_videos` attribute contains our stored data, while `get_videos_dataframe()` presents it in a pandas DataFrame for easy analysis.

## Updating and Retrieving More Data

There are two methods for updating our data:

Looking at recent uploads:

In [11]:
channel2.update_videos(max_result=10)

I've found 0 new videos to be added!


This is useful for regularly updating our dataset with new videos.

Retrieving older videos:

In [12]:
channel2.get_all_videos(max_videos=100)

All the videos in the channel have already been retrieved!


This method helps fill in historical data, especially useful for newly tracked channels.

Let's see an example with a channel that has incomplete data:

In [17]:
url3 = 'https://www.youtube.com/watch?v=XJTMQtE-MIo&t=1s'
channel3 = gt.InfoYT(url3)

We already have history record for this channel in the file LexFridman_videos.json.

INFO ABOUT THE CHANNEL:
The username for this channel is: Lex Fridman.
The channel id is: UCSHZKyawb77ixDdsGog4iWA
The number of videos published by this channel is: 802.
The number of videos already retrieved is: 302
The oldest video was published on: 2020-09-05 17:31:48
The most recent video was published on: 2024-06-19 20:42:46


In [18]:
print(len(channel3.all_videos))
channel3.get_all_videos(max_videos=100)
print(len(channel3.all_videos))

302
The number of videos already retrieved is 302. 
This download will retrieve videos published before 2020-09-05 17:31:48.
This download has retrieved 104 videos.
405


This demonstrates how we can expand our dataset for channels with partial information.

## Saving Updated Data

After retrieving new data, save it to storage:

In [None]:
#channel3.save_to_json()

This function overwrites the previous file, ensuring our stored data is always up-to-date.

## Handling Missing Videos

To catch videos that might have been missed:

In [None]:
channel3.run_reverse_order(max_videos=100)

This function helps ensure comprehensive coverage of a channel's content by searching for potentially missed videos.

## Sample API Request

Here's an example of using the YouTube API directly:

In [15]:
# this is a sample API request to search for a specific keyword
query = 'weightlifting'
request = youtube.search().list(
            part="snippet",
            q=query,
            #type="channel",
            maxResults=5
            )
response = request.execute()
response

{'kind': 'youtube#searchListResponse',
 'etag': 'NdHjoxZ14Tqe1V8z7PSCjIatG_0',
 'nextPageToken': 'CAUQAA',
 'regionCode': 'IT',
 'pageInfo': {'totalResults': 1000000, 'resultsPerPage': 5},
 'items': [{'kind': 'youtube#searchResult',
   'etag': 'NSIR0y0cF51BhpGOZOsQrm3NpW8',
   'id': {'kind': 'youtube#video', 'videoId': 'u51_xd3GJA4'},
   'snippet': {'publishedAt': '2024-07-16T12:40:11Z',
    'channelId': 'UCd5WxLFvKjEbJl5xyUqyHSw',
    'title': 'How This Guy is Saving Chinese Weightlifting',
    'description': "Liu Huanhua, aka Giga Chad, has become arguably China's best weightlifter on the men's team. After years of total dominance, ...",
    'thumbnails': {'default': {'url': 'https://i.ytimg.com/vi/u51_xd3GJA4/default.jpg',
      'width': 120,
      'height': 90},
     'medium': {'url': 'https://i.ytimg.com/vi/u51_xd3GJA4/mqdefault.jpg',
      'width': 320,
      'height': 180},
     'high': {'url': 'https://i.ytimg.com/vi/u51_xd3GJA4/hqdefault.jpg',
      'width': 480,
      'height

This demonstrates how to perform custom queries, which can be useful for more specific data needs beyond channel analysis.