#  YouTube Video Scraper – First Drop

This notebook is the initial draft of my YouTube scraping project. The goal is to extract video-level metadata from a specific channel using Python automation tools.

###  What this notebook currently does:
- Opens the **GeeksforGeeks YouTube channel** using Selenium
- Loads the channel's **Videos** section
- Parses the HTML content using BeautifulSoup
- Extracts:
  -  **Video titles**
  -  **Video links (relative URLs)**
  -  **Text content of individual video blocks**
  
Currently, I'm experimenting with a few entries (not full scraping yet). This is just a working foundation — the structure will be extended to:
- Loop over all visible videos
- Save data in a structured format (CSV/JSON)
- Possibly include extra fields like upload date, duration, views, and likes

This is the **first drop** — just the beginning of building a full-fledged YouTube data analysis tool.


In [4]:
import pandas as pd
import numpy as np
import requests
from bs4 import BeautifulSoup
from selenium import webdriver
import chromedriver_binary

In [6]:
chromedriver_binary.chromedriver_filename

'C:\\Users\\Infinix\\anaconda3\\Lib\\site-packages\\chromedriver_binary\\chromedriver.exe'

In [8]:
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
options=Options()
service=Service()
browser=webdriver.Chrome(service=service, options=options)
browser.get('https://www.youtube.com/@GeeksforGeeksVideos/videos')


In [10]:
soup=BeautifulSoup(browser.page_source,'html.parser')

In [12]:
sp=soup.find('ytd-rich-grid-renderer')

In [14]:
sp.find_all('ytd-rich-item-renderer')[6].text

'\n\n7:34\n    7:34\n  Now playing\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n7:34\n7:34\n    7:34\n  Now playing\n\n\n\n\n\n\n\n\n\n\n\nCourse Walkthrough - How to Utilize the Free Courses in Nation Skillup | GeeksforGeeks\n\n\n\n\n\n\n\n\n\n\n\n  Verified\n\n\n\n•\n\n\n\n\n•\n9.7K views\n11 days ago\n\n\n\n\n\n\n\n\n'

In [16]:
sp.find_all('a',class_="yt-simple-endpoint focus-on-expand style-scope ytd-rich-grid-media")[1]

<a aria-label="Nation SkillUp FAQ | FAQ All Answered | Utilize Nation Skillup to Learn &amp; Win Rewards! 8 minutes, 6 seconds" class="yt-simple-endpoint focus-on-expand style-scope ytd-rich-grid-media" href="/watch?v=pFVxy3f9kAE" id="video-title-link" title="Nation SkillUp FAQ | FAQ All Answered | Utilize Nation Skillup to Learn &amp; Win Rewards!"><yt-formatted-string class="style-scope ytd-rich-grid-media" id="video-title">Nation SkillUp FAQ | FAQ All Answered | Utilize Nation Skillup to Learn &amp; Win Rewards!</yt-formatted-string></a>

In [18]:
title=sp.find_all('a',class_="yt-simple-endpoint focus-on-expand style-scope ytd-rich-grid-media")[0].text
video_link=sp.find_all('a',class_="yt-simple-endpoint focus-on-expand style-scope ytd-rich-grid-media")[0].get('href')
views=sp.find_all('span',class_='inline-metadata-item style-scope ytd-video-meta-block')[0].text
date_time = sp.find_all('span',class_="inline-metadata-item style-scope ytd-video-meta-block")[1].text
thumbnail_link = sp.find_all('img')[0].get('src').split('?')[0]

In [107]:
title

'NIMCET 2026 | How To Prepare NIMCET 2026? | Big Update For NIMCET 2026 Batch'

In [111]:
video_link

'/watch?v=Ho_PIAAVLlE'

In [113]:
views

'1.3K views'

In [115]:
date_time

'1 day ago'

In [207]:
data =[]

for sp in soup.find_all('ytd-rich-item-renderer'):

    
    title=sp.find('a',class_="yt-simple-endpoint focus-on-expand style-scope ytd-rich-grid-media").text
    video_link=sp.find('a',class_="yt-simple-endpoint focus-on-expand style-scope ytd-rich-grid-media").get('href')

    try:
        views=sp.find_all('span',class_='inline-metadata-item style-scope ytd-video-meta-block')[0].text
    except:
        views= np.nan

    try:
        date_time = sp.find_all('span',class_="inline-metadata-item style-scope ytd-video-meta-block")[1].text
    except:
        date_time = np.nan

    try:
        thumbnail_link = sp.find_all('img')[0].get('src').split('?')[0]
    except:
        thumbnail_link = np.nan

    data.append([title, views, date_time, video_link, thumbnail_link])

    

In [209]:
len(data)

2047

In [213]:
df = pd.DataFrame(data, columns = ['title','views','date_time','video_link','thumbnail_link'])


In [215]:
df

Unnamed: 0,title,views,date_time,video_link,thumbnail_link
0,NIMCET 2026 | How To Prepare NIMCET 2026? | Bi...,1.3K views,1 day ago,/watch?v=Ho_PIAAVLlE,https://i.ytimg.com/vi/Ho_PIAAVLlE/hqdefault.jpg
1,NIMCET Roadmap | NIMCET 2026 Preparation | NIM...,2.3K views,1 day ago,/watch?v=cgeb6Gojoho,https://i.ytimg.com/vi/cgeb6Gojoho/hqdefault.jpg
2,AI Engineer Roadmap – How to Learn AI in 2025 ...,15K views,2 days ago,/watch?v=JagRXz_mTU8,https://i.ytimg.com/vi/JagRXz_mTU8/hqdefault.jpg
3,How to Score 9+ CGPA in College 🔥 Complete Roa...,10K views,3 days ago,/watch?v=UttzVuaF-f0,https://i.ytimg.com/vi/UttzVuaF-f0/hqdefault.jpg
4,Course Walkthrough - How to Utilize the Free C...,9K views,8 days ago,/watch?v=Dl-eEZlv_pk&pp=0gcJCccJAYcqIYzv,https://i.ytimg.com/vi/Dl-eEZlv_pk/hqdefault.jpg
...,...,...,...,...,...
2042,Length of shortest chain to reach a target wor...,44K views,9 years ago,/watch?v=6pIC20wCm20,https://i.ytimg.com/vi/6pIC20wCm20/hqdefault.jpg
2043,Binary Search | GeeksQuiz,192K views,9 years ago,/watch?v=T2sFYY-fT5o,https://i.ytimg.com/vi/T2sFYY-fT5o/hqdefault.jpg
2044,Number of Triangles in an Undirected Graph | G...,19K views,9 years ago,/watch?v=ChdNz1Ui1uc,https://i.ytimg.com/vi/ChdNz1Ui1uc/hqdefault.jpg
2045,Write a program to print all permutations of a...,492K views,9 years ago,/watch?v=AfxHGNRtFac,https://i.ytimg.com/vi/AfxHGNRtFac/hqdefault.jpg


In [217]:
df.isnull().sum()

title               0
views               0
date_time           0
video_link          0
thumbnail_link    108
dtype: int64

In [219]:
df.to_csv('data.csv',index = False)
