## AssQ 22 Feb Python Web Scrapping

In [None]:
Q1. Write a python program to extract the video URL of the first five videos.

In [None]:
To extract the video URLs of the first five videos from a YouTube channel, 
you can use the BeautifulSoup library to parse the HTML content of the page and 
extract the necessary information. Additionally, you can use the requests library to fetch the webpage. 
Here's a Python program that demonstrates how to do this:

In [3]:
! pip install requests
! pip install beautifulsoup4



In [4]:
import requests
from bs4 import BeautifulSoup

def extract_video_urls(channel_url, num_videos):
    # Send a request to the channel URL
    response = requests.get(channel_url)
    if response.status_code != 200:
        print("Error: Unable to fetch the webpage.")
        return []

    # Parse the HTML content using BeautifulSoup
    soup = BeautifulSoup(response.text, 'html.parser')

    # Find all video links on the page
    video_links = soup.find_all('a', {'class': 'yt-simple-endpoint'})

    # Extract video URLs from the links
    video_urls = []
    for link in video_links:
        url = link['href']
        if url.startswith('/watch') and len(video_urls) < num_videos:
            video_url = f"https://www.youtube.com{url}"
            video_urls.append(video_url)

    return video_urls

if __name__ == "__main__":
    channel_url = "https://www.youtube.com/@PW-Foundation/videos"
    num_videos = 5

    video_urls = extract_video_urls(channel_url, num_videos)

    if video_urls:
        print("First 5 video URLs:")
        for url in video_urls:
            print(url)
    else:
        print("No video URLs found.")

No video URLs found.


In [None]:
Make sure you have the requests and beautifulsoup4 libraries installed.
You can install them using the following commands:
    
 Please note that web scraping can be against the terms of service of some websites, 
and you should always make sure you are allowed to scrape the content you're interested in.

In [None]:
Q2. Write a python program to extract the URL of the video thumbnails of the first five videos.

In [None]:
To extract the URLs of video thumbnails from the first five videos of a YouTube channel, 
you can again use the BeautifulSoup library to parse the HTML content of the page and extract the necessary information. 
Here's a Python program that demonstrates how to achieve this:

In [5]:
import requests
from bs4 import BeautifulSoup

def extract_thumbnail_urls(channel_url, num_videos):
    # Send a request to the channel URL
    response = requests.get(channel_url)
    if response.status_code != 200:
        print("Error: Unable to fetch the webpage.")
        return []

    # Parse the HTML content using BeautifulSoup
    soup = BeautifulSoup(response.text, 'html.parser')

    # Find all thumbnail image elements
    thumbnail_elements = soup.find_all('img', {'class': 'style-scope yt-img-shadow'})

    # Extract thumbnail URLs from the elements
    thumbnail_urls = []
    for element in thumbnail_elements:
        if len(thumbnail_urls) < num_videos:
            thumbnail_url = element['src']
            thumbnail_urls.append(thumbnail_url)

    return thumbnail_urls

if __name__ == "__main__":
    channel_url = "https://www.youtube.com/@PW-Foundation/videos"
    num_videos = 5

    thumbnail_urls = extract_thumbnail_urls(channel_url, num_videos)

    if thumbnail_urls:
        print("Thumbnail URLs of the first 5 videos:")
        for url in thumbnail_urls:
            print(url)
    else:
        print("No thumbnail URLs found.")

No thumbnail URLs found.


In [None]:
Q3. Write a python program to extract the title of the first five videos.

In [None]:
To extract the titles of the first five videos from a YouTube channel,
you can use the BeautifulSoup library to parse the HTML content of the page and extract the necessary information.
Here's a Python program that demonstrates how to do this:

In [None]:
import requests
from bs4 import BeautifulSoup

def extract_video_titles(channel_url, num_videos):
    # Send a request to the channel URL
    response = requests.get(channel_url)
    if response.status_code != 200:
        print("Error: Unable to fetch the webpage.")
        return []

    # Parse the HTML content using BeautifulSoup
    soup = BeautifulSoup(response.text, 'html.parser')

    # Find all video title elements
    title_elements = soup.find_all('a', {'id': 'video-title'})

    # Extract video titles from the elements
    video_titles = []
    for element in title_elements:
        if len(video_titles) < num_videos:
            video_title = element.get_text().strip()
            video_titles.append(video_title)

    return video_titles

if __name__ == "__main__":
    channel_url = "https://www.youtube.com/@PW-Foundation/videos"
    num_videos = 5

    video_titles = extract_video_titles(channel_url, num_videos)

    if video_titles:
        print("Titles of the first 5 videos:")
        for title in video_titles:
            print(title)
    else:
        print("No video titles found.")

In [None]:
Q4. Write a python program to extract the number of views of the first five videos.

In [None]:
To extract the number of views for the first five videos from a YouTube channel,
you can use the BeautifulSoup library to parse the HTML content of the page and extract the necessary information. 
Here's a Python program that demonstrates how to do this:

In [None]:
import requests
from bs4 import BeautifulSoup

def extract_video_views(channel_url, num_videos):
    # Send a request to the channel URL
    response = requests.get(channel_url)
    if response.status_code != 200:
        print("Error: Unable to fetch the webpage.")
        return []

    # Parse the HTML content using BeautifulSoup
    soup = BeautifulSoup(response.text, 'html.parser')

    # Find all video views elements
    views_elements = soup.find_all('span', {'class': 'style-scope ytd-grid-video-renderer'})

    # Extract video views from the elements
    video_views = []
    for element in views_elements:
        if len(video_views) < num_videos:
            view_text = element.get_text().strip()
            video_views.append(view_text)

    return video_views

if __name__ == "__main__":
    channel_url = "https://www.youtube.com/@PW-Foundation/videos"
    num_videos = 5

    video_views = extract_video_views(channel_url, num_videos)

    if video_views:
        print("Number of views of the first 5 videos:")
        for views in video_views:
            print(views)
    else:
        print("No view counts found.")

In [None]:
Q5. Write a python program to extract the time of posting of video for the first five videos.

In [None]:
To extract the time of posting for the first five videos from a YouTube channel, 
you can use the BeautifulSoup library to parse the HTML content of the page and extract the necessary information.
Here's a Python program that demonstrates how to do this:

In [None]:
import requests
from bs4 import BeautifulSoup

def extract_video_post_times(channel_url, num_videos):
    # Send a request to the channel URL
    response = requests.get(channel_url)
    if response.status_code != 200:
        print("Error: Unable to fetch the webpage.")
        return []

    # Parse the HTML content using BeautifulSoup
    soup = BeautifulSoup(response.text, 'html.parser')

    # Find all video time elements
    time_elements = soup.find_all('div', {'class': 'style-scope ytd-grid-video-renderer'})

    # Extract video post times from the elements
    video_post_times = []
    for element in time_elements:
        if len(video_post_times) < num_videos:
            time_text = element.find('span', {'class': 'style-scope ytd-grid-video-renderer'}).get_text().strip()
            video_post_times.append(time_text)

    return video_post_times

if __name__ == "__main__":
    channel_url = "https://www.youtube.com/@PW-Foundation/videos"
    num_videos = 5

    video_post_times = extract_video_post_times(channel_url, num_videos)

    if video_post_times:
        print("Time of posting for the first 5 videos:")
        for post_time in video_post_times:
            print(post_time)
    else:
        print("No post times found.")

In [None]:
.............................................The End.....................................