# "Web Scrapping Youtube with Selenium"
> "Analyzing the Top 10 Youtube Tech Channels "

- toc: false
- badges: true
- comments: true
- categories: [Selenium, Web Scrapping, Pandas, Youtube, Python]
- image: "images/thumbnails/header_youtube_web.png"

Notebook Created by: __David Rusho__ ([Github Blog](https://drusho.github.io/blog) | [Tableau](https://public.tableau.com/app/profile/drusho/) | [Linkedin](https://linkedin.com/in/davidrusho))

## About the Data

Web scraping was performed on the _Top 10 Tech Channels_ on Youtube using _[Selenium](https://selenium-python.readthedocs.io/)_ (an automated browser (driver) controlled using python, which is often used in web scraping and web testing).  These channels were selected using a __[Top 10 Tech Youtubers](https://blog.bit.ai/top-tech-youtubers/)__ list from blog.bit.ai.  

Data from 2,000 videos was scrapped, which equals about 200 of most popular videos per channel.

## Introduction

## Collecting and Cleaning Data


### Web Scrapping Youtube Channels

In [None]:
#collapse

import pandas as pd
import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

# Chrome driver location (for M1 macbook air)
DRIVER_PATH = "/opt/homebrew/bin/chromedriver"

# activate driver
driver = webdriver.Chrome(executable_path=DRIVER_PATH)


# Scroll to bottom of page
def scroll_page():
    for x in range(7):
        html = driver.find_element_by_tag_name("html")
        html.send_keys(Keys.END)
        time.sleep(2)


def scrap_videos():
    scroll_page()

    chan_xpath = '//*[@id="channel-name"]'
    subs_xpath = '//*[@id="subscriber-count"]'
    videos_class = "style-scope ytd-grid-video-renderer"
    views_xpath = './/*[@id="metadata-line"]/span[1]'
    post_date_xpath = './/*[@id="metadata-line"]/span[2]'

    title_xpath = './/*[@id="video-title"]'

    # Scrap Channel Name
    try:
        channel_name = driver.find_element_by_xpath(chan_xpath).text
    except (Exception,):
        pass

    # Scrap Number of Subscribers
    try:
        subscribers = driver.find_element_by_xpath(subs_xpath).text
    except (Exception,):
        pass

    # Reassign variable to recalculate all videos
    videos = driver.find_elements_by_class_name(videos_class)

    # Loop through all videos
    for video in videos:

        # grab title if available
        try:
            title = video.find_element_by_xpath(title_xpath).text
        except (Exception,):
            pass

        # grab url if available
        try:
            url = video.find_element_by_xpath(title_xpath).get_attribute("href")
        except (Exception,):
            pass

        # grab views if available
        try:
            views = video.find_element_by_xpath(views_xpath).text
        except (Exception,):
            pass

        # grab post date if available
        try:
            post_date = video.find_element_by_xpath(post_date_xpath).text
        except (Exception,):
            pass

        video_items = {
            "channel_name": channel_name,
            "subscribers": subscribers,
            "title": title,
            "views": views,
            "post_date": post_date,
            "url": url,
        }

        vid_list.append(video_items)

    return vid_list


# scrap Channel About section
def scrap_about():

    chan_name_xp = '//*[@id="channel-name"]'
    chan_join = './/*[@id="right-column"]/yt-formatted-string[2]/span[2]'
    chan_views = './/*[@id="right-column"]/yt-formatted-string[3]'
    chan_desc = './/*[@id="description"]'

    # Scrap Channel Name
    try:
        channel_name = driver.find_element_by_xpath(chan_name_xp).text
    except (Exception,):
        pass

    # Scrap Channel Join Date (about)
    try:
        channel_join = driver.find_element_by_xpath(chan_join).text
    except (Exception,):
        pass

    # Scrap Channel Views (about)
    try:
        channel_views = driver.find_element_by_xpath(chan_views).text
    except (Exception,):
        pass

    # Scrap Channel Description (about)
    try:
        channel_description = driver.find_element_by_xpath(chan_desc).text
    except (Exception,):
        pass

    about_items = {
        "channel_name": channel_name,
        "channel_join_date": channel_join,
        "channel_views": channel_views,
        "channel_description": channel_description,
    }

    vid_list.append(about_items)
    return vid_list


# top youtubers based off 'https://blog.bit.ai'
top_youtubers = [
    "ijustine",
    "AndroidAuthority",
    "Mrwhosetheboss",
    "TechnoBuffalo",
    "TLD",
    "austinevans",
    "unboxtherapy",
    "LinusTechTips",
    "UrAvgConsumer",
    "mkbhd",
]

# empty list to hold video details
vid_list = []

# url of most videos sorted by most popular
for youtuber in top_youtubers:
    print(f"processing {youtuber}")
    url = f"https://www.youtube.com/{youtuber}/videos?view=0&sort=p&flow=grid"
    driver.get(url)
    scroll_page()
    vid_list = scrap_videos()
    about_url = f"https://www.youtube.com/{youtuber}/about"
    about = driver.get(about_url)
    driver.implicitly_wait(10)
    about_items = scrap_about()

# Close Chrome browser
driver.quit()

# create pandas df for video info
df_channel = pd.DataFrame(vid_list)

# export df to csv
df_channel.to_csv("yt_channel_scrap.csv")

### Web Scrapping Youtube Videos

In [None]:
#collapse

import pandas as pd
import time
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from datetime import datetime
from requests import options
from selenium import webdriver


# driver options (size and headless)
options = Options()
options.add_argument("--headless")
options.add_argument("--window-size=1920x1080")

# Chrome driver location (for M1 macbook air)
DRIVER_PATH = "/opt/homebrew/bin/chromedriver"

# activate driver
driver = webdriver.Chrome(executable_path=DRIVER_PATH, options=options)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")


# partial video description
def par_description():
    vid_desc = "//div[@class='watch-main-col']/meta[@itemprop='description']"
    elems = driver.find_elements_by_xpath(vid_desc)
    for elem in elems:
        return elem.get_attribute("content")


# publish_date
def publish():
    pub_date = "//div[@class='watch-main-col']/meta[@itemprop='datePublished']"
    elems = driver.find_elements_by_xpath(pub_date)
    for elem in elems:
        return elem.get_attribute("content")


# upload_date
def upload():
    upload_date = "//div[@class='watch-main-col']/meta[@itemprop='uploadDate']"
    elems = driver.find_elements_by_xpath(upload_date)
    for elem in elems:
        return elem.get_attribute("content")


# genre
def genre():
    genre = "//div[@class='watch-main-col']/meta[@itemprop='genre']"
    elems = driver.find_elements_by_xpath(genre)
    for elem in elems:
        return elem.get_attribute("content")


# video_width
def width():
    v_width = "//div[@class='watch-main-col']/meta[@itemprop='width']"
    elems = driver.find_elements_by_xpath(v_width)
    for elem in elems:
        return elem.get_attribute("content")


# video_height
def height():
    v_height = "//div[@class='watch-main-col']/meta[@itemprop='height']"
    elems = driver.find_elements_by_xpath(v_height)
    for elem in elems:
        return elem.get_attribute("content")


# Interaction Count
def interactions():
    interactions = "//div[@class='watch-main-col']/meta[@itemprop='interactionCount']"
    elems = driver.find_elements_by_xpath(interactions)
    for elem in elems:
        return elem.get_attribute("content")


# Video_title
def video_title():
    video_title = "//div[@class='watch-main-col']/meta[@itemprop='name']"
    elems = driver.find_elements_by_xpath(video_title)
    for elem in elems:
        return elem.get_attribute("content")


# Channel_name
def channel_name():
    channel_name = (
        "//div[@class='watch-main-col']/span[@itemprop='author']/link[@itemprop='name']"
    )
    elems = driver.find_elements_by_xpath(channel_name)
    for elem in elems:
        return elem.get_attribute("content")


# Number Likes
def likes():
    likes_xpath = "(//div[@id='top-level-buttons-computed']//*[contains(@aria-label,' likes')])[last()]"
    return driver.find_element_by_xpath(likes_xpath).text


# Total Comments
def comments():
    # Move Page to display comments
    # set scroll pause time
    SCROLL_PAUSE_TIME = 0.5

    # scroll to page bottom
    driver.execute_script("window.scrollTo(0, 1080)")

    # Wait for page load
    time.sleep(SCROLL_PAUSE_TIME)

    # scroll to page bottom
    driver.execute_script("window.scrollTo(300, 1080)")

    # Wait to load page
    time.sleep(SCROLL_PAUSE_TIME)

    com = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located(
            (By.XPATH, '//*[@id="count"]/yt-formatted-string')
        )
    )
    return com.text


# import csv of youtube channels data
df_channels = pd.read_csv(
    "yt_channel_scrap.csv",
)

# new df of channel names and urls
df_videos = df_channels[["channel_name", "url"]].dropna()

# isolate video urls to a list
url_list = df_videos.url.to_list()

vid_list = []
url_fails_ls = []

count = 0

# # launch driver(s)
for url in url_list:
    driver.get(url)
    count += 1
    time.sleep(3)
    subscribe_button = '//*[@id="subscribe-button"]'
    WebDriverWait(driver, 30).until(
        EC.presence_of_element_located((By.XPATH, subscribe_button))
    )

    try:
        comments_num = comments()
        likes_num = likes()
        chan_name = channel_name()
        v_duration = duration()
        p_description = par_description()
        publish_date = publish()
        upload_date = upload()
        v_genre = genre()
        v_width = width()
        v_height = height()
        title = video_title()
        interaction_count = interactions()

    except:
        print(f"EXCEPTION RAISED for {url}")
        url_fails_ls.append(url)
        pass

    video_items = {
        "url": url,  # primary key
        "Channel Name": chan_name,
        "Title": title,
        "Duration": v_duration,
        "Partial Description": p_description,
        "Publish Date": publish_date,
        "Upload_date": upload_date,
        "Genre": v_genre,
        "Width": v_width,
        "Height": v_height,
        "Likes": likes_num,
        "Comments": comments_num,
        "Interaction Count": interaction_count,
    }

    vid_list.append(video_items)

    # print(f"url {count} of {len(url_list)} complete")
    # print every 10th url
    if count % 10 == 0:
        print(f"URL {count} of {len(url_list)} processed.")

driver.quit()

# # create dfs for video and failed urls
df_videos = pd.DataFrame(vid_list)

# store urls that failed to load in driver
url_fails_dict = {"url": url_fails_ls}
df_url_fails = pd.DataFrame(url_fails_dict)


print("Driver Quit")
print("Code Duration: {}".format(end_time - start_time))
print(f"Videos Processed: {len(vid_list)}")
print(f"Failures: {len(url_fails_ls)}")

# export df to csv
df_url_fails.to_csv(
    "url_fails.csv"
)

df_videos.to_csv(
    "yt_videos_scrap.csv"
)

### Importing and Cleaning the Data

> Note: Code in the cell below comes from [this notebook](https://colab.research.google.com/drive/1urQPIhLlr8U8LRB2pHQHkSuLyae60nly?usp=sharing) I created to originally clean and merge the data.

In [None]:
# collapse

import pandas as pd

# load channel csv
yt = pd.read_csv("yt_channel_scrap.csv", parse_dates=["channel_join_date"])

# create df of Channel details
channel_details = yt[yt.channel_join_date.notna()]
channel_details = channel_details.drop(
    columns=["Unnamed: 0", "subscribers", "title", "views", "post_date"]
).reset_index(drop=True)

# create df Video details
video_details = yt[yt.channel_join_date.isna()]
video_details = video_details.drop(
    columns=[
        "Unnamed: 0",
        "channel_join_date",
        "channel_views",
        "channel_description",
        "post_date",
    ]
).reset_index(drop=True)

# merge dfs
merged = channel_details.merge(video_details, on="channel_name")


# drop 2nd url column and rename remaining url col
merged.drop(columns=("url_x"), inplace=True)
merged.rename(columns={"url_y": "url"}, inplace=True)

# dtypes to float for views and subscribers
merged.subscribers = (
    merged.subscribers.str.replace("M subscribers", "").astype("float") * 1000000
)

# modify views col dtype to float
def fix_views(col):
    if "M" in col:
        return float(col.replace("M views", "")) * 1000000
    elif "K" in col:
        return float(col.replace("K views", "")) * 1000
    elif "1 year ago" in col:
        return 0


merged["views"] = merged["views"].apply(fix_views)

# Correct channel view column to display num only
merged["channel_views"] = (
    merged["channel_views"].str.replace(",", "").str.replace(" views", "").astype("int")
)


# import Videos csv data
df_videos = pd.read_csv(
    "yt_videos_scrap_big_data.csv", parse_dates=["Publish Date", "Upload_date"]
)
df_videos.drop(
    columns=["Unnamed: 0", "Duration", "Channel Name", "Title"], inplace=True
)


# comments dytpe to int
df_videos["Comments"] = (
    df_videos["Comments"].str.replace("Comments", "").str.replace(",", "").astype("int")
)

# modify likes col dtype to float
def fix_likes(col):
    if "M" in col:
        return float(col.replace("M", "")) * 1000000
    elif "K" in col:
        return float(col.replace("K", "")) * 1000
    else:
        return float(col)


# Fix Likes Column
df_videos["Likes"] = df_videos["Likes"].apply(fix_likes)


# Fix Width and Height, remove '.' and '0' from end of str
df_videos["Width"] = df_videos["Width"].astype("str").str.split(".", expand=True)[0]
df_videos["Height"] = df_videos["Height"].astype("str").str.split(".", expand=True)[0]

vc_merged = merged.merge(df_videos, on="url")

# rename columns to increase readability in analysis plots and tables
vc_merged.rename(
    columns={
        "channel_name": "Channel Name",
        "channel_join_date": "Channel Join Date",
        "channel_views": "Channel Views (M)",
        "subscribers": "Subscribers (M)",
        "Interaction Count": "Interactations (M)",
        "views": "Video Views (M)",
        "Partial Description": "Video Desc",
        "Publish Date": "Publish Date",
        "Upload_date": "Upload Date",
        "Genre": "Video Genre",
        "Width": "Width",
        "Height": "Height",
        "Comments": "Video Comments",
        "title": "Video Title",
        "url": "Video URL",
    },
    inplace=True,
)


### Data Cleaning Complete

Fully cleaned and merged data from Youtubes Channels and all Videos.

> Note: The columns __Channel Views (M)__, __Subscribers (M)__, __Video Views (M)__, and __Interactions (M)__ are in millions.  

Example: iJustine Channel has 6.89 **M** Subscribers.

In [None]:
#hide
# shorten column numbers length by millions 

vc_merged['Channel Views (M)'] = round(vc_merged['Channel Views (M)']/1000000,2)
vc_merged['Video Views (M)'] = vc_merged['Video Views (M)']/1000000
vc_merged['Subscribers (M)'] = vc_merged['Subscribers (M)']/1000000
vc_merged['Interactations (M)'] = round(vc_merged['Interactations (M)']/1000000,2)

vc_merged.head(2)

Unnamed: 0,Channel Name,Channel Join Date,Channel Views (M),channel_description,Subscribers (M),Video Title,Video Views (M),Video URL,Video Desc,Publish Date,Upload Date,Video Genre,Width,Height,Likes,Video Comments,Interactations (M)
0,iJustine,2006-05-07,0.0,"Tech, video games, failed cooking attempts, vl...",7e-06,Black Eyed Peas - I gotta Feeling (Parody),1.8e-05,https://www.youtube.com/watch?v=iPgaTmsYTT8,Thanks for watching! Don't forget to subscribe...,2009-07-30,2009-07-30,Comedy,1280,720,102000.0,23437,0.0
1,iJustine,2006-05-07,0.0,"Tech, video games, failed cooking attempts, vl...",7e-06,Cake Decorating Challenge with Ro | Nerdy Numm...,1.2e-05,https://www.youtube.com/watch?v=y7xZ-kJDgvM,Thanks for watching! Don't forget to subscribe...,2016-02-18,2016-02-18,Howto & Style,1280,720,99000.0,8421,0.0


#hide
#### Column Descriptions

|Column Name  | Description |
|:--|:--|
|Channel Name|Name of Youtube Channel  |
|Channel Join Date|Date Channel was created|
|Channel Views (M)|Total views the channel has received (in millions)|
|Channel Description|Description of Youtube Channel|
|Subscribers (M)|Number of channel subscribers (in millions)|
|Video Title|Video title|
|Video Views (M)|Total views for video (in millions)|
|Video URL|Video url|
|Video Desc|Description of video|
|Publish Date|Date video was published|
|Upload Date|Date video was uploaded|
|Video Genre|Genre of video|
|Width|Width of video|
|Height|Height of video|
|Likes|Total likes for video|
|Video Comments|Total comments for video|
|Interactions (M)|Number of interactions video has received (in millions)|


## Data Analysis

### Channels Ordered by Join Date

> Note: Join Date is the date that the Youtube Channel was created.

In [None]:
#collapse
# List of Video Channels
yt_chan = vc_merged.groupby(['Channel Join Date','Channel Name','Channel Views (M)'])['Subscribers (M)'].max().to_frame().reset_index()

# rename columns to increase readability
yt_chan.rename(columns={
    'Channel Name':'Channel',
    'Channel Join Date':'Join Date',
    'Subscribers (M)':'Subscribers',
    'Channel Views (M)':'Channel Views'
    },inplace=True)

# style dateframe to highlight highest values
yt_chan.style.format(formatter={'Subscribers': "{:,} M",
                                 'Channel Views': "{:,} M",
                                 'Join Date': "{:%Y-%m-%d}"}).background_gradient(subset=['Channel Views',
                                                                                          'Subscribers'], 
                                                                                  cmap='Wistia').hide_index()

Join Date,Channel,Channel Views,Subscribers
2006-05-07,iJustine,"1,288.99 M",6.89 M
2007-06-07,Jon Rettinger,574.95 M,1.59 M
2007-08-04,Austin Evans,"1,118.91 M",5.07 M
2008-03-21,Marques Brownlee,"2,597.03 M",14.3 M
2008-11-24,Linus Tech Tips,"4,934.74 M",13.7 M
2010-03-24,Jonathan Morrison,430.64 M,2.64 M
2010-12-21,Unbox Therapy,"4,091.68 M",18.0 M
2011-04-03,Android Authority,767.86 M,3.36 M
2011-04-20,Mrwhosetheboss,"1,208.15 M",7.71 M
2012-01-01,UrAvgConsumer,430.38 M,3.11 M


### Top 10 Most Viewed Videos

> Note: __70%__ of the videos in this list are about phones.

In [None]:
#collapse
# Top 10 Videos by Views
top_chan = vc_merged.groupby(['Video Title',
                              'Channel Name',
                              'Publish Date'])['Video Views (M)'].max().sort_values(ascending=False).head(10).reset_index()

# rename columns to increase readability
top_chan.rename(columns={
    'Channel Name':'Channel',
    'Video Views (M)':'Video Views'
    },inplace=True)

top_chan.style.format(formatter={'Video Views': "{:,} M",
                                 'Publish Date': "{:%Y-%m-%d}"}).background_gradient(subset=['Video Views',
                                                                                                   'Publish Date'], cmap='Wistia').hide_index()


Video Title,Channel,Publish Date,Video Views
iPhone 6 Plus Bend Test,Unbox Therapy,2014-09-23,73.0 M
Retro Tech: Game Boy,Marques Brownlee,2019-04-19,28.0 M
BROKE vs PRO Gaming,Austin Evans,2019-08-03,22.0 M
Samsung Galaxy Fold Unboxing: Magnets!,Marques Brownlee,2019-04-16,22.0 M
Turn your Smartphone into a 3D Hologram | 4K,Mrwhosetheboss,2015-08-01,22.0 M
OnePlus 6 Review: Right On the Money!,Marques Brownlee,2018-05-25,21.0 M
This Smartphone Changes Everything...,Unbox Therapy,2018-06-19,21.0 M
The 4 Dollar Android Smartphone,Unbox Therapy,2016-03-11,20.0 M
This Cup Is Unspillable - What Magic Is This?,Unbox Therapy,2016-07-03,20.0 M
"Unboxing The $20,000 Smartphone",Unbox Therapy,2016-12-25,19.0 M


### Channels Grouped by Total Video Views

Sum of all videos for each channel.

> Note: There is an obvious relationship between __Subscribers__ and __Video View__ counts.


In [None]:
#collapse
# Total Views by Channel

chan_views = vc_merged.groupby(['Channel Name','Subscribers (M)'])['Video Views (M)'].sum().sort_values(ascending=False).reset_index()

# rename columns to increase readability
chan_views.rename(columns={
    'Channel Name':'Channel',
    'Video Views (M)':'Video Views',
    'Subscribers (M)':'Subscribers'
    },inplace=True)

chan_views.style.format(formatter={'Video Views': "{:,}",
                                   'Video Views':'{0:,.0f} M',
                                 'Subscribers': "{:,} M"}).background_gradient(subset=['Video Views','Subscribers'], cmap='Wistia').hide_index()

Channel,Subscribers,Video Views
Unbox Therapy,18.0 M,"1,522 M"
Marques Brownlee,14.3 M,"1,286 M"
Linus Tech Tips,13.7 M,"1,158 M"
Mrwhosetheboss,7.71 M,816 M
Austin Evans,5.07 M,600 M
iJustine,6.89 M,597 M
Android Authority,3.36 M,288 M
Jonathan Morrison,2.64 M,249 M
UrAvgConsumer,3.11 M,249 M
Jon Rettinger,1.59 M,193 M


### Top 10 Liked Videos

> Note: ["Reflecting on the Color of My Skin"](https://www.youtube.com/watch?v=o-_WXXVye3Y) created by __Marques Brownlee__  and  ["I've been thinking of retiring"](https://www.youtube.com/watch?v=hAsZCTL__lo) created by **Linus Tech Tips** are videos that don't review a tech product.

In [None]:
#collapse
# Top 10 Videos by Views
top_chan = vc_merged.groupby(['Video Title',
                              'Channel Name',
                              'Publish Date'])['Likes'].max().sort_values(ascending=False).head(10).reset_index()


# rename columns to increase readability
top_chan.rename(columns={
    'Channel Name':'Channel'
    },inplace=True)

top_chan.style.format(formatter={'Likes': "{:,}",
                                 'Publish Date': "{:%Y-%m-%d}"}).background_gradient(subset=['Publish Date','Likes'],
                                                                                     cmap='Wistia').hide_index()


Video Title,Channel,Publish Date,Likes
How THIS wallpaper kills your phone.,Mrwhosetheboss,2020-06-04,831000.0
Reflecting on the Color of My Skin,Marques Brownlee,2020-06-04,620000.0
PlayStation 5 Unboxing & Accessories!,Marques Brownlee,2020-10-27,572000.0
Talking Tech with Elon Musk!,Marques Brownlee,2018-08-17,497000.0
I've been thinking of retiring.,Linus Tech Tips,2020-01-22,480000.0
How THIS instagram story kills your phone.,Mrwhosetheboss,2021-05-06,456000.0
iPhone 12 Unboxing Experience + MagSafe Demo!,Marques Brownlee,2020-10-20,426000.0
This Cup Is Unspillable - What Magic Is This?,Unbox Therapy,2016-07-03,415000.0
AirPods Max Unboxing & Impressions: $550?!,Marques Brownlee,2020-12-10,396000.0
RETRO TECH: MACINTOSH,Marques Brownlee,2020-12-10,396000.0


### Top Video Likes Over Time (Scatter Plot)

### Frequency of Words in Video Titles (Word Wall - Youtube Logo)

### Frequency of Nouns in Vidoe Titles (Bar Chart)

### Frequency of Superlatives in Titles (Bar Chart)

In [None]:
### Frequency of Superlatives in Title

### Correlations

In [None]:
#collapse
vc_merged.corr().style.background_gradient(subset=['Channel Views (M)',
                                                   'Subscribers (M)',
                                                   'Video Views (M)',
                                                   'Likes',
                                                   'Video Comments',
                                                   'Interactations (M)'],
                                           cmap='Wistia')

Unnamed: 0,Channel Views (M),Subscribers (M),Video Views (M),Likes,Video Comments,Interactations (M)
Channel Views (M),1.0,0.907635,0.586217,0.570409,0.138889,0.583878
Subscribers (M),0.907635,1.0,0.65992,0.652701,0.163038,0.659026
Video Views (M),0.586217,0.65992,1.0,0.708341,0.155869,0.996397
Likes,0.570409,0.652701,0.708341,1.0,0.23568,0.715335
Video Comments,0.138889,0.163038,0.155869,0.23568,1.0,0.156037
Interactations (M),0.583878,0.659026,0.996397,0.715335,0.156037,1.0


## Conclusion

* Video Comment numbers have very little correlation to any data that was obtained in this project.

## Resources

- [Top 25 Selenium Functions That Will Make You Pro In Web Scraping](https://towardsdatascience.com/top-25-selenium-functions-that-will-make-you-pro-in-web-scraping-5c937e027244)

- [How to build a Web Scraper or Bot in Python using Selenium](https://medium.com/daily-programming-tips/how-to-build-a-web-scraper-or-bot-in-python-using-selenium-2815f20023f7)

- [Web Scraping: Introduction, Best Practices & Caveats](https://medium.com/velotio-perspectives/web-scraping-introduction-best-practices-caveats-9cbf4acc8d0f)

- [Web Scraping Job Postings from Indeed.com using Selenium](https://towardsdatascience.com/web-scraping-job-postings-from-indeed-com-using-selenium-5ae58d155daf)


- [How I Use Selenium to Automate the Web With Python. Pt1 -  John Watson Rooney
](https://www.youtube.com/watch?v=pUUhvJvs-R4)