<div class="markdown-google-sans">
  <h1>CS410 Final Project: Vader Sentiment Analysis on Youtube Videos</h1>
</div>

Goal: Explore the usage of Vader Sentiment Analysis on specific League of Legends Champion skins and evaluate its usage

In [3]:
# Mounting and setting up Environment
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:
MAIN_DIR = "drive/MyDrive/CS410Final"

In [None]:
#!pip install --upgrade google-api-python-client
#!pip install certifi
#!pip install nltk
#!pip install pandas
#!pip install requests

In [5]:
# Import relevant packages
import json
import requests
import pandas as pd
import nltk
import ipywidgets as widgets


from nltk.sentiment.vader import SentimentIntensityAnalyzer
from googleapiclient.discovery import build
from IPython.display import IFrame, display, clear_output

In [6]:
nltk.download("vader_lexicon")

[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

In [45]:
# Taking data from a reddit post getting votes from users for reference and evaluation.
# https://www.reddit.com/r/leagueoflegends/comments/124kb99/best_skins_per_champ_2022/

df = pd.read_excel(f"{MAIN_DIR}/strawpollData.xlsx", header=None, sheet_name='Sheet1', usecols="G")
df.head()

Unnamed: 0,6
0,PbZqR6DblyN
1,w4nWroaolyA
2,GPgV6Dzpkga
3,LVyK8eXjRn0
4,xVg7jQ4XOnr


In [46]:
# Next we grab the actual vote data using the strawpoll API

ENDPOINT = "https://api.strawpoll.com/v3"
API_KEY = "b226b890-9837-11ee-ad2e-be4921f9e76b"  # Here you will need to insert your own API key if you want to run this cell

poll_ids = df.iloc[:, 0].tolist()

champions = []
skins = []
votes = []

for poll_id in poll_ids:
    response = requests.get(ENDPOINT + '/polls/' + poll_id, headers={'X-API-KEY': API_KEY })
    if response.ok:
        poll_raw = response.json()
        champion_name = poll_raw['title']

        for poll_data in poll_raw["poll_options"]:
            skin_name = poll_data['value']
            vote_count = poll_data['vote_count']
            champions.append(champion_name)
            skins.append(skin_name)
            votes.append(vote_count)

    else:
        error = response.json()
        print(f'Error fetching poll {poll_id}: {error}')

skin_vote_df = pd.DataFrame({'Champion': champions, 'Skin': skins, 'Votes': votes})

In [47]:
# Quick check on number of unique champions to make sure everything loaded properly
unique_champion_count = skin_vote_df['Champion'].nunique()
print("Number of unique champion names:", unique_champion_count)

Number of unique champion names: 163


In [7]:
skin_vote_df.to_csv(f'{MAIN_DIR}/champion_vote_results.csv', index=False)

NameError: ignored

Since I already have the data saved to the CSV file, so we can skip the previous step as it takes a while for it to grab all the data

In [8]:
skin_vote_df = pd.read_csv(f'{MAIN_DIR}/champion_vote_results.csv', index_col=False)
skin_vote_df.head(5)

Unnamed: 0,Champion,Skin,Votes
0,Aatrox,Classic,44
1,Aatrox,Justicar,73
2,Aatrox,Mecha,312
3,Aatrox,Sea Hunter,35
4,Aatrox,Blood Moon,331


In [9]:
skin_count = skin_vote_df.groupby('Champion')['Skin'].count().reset_index().sort_values(by='Skin', ascending=False)
skin_count.head(5)

Unnamed: 0,Champion,Skin
78,Miss Fortune,19
29,Ezreal,18
72,Lux,18
57,Katarina,17
117,Sivir,17


Quick little cell that you can interact with to see all the available champions and what the votes for their skins were.

In [26]:
selected_champion = 'Aatrox'

champion_select = widgets.Dropdown(
    options=skin_vote_df['Champion'].unique(),
    value=selected_champion,
    description='Select a champion:',
    style={'description_width': 'initial'}
)

display(champion_select)
output = widgets.Output()

display(output)

def update_df(change):
    global selected_champion
    selected_champion = change['new']
    filtered_df = skin_vote_df[skin_vote_df['Champion'] == selected_champion].sort_values(by='Votes', ascending=False).reset_index(drop=True)
    with output:
      clear_output(wait=True)
      display(filtered_df)

champion_select.observe(update_df, names='value')

Dropdown(description='Select a champion:', options=('Aatrox', 'Ahri', 'Akali', 'Akshan', 'Alistar', 'Amumu', '…

Output()

Create a Youtube class to hold my Youtube related functions together.
<br>Documentation can be found here: https://developers.google.com/youtube/v3/docs

In [11]:
class YoutubeInfo:
  API_KEY = 'AIzaSyDbJWLlq1CuL7s8HvzvOjGK37IHj1GH7FA'   #  Here you will need to insert your own API key if you want to run this cell
  Max_Length = 300 # Max comment length. Limiting so as to not grab too go over api limit (just in case)

  # Since I am only interested in the videos from the SkinSpotlights YT channel, I can search specific videos on their channel using their ID
  CHANNEL_ID = "UC0NwzCHb8Fg89eTB5eYX17Q"


  # Helper method to search and grab video ID for specific skin
  @staticmethod
  def getVideoID(skinName):
    youtube = build('youtube','v3',developerKey=YoutubeInfo.API_KEY)

    request = youtube.search().list(
      part="snippet",
      channelId=YoutubeInfo.CHANNEL_ID,
      maxResults=1,
      q=skinName
    )
    response = request.execute()

    # Catch incase something went wrong
    # I am only getting and using the first video found on the channel. This works thanks to Youtubes very good search function.
    if 'items' in response and response['items']:
        return response['items'][0]['id']['videoId']
    else:
        return None

  # Uses getVideoID to grab video ID, then feed it into system to grab video comments
  @staticmethod
  def getVideoComments(skinName):
    video_id = YoutubeInfo.getVideoID(skinName)

    if not video_id:
      print(f"No video found for {skin_name}")
      return []

    comments = []
    youtube = build('youtube','v3',developerKey=YoutubeInfo.API_KEY)



    # commentTreads youtube api function to grab all top level comments from a video
    request = youtube.commentThreads().list(
        part="snippet",
        maxResults=100,
        textFormat="plainText",
        order="relevance",
        videoId=video_id
    )
    response = request.execute()

    while response:
      for comment_thread in response.get('items', []):
        comment = comment_thread["snippet"]["topLevelComment"]["snippet"]["textDisplay"]
        comments.append(comment)
      # Set maximum amount of comments to retrieve.
      if len(comments) >= YoutubeInfo.Max_Length:
          break

      # If the video has over 100 comments, then grab the next set of 100 comments
      if "nextPageToken" in response:
        request = youtube.commentThreads().list(
          part="snippet",
          maxResults=100,
          textFormat="plainText",
          order="relevance",
          videoId=video_id,
          pageToken = response["nextPageToken"]
        )
        response = request.execute()
      else:
        break

    return comments

Driving function to run our youtube calls for all the skins of a specific champion.

In [12]:
# Function to store and return the comments from a video
def getComments(champion_df):
  _comments_dict = {}

  for index, row in champion_df.iterrows():
    champion = row['Champion']
    skin = row['Skin']
    skin_name = f"{skin} {champion}"
    comments = YoutubeInfo.getVideoComments(skin_name)

    if champion not in _comments_dict:
      _comments_dict[champion] = {}

    _comments_dict[champion][skin] = comments

  return _comments_dict

Create a chapmion_df to be used to grab comments from the Youtube API.
<br>Create a comments_df to hold the comments extracted
<br>Create a a result_df to display how many comments per skin

In [13]:
champion_df = skin_vote_df[skin_vote_df['Champion'] == selected_champion].sort_values(by='Votes', ascending=False).reset_index(drop=True)
champion_df.head(5)

Unnamed: 0,Champion,Skin,Votes
0,Kindred,Spirit Blossom,432
1,Kindred,Porcelain,40
2,Kindred,Classic,21
3,Kindred,Shadowfire,21
4,Kindred,Super Galaxy,21


In [14]:
comments_dict = getComments(champion_df)

In [15]:
comments_df = pd.DataFrame([(champion, skin, len(comments)) for champion, skins in comments_dict.items() for skin, comments in skins.items()],
                            columns=['Champion', 'Skin', 'Total Comments'])

result_df = pd.merge(champion_df, comments_df, on=['Champion', 'Skin'], how='left').fillna(0)

In [16]:
result_df

Unnamed: 0,Champion,Skin,Votes,Total Comments
0,Kindred,Spirit Blossom,432,228
1,Kindred,Porcelain,40,148
2,Kindred,Classic,21,250
3,Kindred,Shadowfire,21,76
4,Kindred,Super Galaxy,21,162


Create and initialize the Vader Sentiment Analyzer.
<br>Documentation can be found here: https://www.nltk.org/_modules/nltk/sentiment/vader.html

In [17]:
sentAnalyzer = SentimentIntensityAnalyzer()

In [18]:
class SentimentAnalyzer:

  @staticmethod
  def analyzeComments(comments):
    sent_analyzer = SentimentIntensityAnalyzer()

    compound_scores = []
    positive_count = 0
    negative_count = 0
    neutral_count = 0

    for comment in comments:
      scores = sent_analyzer.polarity_scores(comment)
      compound_scores.append(scores['compound'])

      if scores['compound'] >= 0.05:
        positive_count += 1
      elif scores['compound'] <= -0.05:
          negative_count += 1
      else:
          neutral_count += 1

    mean_compound = sum(compound_scores) / len(compound_scores) if compound_scores else None

    return mean_compound, positive_count, negative_count, neutral_count

# This is the start of our loop if you want to attempt the call on other champions.

Next we grab all the data from our sentiment analysis and group it together to view

In [24]:
# Loop From this cell if you want to see data for a different champion
final_champion_df = champion_df.copy()

for index, row in champion_df.iterrows():
    champion = row['Champion']
    skin = row['Skin']

    comments = comments_dict.get(champion, {}).get(skin, [])

    mean_compound, pos_count, neg_count, neu_count = SentimentAnalyzer.analyzeComments(comments)

    final_champion_df.at[index, 'Mean_Compound'] = mean_compound
    final_champion_df.at[index, 'Positive_Count'] = pos_count
    final_champion_df.at[index, 'Negative_Count'] = neg_count
    final_champion_df.at[index, 'Neutral_Count'] = neu_count


final_champion_df

Unnamed: 0,Champion,Skin,Votes,Mean_Compound,Positive_Count,Negative_Count,Neutral_Count
0,Lux,Elementalist,292,0.312292,197.0,50.0,53.0
1,Lux,Dark Cosmic,164,0.137201,117.0,53.0,99.0
2,Lux,Cosmic,64,0.199531,160.0,63.0,77.0
3,Lux,Prestige Battle Academia,33,0.039691,65.0,54.0,63.0
4,Lux,Space Groove,33,0.064977,112.0,87.0,93.0
5,Lux,Battle Academia,24,0.150014,59.0,25.0,48.0
6,Lux,Star Guardian,17,0.122404,70.0,32.0,85.0
7,Lux,Empyrean,16,0.09051,41.0,27.0,22.0
8,Lux,Lunar Empress,16,0.142062,137.0,67.0,96.0
9,Lux,Commando,13,0.057199,36.0,24.0,41.0


In [25]:
final_champion_df.sort_values(by='Mean_Compound', ascending=False)

Unnamed: 0,Champion,Skin,Votes,Mean_Compound,Positive_Count,Negative_Count,Neutral_Count
0,Lux,Elementalist,292,0.312292,197.0,50.0,53.0
16,Lux,Classic,4,0.274362,186.0,62.0,52.0
13,Lux,Steel Legion,9,0.249081,168.0,51.0,81.0
14,Lux,Spellthief,9,0.201867,72.0,25.0,47.0
2,Lux,Cosmic,64,0.199531,160.0,63.0,77.0
15,Lux,Sorceress,4,0.17756,39.0,16.0,34.0
12,Lux,Prestige Porcelain,10,0.168153,39.0,16.0,20.0
17,Lux,Imperial,3,0.168092,79.0,32.0,54.0
5,Lux,Battle Academia,24,0.150014,59.0,25.0,48.0
8,Lux,Lunar Empress,16,0.142062,137.0,67.0,96.0


In [27]:
selected_champion = 'Aatrox'

champion_select = widgets.Dropdown(
    options=skin_vote_df['Champion'].unique(),
    value=selected_champion,
    description='Select a champion:',
    style={'description_width': 'initial'}
)

display(champion_select)
output = widgets.Output()

display(output)

def update_df(change):
    global selected_champion
    selected_champion = change['new']
    filtered_df = skin_vote_df[skin_vote_df['Champion'] == selected_champion].sort_values(by='Votes', ascending=False).reset_index(drop=True)
    with output:
      clear_output(wait=True)
      display(filtered_df)

champion_select.observe(update_df, names='value')


Dropdown(description='Select a champion:', options=('Aatrox', 'Ahri', 'Akali', 'Akshan', 'Alistar', 'Amumu', '…

Output()

In [28]:
champion_df = skin_vote_df[skin_vote_df['Champion'] == selected_champion].sort_values(by='Votes', ascending=False).reset_index(drop=True)
champion_df.head(5)

Unnamed: 0,Champion,Skin,Votes
0,Kindred,Spirit Blossom,432
1,Kindred,Porcelain,40
2,Kindred,Classic,21
3,Kindred,Shadowfire,21
4,Kindred,Super Galaxy,21


In [None]:
comments_dict = getComments(champion_df)