<a href="https://colab.research.google.com/github/phuochoang23/Text_Analysis_Final_Project/blob/main/Final_Project_PHH.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Analyzing Public Sentiment Towards Senator Tuberville's Blockade of Military Promotions

##Introduction
Following the overturning of the Roe v. Wade ruling, a significant issue arose for military service members stationed in states where abortions are illegal, preventing them from accessing essential reproductive health care. Addressing this concern, on October 20, 2022, the Department of Defense (DoD) countered by unveiling a new policy to facilitate access to non-covered reproductive health care for its service members and their families ([Department of Defense, 2023](www.defense.gov/News/Releases/Release/Article/3301006/dod-releases-policies-to-ensure-access-to-non-covered-reproductive-health-care/)). This groundbreaking policy grants service members the necessary time off, travel allowances, and support to seek reproductive health care outside their current stationed state. However, the introduction of this policy faced opposition from Alabama Senator Tommy Tuberville.

In protest of the new DoD policy, Senator Tuberville initiated a blockade on senior military promotions across the DoD. Annually, the Senate approves over 50,000 military nominations and promotions, conducted in batches for efficiency. Typically, the Senate can unanimously approve an entire list at once. Nevertheless, Senators have the option to demand roll call votes, deliberating on each name individually—a choice that significantly prolongs the approval process for these nominations.

Senator Tuberville has enforced the option for roll call votes on a batch of the most senior leadership positions in the military. It was initially estimated that over 600 military leadership positions would be unfilled by the end of 2023. As such, unfilled senior billeted positions gapped from this blockade will severely diminish national security. The holdout remained consistent for over 9 months. However, as of December 6th, 2023, after facing significant bipartisan backlash from his colleagues and constituents, Senator Tuberville has dropped the hold on over 430 military promotions ([Shane, 2023](https://www.navytimes.com/news/pentagon-congress/2023/12/05/tuberville-drops-holds-on-more-than-430-military-promotions/)).

Despite the lifting of Senator Tuberville's blockade, lingering concerns persist due to the delayed promotions. These issues encompass a hurried process to officially promote 430 senior military officers in formal ceremonies, determining back-pay for the promotions, relocating these officers to their new permanent duty stations, and, most crucially, enabling them to assume their new roles in directing policies and procedures commensurate with their positions. These residual concerns are not exhaustive, and many have yet to be previously considered, given that this blockade marked the first of its kind.

###Research Question and Objective
This research paper will utilize data science methods to develop a deeper understanding of public sentiment towards leveraging the United States' national security to enforce changes in political policies. Specifically, the overall objective is to determine whether there is bipartisan support or opposition to Senator Tuberville's blockade of military promotions due to its potential to jeopardize the United States' national security. As such, to achieve this objective, the research will seek to answer the following questions:
>1. Is there a collective sentiment across politically polarized media sources regarding Senator Tuberville's blockade of military promotions?
>2. Has the sentiment changed or shifted over the last 10 months? If so in which way?

###Data Collection
For this research, I will employ the YouTube Application Programming Interface (API) to conduct sentiment and keyword analyses on comments from news videos covering Senator Tuberville's blockade. These videos will be selected from media sources situated at opposing ends of the political spectrum. According to a Pew Research Center Poll on political polarization media habits, 78% of predominantly conservative and consistently conservative web users rely on Fox News as their primary source for government and political news. In contrast, 35% of survey participants prefer CNN News for their government and political news ([Mitchell, Gottfried, Kiley and Matsa, 2014](https://www.pewresearch.org/journalism/2014/10/21/political-polarization-media-habits/)).

To ensure a comprehensive analysis, two videos from each media source will be chosen for this study. One video from each outlet will represent preliminary reports prior to Senator Tuberville's blockade being lifted, while the other will be selected from a timeframe after the blockade has been dropped. This approach aims to capture potential shifts in sentiment across the two distinct periods, offering insights into the evolving public perception of the events surrounding the blockade.

From YouTube, the following videos were selected:
1. Alabama Senator Holding Up Military Promotions, Fox News, September 7th, 2023
2. Alabama Senator Tuberville Blocks Hundreds of Military Promotions, Creating National Security Concern, Fox News, December 6th, 2023
3. Three military leaders make rare joint appearance to demand action from senator, CNN, September 5th 2023
4. Tuberville says he will release holds on military nominations three-star and below, CNN, December 5th, 2023

>Link to original videos:
>1. https://www.youtube.com/watch?v=201JSBJVaXg
>2. https://www.youtube.com/watch?v=OMwDXP0QmF8
>3. https://www.youtube.com/watch?v=RBhcxhzGltI
>4. https://www.youtube.com/watch?v=J3GS2nchryE

###Hypothesis
I hypothesize that despite the political differences between the different media sources and their viewers, an overwhelming majority of people are opposed to using the military promotions and national security to leverage discourse for political policy divergences. Thus, my hypothesis to research questions is that there will be shared negative sentiment against the videos reporting the blockade of military promotions. Furthermore, the videos subsequent Senator Tuberville's removal of the blockades will receive a more positive outlook.

##Significance and Implications to Public Policy
This topic holds significant relevance to public policy for multiple reasons. Initially, Senator Tuberville's blockade on military promotions has resulted in substantial delays in filling essential senior military leadership positions within the Department of Defense (DoD). As geopolitical tensions escalate, particularly in the Indo-Pacific Region with China and amidst the Palestine-Israel conflict in the Middle East, the prolonged vacancies in senior military roles pose significant threats to both current and future U.S. national security. Consequently, issues on military readiness and security will become critical concerns for any government's public policy agenda.

Moreover, this research project delves into the broader concern of the politicization of national security. The importance of maintaining bipartisan support on military matters is underscored, emphasizing the need for collaborative and non-partisan approaches to ensure the nation's security interests are adequately addressed. By examining the implications of Senator Tuberville's actions within this context, the study contributes valuable insights to the ongoing discourse on the intersection of politics and national security policy.

##Data Analysis
###Step One: Data Collection
For the first step, I saved the comments for the four videos into individual .csv files. I chose to exclude the replies based on the belief that the sentiment for the reply would be in relation to the original user's comment rather than the video itself. The number of comments per video varied significantly. The following are the number of comments per video imported into the .csv file. There was difficulty finding a Fox New video that was released prior to the removal of the blockade that had multiple comments.
>1. Video 1: 27 Comments
>2. Video 2: 456 Comments
>3. Video 3: 754 Comments
>4. Video 4: 456 Comments



In [None]:
import os
import csv
from google.oauth2 import service_account
from googleapiclient.discovery import build

def get_authenticated_service(api_key=None):
    if api_key:
        return build('youtube', 'v3', developerKey=api_key)
    else:
        # Replace 'path/to/credentials.json' with the path to your JSON service account key file
        credentials = service_account.Credentials.from_service_account_file('path/to/credentials.json', scopes=['https://www.googleapis.com/auth/youtube.force-ssl'])
        return build('youtube', 'v3', credentials=credentials)

def get_video_comments(service, **kwargs):
    comments = []
    results = service.commentThreads().list(**kwargs).execute()

    while results:
        for item in results['items']:
            comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
            comments.append({'comment': comment})

        # Check if there are more comments
        if 'nextPageToken' in results:
            kwargs['pageToken'] = results['nextPageToken']
            results = service.commentThreads().list(**kwargs).execute()
        else:
            break

    return comments

def main():
    # Replace 'YOUR_API_KEY' with your actual API key or set up OAuth2 credentials
    api_key = 'AIzaSyCvLT-8ooJP3mmri47oa9gUfsr3U4T9Hi4'
    service = get_authenticated_service(api_key)

    video_id = '201JSBJVaXg'  # Replace with the actual video ID

    comments = get_video_comments(
        service,
        part='snippet',
        videoId=video_id,
        textFormat='plainText',
    )

    # Specify the CSV file path
    csv_file_path = '/content/drive/MyDrive/Final Project - Intro To Python/video1comments.csv'

    with open(csv_file_path, 'w', newline='', encoding='utf-8') as csvfile:
        # Create a CSV writer object
        csv_writer = csv.writer(csvfile)

        # Write header row
        csv_writer.writerow(['Comment'])

        for i, comment_data in enumerate(comments, start=1):
            comment = comment_data['comment']

            # Write the comment to CSV
            csv_writer.writerow([comment])

    print(f"Comments saved to {csv_file_path}")

if __name__ == '__main__':
    main()


Comments saved to /content/drive/MyDrive/Final Project - Intro To Python/video1comments.csv


In [None]:
import os
import csv
from google.oauth2 import service_account
from googleapiclient.discovery import build

def get_authenticated_service(api_key=None):
    if api_key:
        return build('youtube', 'v3', developerKey=api_key)
    else:
        # Replace 'path/to/credentials.json' with the path to your JSON service account key file
        credentials = service_account.Credentials.from_service_account_file('path/to/credentials.json', scopes=['https://www.googleapis.com/auth/youtube.force-ssl'])
        return build('youtube', 'v3', credentials=credentials)

def get_video_comments(service, **kwargs):
    comments = []
    results = service.commentThreads().list(**kwargs).execute()

    while results:
        for item in results['items']:
            comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
            comments.append({'comment': comment})

        # Check if there are more comments
        if 'nextPageToken' in results:
            kwargs['pageToken'] = results['nextPageToken']
            results = service.commentThreads().list(**kwargs).execute()
        else:
            break

    return comments

def main():
    # Replace 'YOUR_API_KEY' with your actual API key or set up OAuth2 credentials
    api_key = 'AIzaSyCvLT-8ooJP3mmri47oa9gUfsr3U4T9Hi4'
    service = get_authenticated_service(api_key)

    video_id = 'OMwDXP0QmF8'  # Replace with the actual video ID

    comments = get_video_comments(
        service,
        part='snippet',
        videoId=video_id,
        textFormat='plainText',
    )

    # Specify the CSV file path
    csv_file_path = '/content/drive/MyDrive/Final Project - Intro To Python/video2comments.csv'

    with open(csv_file_path, 'w', newline='', encoding='utf-8') as csvfile:
        # Create a CSV writer object
        csv_writer = csv.writer(csvfile)

        # Write header row
        csv_writer.writerow(['Comment'])

        for i, comment_data in enumerate(comments, start=1):
            comment = comment_data['comment']

            # Write the comment to CSV
            csv_writer.writerow([comment])

    print(f"Comments saved to {csv_file_path}")

if __name__ == '__main__':
    main()


Comments saved to /content/drive/MyDrive/Final Project - Intro To Python/video2comments.csv


In [None]:
import os
import csv
from google.oauth2 import service_account
from googleapiclient.discovery import build

def get_authenticated_service(api_key=None):
    if api_key:
        return build('youtube', 'v3', developerKey=api_key)
    else:
        # Replace 'path/to/credentials.json' with the path to your JSON service account key file
        credentials = service_account.Credentials.from_service_account_file('path/to/credentials.json', scopes=['https://www.googleapis.com/auth/youtube.force-ssl'])
        return build('youtube', 'v3', credentials=credentials)

def get_video_comments(service, **kwargs):
    comments = []
    results = service.commentThreads().list(**kwargs).execute()

    while results:
        for item in results['items']:
            comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
            comments.append({'comment': comment})

        # Check if there are more comments
        if 'nextPageToken' in results:
            kwargs['pageToken'] = results['nextPageToken']
            results = service.commentThreads().list(**kwargs).execute()
        else:
            break

    return comments

def main():
    # Replace 'YOUR_API_KEY' with your actual API key or set up OAuth2 credentials
    api_key = 'AIzaSyCvLT-8ooJP3mmri47oa9gUfsr3U4T9Hi4'
    service = get_authenticated_service(api_key)

    video_id = 'RBhcxhzGltI'  # Replace with the actual video ID

    comments = get_video_comments(
        service,
        part='snippet',
        videoId=video_id,
        textFormat='plainText',
    )

    # Specify the CSV file path
    csv_file_path = '/content/drive/MyDrive/Final Project - Intro To Python/video3comments.csv'

    with open(csv_file_path, 'w', newline='', encoding='utf-8') as csvfile:
        # Create a CSV writer object
        csv_writer = csv.writer(csvfile)

        # Write header row
        csv_writer.writerow(['Comment'])

        for i, comment_data in enumerate(comments, start=1):
            comment = comment_data['comment']

            # Write the comment to CSV
            csv_writer.writerow([comment])

    print(f"Comments saved to {csv_file_path}")

if __name__ == '__main__':
    main()

Comments saved to /content/drive/MyDrive/Final Project - Intro To Python/video3comments.csv


In [None]:
import os
import csv
from google.oauth2 import service_account
from googleapiclient.discovery import build

def get_authenticated_service(api_key=None):
    if api_key:
        return build('youtube', 'v3', developerKey=api_key)
    else:
        # Replace 'path/to/credentials.json' with the path to your JSON service account key file
        credentials = service_account.Credentials.from_service_account_file('path/to/credentials.json', scopes=['https://www.googleapis.com/auth/youtube.force-ssl'])
        return build('youtube', 'v3', credentials=credentials)

def get_video_comments(service, **kwargs):
    comments = []
    results = service.commentThreads().list(**kwargs).execute()

    while results:
        for item in results['items']:
            comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
            comments.append({'comment': comment})

        # Check if there are more comments
        if 'nextPageToken' in results:
            kwargs['pageToken'] = results['nextPageToken']
            results = service.commentThreads().list(**kwargs).execute()
        else:
            break

    return comments

def main():
    # Replace 'YOUR_API_KEY' with your actual API key or set up OAuth2 credentials
    api_key = 'AIzaSyCvLT-8ooJP3mmri47oa9gUfsr3U4T9Hi4'
    service = get_authenticated_service(api_key)

    video_id = 'J3GS2nchryE'  # Replace with the actual video ID

    comments = get_video_comments(
        service,
        part='snippet',
        videoId=video_id,
        textFormat='plainText',
    )

    # Specify the CSV file path
    csv_file_path = '/content/drive/MyDrive/Final Project - Intro To Python/video4comments.csv'

    with open(csv_file_path, 'w', newline='', encoding='utf-8') as csvfile:
        # Create a CSV writer object
        csv_writer = csv.writer(csvfile)

        # Write header row
        csv_writer.writerow(['Comment'])

        for i, comment_data in enumerate(comments, start=1):
            comment = comment_data['comment']

            # Write the comment to CSV
            csv_writer.writerow([comment])

    print(f"Comments saved to {csv_file_path}")

if __name__ == '__main__':
    main()

Comments saved to /content/drive/MyDrive/Final Project - Intro To Python/video4comments.csv


For the next step, I used the NLTK library to clean the data. Each individual .csv file was cleaned through by lemmatizing words and removing stopwords. The results were saved under a new "clean" .csv file. The code from this step was obtained through ChatGPT. [(ChatGPT, 2023)](https://chat.openai.com/share/d0596dc3-1924-476f-b827-841cbcd0776e)

In [None]:
import csv
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer

nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('punkt')  # Add this line to download the 'punkt' resource

def clean_text(text):
    # Tokenization
    words = nltk.word_tokenize(text)

    # Remove stopwords
    stop_words = set(stopwords.words('english'))
    filtered_words = [word for word in words if word.lower() not in stop_words]

    # Lemmatization using WordNet
    lemmatizer = WordNetLemmatizer()
    lemmatized_words = [lemmatizer.lemmatize(word) for word in filtered_words]

    # Join the cleaned words back into a sentence
    cleaned_text = ' '.join(lemmatized_words)

    return cleaned_text

def clean_csv(input_csv_path, output_csv_path):
    with open(input_csv_path, 'r', encoding='utf-8') as csvfile:
        reader = csv.reader(csvfile)
        header = next(reader)  # Read the header row

        # Find the index of the 'Comment' column
        comment_index = header.index('Comment') if 'Comment' in header else None

        if comment_index is not None:
            comments = []

            for row in reader:
                if len(row) > comment_index:
                    comment = row[comment_index]
                    cleaned_comment = clean_text(comment)
                    comments.append([cleaned_comment])

            # Write cleaned comments to a new CSV file
            with open(output_csv_path, 'w', newline='', encoding='utf-8') as output_csv:
                writer = csv.writer(output_csv)
                writer.writerow(['Comments'])
                writer.writerows(comments)

            print(f"Cleaned comments saved to {output_csv_path}")
        else:
            print("Error: 'Comment' column not found in the CSV file.")

if __name__ == '__main__':
    input_csv_path = '/content/drive/MyDrive/Final Project - Intro To Python/video1comments.csv'  # Replace with the path to your input CSV file
    output_csv_path = '/content/drive/MyDrive/Final Project - Intro To Python/cleanvideo1comments.csv'  # Replace with the desired path for the output cleaned CSV file

    clean_csv(input_csv_path, output_csv_path)


Cleaned comments saved to /content/drive/MyDrive/Final Project - Intro To Python/cleanvideo1comments.csv


[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [None]:
import csv
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer

nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('punkt')  # Add this line to download the 'punkt' resource

def clean_text(text):
    # Tokenization
    words = nltk.word_tokenize(text)

    # Remove stopwords
    stop_words = set(stopwords.words('english'))
    filtered_words = [word for word in words if word.lower() not in stop_words]

    # Lemmatization using WordNet
    lemmatizer = WordNetLemmatizer()
    lemmatized_words = [lemmatizer.lemmatize(word) for word in filtered_words]

    # Join the cleaned words back into a sentence
    cleaned_text = ' '.join(lemmatized_words)

    return cleaned_text

def clean_csv(input_csv_path, output_csv_path):
    with open(input_csv_path, 'r', encoding='utf-8') as csvfile:
        reader = csv.reader(csvfile)
        header = next(reader)  # Read the header row

        # Find the index of the 'Comment' column
        comment_index = header.index('Comment') if 'Comment' in header else None

        if comment_index is not None:
            comments = []

            for row in reader:
                if len(row) > comment_index:
                    comment = row[comment_index]
                    cleaned_comment = clean_text(comment)
                    comments.append([cleaned_comment])

            # Write cleaned comments to a new CSV file
            with open(output_csv_path, 'w', newline='', encoding='utf-8') as output_csv:
                writer = csv.writer(output_csv)
                writer.writerow(['Comments'])
                writer.writerows(comments)

            print(f"Cleaned comments saved to {output_csv_path}")
        else:
            print("Error: 'Comment' column not found in the CSV file.")

if __name__ == '__main__':
    input_csv_path = '/content/drive/MyDrive/Final Project - Intro To Python/video2comments.csv'  # Replace with the path to your input CSV file
    output_csv_path = '/content/drive/MyDrive/Final Project - Intro To Python/cleanvideo2comments.csv'  # Replace with the desired path for the output cleaned CSV file

    clean_csv(input_csv_path, output_csv_path)


[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


Cleaned comments saved to /content/drive/MyDrive/Final Project - Intro To Python/cleanvideo2comments.csv


In [None]:
import csv
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer

nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('punkt')  # Add this line to download the 'punkt' resource

def clean_text(text):
    # Tokenization
    words = nltk.word_tokenize(text)

    # Remove stopwords
    stop_words = set(stopwords.words('english'))
    filtered_words = [word for word in words if word.lower() not in stop_words]

    # Lemmatization using WordNet
    lemmatizer = WordNetLemmatizer()
    lemmatized_words = [lemmatizer.lemmatize(word) for word in filtered_words]

    # Join the cleaned words back into a sentence
    cleaned_text = ' '.join(lemmatized_words)

    return cleaned_text

def clean_csv(input_csv_path, output_csv_path):
    with open(input_csv_path, 'r', encoding='utf-8') as csvfile:
        reader = csv.reader(csvfile)
        header = next(reader)  # Read the header row

        # Find the index of the 'Comment' column
        comment_index = header.index('Comment') if 'Comment' in header else None

        if comment_index is not None:
            comments = []

            for row in reader:
                if len(row) > comment_index:
                    comment = row[comment_index]
                    cleaned_comment = clean_text(comment)
                    comments.append([cleaned_comment])

            # Write cleaned comments to a new CSV file
            with open(output_csv_path, 'w', newline='', encoding='utf-8') as output_csv:
                writer = csv.writer(output_csv)
                writer.writerow(['Comments'])
                writer.writerows(comments)

            print(f"Cleaned comments saved to {output_csv_path}")
        else:
            print("Error: 'Comment' column not found in the CSV file.")

if __name__ == '__main__':
    input_csv_path = '/content/drive/MyDrive/Final Project - Intro To Python/video3comments.csv'  # Replace with the path to your input CSV file
    output_csv_path = '/content/drive/MyDrive/Final Project - Intro To Python/cleanvideo3comments.csv'  # Replace with the desired path for the output cleaned CSV file

    clean_csv(input_csv_path, output_csv_path)


[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


Cleaned comments saved to /content/drive/MyDrive/Final Project - Intro To Python/cleanvideo3comments.csv


In [None]:
import csv
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer

nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('punkt')  # Add this line to download the 'punkt' resource

def clean_text(text):
    # Tokenization
    words = nltk.word_tokenize(text)

    # Remove stopwords
    stop_words = set(stopwords.words('english'))
    filtered_words = [word for word in words if word.lower() not in stop_words]

    # Lemmatization using WordNet
    lemmatizer = WordNetLemmatizer()
    lemmatized_words = [lemmatizer.lemmatize(word) for word in filtered_words]

    # Join the cleaned words back into a sentence
    cleaned_text = ' '.join(lemmatized_words)

    return cleaned_text

def clean_csv(input_csv_path, output_csv_path):
    with open(input_csv_path, 'r', encoding='utf-8') as csvfile:
        reader = csv.reader(csvfile)
        header = next(reader)  # Read the header row

        # Find the index of the 'Comment' column
        comment_index = header.index('Comment') if 'Comment' in header else None

        if comment_index is not None:
            comments = []

            for row in reader:
                if len(row) > comment_index:
                    comment = row[comment_index]
                    cleaned_comment = clean_text(comment)
                    comments.append([cleaned_comment])

            # Write cleaned comments to a new CSV file
            with open(output_csv_path, 'w', newline='', encoding='utf-8') as output_csv:
                writer = csv.writer(output_csv)
                writer.writerow(['Comments'])
                writer.writerows(comments)

            print(f"Cleaned comments saved to {output_csv_path}")
        else:
            print("Error: 'Comment' column not found in the CSV file.")

if __name__ == '__main__':
    input_csv_path = '/content/drive/MyDrive/Final Project - Intro To Python/video4comments.csv'  # Replace with the path to your input CSV file
    output_csv_path = '/content/drive/MyDrive/Final Project - Intro To Python/cleanvideo4comments.csv'  # Replace with the desired path for the output cleaned CSV file

    clean_csv(input_csv_path, output_csv_path)


[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


Cleaned comments saved to /content/drive/MyDrive/Final Project - Intro To Python/cleanvideo4comments.csv


Now I commence the data analysis phase. Firstly, I will import each cleaned .csv file as a dataframe, and subsequently, I will utilize the VADER tool to conduct a sentiment analysis. Following the Sentiment Analysis lesson and utilizing assistance from ChatGPT, I introduced a sentiment_score column displaying the compound sentiment analysis score for each comment. [(ChatGPT, 2023)](https://chat.openai.com/share/b550013a-ae48-40c8-8e37-3a5f2ed5f09a) I also incorporated a new bottom row to compute the average compound sentiment score for each video.

Moreover, I meticulously examined ten comments from each video with the highest positive compound sentiment scores and the highest negative sentiment scores. This scrutiny aims to objectively assess potential inaccuracies arising from sarcasm or similar textual nuances, which have been identified as common challenges with the VADER library.

In the final phase, I generated an overall count per video, categorizing sentiments into positive, neutral, and negative. This comprehensive approach ensures a robust evaluation of sentiment distribution within the dataset, accounting for both positive and negative sentiments with a critical eye on potential inaccuracies.

In [38]:
!pip install vaderSentiment

Collecting vaderSentiment
  Downloading vaderSentiment-3.3.2-py2.py3-none-any.whl (125 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m126.0/126.0 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: vaderSentiment
Successfully installed vaderSentiment-3.3.2


In [None]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

# Initialize VADER so we can use it later
sentimentAnalyser = SentimentIntensityAnalyzer()

In [39]:
import pandas as pd
pd.options.display.max_colwidth = 400

In [40]:
video1comments_df = pd.read_csv('/content/drive/MyDrive/Final Project - Intro To Python/cleanvideo1comments.csv', delimiter=',', encoding='utf-8')

In [41]:
video1comments_df

Unnamed: 0,Comments
0,People saying Tuberville secretly paid million China weaken American military .
1,"Tuberville never served day life action designed hurt America , American military family help China Russia . Sad people vote imbecile like Tuberville ."
2,"Republicans nothing Fascists , Racists hate woman , black , mexican etc .. Republicans blocking hundred Military promotion want control woman 's right choose want get pregnant . GOP joke come military readiness ."
3,wonder abortion another Putincan working Russia . certainly n't one .
4,justified !
5,TRAITOR
6,Pure $ hit ya tub
7,"Google : Trump , Nationalist… . Fred Trump ( ’ dad ) , KKK . Vladimir Putin Nationalist 🤔"
8,Smiling traitor Tuberville need replaced ! Impeach !
9,danger national security military change policy move .


In [43]:
video1comments_df['sentiment_score'] = video1comments_df['Comments'].apply(calculate_sentiment)



In [44]:
video1comments_df

Unnamed: 0,Comments,sentiment_score
0,People saying Tuberville secretly paid million China weaken American military .,-0.4215
1,"Tuberville never served day life action designed hurt America , American military family help China Russia . Sad people vote imbecile like Tuberville .",-0.6705
2,"Republicans nothing Fascists , Racists hate woman , black , mexican etc .. Republicans blocking hundred Military promotion want control woman 's right choose want get pregnant . GOP joke come military readiness .",0.8243
3,wonder abortion another Putincan working Russia . certainly n't one .,0.34
4,justified !,0.4574
5,TRAITOR,0.0
6,Pure $ hit ya tub,0.0
7,"Google : Trump , Nationalist… . Fred Trump ( ’ dad ) , KKK . Vladimir Putin Nationalist 🤔",0.0
8,Smiling traitor Tuberville need replaced ! Impeach !,0.555
9,danger national security military change policy move .,-0.25


In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download the VADER lexicon
nltk.download('vader_lexicon')

# Assuming video1comments_df is your DataFrame with a column named 'Comments'

# Function to calculate sentiment score using VADER
def calculate_sentiment(comment):
    analyzer = SentimentIntensityAnalyzer()
    sentiment_score = analyzer.polarity_scores(comment)['compound']
    return sentiment_score

# Apply the sentiment analysis function to the 'Comments' column
video1comments_df['sentiment_score'] = video1comments_df['Comments'].apply(calculate_sentiment)

# Calculate the average sentiment score
average_sentiment_score = video1comments_df['sentiment_score'].mean()

# Create a final row with the average sentiment score
average_row = {'Comments': 'Average Sentiment Score', 'sentiment_score': average_sentiment_score}

# Append the average row to the DataFrame
video1comments_df = video1comments_df.append(average_row, ignore_index=True)

# Display the updated DataFrame
print(video1comments_df)


[nltk_data] Downloading package vader_lexicon to /root/nltk_data...


                                                                                                                                                                                                                                                                                                                                                                                                           Comments  \
0                                                                                                                                                                                                                                                                                                                                   People saying Tuberville secretly paid million China weaken American military .   
1                                                                                                                                                                                         

  video1comments_df = video1comments_df.append(average_row, ignore_index=True)


In [None]:
video1comments_df

Unnamed: 0,Comments,sentiment_score
0,People saying Tuberville secretly paid million China weaken American military .,-0.4215
1,"Tuberville never served day life action designed hurt America , American military family help China Russia . Sad people vote imbecile like Tuberville .",-0.6705
2,"Republicans nothing Fascists , Racists hate woman , black , mexican etc .. Republicans blocking hundred Military promotion want control woman 's right choose want get pregnant . GOP joke come military readiness .",0.8243
3,wonder abortion another Putincan working Russia . certainly n't one .,0.34
4,justified !,0.4574
5,TRAITOR,0.0
6,Pure $ hit ya tub,0.0
7,"Google : Trump , Nationalist… . Fred Trump ( ’ dad ) , KKK . Vladimir Putin Nationalist 🤔",0.0
8,Smiling traitor Tuberville need replaced ! Impeach !,0.555
9,danger national security military change policy move .,-0.25


In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download the VADER lexicon
nltk.download('vader_lexicon')

# Assuming video1comments_df is your DataFrame with a column named 'Comments'

# Function to calculate sentiment score using VADER
def calculate_sentiment(comment):
    analyzer = SentimentIntensityAnalyzer()
    sentiment_score = analyzer.polarity_scores(comment)['compound']
    return sentiment_score

# Apply the sentiment analysis function to the 'Comments' column
video1comments_df['sentiment_score'] = video1comments_df['Comments'].apply(calculate_sentiment)

# Calculate the average sentiment score
average_sentiment_score = video1comments_df['sentiment_score'].mean()

# Create a final row with the average sentiment score
average_row = {'Comments': 'Average Sentiment Score', 'sentiment_score': average_sentiment_score}

# Append the average row to the DataFrame
video1comments_df = video1comments_df.append(average_row, ignore_index=True)

# Sort the DataFrame by 'sentiment_score' in descending order
sorted_df = video1comments_df.sort_values(by='sentiment_score', ascending=False)

# Display the top 10 most positive sentiments
top_10_positive = sorted_df.head(10)
print("Top 10 Most Positive Sentiments:")
print(top_10_positive[['Comments', 'sentiment_score']])

# Sort the DataFrame by 'sentiment_score' in ascending order to get the most negative sentiments first
sorted_df_negative = video1comments_df.sort_values(by='sentiment_score', ascending=True)

# Display the top 10 most negative sentiments
top_10_negative = sorted_df_negative.head(10)
print("Top 10 Most Negative Sentiments:")
print(top_10_negative[['Comments', 'sentiment_score']])


[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


Top 10 Most Positive Sentiments:
                                                                                                                                                                                                                                                                                                                                                                                                           Comments  \
21                              ’ going ? thought intelligent human being . ’ teach ancient history school anymore Greek army Spartans ? think better point make would ’ attract better people fill important position DC ? reason ’ attract better people Washington DC people working like Tuckerville ! really love country , like say , pay attention ’ sending DC fill important position federal government .   
20                                                                                    think awesome question ask Fox News constantly trying divide u coul

  video1comments_df = video1comments_df.append(average_row, ignore_index=True)


In [None]:
top_10_positive

Unnamed: 0,Comments,sentiment_score
21,"’ going ? thought intelligent human being . ’ teach ancient history school anymore Greek army Spartans ? think better point make would ’ attract better people fill important position DC ? reason ’ attract better people Washington DC people working like Tuckerville ! really love country , like say , pay attention ’ sending DC fill important position federal government .",0.9803
20,"think awesome question ask Fox News constantly trying divide u could ’ want u vote together change maybe 99 % u chance get wage increase rather big corporation increasing price 200 % forcing wage ! Big money , mega wealthy , 1 % ’ motivating Fox News ? ’ big smokescreen keep u focusing real issue ! Corporate greed !",0.9114
2,"Republicans nothing Fascists , Racists hate woman , black , mexican etc .. Republicans blocking hundred Military promotion want control woman 's right choose want get pregnant . GOP joke come military readiness .",0.8243
8,Smiling traitor Tuberville need replaced ! Impeach !,0.555
23,Go home Florida Tommy . Try winning State .,0.5267
10,"America Autocracy Totalitarian . TRUTH politician America imposter . anti-American . American politician American citizen encourages gun violence , incites civil disobedience , involved aiding abetting banned political office America . powerful gang thug want government rule long bend way . DEHUMANIZING . gang thug saying know woman need healthcare service . asinine ... VAGINA shot bullet woul...",0.504
25,go another want dictator . Maybe retired Generals volunteer year pulled military Alabama close national guard station really value .,0.4576
4,justified !,0.4574
16,Tuberville talk like donkey ear donkey must donkey .,0.3612
3,wonder abortion another Putincan working Russia . certainly n't one .,0.34


In [None]:
top_10_negative

Unnamed: 0,Comments,sentiment_score
1,"Tuberville never served day life action designed hurt America , American military family help China Russia . Sad people vote imbecile like Tuberville .",-0.6705
19,Tuberville real jerk causing unnecessary problem military .,-0.6249
22,Tuberville isa traitor criminal .,-0.5267
15,Tuberville disgrace .,-0.4939
26,terrorist like putin,-0.4939
11,Tommy Traitorville .... Florida resident bringing shame upon Alabama .,-0.4767
0,People saying Tuberville secretly paid million China weaken American military .,-0.4215
18,people vote crazy !,-0.4003
12,Crazy part vet Republicans🤷its mindblowing🙄,-0.34
13,America already lost democracy . minority making law majority,-0.3182


In [None]:
positive_count = len(video1comments_df[video1comments_df['sentiment_score'] > 0])
neutral_count = len(video1comments_df[video1comments_df['sentiment_score'] == 0])
negative_count = len(video1comments_df[video1comments_df['sentiment_score'] < 0])

# Display the counts
print("Number of Positive Sentiments:", positive_count)
print("Number of Neutral Sentiments:", neutral_count)
print("Number of Negative Sentiments:", negative_count)

Number of Positive Sentiments: 11
Number of Neutral Sentiments: 6
Number of Negative Sentiments: 12


Due to the varying number of comments between the four videos, I also took a sample of 20 comments from each video to conduct a sentiment analysis and reduce sample bias. This script was built with assistance from ChatGPT. [(ChatGPT, 2023)](https://chat.openai.com/share/e16f6ddb-5203-478b-b45a-05fc2f44e520)

In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download the VADER lexicon
nltk.download('vader_lexicon')

# Assuming video1comments_df is your DataFrame with a column named 'Comments'

# Function to calculate sentiment score using VADER
def calculate_sentiment(comment):
    analyzer = SentimentIntensityAnalyzer()
    sentiment_score = analyzer.polarity_scores(comment)['compound']
    return sentiment_score

# Apply the sentiment analysis function to the 'Comments' column
video1comments_df['sentiment_score'] = video1comments_df['Comments'].apply(calculate_sentiment)

# Create a new DataFrame with a sample of 20 comments
video1sample_df = video1comments_df.sample(n=20, random_state=42).reset_index(drop=True)

# Calculate the average sentiment score for the sample
average_sentiment_score = video1sample_df['sentiment_score'].mean()

# Create a final row with the average sentiment score
average_row = {'Comments': 'Average Sentiment Score', 'sentiment_score': average_sentiment_score}

# Append the average row to the sample DataFrame
video1sample_df = video1sample_df.append(average_row, ignore_index=True)

# Display the updated sample DataFrame
print(video1sample_df)


[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


                                             Comments  sentiment_score
0   Smiling traitor Tuberville need replaced ! Imp...         0.555000
1   America already lost democracy . minority maki...        -0.318200
2   danger national security military change polic...        -0.250000
3   ’ going ? thought intelligent human being . ’ ...         0.980300
4   People saying Tuberville secretly paid million...        -0.421500
5   Tommy Traitorville .... Florida resident bring...        -0.476700
6   Tuberville talk like donkey ear donkey must do...         0.361200
7   's custom Alabama given size ear Tommy Tuperwa...         0.000000
8         Crazy part vet Republicans🤷its mindblowing🙄        -0.340000
9                                  Tommy watch back .         0.000000
10  Tuberville never served day life action design...        -0.670500
11                                        justified !         0.457400
12                                            TRAITOR         0.000000
13  Re

  video1sample_df = video1sample_df.append(average_row, ignore_index=True)


In [None]:
video2comments_df = pd.read_csv('/content/drive/MyDrive/Final Project - Intro To Python/cleanvideo2comments.csv', delimiter=',', encoding='utf-8')

In [None]:
video2comments_df

Unnamed: 0,Comments
0,demoted see feel . 's crazy give one person much power .
1,new soy military…🤮
2,block senator 's pay ? Better fact blocked also .
3,Top-Down corruption within Media/Government industrial complex . Needs totally rebuilt .
4,one guy ?
...,...
451,"Within next hour , theyll saying “ dEmOcRaTs ” fault people didnt get promoted . . know republican dont care military vet ."
452,Good woke military BS need thrown window
453,Tuberville waiting master trump 's reelection
454,"traitor paid russia , like GOP"


In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download the VADER lexicon
nltk.download('vader_lexicon')

# Assuming video2comments_df is your DataFrame with a column named 'Comments'

# Function to calculate sentiment score using VADER
def calculate_sentiment(comment):
    analyzer = SentimentIntensityAnalyzer()
    sentiment_score = analyzer.polarity_scores(comment)['compound']
    return sentiment_score

# Apply the sentiment analysis function to the 'Comments' column
video2comments_df['sentiment_score'] = video2comments_df['Comments'].apply(calculate_sentiment)

# Calculate the average sentiment score
average_sentiment_score = video2comments_df['sentiment_score'].mean()

# Create a final row with the average sentiment score
average_row = {'Comments': 'Average Sentiment Score', 'sentiment_score': average_sentiment_score}

# Append the average row to the DataFrame
video2comments_df = video2comments_df.append(average_row, ignore_index=True)

# Sort the DataFrame by 'sentiment_score' in descending order
sorted_df = video2comments_df.sort_values(by='sentiment_score', ascending=False)

# Display the top 10 most positive sentiments
top_10_positive = sorted_df.head(10)
print("Top 10 Most Positive Sentiments:")
print(top_10_positive[['Comments', 'sentiment_score']])

# Sort the DataFrame by 'sentiment_score' in ascending order to get the most negative sentiments first
sorted_df_negative = video2comments_df.sort_values(by='sentiment_score', ascending=True)

# Display the top 10 most negative sentiments
top_10_negative = sorted_df_negative.head(10)
print("Top 10 Most Negative Sentiments:")
print(top_10_negative[['Comments', 'sentiment_score']])

[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


Top 10 Most Positive Sentiments:
                                                                                                                                                                                                                                                                                                                                                                                                            Comments  \
175                                                                                                                                                                                                                                                                 born 50 's , one thing kept secure America Military . honor love Vets , honor POW 's honor currently serve . Thank service making sacrifice . 🇺🇲   
258                                                                                                                                                    

  video2comments_df = video2comments_df.append(average_row, ignore_index=True)


In [None]:
video2comments_df

Unnamed: 0,Comments,sentiment_score
0,demoted see feel . 's crazy give one person much power .,-0.34000
1,new soy military…🤮,0.00000
2,block senator 's pay ? Better fact blocked also .,-0.36120
3,Top-Down corruption within Media/Government industrial complex . Needs totally rebuilt .,0.00000
4,one guy ?,0.00000
...,...,...
452,Good woke military BS need thrown window,0.44040
453,Tuberville waiting master trump 's reelection,0.00000
454,"traitor paid russia , like GOP",0.36120
455,time impeach Tommy traitor ..,0.00000


In [None]:
top_10_positive

Unnamed: 0,Comments,sentiment_score
175,"born 50 's , one thing kept secure America Military . honor love Vets , honor POW 's honor currently serve . Thank service making sacrifice . 🇺🇲",0.9565
258,'' u help u always expose evil always `` GREAT JOB ... THANX 4 MAKING Tee LIONS NAMED LEO music worldwide . LOVE ... ! ! ! .... MUCH LOVE. ! !,0.9389
179,THANK GOD GREAT PATRIOT ❤❤❤❤,0.8979
245,"military always strong , full people willing defend interest value REAL American People , n't fucktards like one demanding Tuberville stop blocking . always ready sacrifice individual want focus need nation security need threat foreign domestic . stop allowing Socialists decide get promotion n't . Tuberville proved madness political theatre , `` Democrats '' real threat national security . mea...",0.896
260,love focused national security danger actually ’ tell block 400 promotion promotion given pentagon new policy promoting NCO officer based diversity like black NCO officer . ’ surprised new would say abortion nothing promotion would medcom civilian side thing . body promoting color skin,0.8834
408,Wow good job fox ! issue around almost year ! ! military retire ’ stay rank losing top pick important military position ! ’ definite issue !,0.8741
92,"feel hold earned promotion . However , think tax payer oay fir state abortion reproductive care either . fall individual . Maybe inquiry regard secual behavior personal responsibility need take place . serve country control sex life ? Hiw trusted life others even care unborn child ?",0.8708
178,Wow ! ! Really ! ! Wow ! ! .,0.8679
370,"'m Republican , military wife . believe 're stationed state country lack adequate medical care , military responsible helping receive proper care .",0.8591
157,"take , cabana boy getting sand castle knocked ? ? ? ...... Awesome ! 2 men rank colonel left military , 'd amazed",0.8455


In [None]:
top_10_negative

Unnamed: 0,Comments,sentiment_score
439,"Updated Recap : Report Nancy Barr : Sally Knorr 's Hospice House Isaiah House Rochester NY . Founded one year Sally Knorr admitted . Sally Knorr burglarized Isaiah House . Receptionist infiltration Isaiah House : provided false information . underworld plotting Amy Knorr , patterned Sally Knorr Isaiah House . statement Aaron Khodorkovsky English underworld Leisure Vale Assisted Living currentl...",-0.9972
68,military gone WOKE WEAK ! ! leader back stabbed soldier defending country ! ! abandoned field forced vaccination indoctrination racism fake gender denied faith OFFICER PROMOTED UNEQUIVOCALLY PROVE DEFEND CONSTITUTION ENEMIES ABROAD WITHIN POLITICIANS DESTROYING COUNTRY ! ! ! !,-0.9793
276,"attack Lawrence Richard , 10 hr ago ￼ Fox News Follow Emergency response official said least 85 people confirmed dead `` mistaken '' army drone attack religious gathering northwest Nigeria . victim killed Sunday night drone `` targeting terrorist bandit '' Kaduna state ’ Tudun Biri village , according government security official . observing Muslim holiday .",-0.9712
80,"'s disgrace . abomination ! uneducated lowlife . honor . courage . commitment . Balls . 's another terrorist holding government hostage . 's LOSER . could NEVER make military . even coast guard . pathetic little man . 's going get karma . likely testicular cancer . , need ball ! POS ! blame every trashy GED moron voted . hey , football coach challenging military 😑",-0.9662
83,"let get straight military paying abortion lol country damn joke wasting money un-deployable soldier 's , life death situation abortion Healthcare killing another human sick twisted reasoning say `` human '' `` body choice '' well remember everybody else planet position thing kept ending trash piece fact mom made choice keep ungrateful as raise beter worse got chance life somebody loved enough ...",-0.9531
63,Wtf idiot . get abortion . battlefield training ’ pregnant idiot . get position ? ’ guy football coach beforehand ? Wtf,-0.9528
23,"n't national security already completely compromised ? quote Hillary Clinton . difference make ? Let 's try act like military n't already co-opted communist . Let 's try act like ever used . president leftist delegate one sold nation evil . governor 's sick unborn baby murdered . Clearly realize governor agenda . However , act though action kind threat country completely disingenuous . action ...",-0.9494
294,"Ya know , woman last gatekeeper womb . matter physical condition station life , want baby , allowing conception one ? Unless raped , otherwise forced , unwanted pregnancy land , whose responsibility avoid unwanted pregnancy first place ? wear seatbelt prevent injury , take vaccine prevent illness , use condom say prevent STDs , `` follow science '' protect others , woman ca n't bothered expect...",-0.9451
41,😢😢😢the LONG TERM DAMAGE zealot caused America ’ military readiness ludicrous . also hurt servicemen/womens . rely promotion . many qualified member lose political stunt ? ! ? !,-0.9369
315,"Senator Tabitha Tuberville R-Al life Florida done irreversible damage republican party/cult 's already collapsing due overturning Roe v Wade , threatening terminate constitution , threatening eliminate social security medicare , denying global warming . Tabitha Tuberville need resign go retire Florida life .",-0.93


In [None]:
positive_count = len(video2comments_df[video2comments_df['sentiment_score'] > 0])
neutral_count = len(video2comments_df[video2comments_df['sentiment_score'] == 0])
negative_count = len(video2comments_df[video2comments_df['sentiment_score'] < 0])

# Display the counts
print("Number of Positive Sentiments:", positive_count)
print("Number of Neutral Sentiments:", neutral_count)
print("Number of Negative Sentiments:", negative_count)

Number of Positive Sentiments: 150
Number of Neutral Sentiments: 110
Number of Negative Sentiments: 197


In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download the VADER lexicon
nltk.download('vader_lexicon')

# Assuming video1comments_df is your DataFrame with a column named 'Comments'

# Function to calculate sentiment score using VADER
def calculate_sentiment(comment):
    analyzer = SentimentIntensityAnalyzer()
    sentiment_score = analyzer.polarity_scores(comment)['compound']
    return sentiment_score

# Apply the sentiment analysis function to the 'Comments' column
video2comments_df['sentiment_score'] = video2comments_df['Comments'].apply(calculate_sentiment)

# Create a new DataFrame with a sample of 20 comments
video2sample_df = video2comments_df.sample(n=20, random_state=42).reset_index(drop=True)

# Calculate the average sentiment score for the sample
average_sentiment_score = video2sample_df['sentiment_score'].mean()

# Create a final row with the average sentiment score
average_row = {'Comments': 'Average Sentiment Score', 'sentiment_score': average_sentiment_score}

# Append the average row to the sample DataFrame
video2sample_df = video2sample_df.append(average_row, ignore_index=True)

# Display the updated sample DataFrame
print(video2sample_df)

[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


                                             Comments  sentiment_score
0   absolutely right block . abortion TAXPAYERS EX...         -0.78930
1   Tuber-vile flubberville ! Horrible unpatriotic...         -0.58480
2                                    Lmfao QOP again😂          0.54230
3   Good enough letting try turn spandex pant brig...         -0.55740
4   ca n't pay military staff already paying Ukrai...          0.07620
5                                   Scumbag forever !         -0.66960
6                       😂😂😂😂😂😂😂 . understand reason ?          0.00000
7   make senator think block anything military mil...         -0.44040
8   lifted hold promotion . 400 brought passed one...          0.42150
9                         trash senator . Time vote .          0.00000
10                                            F ( * *          0.00000
11  Republicans especially trump hate soldier , ye...         -0.80590
12               Never let Republican tell care troop          0.49390
13  's

  video2sample_df = video2sample_df.append(average_row, ignore_index=True)


In [None]:
video3comments_df = pd.read_csv('/content/drive/MyDrive/Final Project - Intro To Python/cleanvideo3comments.csv', delimiter=',', encoding='utf-8')

In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download the VADER lexicon
nltk.download('vader_lexicon')

# Assuming video3comments_df is your DataFrame with a column named 'Comments'

# Function to calculate sentiment score using VADER
def calculate_sentiment(comment):
    analyzer = SentimentIntensityAnalyzer()
    sentiment_score = analyzer.polarity_scores(str(comment))['compound']
    return sentiment_score

# Replace NaN values with an empty string in the 'Comments' column
video3comments_df['Comments'].fillna('', inplace=True)

# Apply the sentiment analysis function to the 'Comments' column
video3comments_df['sentiment_score'] = video3comments_df['Comments'].apply(calculate_sentiment)

# Calculate the average sentiment score
average_sentiment_score = video3comments_df['sentiment_score'].mean()

# Create a final row with the average sentiment score
average_row = {'Comments': 'Average Sentiment Score', 'sentiment_score': average_sentiment_score}

# Append the average row to the DataFrame
video3comments_df = video3comments_df.append(average_row, ignore_index=True)

# Sort the DataFrame by 'sentiment_score' in descending order
sorted_df = video3comments_df.sort_values(by='sentiment_score', ascending=False)

# Display the top 10 most positive sentiments
top_10_positive = sorted_df.head(10)
print("Top 10 Most Positive Sentiments:")
print(top_10_positive[['Comments', 'sentiment_score']])

# Sort the DataFrame by 'sentiment_score' in ascending order to get the most negative sentiments first
sorted_df_negative = video3comments_df.sort_values(by='sentiment_score', ascending=True)

# Display the top 10 most negative sentiments
top_10_negative = sorted_df_negative.head(10)
print("Top 10 Most Negative Sentiments:")
print(top_10_negative[['Comments', 'sentiment_score']])






[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


Top 10 Most Positive Sentiments:
                                              Comments  sentiment_score
514  Ukraine 's slow advance providing FREE DEMOCRA...           0.9922
63   Mr. Tuberville bound oath give military reques...           0.9912
348  Lol 😂 defending American interest Iraq ? Pray ...           0.9760
326  God interview 's quite obvious see decline val...           0.9548
259  GOP long hold ideal used . 's precedence . Hen...           0.9482
149  Maybe born 1990 give le room speak ; help obse...           0.9217
614  please change rule alow jerkass like tubervill...           0.9201
475  Republican Party , party law order . . party n...           0.9186
245  Republicans , Russia China LOVE . love X presi...           0.9141
366  ’ kind sad ’ enough peer pressure Congress Tub...           0.9108
Top 10 Most Negative Sentiments:
                                              Comments  sentiment_score
426  Remember Tuberville also voted impeaching 45 h...          -0.983

  video3comments_df = video3comments_df.append(average_row, ignore_index=True)


In [None]:
video3comments_df

Unnamed: 0,Comments,sentiment_score
0,sympathise wholeheartedly spouse Secretary Arm...,-0.510600
1,Anybody running congress senate President psyc...,-0.541100
2,like black nationalist . ring . jackasz ? !,0.419900
3,"one race , race",0.000000
4,Put form military every race mankind .,0.000000
...,...,...
750,stand back stand ....... 's proud boy,0.476700
751,"Tuesday night Ashley shower night , Joey love ...",0.636900
752,toddler came two spirit gender queer .,0.177900
753,Poor Joe Hunter going thing,-0.476700


In [None]:
top_10_positive

Unnamed: 0,Comments,sentiment_score
514,"Ukraine 's slow advance providing FREE DEMOCRATIC country world incredibly great service . Ukraine fighting Free World . Ukraine best put Russia place DEFENDING FREE WORLD PUTIN 'S ATTEMPT IRRADICATE FREEDOM DEMOCRACY . West , NATO , EU USA need provide support Ukraine Ukraine need support Democracy Freedom . Ukraine taking human infrastructure loss protect World 's freedom . NATO lost 1 soldi...",0.9922
63,"Mr. Tuberville bound oath give military request time regardless position Senator . Every citizen held oath . Mr. Tuberville want remain US citizen ask military assist regard request immediately . Mr. Tuberville 's oath citizenship . _______________________________________________________________________ Naturalization Oath Allegiance United States America Oath '' hereby declare , oath , absolu...",0.9912
348,Lol 😂 defending American interest Iraq ? Pray ? n't America interest restricted border like country world ? Imagine China Russia North Korea interest USA determined best served sending military inside USA bombing market place sport stadium ?,0.976
326,God interview 's quite obvious see decline value military come . people place call leader would never dare let anyone child grandchild join armed force United States right worst shape ever issue paying someone 's abortion point 's whole idea value leader military like three value military personnel watching Russians Chinese discipline value follow pray God never go battle kind conventional war...,0.9548
259,GOP long hold ideal used . 's precedence . Hence love claim Lincoln Rep truth ideal party flipped . n't anyone else see happening ? ? ? caused first time ? information hard find .,0.9482
149,"Maybe born 1990 give le room speak ; help observe decline political dignity United States . Senator Tuberville declining speak medium outlet baffling . Regardless disagree one side , dignity respect constituent placed office . would actually like hear response regarding issue . response could probably change mind could hear perspective . Instead comfortable coward .",0.9217
614,"please change rule alow jerkass like tuberville interfer serious matter like , well hope military pay back kind comming election .",0.9201
475,"Republican Party , party law order . . party national security . . party support troop . . conservative party . . position MAGA , Republican Party , weakest Speaker history disgraced felon leader extremist position . already cost u standing world financial market . going force government shutdown . minority rule . Vote blue .",0.9186
245,"Republicans , Russia China LOVE . love X president tried screw election , screwing readiness . hope kid grandkids answer school book .",0.9141
366,’ kind sad ’ enough peer pressure Congress Tuberville ’ little friend ’ work help understand basic . ’ time dump Tuberville . demonstrates elected official could care le people defending nation . military face familial issue like everyone else . Apparently Tuberville lack compassion support protect country,0.9108


In [None]:
top_10_negative

Unnamed: 0,Comments,sentiment_score
426,Remember Tuberville also voted impeaching 45 hater democracy . Let butt-scratcher racist die hell old Alabama sends new black congressional representative Washington . old south die slow death atrophy ineptness . sun rising new south anyway . sun rising . service men woman deserve better . return service throw aside beg respect . wrong u nation ? ca n't support military way would expect people...,-0.9834
122,"DEMOCRATS WOULD COME GET HALFWAY DECENT POLICIES EVERYBODY REPUBLICANS MAY WORK DEMOCRATS ...... HELL DEMOCRATS CA N'T EVEN GO BORDER SEE COULD SEE IMMIGRANTS COMING COUNTRY ...... GOTTEN BAD GOING HARD GET ....... , DEMOCRATS CANT EVEN GO BORDER ...... HARRIS WENT ONE TIME TWO TACOS WENT BACK ...... ATLEAST TRUMP PRESIDENT ONE THIRD IMMIGRANTS COMING ACROSS ..... GOT BAD NEW YORK , MAYOR DOES...",-0.9831
316,radical right wanting clean deep state- `` diversity abortion demand '' lefty stuff- extreme right crazy mini trump bullies- country squish crazy back rock came from- crazy family member speaking public bizarre fool are- also serf scare away ally let enemy see weak become nation- pick AR 15 shoot one another school mall work together defend nation-what mess Nation .,-0.9709
420,Wsy go republican smh . Real patriot btw ya dismay democracy voting two narcissistic clown think ’ royal Desantis trump mention criminal destroying nato niw . really red state representative smh shit disgrace,-0.9648
240,"disgrace know happening , broadcast whomever might listening also disgrace . 'm tired listening `` American people '' stand complain literally nothing . pas buck , n't armed force duty protect enemy foreign domestic ? Wake force , domestic , admitted , taunted country . stand country going Washington , get thrown jail ! ! force reading responding `` people 's demand ? '' stand behind beside fr...",-0.9597
305,Imagine cared much accountability 13 service member NEEDLESSLY lost life Kabul airport coward star collar denied air strike would killed terrorist carried attack .,-0.9595
517,"dumbest Senator Senate , quite feat , power also scare people . 's clout amongst people actively root downfall country , rebuild image want . Hateful , petty , mean , isolated , dog eat dog . n't love America , hate terrifies minority resort childish BS like 's , desperation , cowardice , lie .",-0.9507
256,"Common sense , anyone ? idiocracy . veteran ask hell charge . asshats worry sick vet using herb crap play . Little Tommy 's drilling hole lifeboat . WTF .",-0.948
418,"`` Whores power , slut money pathetically fearful . `` , John Heilemann really ca n't get much sleazier “ today 's Republican party whore house . ”",-0.9468
516,terrorist senator arrested FBI act treason charged federal prosecutor maximum possible punishment .,-0.9403


In [None]:
positive_count = len(video3comments_df[video3comments_df['sentiment_score'] > 0])
neutral_count = len(video3comments_df[video3comments_df['sentiment_score'] == 0])
negative_count = len(video3comments_df[video3comments_df['sentiment_score'] < 0])

# Display the counts
print("Number of Positive Sentiments:", positive_count)
print("Number of Neutral Sentiments:", neutral_count)
print("Number of Negative Sentiments:", negative_count)

Number of Positive Sentiments: 233
Number of Neutral Sentiments: 170
Number of Negative Sentiments: 352


In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download the VADER lexicon
nltk.download('vader_lexicon')

# Assuming video3comments_df is your DataFrame with a column named 'Comments'

# Drop rows with missing or NaN values in the 'Comments' column
video3comments_df = video3comments_df.dropna(subset=['Comments'])

# Function to calculate sentiment score using VADER
def calculate_sentiment(comment):
    analyzer = SentimentIntensityAnalyzer()
    sentiment_score = analyzer.polarity_scores(str(comment))['compound']
    return sentiment_score

# Apply the sentiment analysis function to the 'Comments' column
video3comments_df['sentiment_score'] = video3comments_df['Comments'].apply(calculate_sentiment)

# Create a new DataFrame with a sample of 20 comments
video3sample_df = video3comments_df.sample(n=20, random_state=42).reset_index(drop=True)

# Calculate the average sentiment score for the sample
average_sentiment_score = video3sample_df['sentiment_score'].mean()

# Create a final row with the average sentiment score
average_row = {'Comments': 'Average Sentiment Score', 'sentiment_score': average_sentiment_score}

# Append the average row to the sample DataFrame
video3sample_df = video3sample_df.append(average_row, ignore_index=True)

# Display the updated sample DataFrame
print(video3sample_df)


[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


                                             Comments  sentiment_score
0                          Tuberville anti American .        -0.318200
1                               Tubberville traitor .         0.000000
2                     one shlub trying feel important         0.202300
3         word others party . intentional whole GOP !         0.457400
4   Time start moving military base asset Alabama ...         0.612400
5   Senator Tuberville 's behavior illustrative cu...         0.680800
6   'S AMAZING . one party , .. let 's call Repugn...         0.748000
7   simply amazing one senator given much power . ...        -0.025800
8              toddler came two spirit gender queer .         0.177900
9   Tommy Domestic Republican Terrorist disguised ...        -0.795900
10  really want know Senate Majority Leader Schume...        -0.710200
11  2023 LEADERS STUCK 1950 TIESE FIGHTING COMMUNI...         0.704600
12  fair . republican never allow anyone fill spot...         0.368200
13    

  video3sample_df = video3sample_df.append(average_row, ignore_index=True)


In [None]:
video4comments_df = pd.read_csv('/content/drive/MyDrive/Final Project - Intro To Python/cleanvideo4comments.csv', delimiter=',', encoding='utf-8')

In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download the VADER lexicon
nltk.download('vader_lexicon')

# Assuming video4comments_df is your DataFrame with a column named 'Comments'

# Function to calculate sentiment score using VADER
def calculate_sentiment(comment):
    analyzer = SentimentIntensityAnalyzer()
    sentiment_score = analyzer.polarity_scores(str(comment))['compound']
    return sentiment_score

# Replace NaN values with an empty string in the 'Comments' column
video4comments_df['Comments'].fillna('', inplace=True)

# Apply the sentiment analysis function to the 'Comments' column
video4comments_df['sentiment_score'] = video4comments_df['Comments'].apply(calculate_sentiment)

# Calculate the average sentiment score
average_sentiment_score = video4comments_df['sentiment_score'].mean()

# Create a final row with the average sentiment score
average_row = {'Comments': 'Average Sentiment Score', 'sentiment_score': average_sentiment_score}

# Append the average row to the DataFrame
video4comments_df = video4comments_df.append(average_row, ignore_index=True)

# Sort the DataFrame by 'sentiment_score' in descending order
sorted_df = video4comments_df.sort_values(by='sentiment_score', ascending=False)

# Display the top 10 most positive sentiments
top_10_positive = sorted_df.head(10)
print("Top 10 Most Positive Sentiments:")
print(top_10_positive[['Comments', 'sentiment_score']])

# Sort the DataFrame by 'sentiment_score' in ascending order to get the most negative sentiments first
sorted_df_negative = video4comments_df.sort_values(by='sentiment_score', ascending=True)

# Display the top 10 most negative sentiments
top_10_negative = sorted_df_negative.head(10)
print("Top 10 Most Negative Sentiments:")
print(top_10_negative[['Comments', 'sentiment_score']])


[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


Top 10 Most Positive Sentiments:
                                                                                                                                                                                                                                                                                                                                                                                                            Comments  \
10   41 4-star general officer US military . withholding 12 . 's greater 1/4 . 4-stars ? Combatant Commanders Commanders highest leadership level service . Tuberville plant . 's pretty obvious hold position Trump Republican Senate put missing puzzle piece play win 2024 -- military leader loyal Trump Constitution . take oath fealty Trump . Trump need take military order succeed dictator . think 's p...   
301  Nobody surprised . Tuberville GOP wanted able wait election Trump got re-elected could install crony pick every position , , really , need top pos

  video4comments_df = video4comments_df.append(average_row, ignore_index=True)


In [None]:
video4comments_df

Unnamed: 0,Comments,sentiment_score
0,. Reason . . Holding . 4 . Star . Back . . Person . . Black,0.0000
1,Tubbervile get away holding military fine pay sue next person try pull thing face fine,0.2960
2,4 star Brain Thrust n't forget,0.1695
3,HOPE VOTERS DISTRICT REALIZE STUPID IGNORANT MAN REALLY !,-0.4389
4,. . Told . . . 're . Going . . Close . Ft. Rucker . . Transfer . Redstone . . Houston . . . . Change . . Diaper,0.0000
...,...,...
976,Ef idiot,-0.5106
977,’ mad man like MaGa,-0.1779
978,😮😮😮😮😮😮,0.0000
979,😮😊,0.0000


In [None]:
top_10_positive

Unnamed: 0,Comments,sentiment_score
10,41 4-star general officer US military . withholding 12 . 's greater 1/4 . 4-stars ? Combatant Commanders Commanders highest leadership level service . Tuberville plant . 's pretty obvious hold position Trump Republican Senate put missing puzzle piece play win 2024 -- military leader loyal Trump Constitution . take oath fealty Trump . Trump need take military order succeed dictator . think 's p...,0.9741
301,"Nobody surprised . Tuberville GOP wanted able wait election Trump got re-elected could install crony pick every position , , really , need top position . Thus , going allow lower rank filled still holding top rank . ? every successful coup every country needed military backing mean support TOP leadership position . Ca n't happen US say ? look Jan 6 . Look Trump 's acting SECDEF kept National G...",0.9636
402,"must amazing feeling security , career well people life hand . powerful man . hope constituent proud",0.9628
553,"Tuberville play ball tell young men play game football . greatest accomplishment life 's job experience position US Senate . Football great game game n't get played , n't put national security jeopardy . zero military service hold psychical education degree . know play football gold . else ... ? Well , wrap resume . anyone 's vetting mind would qualified position United States Senator ? got st...",0.9589
189,"non-sense occured 1800 , guy would cuff awaiting sentence would likely see demise . , era booklet like `` common sense '' printed people knew meant . Good luck 23 , gene pool run dry , give ... Tommy ... brilliant . Wait see , scientist get re-elected nation highly educated . 'm pretty sure could 200 hour telethon never even make dent .",0.9536
344,"China begging Tuberville continue withhold military promotion . China offer Tuberville better package Supreme Court justice , Swiss account million deposit property foreign country gold bar safe . 'll easier China invade Taiwan .",0.9393
60,"Somebody said best , ’ holding long election . ’ holding hoping Trump win appoint loyalist . Wake folk !",0.9273
918,Amazon 's AMA52K n't another project ; 's like watching symphony technology innovation coming together perfect harmony .,0.9042
254,"* Donald J. Trump POTUS , low inflation , gas price $ 2 gallon , secured southern border , U.S. Petroleum Reserves filled , U.S. # 1 oil producer world , U.S. n't war , ISIS reign terror came end , U.S. military strengthened , every working American received income tax cut , 100 job killing regulation ended , everyone able find job wanted one , crime except large city controlled Dems , moved U...",0.8962
642,"interesting US system allows small minority congress senate wield something akin `` veto '' -powers - 'd think POTUS kind power , , live learn guess . Hint : get rid FPTP voting system please - favor .",0.8779


In [None]:
top_10_negative

Unnamed: 0,Comments,sentiment_score
427,Tuberville wanted weaken military America get attacked blame Biden . Worst part idiot Alabama still elect POS southern wealthy racist white man still fighting civil war .,-0.9735
150,"Tub head say speaking general said `` problem whatsoever battlefield '' go holding promotion ? Name general . n't exist . another MAGA lie . Millions personnel bad morale , top general heart attack 2 job ? problem ? ! lying * s . lousy football coach interfering America 's national defense .",-0.964
414,"Yedioth Ahronoth , authority one released Israeli female prisoner : “ sat tunnel afraid , Hamas would kill u , Israel would kill u , would say u : Hamas killed . ”",-0.959
54,"navy veteran , along several serviceman got severely depressed way , Biden administration pulled Afghanistan service , way people held accountable military chain command , disgusting repulsive withdrawal Afghanistan . service member family died Afghanistan , blame Biden administration solely . repulsive disgusting . way Democrats defend Joe Biden .",-0.9584
432,Tommy scumbagVille Tommy really trying hurt men woman uniform ? actually helped enemy like CCP Vladimir Putin whole lot Tommy ashamed . Republican Senators assuming immediately stopped men woman uniform month ago . one reason ca n't bring vote republican anymore . guy keep fighting wrong team . keep fighting people like Putin ! men woman uniform keep slapping face Time Time really stink .,-0.9556
667,one racist/fascist/mysogynistic idiot able block military promotion number 1 military/country world . Shame senate rule Republican scum enabling behind scene . guy alone many fascist/Christian nationalist billionaire funded organization GOPers behind . reason giving Dems/Schumer/WH n't back 's making look bad term Ukraine Israel war .,-0.9432
888,person never served military uniform ’ never able call shot military ’ plain stupid pathetic disgusting unhealthy unheard period 😮,-0.9313
25,tuberville planning torelease 3 star general promotion ? IMPEACHED JAILED yoir horrific betrayal country . !,-0.925
250,"selfish idiot , country brink war one man stupid thinking authority Military .",-0.9246
21,"Mr Esper put IDF soldier risk go house house . known HAMAS booby trapped home along tunnel . take much much longer . willing put risk ? IDF taking risk , telling fight . think know fight .",-0.9169


In [None]:
positive_count = len(video4comments_df[video4comments_df['sentiment_score'] > 0])
neutral_count = len(video4comments_df[video4comments_df['sentiment_score'] == 0])
negative_count = len(video4comments_df[video4comments_df['sentiment_score'] < 0])

# Display the counts
print("Number of Positive Sentiments:", positive_count)
print("Number of Neutral Sentiments:", neutral_count)
print("Number of Negative Sentiments:", negative_count)

Number of Positive Sentiments: 313
Number of Neutral Sentiments: 253
Number of Negative Sentiments: 415


In [None]:
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

# Download the VADER lexicon
nltk.download('vader_lexicon')

# Assuming video4comments_df is your DataFrame with a column named 'Comments'

# Drop rows with missing or NaN values in the 'Comments' column
video4comments_df = video4comments_df.dropna(subset=['Comments'])

# Function to calculate sentiment score using VADER
def calculate_sentiment(comment):
    analyzer = SentimentIntensityAnalyzer()
    sentiment_score = analyzer.polarity_scores(str(comment))['compound']
    return sentiment_score

# Apply the sentiment analysis function to the 'Comments' column
video4comments_df['sentiment_score'] = video4comments_df['Comments'].apply(calculate_sentiment)

# Create a new DataFrame with a sample of 20 comments
video4sample_df = video4comments_df.sample(n=20, random_state=42).reset_index(drop=True)

# Calculate the average sentiment score for the sample
average_sentiment_score = video4sample_df['sentiment_score'].mean()

# Create a final row with the average sentiment score
average_row = {'Comments': 'Average Sentiment Score', 'sentiment_score': average_sentiment_score}

# Append the average row to the sample DataFrame
video4sample_df = video4sample_df.append(average_row, ignore_index=True)

# Display the updated sample DataFrame
print(video4sample_df)


[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


                                             Comments  sentiment_score
0   kind nightmare happen ? ? Holding MILITARY PRO...        -0.690100
1   Tuberville holding back senior appointment mil...         0.718400
2   US senator censured ? : http : //www.senate.go...         0.000000
3   sure missed memo , Amazon 's AMA52K become hot...         0.271400
4   nothing abortion right-wing power grab militar...         0.855500
5   Coach Taterhead leaving spot open Trumps ' Rev...        -0.570700
6   mad Donald Trump tried overthrow government , ...        -0.790600
7                     Biden criminal human trafficker        -0.526700
8                  guess prune juice finally kicked 😮         0.000000
9                                 VOTE clown trader !         0.000000
10  real reason GOPs took white house Senate could...         0.361200
11  Rank matter principle . Tuberville n't want Tr...         0.596400
12                     've 15 minute fame . go home .         0.440400
13  se

  video4sample_df = video4sample_df.append(average_row, ignore_index=True)


###Sentiment Analysis Results
Video 1: The average compound sentiment score from all comments was 0.029578. Of the 27 comments, 11 received a positive sentiment score, 6 were neutral, and 12 received a negative sentiment. From the random sample of 20 comments, the average compound sentiment score was 0.030235.

Video 2: The average compound sentiment score from all comments was -0.06903. Of the 457 comments, 150 received a positive sentiment score, 110 were neutral, and 197 received a negative sentiment. From the random sample of 20 comments, the average compound sentiment score was -0.26991.

Video 3: The average compound sentiment score from all comments was -0.091031. Of the 754 comments, 223 received a positive sentiment score, 170 were neutral, and 352 received a negative sentiment. From the random sample of 20 comments, the average compound sentiment score was -0.118925.

Video 4: The average compound sentiment score from all comments was -0.0718. Of the 981 comments, 313 received a positive sentiment score, 253 were neutral, and 415 received a negative sentiment. From the random sample of 20 comments, the average compound sentiment score was -0.015905.


###Evaluation of Top Scores

Across three of the four videos, there was a shared negative sentiment regarding the Senator Tuberville's blockade of military promotions. However, Video 1 had the only positive average sentiment score for all comments and the random sample. The difference between this video and it counterparts are the limited comments available (27) which may have presented a bias in the results due to a limited sample size.

In review of the comments which received the highest positive compound sentiment scores, I discovered inaccurate results corresponding to the actual texts. The following are three of the top positive scores in Video 1 and the unedited/cleaned comments:
>1. "What's going on here? I thought were intelligent human beings. Don't they teach ancient history in the schools anymore what about the Greek army and the Spartans? I think a better point to make would be why can't we attract better people to fill up important positions in DC? The reason we can't attract better people to Washington DC is because we have people working there now like Tuckerville! If you really love a country, like you say, you do pay attention to who you're sending to DC to fill important positions in a federal government." Score: .9803
>2. "I think an awesome question to ask is why does Fox News constantly trying to divide us could it be that they don't want us to vote together for change and maybe the 99% of us have a chance to get a wage increase rather than the big corporations increasing their prices by 200% and forcing our wages down! Big money, mega wealthy, 1% what's motivating Fox News? It's a big smokescreen to keep us from focusing on the real issues! Corporate greed!" Score: .9114
>3. "Republicans are nothing more than Fascists, Racists who hate women, blacks, mexicans etc..  Republicans are blocking hundreds of Military promotions all because they want to control a woman's right to choose if they want to get pregnant. GOP is a joke when it comes to military readiness." Score: .8243

Based on my evaluation of the texts, the comments were negative in sentiment and were scored incorrectly. The potential errors resulted from the VADER tool which fails to account for sarcasm and overall context.

The following are three of the top negative scores in Video 1 and the unedited/cleaned comments:
>1. "Tuberville never served a day in his life and his actions are designed to hurt America, American military families and help China and Russia. Sad that people vote for imbeciles like Tuberville." Score: -.6705
>2. "Tuberville is being a real jerk and causing unnecessary problems for our military." Score: -.6249
>3. "Tuberville isa traitor and a criminal." Score: -.5267

In relation to the top negative compound scores, VADER accurately determined the text as negative. From this sample, VADER was more effective in accounting for negative sentiment within the text as opposed to positive sentiments.




###Top 10 frequently mentioned words
In this analysis, I will identify the top ten most frequently used words in each video and display the respective frequency of each word. This approach allows me to discern if there are recurring words across the videos that convey a positive or negative perspective. By examining the frequency of specific words, I aim to uncover linguistic patterns that may indicate sentiments associated with the videos.

Video 1 Top 10 Frequently Mentioned Words:

In [None]:
import pandas as pd
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from collections import Counter

# Download NLTK resources
nltk.download('punkt')
nltk.download('stopwords')

# Assuming 'Comments' is the column in your DataFrame (e.g., video4comments_df)
comments_column = video1comments_df['Comments']

# Combine all comments into a single string
all_comments_text = ' '.join(comments_column.astype(str))

# Tokenize the text into words
words = word_tokenize(all_comments_text)

# Remove stop words
stop_words = set(stopwords.words('english'))
filtered_words = [word.lower() for word in words if word.isalpha() and word.lower() not in stop_words]

# Calculate word frequencies
word_frequencies = Counter(filtered_words)

# Display the top 15 most frequently used words
top_15_words = word_frequencies.most_common(15)
print("Top 15 Most Frequently Used Words:")
for word, frequency in top_15_words:
    print(f"{word}: {frequency}")

Top 15 Most Frequently Used Words:
tuberville: 8
military: 8
people: 6
like: 6
america: 5
want: 5
american: 4
woman: 4
tommy: 4
donkey: 4
u: 4
average: 4
sentiment: 4
score: 4
vote: 3


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Video 2 Top 10 Frequently Mentioned Words:

In [None]:
import pandas as pd
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from collections import Counter

# Download NLTK resources
nltk.download('punkt')
nltk.download('stopwords')

# Assuming 'Comments' is the column in your DataFrame (e.g., video4comments_df)
comments_column = video2comments_df['Comments']

# Combine all comments into a single string
all_comments_text = ' '.join(comments_column.astype(str))

# Tokenize the text into words
words = word_tokenize(all_comments_text)

# Remove stop words
stop_words = set(stopwords.words('english'))
filtered_words = [word.lower() for word in words if word.isalpha() and word.lower() not in stop_words]

# Calculate word frequencies
word_frequencies = Counter(filtered_words)

# Display the top 15 most frequently used words
top_15_words = word_frequencies.most_common(15)
print("Top 15 Most Frequently Used Words:")
for word, frequency in top_15_words:
    print(f"{word}: {frequency}")

Top 15 Most Frequently Used Words:
military: 127
people: 51
need: 48
get: 47
promotion: 42
one: 38
abortion: 38
tuberville: 38
country: 37
care: 29
like: 28
security: 28
republican: 26
life: 26
good: 25


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Video 3 Top 10 Frequently Mentioned Words:

In [None]:
import pandas as pd
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from collections import Counter

# Download NLTK resources
nltk.download('punkt')
nltk.download('stopwords')

# Assuming 'Comments' is the column in your DataFrame (e.g., video4comments_df)
comments_column = video3comments_df['Comments']

# Combine all comments into a single string
all_comments_text = ' '.join(comments_column.astype(str))

# Tokenize the text into words
words = word_tokenize(all_comments_text)

# Remove stop words
stop_words = set(stopwords.words('english'))
filtered_words = [word.lower() for word in words if word.isalpha() and word.lower() not in stop_words]

# Calculate word frequencies
word_frequencies = Counter(filtered_words)

# Display the top 15 most frequently used words
top_15_words = word_frequencies.most_common(15)
print("Top 15 Most Frequently Used Words:")
for word, frequency in top_15_words:
    print(f"{word}: {frequency}")

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Top 15 Most Frequently Used Words:
military: 253
tuberville: 198
one: 101
people: 92
senator: 91
get: 75
like: 72
need: 72
country: 64
vote: 64
republican: 63
alabama: 62
gop: 50
would: 46
party: 46


Video 4 Top 10 Frequently Mentioned Words:

In [None]:
import pandas as pd
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from collections import Counter

# Download NLTK resources
nltk.download('punkt')
nltk.download('stopwords')

# Assuming 'Comments' is the column in your DataFrame (e.g., video4comments_df)
comments_column = video4comments_df['Comments']

# Combine all comments into a single string
all_comments_text = ' '.join(comments_column.astype(str))

# Tokenize the text into words
words = word_tokenize(all_comments_text)

# Remove stop words
stop_words = set(stopwords.words('english'))
filtered_words = [word.lower() for word in words if word.isalpha() and word.lower() not in stop_words]

# Calculate word frequencies
word_frequencies = Counter(filtered_words)

# Display the top 15 most frequently used words
top_15_words = word_frequencies.most_common(15)
print("Top 15 Most Frequently Used Words:")
for word, frequency in top_15_words:
    print(f"{word}: {frequency}")

Top 15 Most Frequently Used Words:
military: 221
tuberville: 201
one: 105
trump: 94
need: 86
people: 84
tommy: 81
get: 79
vote: 74
like: 72
promotion: 64
hold: 59
power: 59
country: 56
time: 56


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


##Conclusion

The analysis of public sentiment towards Senator Tuberville's blockade of military promotions provides valuable insights into the complex interplay between politics, national security, and public opinion. The research, conducted through sentiment analysis of YouTube comments on politically polarized media sources, aimed to understand the evolving perspectives over a 10-month period.

The sentiment analysis results revealed a predominantly negative sentiment across three of the four selected videos. However, Video 1 exhibited a positive average sentiment score, challenging the overall trend. This deviation was attributed to the limited number of comments in Video 1, raising concerns about the impact of sample size on sentiment analysis outcomes.

Furthermore, an in-depth examination of comments with the highest positive sentiment scores uncovered inaccuracies in the sentiment analysis tool, particularly in instances of sarcasm and nuanced language. This highlights a limitation in relying solely on automated sentiment analysis tools, emphasizing the importance of human review to ensure the accuracy of results.

Despite the lifting of Senator Tuberville's blockade, concerns persist about the delayed promotions and the subsequent challenges in addressing the backlog. The research underscores the broader implications of using military promotions as a political tool, emphasizing the need for bipartisan support in matters of national security.

In conclusion, the study provides a nuanced understanding of public sentiment, revealing both shared concerns and variations in perspectives. The findings contribute to the ongoing discourse on the intersection of politics and national security policy, urging policymakers to consider the broader implications of their actions on military readiness and security. Additionally, the research highlights the importance of refining sentiment analysis methodologies to accurately capture the complexities of public opinion in the digital age.