<a href="https://colab.research.google.com/github/Namvi3t/DataProjects/blob/main/Sentiment_Analysis_on_Elon_Musk.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sentiment Analysis on Elon Musk

### Project Description and Background:

A sentiment analysis is the act of computationally recognizing and classifying opinions stated in a text, particularly to ascertain if the writer has a favorable, negative, or neutral viewpoint on a certain subject, item, etc. This will detect any positive, negative, or neutral sentiment in text. For this example, I will be doing Elon Musk. Elon Musk has a been a topic of controversy when he decided to buy twitter and that caused an uprise on Twitter. It got me thinking about doing a sentiment analysis on Elon's Twitter Page to see what his page is like now since he bought Twitter.

# Important Note:
 - You need to have **ELEVATED** Access Twitter Developers account or else you will not be able to gather tweets.
 - Have a .txt file that will include api key, api secret key, access token, and secret access token. This is needed to pass the verification or else the code will not work.
 - Each Code block needs to be run in order of the code block you see.

###### By Samuel Do

In [1]:
#Import Files for sentiment analysis
import tweepy #Needed tor Tweepy
import re #Needed for Tweepy
from textblob import TextBlob #Process Textual data
from wordcloud import WordCloud #Help create the world cloud
import pandas as pd #Help use data structures and data analysis tools
import numpy as np #Help perform mathimatical arrays
import matplotlib.pyplot as plt #Help plot graphs
plt.style.use('dark_background') #What type of style/looks of the graph. Dark mode helps my eyes
from google.colab import files

# save your twitter developer account API key and secret and access key and secret in a txt file
# make sure there are no extra white space in your txt file
# upload this text file to Google Colab:

txtfile = files.upload() 
keys =  txtfile.get('twkeys.txt').splitlines()

# test your keys and secrets are correct or not:
apiKey = keys[0]
apiKeySecret = keys[1]
accessToken = keys[2]
accessTokenSecret = keys[3]
auth = tweepy.OAuthHandler(apiKey, apiKeySecret)
auth.set_access_token(accessToken, accessTokenSecret)
api = tweepy.API(auth)
try:
  api.verify_credentials()
  print("verification successful!")  
except: 
  print("authentication error")  # if keys are NOT correct, you should see error

Saving twkeys.txt to twkeys.txt
verification successful!


# Gather the data

I have gathered a sample data of 200 tweets to help identify a pattern and trend of a larger dataset by using a subset of a population. Once I got the data, I put it into a Dataframe using Pandas. I showed the first 100 rows of the tweets and then I also showed the last 100 rows of the tweets. 

In [11]:
# Get 200 tweets from Elon Musk Twitter Page
posts = api.user_timeline(screen_name ="elonmusk", count=200, lang = "en", tweet_mode="extended")

# Print last 200 tweets from Elon Musk
print("Here are 200 recent tweets: \n")
i = 1
for tweet in posts[0:200]: #Print the 200 tweets 
  print(str(i) +')'+ tweet.full_text + "\n")
  i+=1

Here are 200 recent tweets: 

1)@ACLU Kudos to ACLU for this non-partisan support of free speech!

2)@TheBabylonBee Only a matter of time!

3)@BillyM2k Kangaroos love kickboxing!

4)@Lukewearechange !!

5)@mtaibbi De facto attack on First Amendment

6)@MostlyPeacefull Strange daze

7)@ScottAdamsSays Please run. That would be awesome.

8)@RubinReport Accurate thread

9)@SenSchumer May I suggest a DM chat 🙏 https://t.co/wmwBxEFit1

10)@cb_doge @Tesla @mayemusk And then was at Twitter HQ past midnight. Very long day.

11)RT @SpaceX: Launch and catch tower destacked Ship 24 from Booster 7 on the orbital pad today ahead of the Booster’s static fire test https:…

12)@micsolana lol

13)@dogeofficialceo Will dig in

14)Congrats Tesla California factory team on all-time record production! https://t.co/1aF53hgWgM

15)@EvaFoxU Tweetception

16)@BillyM2k 🎯

17)RT @Tesla: Q4 2022 Earnings Call https://t.co/JNL5ovciRJ

18)@PeterDiamandis 🙏

19)@TimRunsHisMouth 🤣

20)@RNCResearch 🤨

21)@SenSchumer @R

In [9]:
#Create a dataframe 
df = pd.DataFrame( [tweet.full_text for tweet in posts], columns =['Tweets'])

#Show the first 100 rows of data
df.head(100)


Unnamed: 0,Tweets
0,@Lukewearechange !!
1,@mtaibbi De facto attack on First Amendment
2,@MostlyPeacefull Strange daze
3,@ScottAdamsSays Please run. That would be awes...
4,@RubinReport Accurate thread
...,...
95,"@ScottAdamsSays And my cousin, who is young &a..."
96,@unusual_whales @BillyM2k Yes
97,@ScottAdamsSays I had major side effects from ...
98,@CommunityNotes Extremely important


In [10]:
#Create a dataframe 
df = pd.DataFrame( [tweet.full_text for tweet in posts], columns =['Tweets'])

#Show the last 100 rows of data
df.tail(100)

Unnamed: 0,Tweets
100,@BillyM2k @joshzepps Sigh … I hope Sam becomes...
101,@slashdot Cool
102,@LayahHeilpern Interesting &amp; entertaining ...
103,@WorldAndScience 🤣 yeah
104,@stats_feed Wow
...,...
195,@TaraBull808 Brief synchronization lag between...
196,@physorg_com @USC @PNASNews Maybe because it’s...
197,@EWoodhouse7 Interesting
198,@glennbeck @ShellenbergerMD Citizen journalism...


# Text Cleaning

In [5]:
#Clean the text here
def cleanText(text):
  text = re.sub(r'@[A-Za-z0-9]+', '', text) #Remove the '@' symbol or the @mentions
  text = re.sub(r'#', '', text) #Remove '#' symbol which is the hash tag
  text = re.sub(r'RT[\s]+', '', text) #Remove Retweets
  text = re.sub(r'https?:\/\/S+', '', text) #Remove hyperlink from the tweets

  return text
  
#Call in the cleanText method
df['Tweets']= df['Tweets'].apply(cleanText)

#Show the cleaned text
df

Unnamed: 0,Tweets
0,!!
1,De facto attack on First Amendment
2,Strange daze
3,Please run. That would be awesome.
4,Accurate thread
...,...
195,Brief synchronization lag between our Atlanta...
196,_com Maybe because it’s online &amp; social ...
197,Interesting
198,Citizen journalism is vital to the future of...


# Determine Subjectivity

In [None]:
# Create a Function to get the subjectivity
def getSubjectivity(text):
  return TextBlob(text).sentiment.subjectivity

# Function to determine if the positive or negative by using polarity
def getPolarity(text):
  return TextBlob(text).sentiment.polarity

# Create 2 columns
df['Subjectivity'] = df['Tweets'].apply(getSubjectivity)
df['Polarity'] = df['Tweets'].apply(getPolarity)

#Show dataframe with the new columns
df

In [None]:
#The World Cloud
allWords = ' '.join( [twts for twts in df['Tweets']])
wordCloud = WordCloud(width = 600, height = 300, random_state = 22, max_font_size = 120).generate(allWords)

#Customize the wordCloud and displaying it
plt.imshow(wordCloud, interpolation = "bilinear")
# Do not show the axies since it makes the image look bad
plt.axis('off')
plt.show()

In [None]:
#Create a funciton for negative, positive, and neutral analysis
def getAnalysis(score):
  #Determine the polarity  
  if score < 0:
    return 'Negative'
  elif score == 0:
    return 'Neutral'
  else:
    return 'Positive'

#Add another column displaying Analysis
df['Analysis'] = df['Polarity'].apply(getAnalysis)

# Display the dataframe
df

In [None]:
# Print all positive tweets
k=1 #Iterate the list of positive tweets
sortedDF = df.sort_values(by=['Polarity']) #Sort the dataframe into polarity
#Display all positive tweets based on the count from previous code
for i in range(0, sortedDF.shape[0]):
  if (sortedDF['Analysis'][i]== 'Positive'):
    print(str(k) + ')' +sortedDF['Tweets'][i])
    print()
    k+=1

In [None]:
# Print all negative tweets
k=1 #Iterate the list of negative tweets
sortedDF = df.sort_values(by=['Polarity'], ascending = 'False') 
#Display all negative tweets based on the count from previous code
for i in range(0, sortedDF.shape[0]):
  if (sortedDF['Analysis'][i]== 'Negative'):
    print(str(k) + ')' +sortedDF['Tweets'][i])
    print()
    k+=1

In [None]:
# Print all neutral tweets
k=1 #Iterate the list of negative tweets
sortedDF = df.sort_values(by=['Polarity']) 
#Display all neutral tweets based on the count from previous code
for i in range(0, sortedDF.shape[0]):
  if (sortedDF['Analysis'][i]== 'Neutral'):
    print(str(k) + ')' +sortedDF['Tweets'][i])
    print()
    k+=1

In [None]:
# Plot polarity and subjectivity using Scatter Plot
#create the Scatter Plot size
plt.figure(figsize=(8,6))
#configure the scatter plot and display the scatter plot
for i in range(0, df.shape[0]):
  plt.scatter(df['Polarity'][i], df['Subjectivity'][i], color = 'Red')
plt.title('Sentiment Analysis')
plt.xlabel('Polarity')
plt.ylabel('Subjectivity')
plt.show()

# Calculations and Pie Chart

In [None]:
# Calcualte the percentages of positive tweets
ptweets = df[df.Analysis == 'Positive']
ptweets = ptweets['Tweets']

round((ptweets.shape[0] / df.shape[0]) * 100, 1) #Calculate the percentage

In [None]:
# Calcualte  the percentages of negative tweets
ntweets = df[df.Analysis == 'Negative']
ntweets = ntweets['Tweets']

round((ntweets.shape[0] / df.shape[0]) * 100, 1) #Calculate the percentage

In [None]:
# Calcualte  the percentages of neutral tweets
neutweets = df[df.Analysis == 'Neutral']
neutweets = neutweets['Tweets']

round((neutweets.shape[0] / df.shape[0]) * 100, 1) #Calculate the percentage

In [None]:
#Show value counts
df['Analysis'].value_counts()

#Plot and visualzie the counts using a pie chart
plt.title('Sentiment Analysis')
plt.xlabel('Sentiment')
plt.ylabel('Counts')
df['Analysis'].value_counts().plot(kind='pie', autopct='%1.1f%%', textprops={'color':"r"})
plt.show()

# Resources

In order to complete the Sentiment Analysis on Elon Musk's page, it required libraries in order to do my analysis.

#### Documentation:
*   https://docs.tweepy.org/en/stable/ (Tweepy)
*   https://pandas.pydata.org/docs/ (Pandas)
*   https://numpy.org/doc/ (Numpy)
*   https://matplotlib.org/stable/index.html (Matplotlib)
*   https://textblob.readthedocs.io/en/dev/ (Text Blob)
*   https://python-course.eu/applications-python/python-wordcloud-tutorial.php (Word Cloud)

Help guided this Sentiment Analysis by understanding how to display and the syntax.

### Code Skeleton:
*   https://www.geeksforgeeks.org/twitter-sentiment-analysis-using-python/ 

Help set up Sentiment Analysis. I used this as a guide in helping me make this sentiment analysis. I further implemented this by doing more analysis to it as you can see above.











