*Social Media Analytics Workshop - Telkom University*


---




# Hand-On Practice: Collecting and Analyzing Twitter Data

To collect twitter data in python, we can use Tweepy. Tweepy is the most popular Python Package for accessing the Twitter API, You can read the full documentation [HERE](https://tweepy.readthedocs.io/en/latest/). In this practice, we will practice to get tweet data using a specific keyword and do some sentiment analysis from the tweet collected. 

*In this practice we will use a prebuild sentiment analysis model, namely NLTK Vader sentiment. Unfortunately, this model only supports English.*

In [0]:
# Install Library
!pip install tweepy
!pip install vaderSentiment

In [0]:
# Import Library
import tweepy
import pandas as pd
import matplotlib.pyplot as plt
from nltk.sentiment.vader import SentimentIntensityAnalyzer 

In [0]:
# Fill the API Key
consumer_key = 'Masukkan Consumer Key'
consumer_secret = 'Masukkan Consumer Secret Key'
access_token = 'Masukkan Access token'
access_token_secret = 'Masukkan Access Token Secret'

In [0]:
# Auth. 
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

In [0]:
# Finding tweets by Keyword
tweets = api.search('startup', count=1000, lang='en')

In [0]:
tweets

In [0]:
# Print the collected tweet
data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])
display(data.head(10))

In [0]:
# Import library for Text Analytics
import nltk
nltk.download('vader_lexicon')

In [0]:
# Sentiment Analysis
sid = SentimentIntensityAnalyzer()
listy = [] 
for index, row in data.iterrows():
  ss = sid.polarity_scores(row["Tweets"])
  listy.append(ss)
  
se = pd.Series(listy)
data['polarity'] = se.values
display(data.head(10))

In [0]:
# Pie Chart Visualization
labels = ['negative', 'neutral', 'positive']
sizes  = [ss['neg'], ss['neu'], ss['pos']]
plt.pie(sizes, labels=labels, autopct='%1.1f%%')
plt.axis('equal') 
plt.show()