###### *15CSE350 INFORMATION RETRIEVAL*

# Case Study: Analysing the Feelings of Twitterati 

### Part 1: Setting Up Twitter Developer Credentials
For obtaining the required credentials, you need to sign up for a twitter developer's account. You can sign up for one [here](https://developer.twitter.com/).

We shall store our Twitter credentials in a json file.

In [3]:
import json

# create a dictionary to store your twitter credentials
twitter_cred = dict()

# Entering the consumer_key, consumer_secret, access_key and access_secret
twitter_cred['CONSUMER_KEY'] = 'YOUR_TWITTER_CONSUMER_KEY'
twitter_cred['CONSUMER_SECRET'] = 'YOUR_CONSUMER_SECRET_KEY'
twitter_cred['ACCESS_KEY'] = 'YOUR_ACCESS_KEY'
twitter_cred['ACCESS_SECRET'] = 'YOUR_ACCESS_SECRET_KEY'

# Save the information to a json so that it can be reused in code without exposing
# the secret info to public

with open('twitter_credentials.json', 'w') as secret_info:
    json.dump(twitter_cred, secret_info, indent=4, sort_keys = True)

### Part 2: Setting Up the Twitter API Endpoint

In [4]:
import tweepy

# Twitter API credentials

with open('twitter_credentials.json') as cred_data:
    info = json.load(cred_data)

consumer_key = info['CONSUMER_KEY']
consumer_secret = info['CONSUMER_SECRET']
access_key = info['ACCESS_KEY']
access_secret = info['ACCESS_SECRET']

# Create the api endpoint
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
api = tweepy.API(auth)

### Part 3: Fetching and Cleaning the Tweets

In [4]:
import re

# Mention the maximum number of tweets that you want to be extracted.
max_tweets = int(input('Enter the maximum number of tweets that you want to extract- '))

# Mention the hashtag that you want to look out for
hashtag = input('Enter the hashtag you want to scrape- ')

#Fetching tweets using our API

tweets = api.search(q = "#"+hashtag, lang="en", count = max_tweets, tweet_mode = "extended")

clean_tweets = []
tweet_text = []

for tweet in tweets:    
    t = tweet.full_text
    tt = t.lower()
    #print("\n\n'tweet in lowercase text: ", tt)
    tweet_text.append(tt)
   
    #Removing urls(http:..)
    clean_twt = re.sub(r'(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:\'".,<>?«»“”‘’]))', '', tt)

    #Removing any other punctuation or unwanted character
    clean_twt = re.sub(r"[.,;:$&*|?'-]",'',clean_twt)
    
    #Removing mentions
    clean_twt = re.sub('@[^\s]+','',clean_twt)
    
    #print("\nCleaned tweet is ", clean_twt)
    
    #adding to clean_tweets
    clean_tweets.append(clean_twt)

print ('Extracted ' + str(len(tweets))+ ' tweets tagged with #' + hashtag)

Enter the maximum number of tweets that you want to extract- 10
Enter the hashtag you want to scrape- isro
Extracted 10 tweets tagged with #isro


### Part 4: Sentiment Analysis

In [5]:
from textblob import TextBlob

#Utility function that uses the textBlob library to classify the sentiment using the sentiment method
def get_tweet_sentiment(tweet): 
    # create TextBlob object of passed tweet text 
    analysis = TextBlob(tweet) 
    
    # set sentiment 
    if analysis.sentiment.polarity > 0: 
        return 'positive'
    elif analysis.sentiment.polarity == 0: 
        return 'neutral'
    else: 
        return 'negative'

In [6]:
#We shall display the sentiment of each of the retrieved tweets
for tweet in clean_tweets:
    print("tweet is: ", tweet, " sentiment is: ",get_tweet_sentiment(tweet), '\n')

tweet is:  rt        
just giving information to isro no need to spend money to…  sentiment is:  neutral 


  pabsgill

isro #isro   sentiment is:  positive 

tweet is:  this one from table calendar bought 2 yrs back so apt after #chandrayan2 success #isro #vikramlander   sentiment is:  positive 

tweet is:  rt  #isro we are proud of you thanks for all the efforts you brought the whole nation together helped us pray together…  sentiment is:  positive 

tweet is:  this is called "full paisa vasool" project 
congratulations team 
planned for 6 months india’s mars mission #mangalyaan completes 5 years only artificial satellite that could image the full disc of mars in one view frame 
#isro
#missionmangal

  sentiment is:  positive 

tweet is:  softland failure nothing to worry about exisro chief  deccan herald 
  #nair #isro #chandrayaans   sentiment is:  negative 

tweet is:  rt  sometimes we don’t land or arrive at the destination we want to the important thing is we took off and had th

###### Simple demo of the textblob sentiment analysers

In [None]:
import nltk
nltk.download('movie_reviews')

##### from textblob import TextBlob
from textblob.sentiments import NaiveBayesAnalyzer

#Pattern Analyser - default analyser
print("PatternAnalyzer() results:")
#Positive 
blob = TextBlob("This is a great place\n")
print(blob.sentiment.polarity)

#Neutral
blob = TextBlob("This is a place\n")
print(blob.sentiment.polarity)

#Negative
blob = TextBlob("This is a terrible place\n")
print(blob.sentiment.polarity)

print("\nNaiveBayesAnalyzer() results:")
#Naive Bayes Analyser - we need to explicitly state that the NaiveBayesAnalyser() is to be used
#Positive example
blob = TextBlob("This is a great place", analyzer = NaiveBayesAnalyzer())
print(blob.sentiment.classification)

#Negative example
blob = TextBlob("This is a horrible place", analyzer = NaiveBayesAnalyzer())
print(blob.sentiment.classification)