##### FETCHING TWEETS USING TWEEPY

In [1]:
import tweepy
import pandas as pd
import numpy as np
keys=pd.read_csv(r'C:\Users\mugesh\Projects\Sentimental_Analysis\twitter_api_keys.csv')

In [2]:
# Variables that contains the credentials to access Twitter API
access_token = keys['API Keys'][0]
access_secret = keys['API Keys'][1]
consumer_key = keys['API Keys'][2]
consumer_secret = keys['API Keys'][3]

In [3]:
# Create the authentication object
authenticate = tweepy.OAuthHandler(consumer_key,consumer_secret) 
# Set the access token and access token secret
authenticate.set_access_token(access_token,access_secret) 
# Creating the API object while passing in auth information
api = tweepy.API(authenticate, wait_on_rate_limit = True,)

In [4]:
# tweets from a specific user
ny_tweets = api.user_timeline('@nytimes',count=200)
for tweets in ny_tweets:
    print(tweets.text)

It isn’t often that one needs driving lessons to guide a robot through an art gallery, but these are strange times https://t.co/ZbyXKoDMuC
Read updates in Chinese: 新冠病毒疫情最新消息 https://t.co/EItmarz4rX
The latest on the coronavirus:
—Israel said it would soon begin easing isolation restrictions.
—Performers joined t… https://t.co/TuVzuIEPS7
Ciudad Perdida, Colombia’s “Lost City,” is stunning in its scale and complexity: an 80-acre site — parts of which d… https://t.co/fuo64cHRWv
Children bring all kinds of wacky things to bed with them. @nytparenting spoke to experts about what it can tell pa… https://t.co/1MyGAWrQgS
What does New York City’s Tiger Man — the one who in 2003 was found to be keeping a full-grown tiger in his Harlem… https://t.co/Kp6seWU9Yx
Virtual dating platforms are quickly pivoting to help social-distancing singles. “It is an entirely possible scenar… https://t.co/x23MzWG2lA
Which companies should you ask for a refund? It might be more complicated than you think. @ronlie

In [5]:
# fuction to extract data from tweet object
def extract_tweet(tweet_object):
    # create empty list
    tweet_list =[]
    # loop through tweet objects
    for tweet in tweet_object:
        tweet_id = tweet.id # unique integer identifier for tweet
        text = tweet.text # utf-8 text of tweet
        created_at = tweet.created_at # utc time tweet created
        source = tweet.source # utility used to post tweet
        retweets = tweet.retweet_count # number of times this tweet retweeted
        favorites = tweet.favorite_count # number of time this tweet liked
        # append attributes to list
        tweet_list.append({'tweet_id':tweet_id, 
                          'text':text, 
                          'time':created_at, 
                          'source':source, 
                          'retweets':retweets,
                          'favorites':favorites})
    # create dataframe   
    df = pd.DataFrame(tweet_list, columns=['tweet_id','text','time','source','retweets','favorites'])
    return df


df = extract_tweet(ny_tweets)

In [6]:
df

Unnamed: 0,tweet_id,text,time,source,retweets,favorites
0,1251737354815057920,It isn’t often that one needs driving lessons ...,2020-04-19 05:00:16,SocialFlow,61,233
1,1251726366367776768,Read updates in Chinese: 新冠病毒疫情最新消息 https://t....,2020-04-19 04:16:36,Twitter Web App,18,53
2,1251726143767658497,The latest on the coronavirus:\n—Israel said i...,2020-04-19 04:15:43,SocialFlow,89,272
3,1251724703452008449,"Ciudad Perdida, Colombia’s “Lost City,” is stu...",2020-04-19 04:09:59,SocialFlow,163,556
4,1251717166606569473,Children bring all kinds of wacky things to be...,2020-04-19 03:40:02,SocialFlow,51,242
...,...,...,...,...,...,...
195,1251007502357016580,President Trump has made many promises about r...,2020-04-17 04:40:05,SocialFlow,281,637
196,1251004993152352258,"RT @danielle_ivory: ""Two workers at the home, ...",2020-04-17 04:30:07,SocialFlow,66,0
197,1251000544484904963,Read updates in Chinese: 新冠病毒疫情最新消息 https://t....,2020-04-17 04:12:26,SocialFlow,28,95
198,1251000383641616384,—Brazil's president fired his health minister ...,2020-04-17 04:11:48,SocialFlow,167,405


In [7]:
df['text']

0      It isn’t often that one needs driving lessons ...
1      Read updates in Chinese: 新冠病毒疫情最新消息 https://t....
2      The latest on the coronavirus:\n—Israel said i...
3      Ciudad Perdida, Colombia’s “Lost City,” is stu...
4      Children bring all kinds of wacky things to be...
                             ...                        
195    President Trump has made many promises about r...
196    RT @danielle_ivory: "Two workers at the home, ...
197    Read updates in Chinese: 新冠病毒疫情最新消息 https://t....
198    —Brazil's president fired his health minister ...
199    A Massachusetts man was charged with trying to...
Name: text, Length: 200, dtype: object

###### REMOVING THE USER HANDLES

In [8]:
# removing user handle 
df['tidy_tweet'] = df['text'].replace(to_replace ='(@[\w]+)', value ='', regex = True) 
df['tidy_tweet'] = df['tidy_tweet'].replace(to_replace =('RT'), value ='',regex = True) 
df['tidy_tweet']

0      It isn’t often that one needs driving lessons ...
1      Read updates in Chinese: 新冠病毒疫情最新消息 https://t....
2      The latest on the coronavirus:\n—Israel said i...
3      Ciudad Perdida, Colombia’s “Lost City,” is stu...
4      Children bring all kinds of wacky things to be...
                             ...                        
195    President Trump has made many promises about r...
196     : "Two workers at the home, which has 227 bed...
197    Read updates in Chinese: 新冠病毒疫情最新消息 https://t....
198    —Brazil's president fired his health minister ...
199    A Massachusetts man was charged with trying to...
Name: tidy_tweet, Length: 200, dtype: object

##### REMOVE THE LINKS FROM THE TWEETS

In [9]:
#remove any links from the tweet: Links not required for performing sentiment analysis.
df['tidy_tweet'] = df['tidy_tweet'].str.replace('((www\.[\s]+)|(https?://[^\s]+))','\0',regex=True)
df['tidy_tweet']

0      It isn’t often that one needs driving lessons ...
1                  Read updates in Chinese: 新冠病毒疫情最新消息  
2      The latest on the coronavirus:\n—Israel said i...
3      Ciudad Perdida, Colombia’s “Lost City,” is stu...
4      Children bring all kinds of wacky things to be...
                             ...                        
195    President Trump has made many promises about r...
196     : "Two workers at the home, which has 227 bed...
197                Read updates in Chinese: 新冠病毒疫情最新消息  
198    —Brazil's president fired his health minister ...
199    A Massachusetts man was charged with trying to...
Name: tidy_tweet, Length: 200, dtype: object

##### REMOVE SPECIAL CHARACTERS,PUNCTUATION,NUMBERS

In [10]:
# remove special characters, numbers, punctuations: None of them would add any value to the sentiment score.
df['tidy_tweet'] = df['tidy_tweet'].str.replace("[^a-zA-Z]+", " ")

In [11]:
df['tidy_tweet']

0      It isn t often that one needs driving lessons ...
1                               Read updates in Chinese 
2      The latest on the coronavirus Israel said it w...
3      Ciudad Perdida Colombia s Lost City is stunnin...
4      Children bring all kinds of wacky things to be...
                             ...                        
195    President Trump has made many promises about r...
196     Two workers at the home which has beds also t...
197                             Read updates in Chinese 
198     Brazil s president fired his health minister ...
199    A Massachusetts man was charged with trying to...
Name: tidy_tweet, Length: 200, dtype: object

##### TOKENIZING AND REMOVE THE STOP WORDS

In [12]:
df["tidy_tweet"] = df["tidy_tweet"].str.lower()
df["tidy_tweet"] = df["tidy_tweet"].str.split()
from nltk.corpus import stopwords

stop = stopwords.words('english')

In [13]:
df['tidy_tweet']=df['tidy_tweet'].apply(lambda x: [item for item in x if item not in stop])
df['tidy_tweet']

0      [often, one, needs, driving, lessons, guide, r...
1                               [read, updates, chinese]
2      [latest, coronavirus, israel, said, would, soo...
3      [ciudad, perdida, colombia, lost, city, stunni...
4      [children, bring, kinds, wacky, things, bed, s...
                             ...                        
195    [president, trump, made, many, promises, respo...
196    [two, workers, home, beds, also, told, actual,...
197                             [read, updates, chinese]
198    [brazil, president, fired, health, minister, d...
199    [massachusetts, man, charged, trying, blow, je...
Name: tidy_tweet, Length: 200, dtype: object

In [14]:
def rejoin_words(row):
    my_list = row['tidy_tweet']
    joined_words = ( " ".join(my_list))
    return joined_words

df['tidy_tweet'] = df.apply(rejoin_words, axis=1)


In [15]:
df['tidy_tweet']

0      often one needs driving lessons guide robot ar...
1                                   read updates chinese
2      latest coronavirus israel said would soon begi...
3      ciudad perdida colombia lost city stunning sca...
4      children bring kinds wacky things bed spoke ex...
                             ...                        
195    president trump made many promises responding ...
196    two workers home beds also told actual death t...
197                                 read updates chinese
198    brazil president fired health minister disagre...
199    massachusetts man charged trying blow jewish a...
Name: tidy_tweet, Length: 200, dtype: object

##### LEMMATIZATION

In [16]:
import nltk
w_tokenizer = nltk.tokenize.WhitespaceTokenizer()
lemmatizer = nltk.stem.WordNetLemmatizer()

def lemmatize_text(text):
    return [lemmatizer.lemmatize(w) for w in w_tokenizer.tokenize(text)]

df['tidy_tweet'] = df['tidy_tweet'].apply(lemmatize_text)

In [17]:
df['tidy_tweet']

0      [often, one, need, driving, lesson, guide, rob...
1                                [read, update, chinese]
2      [latest, coronavirus, israel, said, would, soo...
3      [ciudad, perdida, colombia, lost, city, stunni...
4      [child, bring, kind, wacky, thing, bed, spoke,...
                             ...                        
195    [president, trump, made, many, promise, respon...
196    [two, worker, home, bed, also, told, actual, d...
197                              [read, update, chinese]
198    [brazil, president, fired, health, minister, d...
199    [massachusetts, man, charged, trying, blow, je...
Name: tidy_tweet, Length: 200, dtype: object

In [18]:
def rejoin_words(row):
    my_list = row['tidy_tweet']
    joined_words = ( " ".join(my_list))
    return joined_words

df['tidy_tweet'] = df.apply(rejoin_words, axis=1)

In [19]:
df['tidy_tweet']

0      often one need driving lesson guide robot art ...
1                                    read update chinese
2      latest coronavirus israel said would soon begi...
3      ciudad perdida colombia lost city stunning sca...
4      child bring kind wacky thing bed spoke expert ...
                             ...                        
195    president trump made many promise responding c...
196    two worker home bed also told actual death tol...
197                                  read update chinese
198    brazil president fired health minister disagre...
199    massachusetts man charged trying blow jewish a...
Name: tidy_tweet, Length: 200, dtype: object

In [20]:
df.isnull().sum()

tweet_id      0
text          0
time          0
source        0
retweets      0
favorites     0
tidy_tweet    0
dtype: int64

##### CHECK THE SENTIMENT USING TEXTBLOB 

In [21]:
from textblob import TextBlob
df[['polarity', 'subjectivity']] = df['tidy_tweet'].apply(lambda Text: pd.Series(TextBlob(Text).sentiment))

In [22]:
df[['text','polarity', 'subjectivity']]

Unnamed: 0,text,polarity,subjectivity
0,It isn’t often that one needs driving lessons ...,-0.050000,0.150000
1,Read updates in Chinese: 新冠病毒疫情最新消息 https://t....,0.000000,0.000000
2,The latest on the coronavirus:\n—Israel said i...,0.500000,0.900000
3,"Ciudad Perdida, Colombia’s “Lost City,” is stu...",0.500000,1.000000
4,Children bring all kinds of wacky things to be...,0.550000,0.950000
...,...,...,...
195,President Trump has made many promises about r...,0.500000,0.500000
196,"RT @danielle_ivory: ""Two workers at the home, ...",0.050000,0.275000
197,Read updates in Chinese: 新冠病毒疫情最新消息 https://t....,0.000000,0.000000
198,—Brazil's president fired his health minister ...,-0.388889,0.833333


In [23]:
single_tweet=df['text'][100]
tidy_tweet1=df['tidy_tweet'][100]

In [24]:
from textblob.sentiments import NaiveBayesAnalyzer
analysis = TextBlob(tidy_tweet1,analyzer=NaiveBayesAnalyzer())
print(single_tweet)
print(analysis.sentiment[0])

The New York Liberty chose Oregon’s Sabrina Ionescu with the No. 1 pick in the WNBA draft. The season is postponed… https://t.co/E427v9vbUd
neg
