<a href="https://colab.research.google.com/github/mratanusarkar/twitter-sentiment-analysis/blob/feature%2Ftwitter-api/Notebooks/Twitter_API.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Twitter API

Aim is to connect to twitter via API (v2) and use it to pull tweets based on filters and conditions such as:
- hashtags (#)
- userId or mentions (@)
- keywords (string)

Using the retrieved data, sort and order the tweets based on various parameters such as:
- number of likes
- number of comments
- number of retweets
- number of engagement or view count

So that, the data can be used for sentiment analysis.

In future (v2), we can also go in depth into each "tweet thread" and "quote tweet (retweet)" to find links and relations with each other, and get different analysis.

## Import Packages

In [1]:
import tweepy
import configparser
import pandas as pd
from pprint import pprint

## Input secrets and keys

In [2]:
# secrets and keys
api_key = "<API_KEY>"
api_key_secret = "<API_KEY_SECRET>"
bearer_token = "<BEARER_TOKEN>"
access_token = "<ACCESS_TOKEN>"
access_token_secret = "<ACCESS_TOKEN_SECRET>"

## Connect to Twitter API

In [3]:
# set auth handler
auth = tweepy.OAuthHandler(api_key, api_key_secret)
auth.set_access_token(access_token, access_token_secret)

# authenticate and get api handler
api = tweepy.API(auth)

## Pull tweets using various methods

In [4]:
# get public tweets from my timeline
limit = 1

tweets_from_public_timeline = api.home_timeline(count=limit)
# pprint(tweets_from_public_timeline[0]._json)
print(tweets_from_public_timeline[0].text)

A Review in @NatRevClinOncol summarizes the multidimensional cellular and molecular profiling technologies that hav… https://t.co/QVAh9RWj88


In [5]:
# get tweets from specific user
user = 'mratanusarkar'
limit = 1

tweets_from_user_timeline = api.user_timeline(screen_name=user, count=limit, tweet_mode='extended')
# pprint(tweets_from_user_timeline[0]._json)
print(tweets_from_user_timeline[0].full_text)

Good to see latest technology getting into chess as well!! https://t.co/8k9jQtSB1K


In [33]:
# get tweets using keywords, #hashtags or @mentions

In [6]:
# get tweets from keywords
keywords = "physics"
limit = 1

tweets_from_keywords = api.search(q=keywords, count=limit, tweet_mode='extended')
# pprint(tweets_from_keywords[0]._json)
print(tweets_from_keywords[0].full_text)

RT @1MASHMARTIN: What do you remember in Physics?


In [7]:
# get tweets from #hashtags
keywords = "#physics"
limit = 1

tweets_from_keywords = api.search(q=keywords, count=limit, tweet_mode='extended', include_entities=True)
# pprint(tweets_from_keywords[0]._json)
print(tweets_from_keywords[0].full_text)

❤️‍🔥❤️‍🔥❤️‍🔥💛🤎💖 
 #science #education #biology #physics #chemistry #technology https://t.co/NN1UHwxt2d


In [8]:
# get tweets from @users or @mentions
keywords = "@3blue1brown"
limit = 1

tweets_from_keywords = api.search(q=keywords, count=limit, tweet_mode='extended', include_entities=True)
# pprint(tweets_from_keywords[0]._json)
print(tweets_from_keywords[0].full_text)

RT @michele_geraci: Domanda: Come fanno le reti neuronali e Intelligenza Artificiale ad imparare? 

Risposta: Gradient Descent

Secondo vid…


In [9]:
# combination of all
query = "#math OR #mathematics AND @3blue1brown"
limit = 1

tweets_from_keywords = api.search(q=query, count=limit, tweet_mode='extended', include_entities=True)
# pprint(tweets_from_keywords[0]._json)
print(tweets_from_keywords[0].full_text)

RT @TheGaloisCxn: I’d love some outside opinion on or verification of this proof! 🧐 

Has anyone in the #Mathematics #YouTube community see…


In [15]:
# use cursor to avoid the API cap
query = "#math OR #mathematics AND @3blue1brown OR physics"
limit = 300

tweets = tweepy.Cursor(api.search, q=query, count=100, tweet_mode='extended').items(limit)
print(tweets)

<tweepy.cursor.ItemIterator object at 0x7fd11bd33e50>


In [14]:
# print(list(tweets)[0].full_text)
# it seems that we can access the iterable only once!! 
# so better convert it to a df

In [52]:
# print all tweets
# for i, tweet in list(tweets):
#     print(i, ":", tweet.full_text)

## Save the data in a DataFrame

In [16]:
# define the column names
columns = ["Time", "User", "Tweet"]
data = []

In [17]:
for tweet in tweets:
    data.append([tweet.created_at, tweet.user.screen_name, tweet.full_text])
dataframe = pd.DataFrame(data, columns=columns)
dataframe

Unnamed: 0,Time,User,Tweet
0,2023-02-07 14:13:40,Editordon2,"Don't struggle, we are here to help\n#book rev..."
1,2023-02-07 14:05:16,russellmanthy,Richard Feynman’s path integral is both a powe...
2,2023-02-07 14:04:18,Earthworksjobs,Postdoc - Storyline scenarios of extreme event...
3,2023-02-07 14:03:47,YosleidiN,RT @dment37: d²=a²+b²+c² (Pythagoras in 3D): p...
4,2023-02-07 14:02:09,NewLeibniz,"#math #physics I can't read music, but I insta..."
...,...,...,...
295,2023-02-06 06:07:54,Legitwriters2,Experts available and reliable to ace academic...
296,2023-02-06 06:07:52,Legitwriters2,Experts available and reliable to ace academic...
297,2023-02-06 06:07:28,Legitwriters2,Experts available and reliable to ace academic...
298,2023-02-06 06:07:19,Legitwriters2,Experts available and reliable to ace academic...


In [None]:
# export data
# dataframe.to_json("tweets.json")
# dataframe.to_csv("tweets.csv")