# Accessing Twitter with Python/Tweepy

*Tweepy* (https://www.tweepy.org/) is a Python package for accessing Twitter's API.

In order to use this Notebook, you must set up your *credentials* file as described previously. The file *credentials.py* must be in the same directory as this Notebook.

In [2]:
import tweepy
import credentials

client = tweepy.client.Client(bearer_token = credentials.BEARER_TOKEN, consumer_key = credentials.API_KEY, consumer_secret = credentials.API_KEY_SECRET)

### Search recent tweets

The *search_recent_tweets* method will retrieve a sample of tweets from the last 7 days. Let's get recent English language tweets mentioning "spring break". By default, 10 tweets are returned. Note that you are limited to retreiving 500,000 tweets a month. Note also that filters are embedded in the search string. To generate an advanced search string, you can use https://twitter.com/search-advanced.

In [3]:
results = client.search_recent_tweets('spring break lang:en')
results

Response(data=[<Tweet id=1508552199202349058 text='RT @Jessi_Taylor21: 5 days, 33 miles walked, and 2 VERY swollen ankles. What started as a spring break plan ended up being the best Babymoo…'>, <Tweet id=1508552188619939844 text='I can’t waitttttttt to sleep all of spring break😍'>, <Tweet id=1508552184396406785 text='Just a Monday reminder that you are loved and cared for ❤️ …. Oh, and to stay positive, we have 3 weeks until spring break! You got this! #theheartteacher https://t.co/BPQS60ZlKx'>, <Tweet id=1508552144617570306 text='Professors love to assign homework over spring break as if I’m not going to just vomit on a piece of paper and call it a day'>, <Tweet id=1508552133435658246 text='@MysterySolvent Checking Ritz Carlton rates for Spring Break.'>, <Tweet id=1508552130507988993 text='RT @BlareWolfPrject: Spring break YCH raffle!!\n\nGot a lil footjob sketch for this raffle\n\n-follow AND retweet to enter\n\n-winner gets a full…'>, <Tweet id=1508552125395001348 text='Surf, sand,

### View the tweets

The actual results are stored in *results.data*, which contains a *list* of results. Information about each tweet can be accessed using the dot (.) operator. Use *tweet.text* to get the text of each tweet.

In [4]:
for tweet in results.data :
    print(tweet.text, '\n')

RT @Jessi_Taylor21: 5 days, 33 miles walked, and 2 VERY swollen ankles. What started as a spring break plan ended up being the best Babymoo… 

I can’t waitttttttt to sleep all of spring break😍 

Just a Monday reminder that you are loved and cared for ❤️ …. Oh, and to stay positive, we have 3 weeks until spring break! You got this! #theheartteacher https://t.co/BPQS60ZlKx 

Professors love to assign homework over spring break as if I’m not going to just vomit on a piece of paper and call it a day 

@MysterySolvent Checking Ritz Carlton rates for Spring Break. 

RT @BlareWolfPrject: Spring break YCH raffle!!

Got a lil footjob sketch for this raffle

-follow AND retweet to enter

-winner gets a full… 

Surf, sand, and semiautomatic weapons.

Florida officials seized a staggering 75 guns from rampaging partiers in a spring break beach town this past weekend alone — calling it enough firepower...... https://t.co/4C6fmqcUwM 

RT @Mythical: |￣￣￣￣￣￣￣￣￣￣￣￣￣￣￣|
      WE'RE ON SPRING BREAK!
    

### Default tweet information

By default each tweet has an *id*, which uniquely identifies it, and its *text*.

In [5]:
tweet1 = results.data[0]
tweet1

<Tweet id=1508552199202349058 text='RT @Jessi_Taylor21: 5 days, 33 miles walked, and 2 VERY swollen ankles. What started as a spring break plan ended up being the best Babymoo…'>

In [6]:
tweet1.id

1508552199202349058

We can look at the tweet on Twitter by specifying its id.

In [7]:
print('https://twitter.com/i/web/status/', tweet1.id, sep = '')

https://twitter.com/i/web/status/1508552199202349058


Note that this tweet does not have additional information (such as the author id)

In [8]:
tweet1.author_id

### Twitter objects

When calling a function that uses the Twitter API, a specific Twitter object will be returned. For this class we will focus on two objects, for *Tweets* and *Users*. 

https://developer.twitter.com/en/docs/twitter-api/data-dictionary/introduction

The function *client.search_recent_tweets* returns a *Tweet* object. Each object includes different *fields* that can be returned. In the case of a *Tweet* object, the default fields are the *id* and *text* only, as seen above, and documented here: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet


### Getting additional information using *fields*

We can retreive additional data by setting various *fields* arguments. The argument can be set to a string or a list of strings. Again we will focus on two types:

- *tweet_fields*: additional information about the tweet
    - https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet
- *user_fields*: additional information about the user
    - https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/user

Below, we get the public_metrics (retweets, replies, likes, and quotes).

In [13]:
results = client.search_recent_tweets('spring break lang:en', 
                                      tweet_fields = 'public_metrics')
tweet1 = results.data[0]

In [14]:
tweet1.public_metrics

{'retweet_count': 0, 'reply_count': 0, 'like_count': 0, 'quote_count': 0}

### Getting additional information using *expansions*

If we want to retreive fields that belong to an object other than the one returned, then we need to use the *expansions* argument. Available expansions are listed here: https://developer.twitter.com/en/docs/twitter-api/expansions

By default, the default fields for the expansion are returned. In the example below, we set the *expansions* to 'author_id', which returns a *user* object for each tweet. The default fields of the user object are the *id*, *name*, and *username*.

In [10]:
results = client.search_recent_tweets('spring break lang:en', expansions = 'author_id')
tweet1 = results.data[0]

We now have the author id for each tweet.

In [11]:
print('https://twitter.com/i/web/status/', tweet1.id, sep = '')

https://twitter.com/i/web/status/1506648533448933383


In [12]:
tweet1.author_id

1134832240674988033

The information retrieved for the expansion can be found in *results.includes*, which is a dictionary that in this case contains the *user* information requested by the *expansion*. Note that only the *default* user information is returned (*id*, *name*, and *username*). If we wanted additional information, we could specify this using the *user_fields* argument.

In [13]:
results.includes['users']

[<User id=1134832240674988033 name=BlackToe username=CookerBlackToe>,
 <User id=1047203621224337408 name=krushgraphix💋 username=krushgraphix>,
 <User id=1312020792318795776 name=JoeShiesty username=josephavsz>,
 <User id=1436678841720791043 name=Kevin Carter username=KevinCarter_28>,
 <User id=1301889165060116481 name=KJ username=IDontSpeakWhine>,
 <User id=50618960 name=Chris Dong username=thechrisflyer>,
 <User id=1458796576311238656 name=rupaul fan tweets username=iloveaphextwin1>,
 <User id=1028697242016907265 name=Pace High School Flag Football username=PaceFlagFB>,
 <User id=4594470269 name=nadia’s evil era username=nadiabagels69>,
 <User id=7212562 name=Southwest Airlines username=SouthwestAir>]

The user information will be linked to each tweet by the user id, but in general this information may not exist for each tweet. Therefore, let's create a dictionary so that we can look up a user. We do this using *dictionary comprehension*.

In [14]:
users_dictionary = { u.id:u for u in results.includes['users']}

Now we can look up a user in the dictionary by its id

In [15]:
tweet1_user = users_dictionary[tweet1.author_id]
tweet1_user

<User id=1134832240674988033 name=BlackToe username=CookerBlackToe>

In [16]:
print(tweet1_user.name, '(@', tweet1_user.username, '): ', tweet1.text, sep ='')

BlackToe(@CookerBlackToe): Wonder if $APE and $ETH gonna move the same or if $APE gonna continue doing its own thang 🤪🤪🤪. Spring break in Valhalla still on schedule ☃️ https://t.co/cjCJ3hiBlw


Let's now include the user's profile *description*, along with the default *user* information the *author_id* expansion gives us for each tweet.

In [17]:
results = client.search_recent_tweets('spring break lang:en', expansions = 'author_id', user_fields = 'description')
users_dictionary = { u.id:u for u in results.includes['users']}

In [18]:
tweet1 = results.data[0]
user1 = users_dictionary[tweet1.author_id]

print('User:', user1.username)
print('Description:', user1.description)
print()
print('Tweet:', tweet1.text)

User: alivealice9
Description: Fully human. Does not fit in a box & colors outside the lines. Rabbit Holes & Wonderland. 🌎☮️💗

Tweet: @MelissaTuckey @TompkinsHealth If you have been looking around the world &amp; closer to home - the Cornell numbers in past few weeks not included in these reports (are Ithaca College just returned from spring break included) should be no surprise. https://t.co/Rb6ueFTGmD


### Getting user information

To get user information directly, we can use one of the following: 
- *client.get_user* will get information for one user based on their *userName* or by *id*.
- *client.get_users* will get information for multiple users based on their *usernames* or *ids*.

The concepts involving additional *fields* and *expansions* discussed above apply here. In this case, the only available expansion is for a *pinned_tweet_id*, which will be the ID of the user's pinned tweet, if applicable.

In [19]:
user = client.get_user(id = results.data[0].author_id)
user

Response(data=<User id=802555052875333632 name=AliceWondersAtItAll username=alivealice9>, includes={}, errors=[], meta={})

By default, we get the user's id, name, and username:

In [20]:
print('User name:', user.data.username)
print('Name:', user.data.name)
print('User id:', user.data.id)
print('Link: https://twitter.com/' + user.data.username)

User name: alivealice9
Name: AliceWondersAtItAll
User id: 802555052875333632
Link: https://twitter.com/alivealice9


### Get tweets from a user's timeline

To get tweets from a user's timeline, we need the user's id. Then we can use *client.get_users_tweets* to get the 10 most recent tweets by default.

In [22]:
eastern = client.get_user(username='EasternCTStateU')
eastern.data.id

226561689

In [23]:
eastern_tweets = client.get_users_tweets(eastern.data.id)

In [24]:
for i, tweet in enumerate(eastern_tweets.data) :
    print(i+1, ': ', tweet.text, sep = '')
    print()

1: Congratulations to the admitted class of 2026! Next week, we will host a major spotlight for Psychology, with more "Major Spotlight" events to come! #MyEastern
https://t.co/lBhuDjS4fg https://t.co/zItPiDlGwp

2: https://t.co/qqliXa9hVq

3: https://t.co/XJdtlUDOiT

4: RT @EasternTheatre: Legally Blonde The Musical, follows the transformation of Elle Woods as she tackles stereotypes and scandal in pursuit…

5: RT @ECSUEnglishDept: Questions about English classes or the English major @EasternCTStateU? Get answers (and snacks) at Sigma Tau Delta's Q…

6: This Wednesday, March 23! Over 60 employers will be on campus recruiting Eastern students for jobs and internships. All majors are encouraged to attend to talk with recruiters about their opportunities!  Professional dress and resumes in hand are strongly encouraged. #MyEastern https://t.co/wx9RwpGMoW

7: Photo Submission Link: https://t.co/PD6CCVwG7Q https://t.co/3EZMp1uDf3

8: https://t.co/vrkcimkKCp

9: https://t.co/jWgPbEI9XS

10: h

### Getting more results,  and don't forget your monthly tweet cap

As mentioned above, you are limited to retreiving 500,000 tweets a month. 

To see this monthly Twitter cap usage, log on to your developer account, at https://developer.twitter.com, and click on the 'Developer Portal' link on the top right. 

This page will look something like this: https://gdancik.github.io/CSC-202/data/notes/twitter.png.

However, there are no limits to how much user information can be retrieved.

Because of the monthly cap, the code above uses the default settings to return 10 tweets at once. However, this can be modified by setting *max_results*, which for tweets should be a number between 10 and 100. It is also possible to get multiple *pages* of results. If more than 100 results are desired, you will need to use *tweepy.Paginator* (see https://docs.tweepy.org/en/latest/v2_pagination.html for examples).
