# Accessing Twitter with Python/Tweepy

*Tweepy* (https://www.tweepy.org/) is a Python package for accessing Twitter's API.

In order to use this Notebook, you must set up your *credentials* file as described previously. The file *credentials.py* must be in the same directory as this Notebook.

In [None]:
import tweepy
import credentials

client = tweepy.client.Client(bearer_token = credentials.BEARER_TOKEN, consumer_key = credentials.API_KEY, consumer_secret = credentials.API_KEY_SECRET)

### Search recent tweets

The *search_recent_tweets* method will retrieve a sample of tweets from the last 7 days. Let's get recent English language tweets mentioning "spring break". By default, 10 tweets are returned. Note that you are limited to retreiving 500,000 tweets a month. Note also that filters are embedded in the search string. To generate an advanced search string, you can use https://twitter.com/search-advanced.

In [None]:
results = client.search_recent_tweets('spring break lang:en')
results

### View the tweets

The actual results are stored in *results.data*, which contains a *list* of results. Information about each tweet can be accessed using the dot (.) operator. Use *tweet.text* to get the text of each tweet.

In [None]:
for tweet in results.data :
    print(tweet.text, '\n')

### Default tweet information

By default each tweet has an *id*, which uniquely identifies it, and its *text*.

In [None]:
tweet1 = results.data[0]
tweet1

In [None]:
tweet1.id

We can look at the tweet on Twitter by specifying its id.

In [None]:
print('https://twitter.com/i/web/status/', tweet1.id, sep = '')

Note that this tweet does not have additional information (such as the author id)

In [None]:
tweet1.author_id

### Twitter objects

When calling a function that uses the Twitter API, a specific Twitter object will be returned. For this class we will focus on two objects, for *Tweets* and *Users*. 

https://developer.twitter.com/en/docs/twitter-api/data-dictionary/introduction

The function *client.search_recent_tweets* returns a *Tweet* object. Each object includes different *fields* that can be returned. In the case of a *Tweet* object, the default fields are the *id* and *text* only, as seen above, and documented here: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet


### Getting additional information using *fields*

We can retreive additional data by setting various *fields* arguments. The argument can be set to a string or a list of strings. Again we will focus on two types:

- *tweet_fields*: additional information about the tweet
    - https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet
- *user_fields*: additional information about the user
    - https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/user

Below, we get the public_metrics (retweets, replies, likes, and quotes).

In [None]:
results = client.search_recent_tweets('spring break lang:en', 
                                      tweet_fields = 'public_metrics')
tweet1 = results.data[0]

In [None]:
tweet1.public_metrics

### Getting additional information using *expansions*

If we want to retreive fields that belong to an object other than the one returned, then we need to use the *expansions* argument. Available expansions are listed here: https://developer.twitter.com/en/docs/twitter-api/expansions

By default, the default fields for the expansion are returned. In the example below, we set the *expansions* to 'author_id', which returns a *user* object for each tweet. The default fields of the user object are the *id*, *name*, and *username*.

In [None]:
results = client.search_recent_tweets('spring break lang:en', expansions = 'author_id')
tweet1 = results.data[0]

We now have the author id for each tweet.

In [None]:
print('https://twitter.com/i/web/status/', tweet1.id, sep = '')

In [None]:
tweet1.author_id

The information retrieved for the expansion can be found in *results.includes*, which is a dictionary that in this case contains the *user* information requested by the *expansion*. Note that only the *default* user information is returned (*id*, *name*, and *username*). If we wanted additional information, we could specify this using the *user_fields* argument.

In [None]:
results.includes['users']

The user information will be linked to each tweet by the user id, but in general this information may not exist for each tweet. Therefore, let's create a dictionary so that we can look up a user. We do this using *dictionary comprehension*.

In [None]:
users_dictionary = { u.id:u for u in results.includes['users']}

Now we can look up a user in the dictionary by its id

In [None]:
tweet1_user = users_dictionary[tweet1.author_id]
tweet1_user

In [None]:
print(tweet1_user.name, '(@', tweet1_user.username, '): ', tweet1.text, sep ='')

Let's now include the user's profile *description*, along with the default *user* information the *author_id* expansion gives us for each tweet.

In [None]:
results = client.search_recent_tweets('spring break lang:en', expansions = 'author_id', user_fields = 'description')
users_dictionary = { u.id:u for u in results.includes['users']}

In [None]:
tweet1 = results.data[0]
user1 = users_dictionary[tweet1.author_id]

print('User:', user1.username)
print('Description:', user1.description)
print()
print('Tweet:', tweet1.text)

### Getting user information

To get user information directly, we can use one of the following: 
- *client.get_user* will get information for one user based on their *userName* or by *id*.
- *client.get_users* will get information for multiple users based on their *usernames* or *ids*.

The concepts involving additional *fields* and *expansions* discussed above apply here. In this case, the only available expansion is for a *pinned_tweet_id*, which will be the ID of the user's pinned tweet, if applicable.

In [None]:
user = client.get_user(id = results.data[0].author_id)
user

By default, we get the user's id, name, and username:

In [None]:
print('User name:', user.data.username)
print('Name:', user.data.name)
print('User id:', user.data.id)
print('Link: https://twitter.com/' + user.data.username)

### Get tweets from a user's timeline

To get tweets from a user's timeline, we need the user's id. Then we can use *client.get_users_tweets* to get the 10 most recent tweets by default.

In [None]:
eastern = client.get_user(username='EasternCTStateU')
eastern.data.id

In [None]:
eastern_tweets = client.get_users_tweets(eastern.data.id)

In [None]:
for i, tweet in enumerate(eastern_tweets.data) :
    print(i+1, ': ', tweet.text, sep = '')
    print()

### Getting more results,  and don't forget your monthly tweet cap

As mentioned above, you are limited to retreiving 500,000 tweets a month. 

To see this monthly Twitter cap usage, log on to your developer account, at https://developer.twitter.com, and click on the 'Developer Portal' link on the top right. 

This page will look something like this: https://gdancik.github.io/CSC-202/data/notes/twitter.png.

However, there are no limits to how much user information can be retrieved.

Because of the monthly cap, the code above uses the default settings to return 10 tweets at once. However, this can be modified by setting *max_results*, which for tweets should be a number between 10 and 100. It is also possible to get multiple *pages* of results. If more than 100 results are desired, you will need to use *tweepy.Paginator* (see https://docs.tweepy.org/en/latest/v2_pagination.html for examples).
