# Twitter API

You need to have a Twitter Developer Account before proceeding. If you don't have one, [check out these instructions](https://github.com/caocscar/twitter-create-developer-account).

## Getting Twitter Credentials

Twitter implements OAuth 1.0 as its authentication protocol. You'll need 4 credentials in order to use OAuth and make requests to Twitter's API.

The four credentials (i.e. API key, API secret key, Access token and Access token secret) can be obtained on the Twitter developer site to access the API. The steps are as follows:

- Go to https://developer.twitter.com/en/apps and create an application. 
- Enter a name/description/website for your application. You can use https://www.google.com for the website. You also need to describe how the app will be used. Agree to the Developer Terms and create your Twitter application.
- Go to the *Keys and tokens* tab. You should be able to see your **Consumer Key** and **Consumer Secret**.
- At the bottom of the page, click the button to create your own **Access Token** and **Access Token Secret**.

**Note**: You can always revoke/regenerate your credentials or delete the application.

#### Now you are ready to start using the Twitter API.

# Postman Software for Non-Programmers

Here is the [Postman homepage](https://www.getpostman.com/).  
1. Download the software (Mac, Windows or Linux) https://www.getpostman.com/downloads/. I'll be using v7.1.1 for the workshop. 
2. Run the executable to install the software.
3. Launch Postman
4. (optional) You can sign in (using Google or creating a Postman account)

Here is a blog on how to use Postman with Twitter. https://www.dataneb.com/blog/how-to-make-calls-to-twitter-apis-using-postman-client

### Making a Twitter API Request
1. Open a new GET request tab (if one is not already open)

### Setting up Twitter Credentials

2. Click on "Authorization" tab. Select "type" as `OAuth 1.0` and "add authorization data to" `Request Headers`.
3. On the "Authorization" tab, you should see a spot to enter your four Twitter credentials we've previously acquired.
4. Switch back to the "Params" tab.

### Making GET Request
5. Enter request URL with  https://api.twitter.com/1.1/account/verify_credentials.json?skip_status=true. This API returns information about your Twitter account. Hit "Send".
6. You should see it return a response with information in JSON format.
7. You can save the JSON response via the "Download" button.
8. That's it. You've just used one endpoint in the Twitter API.

Q: How did I know what URL to use to make the request above?  
A: I read the API documentation.

The Twitter API reference index can be found at https://developer.twitter.com/en/docs/api-reference-index. This has a list of all APIs available to developers. We can look up our endpoint under "Manage account settings and profile" as `GET account/verify_credentials`.

The documentation tells us the "Resource URL" https://api.twitter.com/1.1/account/verify_credentials.json as well as the parameters we have to pass to it (required or optional). It also provides an example request and example response.

### Example 2
Let's move onto another example. How do we get a list of who is following us? This would be under the "Follow, search, and get users" section under `GET followers/list`. The [documentation tells us](https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference/get-followers-list) the "Resource URL" is 
https://api.twitter.com/1.1/followers/list.json. There are no required parameters. We get our own account by default but we can specify specific accounts by `screen_name` or `user_id`.

**Note**: For some reason beyond me, I can't get the API to search for tweets working with Postman. Its giving me an authentication error I can't resolve. We can get around this issue using a programming language like Python. There might be other APIs among the 100+ that has similar issues. 

# Python for Programmers

We setup our credentials first. Here I am using environmental variables to store my credentials. You can hardcode it in for the workshop.

In [3]:
import os
consumerKey = os.getenv('consumerkey')
consumerSecret = os.getenv('consumersecret')
oauthToken = os.getenv('accesstoken')
oauthTokenSecret = os.getenv('accesstokensecret')

**Note**: To re-iterate, I do NOT recommend you actually hard code the credentials in a working setting. In Python, you should save the credentials in an environmental variable and retrieve it using the `os.getenv` method thus decoupling the credentials from your code.

We can make `GET/POST` requests directly to Twitter. We just need to use the `requests_oauthlib` library for the authentication part and the `requests` library for the request part. Below is an example of the same request we made using Postman.

Install the module from the terminal or command prompt with `pip install requests_oauthlib`.

In [8]:
import requests
from requests_oauthlib import OAuth1

auth = OAuth1(consumerKey, consumerSecret, oauthToken, oauthTokenSecret)
url = 'https://api.twitter.com/1.1/account/verify_credentials.json'
params = {'skip_status':'true'}
R = requests.get(url, auth=auth, params=params)
R.raise_for_status()
response = R.json()
print(response)

{'id': 393706761, 'id_str': '393706761', 'name': 'ARCatUM', 'screen_name': 'ARC_UM', 'location': 'Ann Arbor, MI', 'description': 'Advanced computing resources and services for research, teaching and learning at the University of Michigan', 'url': 'http://t.co/TM2VijomUk', 'entities': {'url': {'urls': [{'url': 'http://t.co/TM2VijomUk', 'expanded_url': 'http://arc.umich.edu', 'display_url': 'arc.umich.edu', 'indices': [0, 22]}]}, 'description': {'urls': []}}, 'protected': False, 'followers_count': 597, 'friends_count': 186, 'listed_count': 67, 'created_at': 'Tue Oct 18 23:23:44 +0000 2011', 'favourites_count': 28, 'utc_offset': None, 'time_zone': None, 'geo_enabled': True, 'verified': False, 'statuses_count': 1921, 'lang': None, 'contributors_enabled': False, 'is_translator': False, 'is_translation_enabled': False, 'profile_background_color': 'DBDBDB', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.tw

The advantage is that we can also use Python to parse the JSON response for the data we need.

In [20]:
print(f"My screen name is @{response['screen_name']}")
print(f"Description: {response['description']}")
print(response['entities']['url']['urls'][0]['expanded_url'])

My screen name is @ARC_UM
Description: Advanced computing resources and services for research, teaching and learning at the University of Michigan
http://arc.umich.edu


##### But a more convenient way to access the API is to use an existing Twitter package that someone has already crafted to abstract away some of the technical details.

## Installing Python Twitter Tools Module

We will be using the Python Twitter Tool module for interacting with the API. They usually make our lives easier than making a direct GET request since they handle other stuff under the hood for us. There are many available Twitter packages for Python. I chose this one because it was popular and seemed simple to use (although lacking good documentation). Documentation is at https://github.com/sixohsix/twitter.

Install the module from the terminal or command prompt with `pip install twitter`. 

In [21]:
import twitter

Next submit your credentials and create a variable containing your authentication.

In [22]:
auth = twitter.OAuth(oauthToken, oauthTokenSecret, consumerKey, consumerSecret)

# Twitter Search API

How do we search historical tweets? This is under the "Search Tweets" section under `Standard Search API`. The [documentation tells us](https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets.html) the "Resource URL" is https://api.twitter.com/1.1/search/tweets.json. See the [Twitter Search overview]( https://developer.twitter.com/en/docs/tweets/search/overview) to see what the standard API is lacking compared to the **Premium** and **Enterprise** search products. 
The Standard Search API searches against a __sampling__ of recent Tweets published in the past 7 days. We will see later how to get around the 7 day restriction by invoking other constraints.

**Note**: The Search API is not an exact replica of the Search feature available in Twitter mobile or web clients such as https://twitter.com/search. 

Let's start by creating a Search API handle using your authentication.

In [23]:
twtr = twitter.Twitter(auth=auth)

## Search using a query

Details on the search API can be found here https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets.html.

Create a query term and also specify the number of tweets you want (default = 15). Use the `search.tweets` method to search for tweets. A list of possible queries (i.e. standard operators) can be found at https://developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/standard-operators.html.

In [24]:
query = '#UMTweetCon2019' # from:arc_um
limit = 2
results = twtr.search.tweets(q=query, count=limit)
results

{'statuses': [{'created_at': 'Thu May 23 14:18:40 +0000 2019',
   'id': 1131565109376180224,
   'id_str': '1131565109376180224',
   'text': 'RT @umisrcps: Using @Twitter data, Michael Traugott of @umisrcps and his team analyzed the networked relationships of political journalists…',
   'truncated': False,
   'entities': {'hashtags': [],
    'symbols': [],
    'user_mentions': [{'screen_name': 'umisrcps',
      'name': 'Ctr for Political St',
      'id': 1484961512,
      'id_str': '1484961512',
      'indices': [3, 12]},
     {'screen_name': 'Twitter',
      'name': 'Twitter',
      'id': 783214,
      'id_str': '783214',
      'indices': [20, 28]},
     {'screen_name': 'umisrcps',
      'name': 'Ctr for Political St',
      'id': 1484961512,
      'id_str': '1484961512',
      'indices': [55, 64]}],
    'urls': []},
   'metadata': {'iso_language_code': 'en', 'result_type': 'recent'},
   'source': '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>',
   

Apparently, there is a lot of data associated with a 140 character tweet.

## User Info

The results are stored in the `statuses` key. Here's how you would access some interesting fields about the Tweeter.

In [25]:
for tweet in results['statuses']:
    print(f"Name: {tweet['user']['name']}")
    print(f"ScreenName: {tweet['user']['screen_name']}")
    print(f"Description: {tweet['user']['description']}")
    print(f"Location: {tweet['user']['location']}")
    print(f"# Tweets: {tweet['user']['statuses_count']}")        
    print(f"# Following: {tweet['user']['friends_count']}")        
    print(f"# Followers: {tweet['user']['followers_count']}")
    print(f"# Likes: {tweet['user']['favourites_count']}")
    print(f"# Lists: {tweet['user']['listed_count']}")
    print('\n')

Name: Cardi B. Hoodson
ScreenName: StewartColes
Description: PhD Candidate at @UM_CommStudies studying how media depictions of marginalized groups affect public opinion. @drexelwestphal & @SDSU_JMS alumnus, #USMC veteran.
Location: Ann Arbor, MI
# Tweets: 12026
# Following: 1036
# Followers: 925
# Likes: 14615
# Lists: 25


Name: UM SchoolofDentistry
ScreenName: UMichDentistry
Description: MDentistry: Advancing health through education, service, research and discovery.
Location: Ann Arbor, MI
# Tweets: 1067
# Following: 256
# Followers: 1539
# Likes: 239
# Lists: 58




## Tweet Info

Here's how you would access info about the tweet itself.

In [26]:
for tweet in results['statuses']:
    print(f"Created at: {tweet['created_at']}")       
    print(f"Text: {tweet['text']}")
    print(f"Source: {tweet['source']}")
    for hashtag in tweet['entities']['hashtags']:
        print(f"Hashtag: {hashtag['text']}")
    print(f"Likes: {tweet['favorite_count']}")
    print(f"Retweets: {tweet['retweet_count']}")        
    print(f"Retweet: {tweet['retweeted']}")
    print(f"Coordinates: {tweet['coordinates']}")
    place = tweet['place']
    if place is not None:
        print(f"Place Name: {place['full_name']}")
        print(f"Place Type: {place['place_type']}")
        print(f"Place Bounding Box: {place['bounding_box']['coordinates']}")
    print('\n')

Created at: Thu May 23 14:18:40 +0000 2019
Text: RT @umisrcps: Using @Twitter data, Michael Traugott of @umisrcps and his team analyzed the networked relationships of political journalists…
Source: <a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>
Likes: 0
Retweets: 5
Retweet: False
Coordinates: None


Created at: Thu May 23 13:57:53 +0000 2019
Text: RT @umisr: Vidya Ramaswamy from @UMichDentistry talks about #studentengagement in university lectures, and using Twitter to improve engagem…
Source: <a href="http://twitter.com/download/android" rel="nofollow">Twitter for Android</a>
Hashtag: studentengagement
Likes: 0
Retweets: 1
Retweet: False
Coordinates: None




## Search tweets by users located within a given radius of a GPS point

Add the `geo` argument to the `search.tweets` method.

In [27]:
query = 'food' 
limit = 5
georesults = twtr.search.tweets(q=query, count=limit, geo="42.7,-83.3, 10km")

Let's print out the tweet along with the datetime and coordinates information.

In [28]:
for i, tweet in enumerate(georesults['statuses']):
    print(i, tweet['text'])
    print(tweet['created_at'], tweet['coordinates'])
    print('\n')

0 We know it is a little more than 1 month away, but mark your calendars for what will be a kick 🍑 time rocking out t… https://t.co/uzLvRoCtiy
Thu May 23 14:36:36 +0000 2019 None


1 RT @BernieSanders: I say to McDonald's CEO @SteveEasterbrk: Be a leader. Set an example for the entire fast food industry to follow. Raise…
Thu May 23 14:36:36 +0000 2019 None


2 RT @petiteyesplease: ok listen. water DOES NOT have calories. All of this dry fasting bullshit is getting on my nerves bc it makes me worri…
Thu May 23 14:36:36 +0000 2019 None




Q: How do I know what methods and arguments to use?  
A: I read the API documentation.

This URL has a list of API endpoints for Twitter https://developer.twitter.com/en/docs/api-reference-index. This is also a good jumping off point to get to the doucmentation. *Bookmark it!*

## Get a list of Followers

Use the `followers.list` method to get a list of followers for a given user. Returns a default of 20 results. Can set value up to 200 using the `count` argument.

In [29]:
followers = twtr.followers.list(screen_name='arc_um', skip_status=True, include_user_entities=False)
followers

{'users': [{'id': 818650312181313536,
   'id_str': '818650312181313536',
   'name': 'AESPORIA',
   'screen_name': 'AESPORIA',
   'location': 'United States',
   'url': 'https://soundcloud.com/AESPORIA',
   'description': 'if (artistic_purpose == artistic_identity):',
   'protected': False,
   'followers_count': 25,
   'friends_count': 314,
   'listed_count': 0,
   'created_at': 'Tue Jan 10 02:46:59 +0000 2017',
   'favourites_count': 67,
   'utc_offset': None,
   'time_zone': None,
   'geo_enabled': False,
   'verified': False,
   'statuses_count': 179,
   'lang': None,
   'contributors_enabled': False,
   'is_translator': False,
   'is_translation_enabled': False,
   'profile_background_color': '000000',
   'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png',
   'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png',
   'profile_background_tile': False,
   'profile_image_url': 'http://pbs.twimg.com/profile_images/109031

Print out the __screen name__ and the __description__ of the followers.

In [30]:
for i, user in enumerate(followers['users']):
    print(i, user['screen_name'], user['description'])

0 AESPORIA if (artistic_purpose == artistic_identity):
1 um_midas MIDAS catalyzes data science through support for faculty, research, education and training, and industry engagement
2 notinorbit New account, old me. InfoSec pro @umisr. Opinions are my own.
3 fountain79 Dreamer. Planner. Community Builder. Encourager. Singer. Dancer. Dog lover. Enthusiastically Mediocre Golfer.
4 pfschus President's Postdoctoral Fellow in Michigan NERS department. Interested in nuclear science, radiation detectors, national security, running, and pizza 🍕
5 willard125 Family man, world traveler, runner, home brewer, gamer. Torn between Colorado & Michigan.
6 CSMissThomas1 Miss Thomas 🌼 Computer Science Teacher 👩🏽‍🏫 Running, Countryside & Dog Walks with my Frenchie 🐾  Currently Second in Dept, new adventure from September 💭
7 UM_MICDE MICDE is the focal point for the wide spectrum of research in computational science and engineering at the University of Michigan.
8 TheWebConf The Web Conference Series (fo

To iterate beyond 20 (or whatever you initially asked for), use the key `next_cursor` from the initial result along with the `cursor` argument. Think of it as Page 2. Repeat as necessary (see example below for complete acquisition).

In [31]:
followers = twtr.followers.list(screen_name='arc_um', cursor=followers['next_cursor'], skip_status=True, include_user_entities=False, )
for i, user in enumerate(followers['users'], start=i+1):
    print(i, user['screen_name'], user['description'])

20 UM_PSC The University of Michigan's Population Studies Center, established in 1961, is an interdisciplinary community of scholars in population research and training.
21 sash_mu Biologist & writer.
22 UMIOE We excel at analyzing and shaping all types of systems to help solve local and global challenges.
23 cnfan99 
24 RDCrawford26 PhD student @UMich studying bioinfomatics. @umichsph alum.  R/C++. Interested in genomics, antibiotic resistance, evolution, IPAs, and trail running.
25 Maggieyuyaoliu 
26 MarioAndreWO Músico, professor e pesquisador.
27 f_greve Phd student UW-Madison
28 evagro Somewhere between idiot and scholar. Generally not the 1⃣, but sometimes that 🐝. I am into birds and I know how to fly in my dreams. 🦉
29 LouWassel Detroiter via birth, education, occupation and avocation.
30 angelaxocampo @umichLSA Collegiate Fellow in Political Science. @UCLA PhD. @BrownUniversity Alumna. American Politics, Latino Politics, Race, Ethnicity & Politics.
31 UMichMCIRCC The Michigan C

## Get a list of Following

Use the `friends.list` method to see who a given user is following. Returns a default of 20 results. Can set value up to 200 using the `count` argument.

In [32]:
following = twtr.friends.list(screen_name='arc_um', skip_status=True, include_user_entities=False)
for i, user in enumerate(following['users']):
    print(i, user['screen_name'], user['description'])

0 umsi The University of Michigan School of Information creates and shares knowledge so that people will use information -- with technology -- to build a better world.
1 chihiroABO Researcher @UMich,@UM_Genetics. Engineer turned Geneticist,Data Scientist. Troublemaker/Troubleshooter. Dog lover. Fave color:Blue. Opinions of my own...
2 EdwardTufte Statistician,visualizer,artist. Professor PoliSci, Statistics,CompSci Yale+Princeton 33 years. Founded Graphics Press, Hogpen Hill Tree Farms, ET Modern Gallery
3 UMichLaw The world needs lawyers. And Leaders. And Victors. This is where they are made. #LeadersAndBest   • http://fb.me/umichlaw  (Retweets not endorsements.)
4 CBSSM CBSSM focuses on clinical & research ethics, health communication & decision making, CBPR, and ELSI of genomics and much more.
5 KayteSB Law & bioethics, Assistant Professor @umichmedicine & @CBSSM. Former @bioethicsgov. @PennLaw @UPenn_MedEthics @Middlebury alumna #researchethics (views own)
6 WNicholsonPrice Innovat

Same function argument to access "Page 2" of the results as in the followers example.

## Cross Reference Followers and Following

Here is some example code to cross reference the two lists to see which relationships are reciprocated.

This function appends a list of users to an existing list.

In [33]:
def append_users(f, list_users):
    for user in f['users']:
        list_users.append(user['screen_name'])
    return list_users

Grab entire list of followers.

In [34]:
followers = {}
followers['next_cursor'] = -1
lemmings = []
while (followers['next_cursor'] != 0):
    followers = twtr.followers.list(screen_name='arc_um', count=200, cursor=followers['next_cursor'])
    lemmings = append_users(followers, lemmings)
print(f'There are {len(lemmings)} followers')

There are 597 followers


In [35]:
lemmings[:10]

['AESPORIA',
 'um_midas',
 'notinorbit',
 'fountain79',
 'pfschus',
 'willard125',
 'CSMissThomas1',
 'UM_MICDE',
 'TheWebConf',
 'umichCREES']

Grab entire list of following.

In [36]:
following = {}
following['next_cursor'] = -1
leaders = []
while (following['next_cursor'] != 0):
    following = twtr.friends.list(screen_name='arc_um', count=200, cursor=following['next_cursor'])
    leaders = append_users(following, leaders)
print(f'Following {len(leaders)} accounts.')

Following 186 accounts.


Find the intersection of the two groups using a set operations.

In [37]:
lemmings = set(lemmings)
leaders = set(leaders)
relationship = leaders.intersection(lemmings)
print(len(relationship), relationship)

86 {'astrocurry', 'NYUDataScience', 'MichEnergy', 'dmaletta', 'ARCTS_UM', 'mcarrt', 'ljdursi', 'JohnsenSCFPL', 'YvonneW6463', 'IllinoisIHSI', 'sgowtham', 'jeefy', 'umichTECH', 'icermsu', 'vcollab', 'SBroudeGeva', 'umisr', 'MWBigDataHub', 'nanoHUBnews', 'KirkDBorne', 'danabrunson', 'BloodFlowSim', 'CASC_HPC', 'umichgradschool', 'UM_IHPI', 'UM_MiCHAMP', 'uwescience', 'sciencegateways', 'emilymprovost', 'SOCRedu', 'peh3iii', 'UMCAIM', 'UMPublicAffairs', 'UMich_CRLT', 'MoiraCAD', 'urcmich', 'PurdueRCAC', 'chihiroABO', 'MrBobbyTables', 'schelcj', 'russellfunk', 'tfinholt', 'aulia_khamas', 'mdst_umich', 'denizbilman', 'YottabyteLLC', 'NUITResearch', 'm_cragin', 'UMLifeSciences', 'UMichResearch', 'nswigginton', 'umichmedicine', 'oliviawalch', 'francescadomin8', 'UMich', 'ncm140', 'MarsdenStanford', 'open_michigan', 'UMichiganAI', 'ConversationUS', 'tonymarkel', 'DanEklund_UMich', 'brockpalen', 'uclbdi', 'XSEDEscience', 'jesse_caps', 'jhallum', 'umsi', 'SciNode', 'UMichiganNews', 'AmberHarmon'

## Search for Trends by Where on Earth (WOE) ID

Use the `trends.place` method to get a list of trending topics for a given location. The location is specified using the WOE ID.  
A WOE ID is a unique identifier for a place on Earth. 

Here is a dictionary of places and WOE IDs.

In [38]:
woeid = {'World':1, 'USA':23424977, 'San Francisco':2487956, 'Los Angeles':2442047,
         'Canada':23424775, 'Toronto':4118, 'Montreal':3534,
         'United Kingdom':23424975, 'Germany':23424829}

Searching for trends by WOE ID is analagous to searching for YouTube videos by Region Code.

In [39]:
woe_trends = twtr.trends.place(_id=woeid['Canada'])

In [40]:
woe_trends

[{'trends': [{'name': 'DJ Smith',
    'url': 'http://twitter.com/search?q=%22DJ+Smith%22',
    'promoted_content': None,
    'query': '%22DJ+Smith%22',
    'tweet_volume': None},
   {'name': '#GreaterTorontoDay',
    'url': 'http://twitter.com/search?q=%23GreaterTorontoDay',
    'promoted_content': None,
    'query': '%23GreaterTorontoDay',
    'tweet_volume': None},
   {'name': '#WorldTurtleDay',
    'url': 'http://twitter.com/search?q=%23WorldTurtleDay',
    'promoted_content': None,
    'query': '%23WorldTurtleDay',
    'tweet_volume': 15562},
   {'name': 'Ottawa Senators',
    'url': 'http://twitter.com/search?q=%22Ottawa+Senators%22',
    'promoted_content': None,
    'query': '%22Ottawa+Senators%22',
    'tweet_volume': None},
   {'name': '#ThursdayThoughts',
    'url': 'http://twitter.com/search?q=%23ThursdayThoughts',
    'promoted_content': None,
    'query': '%23ThursdayThoughts',
    'tweet_volume': 47399},
   {'name': '#TerminatorDarkFate',
    'url': 'http://twitter.com/se

Print out top 10 trending topics with number of tweets

In [41]:
trends = woe_trends[0]['trends']
for trend in trends[:10]:
    print(trend['name'], trend['tweet_volume'])

DJ Smith None
#GreaterTorontoDay None
#WorldTurtleDay 15562
Ottawa Senators None
#ThursdayThoughts 47399
#TerminatorDarkFate 14723
#5SOS 55644
Linda Hamilton None
Beaton Tulk None
Tillerson 58945


Here is a URL detailing the WOE ID's supported by Twitter.  
https://twittercommunity.com/t/what-are-the-list-of-woeids-supported-by-twitter/8493/2.

**Note**: Not all WOE IDs are supported by Twitter. Places like Ann Arbor or Michigan are not. Apparently not important enough :(

## Submit Your Own Tweet

If you want to become an *evil trolling TwitterBot*, this is the first step towards darkness (although that would probably violate the use case you signed up for). Use the `statuses.update` method to start tweeting from Python.

In [None]:
#twtr.statuses.update(status="Python BotTweeting from the Twitter API workshop. Thanks #CSCAR, #UMTweetCon2019, @ARC_UM")

Tweeting parameters can be found at  
https://developer.twitter.com/en/docs/tweets/post-and-engage/api-reference/post-statuses-update

If you've been counting, we've touched upon 6+ API endpoints out of the 100+ endpoints available.

## Rate Limits

Usage of the Twitter API is subject to rate limits which varies based on the endpoint.  
Details can be found here at https://developer.twitter.com/en/docs/basics/rate-limits.

# Twitter Streaming API

The streaming API is used to collect tweets from the future. The streaming API provides a **sample** of the available Tweets. Polling and rate limits do NOT apply to the streaming API.

See the [overview on how to Filter Realtime Tweets](https://developer.twitter.com/en/docs/tweets/filter-realtime/overview) to see what the standard API is lacking compared to the **Enterprise** version.

Details on the public data streaming API can be found here at 
https://developer.twitter.com/en/docs/tweets/filter-realtime/api-reference/post-statuses-filter

This code block is here so we can quickly restart the kernel if the streaming API is complaining about `Exceeded connection limit for user`.

In [None]:
consumerKey = os.getenv('consumerkey')
consumerSecret = os.getenv('consumersecret')
oauthToken = os.getenv('accesstoken')
oauthTokenSecret = os.getenv('accesstokensecret')

import twitter
auth = twitter.OAuth(oauthToken, oauthTokenSecret, consumerKey, consumerSecret)

We need to create a new API hand for streaming using our authentication. Only one standing connection per account is allowed to a public endpoint. We'll be using the public stream API which is specified in the domain argument.

In [42]:
twtr_stream = twitter.TwitterStream(auth=auth, domain="stream.twitter.com")

**Note**: You can NOT request every future Tweet through this API. That is referred to as the Firehose. It costs a lot of `$$$$$$$$`.

## Search by Filter

Stream searches are done with a delimited list of terms. A phrase may consist of one or more terms. Term ordering is ignored and searches are not case sensitive.
 
spaces == logical ANDs (e.g. `"Alex twitter" == "alex AND twitter"`)  
commas == logical ORs (e.g. `"Alex, twitter" == "Alex OR twitter"`)

The text of the Tweet and some entity fields are considered for matches. Specifically:
- the `text` attribute of the Tweet
- `expanded_url` and `display_url` for links and media
- `text` for hashtags
- and `screen_name` for user mentions

Use the `statuses.filter` method to create a streaming query.

In [43]:
iterator = twtr_stream.statuses.filter(track="tennis, michigan")

Use a `for` loop to get the generator to yield future results as they come in. I'm printing the fields (where applicable) that are being searched except time. The `break` command is to prevent it going on indefinitely.

**Tip**: Use the stop button in the toolbar to prevent it from going to 100.

In [44]:
for i, tweet in enumerate(iterator):   
    print(f"{i} Time: {tweet['created_at']}")
    print(f"Tweet: {tweet['text']}")
    print(f"Tweet URL: https://twitter.com/{tweet['user']['screen_name']}/status/{tweet['id']}")
    if len(tweet['entities']['hashtags']) > 0:
        print(f"Hashtags: {tweet['entities']['hashtags'][0]['text']}")
    print("\n")
    if i > 3:
        break

0 Time: Thu May 23 14:38:09 +0000 2019
Tweet: RT @FaithfulSports_: Who's the real MSU? Like for Mississippi State, Retweet for Michigan State 
#MississippiState #MichiganState https://t…
Tweet URL: https://twitter.com/AnCapTrev/status/1131570015172517888
Hashtags: MississippiState


1 Time: Thu May 23 14:38:09 +0000 2019
Tweet: RT @tribelaw: The way @PeteButtigieg thinks and speaks: like a lake in spring — crystal clear all the way down https://t.co/GvtSVmq3b1
Tweet URL: https://twitter.com/AuthorJackBloom/status/1131570015495446531


2 Time: Thu May 23 14:38:12 +0000 2019
Tweet: A fantastic day out at the tennis tournament today, coming 2nd and bringing home the silver medals.  What a great e… https://t.co/vDfezEXcLu
Tweet URL: https://twitter.com/stmichaelssand/status/1131570024869769216


3 Time: Thu May 23 14:38:13 +0000 2019
Tweet: RT @FaithfulSports_: Who's the real MSU? Like for Mississippi State, Retweet for Michigan State 
#MississippiState #MichiganState https://t…
Tweet URL:

More details on the `tracks` request parameter can be found at https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters.

## Search by Location

Use the `locations` argument to specify a bounding box to search. The API will return all tweets whose location intersects the bounding box. This will return all tweets intersecting the New York City bounding box. 

In [45]:
iterator = twtr_stream.statuses.filter(locations="-74,40,-73,41")
for i, tweet in enumerate(iterator, start=1):
    print('Time: {}'.format(tweet['created_at']))
    print('Tweet: {}'.format(tweet['text']))
    print('Coordinates: {}'.format(tweet['coordinates']))
    if tweet['place'] is not None:
        print('BoundingBox: {}'.format(tweet['place']['bounding_box']['coordinates']))
        print('Name: {}'.format(tweet['place']['full_name']))
        print('Type: {}'.format(tweet['place']['place_type']))
        print('Place ID: {}\n'.format(tweet['place']['id']))
    else:
        print('')
    if i > 4:
        break

Time: Thu May 23 14:38:16 +0000 2019
Tweet: #livemusic @woodenshjips @brooklynbowl #nyc #bk #brooklyn #woodenshjips #newyork #newyorkcity #psych #psychedelic @… https://t.co/YMyIeB1FHx
Coordinates: {'type': 'Point', 'coordinates': [-73.95755, 40.72206]}
BoundingBox: [[[-74.041878, 40.570842], [-74.041878, 40.739434], [-73.855673, 40.739434], [-73.855673, 40.570842]]]
Name: Brooklyn, NY
Type: city
Place ID: 011add077f4d2da3

Time: Thu May 23 14:38:18 +0000 2019
Tweet: In UN Security Council on the Protection of Civilians in Armed Conflict, 
Center for Civilians in Conflict's Borell… https://t.co/Ms0ltmT4Fb
Coordinates: None
BoundingBox: [[[-74.041878, 40.570842], [-74.041878, 40.739434], [-73.855673, 40.739434], [-73.855673, 40.570842]]]
Name: Brooklyn, NY
Type: city
Place ID: 011add077f4d2da3

Time: Thu May 23 14:38:18 +0000 2019
Tweet: Reconstruction Of Lowe's Proving Difficult, But Sales Are Up @lowes https://t.co/t3AaeVPL8y @tforbes
Coordinates: None
BoundingBox: [[[-73.888831, 40.9

Bounding boxes act like OR operators. They do not filter `track` parameters. So the following will either return football OR tweets from NYC. 

In [46]:
iterator = twtr_stream.statuses.filter(track="football", locations="-74,40,-73,41")
for i, tweet in enumerate(iterator, start=1):
    print('Time: {}'.format(tweet['created_at']))
    print('Tweet: {}'.format(tweet['text']))
    print('Coordinates: {}'.format(tweet['coordinates']))
    if tweet['place'] is not None:
        print('BoundingBox: {}'.format(tweet['place']['bounding_box']['coordinates']))
        print('Name: {}'.format(tweet['place']['full_name']))
        print('Type: {}'.format(tweet['place']['place_type']))
        print('ID: {}\n'.format(tweet['place']['id']))
    else:
        print('')
    if i > 4:
        break

Time: Thu May 23 14:38:23 +0000 2019
Tweet: RT @KasanovaKraze: When I see Rooney slander I just assume you don’t know football still 🤷🏾‍♂️ only way it makes sense
Coordinates: None

Time: Thu May 23 14:38:22 +0000 2019
Tweet: FOOTBALL: @sloughtownfc will face seven new opponents in the @TheNationalLge South next season and announce… https://t.co/qKKLxjpR6b
Coordinates: None

Time: Thu May 23 14:38:23 +0000 2019
Tweet: RT @FluminenseRaiz: A TORCIDA ACORDOUUUUUUUUU

ISSO AQUI É FLUMINENSE FOOTBALL CLUB PORRA
Coordinates: None

Time: Thu May 23 14:38:24 +0000 2019
Tweet: Practice today for anyone that's on the sectional roster will start at 330pm on the football field.
Coordinates: None
BoundingBox: [[[-86.348441, 39.631677], [-86.348441, 39.927448], [-85.937404, 39.927448], [-85.937404, 39.631677]]]
Name: Indianapolis, IN
Type: city
ID: 018929347840059e

Time: Thu May 23 14:38:24 +0000 2019
Tweet: RT @ThistleTrust: JUST 3 DAYS LEFT and a handful of Early Bird bookings are available ... A

More details on the `locations` request parameter can be found at https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters.

## Twitter API Status

Status of the API can be found at https://api.twitterstat.us/

## Saving the Tweets

Once you have the tweets in hand, you can save it in JSON format to a:
1. text file
2. NoSQL database (MongoDB seems to be a popular choice)

We won't cover saving in detail here because it is non-trivial to setup a MongoDB database (and requires admin privileges) within this workshop.

### Text File

To save a single tweet to a text file, use the `json` module with the `dumps` method with standard Python file I/O.

In [None]:
import json
with open('tweet.txt','w') as fout:
    fout.write(json.dumps(tweet, indent=2))

### MongoDB

Below is a simple example of how to add a tweet to a MongoDB.

**Tip**: Make sure MongoDB is running before running this snippet.

In [None]:
import pymongo
client = pymongo.MongoClient("localhost", 27017)
db = client.example
db.my_collection

Insert a single tweet

In [None]:
db.my_collection.insert_one(tweet).inserted_id

Lookup the single tweet

In [None]:
db.my_collection.find_one()

**Note**: There are additional steps besides the code shown to get MongoDB working. 

## Last Note: Library of Congress Twitter Archive

The Library of Congress and Twitter have teamed up back in April 2010 to archive every single public tweet. Here's an journal article on the subject matter. http://firstmonday.org/article/view/5619/4653#p4. Initially, that was the plan. The Library of Congress had a change of heart in Dec 2017 and will only acquire very selective tweets from here on. Regardless, the archive has not been made public yet. 