# Twitter API

You need to have a Twitter Developer Account before proceeding. If you don't have one, [check out these instructions](https://github.com/caocscar/twitter-create-developer-account).

## Getting Twitter Credentials

Twitter implements OAuth 1.0 as its authentication protocol. You'll need 4 credentials in order to use OAuth and make requests to Twitter's API.

The four credentials (i.e. API key, API secret key, Access token and Access token secret) can be obtained on the Twitter developer site to access the API. The steps are as follows:

- Go to https://developer.twitter.com/en/apps and create an application. 
- Enter a name/description/website for your application. You can use https://www.google.com for the website. You also need to describe how the app will be used. Agree to the Developer Terms and create your Twitter application.
- Go to the *Keys and tokens* tab. You should be able to see your **Consumer Key** and **Consumer Secret**.
- At the bottom of the page, click the button to create your own **Access Token** and **Access Token Secret**.

**Note**: You can always revoke/regenerate your credentials or delete the application.

# Postman Software for Non-Programmers

Here is the [Postman homepage](https://www.getpostman.com/).  
1. Download the software (Mac, Windows or Linux) https://www.getpostman.com/downloads/. I'll be using v7.1.1 for the workshop. 
2. Run the executable to install the software.
3. Launch Postman
4. (optional) You can sign in (using Google or creating a Postman account)

Here is a blog on how to use Postman with Twitter. https://www.dataneb.com/blog/how-to-make-calls-to-twitter-apis-using-postman-client

### Making a Twitter API Request
1. Open a new GET request tab (if one is not already open)

### Setting up Twitter Credentials

2. Click on "Authorization" tab. Select "type" as `OAuth 1.0` and "add authorization data to" `Request Headers`.
3. On the "Authorization" tab, you should see a spot to enter your four Twitter credentials we've previously acquired.
4. Switch back to the "Params" tab.

### Making GET Request
5. Enter request URL with  https://api.twitter.com/1.1/account/verify_credentials.json?skip_status=true. This API returns information about your Twitter account. Hit "Send".
6. You should see it return a response with information in JSON format.
7. You can save the JSON response via the "Download" button.
8. That's it. You've just used one endpoint in the Twitter API.

Q: How did I know what URL to use to make the request above?  
A: I read the API documentation.

The Twitter API reference index can be found at https://developer.twitter.com/en/docs/api-reference-index. This has a list of all APIs available to developers. We can look up our endpoint under "Manage account settings and profile" as `GET account/verify_credentials`.

The documentation tells us the "Resource URL" https://api.twitter.com/1.1/account/verify_credentials.json as well as the parameters we have to pass to it (required or optional). It also provides an example request and example response.

### Example 2
Let's move onto another example. How do we get a list of who is following us? This would be under the "Follow, search, and get users" section under `GET followers/list`. The [documentation tells us](https://developer.twitter.com/en/docs/accounts-and-users/follow-search-get-users/api-reference/get-followers-list) the "Resource URL" is 
https://api.twitter.com/1.1/followers/list.json. There are no required parameters. We get our own account by default but we can specify specific accounts by `screen_name` or `user_id`.

**Note**: For some reason beyond me, I can't get the API to search for tweets working with Postman. Its giving me an authentication error I can't resolve. We can get around this issue using a programming language like Python. There might be other APIs among the 100+ that has similar issues. 

# Python for Programmers

We setup our credentials first. Here I am using environmental variables to store my credentials. You can hardcode it in for the workshop.

In [1]:
import os
consumerKey = os.getenv('consumerkey')
consumerSecret = os.getenv('consumersecret')
oauthToken = os.getenv('accesstoken')
oauthTokenSecret = os.getenv('accesstokensecret')

**Note**: To re-iterate, I do NOT recommend you actually hard code the credentials in a working setting. In Python, you should save the credentials in an environmental variable and retrieve it using the `os.getenv` method thus decoupling the credentials from your code.

## Installing Python Twitter Tools Module

We will be using the Python Twitter Tool module for interacting with the API. They usually make our lives easier than making a direct GET request since they handle other stuff under the hood for us. There are many available Twitter packages for Python. I chose this one because it was popular and seemed simple to use (although lacking good documentation). Documentation is at https://github.com/sixohsix/twitter.

Install the module from the terminal or command prompt with `pip install twitter`. 

In [2]:
import twitter

Next submit your credentials and create a variable containing your authentication.

In [3]:
auth = twitter.OAuth(oauthToken, oauthTokenSecret, consumerKey, consumerSecret)

# Twitter Search API

How do we search historical tweets? This is under the "Search Tweets" section under `Standard Search API`. The [documentation tells us](https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets.html) the "Resource URL" is https://api.twitter.com/1.1/search/tweets.json. See the [Twitter Search overview]( https://developer.twitter.com/en/docs/tweets/search/overview) to see what the standard API is lacking compared to the **Premium** and **Enterprise** search products. 
The Standard Search API searches against a __sampling__ of recent Tweets published in the past 7 days. We will see later how to get around the 7 day restriction by invoking other constraints.

**Note**: The Search API is not an exact replica of the Search feature available in Twitter mobile or web clients such as https://twitter.com/search. 

Let's start by creating a Search API handle using your authentication.

In [4]:
twtr = twitter.Twitter(auth=auth)

## Search using a query

Details on the search API can be found here https://developer.twitter.com/en/docs/tweets/search/api-reference/get-search-tweets.html.

Create a query term and also specify the number of tweets you want (default = 15). Use the `search.tweets` method to search for tweets. A list of possible queries (i.e. standard operators) can be found at https://developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/standard-operators.html.

In [6]:
query = '#UMTweetCon2019' # from:arc_um
limit = 2
results = twtr.search.tweets(q=query, count=limit)
results

{'statuses': [{'created_at': 'Mon May 20 14:09:04 +0000 2019',
   'id': 1130475533102247936,
   'id_str': '1130475533102247936',
   'text': "RT @umisr: #UMTweetCon2019 is this week! There's still time to register. Connect with other @UMich scholars in an interdisciplinary exchang…",
   'truncated': False,
   'entities': {'hashtags': [{'text': 'UMTweetCon2019', 'indices': [11, 26]}],
    'symbols': [],
    'user_mentions': [{'screen_name': 'umisr',
      'name': 'U-M ISR',
      'id': 217437553,
      'id_str': '217437553',
      'indices': [3, 9]},
     {'screen_name': 'UMich',
      'name': 'University of Michigan',
      'id': 88836132,
      'id_str': '88836132',
      'indices': [92, 98]}],
    'urls': []},
   'metadata': {'iso_language_code': 'en', 'result_type': 'recent'},
   'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>',
   'in_reply_to_status_id': None,
   'in_reply_to_status_id_str': None,
   'in_reply_to_user_id': None,
   'in_reply_to_user_id

Apparently, there is a lot of data associated with a 140 character tweet.

## User Info

The results are stored in the `statuses` key. Here's how you would access some interesting fields about the Tweeter.

In [11]:
for tweet in results['statuses']:
    print(f"Name: {tweet['user']['name']}")
    print(f"ScreenName: {tweet['user']['screen_name']}")
    print(f"Description: {tweet['user']['description']}")
    print(f"Location: {tweet['user']['location']}")
    print(f"# Tweets: {tweet['user']['statuses_count']}")        
    print(f"# Following: {tweet['user']['friends_count']}")        
    print(f"# Followers: {tweet['user']['followers_count']}")
    print(f"# Likes: {tweet['user']['favourites_count']}")
    print(f"# Lists: {tweet['user']['listed_count']}")
    print('\n')

Name: SEHresearch
ScreenName: SEHresearch
Description: The Social Environment & Health Program (SEH) at The University of Michigan's @umich Institute for Social Research @umisr #SEHresearch
Location: 
# Tweets: 42
# Following: 89
# Followers: 53
# Likes: 50
# Lists: 0


Name: UM SRC
ScreenName: UM_SRC
Description: Survey Research Center is an international leader in research involving the collection and analysis of sample surveys, administrative and other non-survey data.
Location: Ann Arbor, MI
# Tweets: 1851
# Following: 545
# Followers: 896
# Likes: 470
# Lists: 13




## Tweet Info

Here's how you would access info about the tweet itself.

In [18]:
for tweet in results['statuses']:
    print(f"Created at: {tweet['created_at']}")       
    print(f"Text: {tweet['text']}")
    print(f"Source: {tweet['source']}")
    for hashtag in tweet['entities']['hashtags']:
        print(f"Hashtag: {hashtag['text']}")
    print(f"Likes: {tweet['favorite_count']}")
    print(f"Retweets: {tweet['retweet_count']}")        
    print(f"Retweet: {tweet['retweeted']}")
    print(f"Coordinates: {tweet['coordinates']}")
    place = tweet['place']
    if place is not None:
        print(f"Place Name: {place['full_name']}")
        print(f"Place Type: {place['place_type']}")
        print(f"Place Bounding Box: {place['bounding_box']['coordinates']}")
    print('\n')

Created at: Mon May 20 14:09:04 +0000 2019
Text: RT @umisr: #UMTweetCon2019 is this week! There's still time to register. Connect with other @UMich scholars in an interdisciplinary exchang…
Source: <a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>
Hashtag: UMTweetCon2019
Likes: 0
Retweets: 2
Retweet: False
Coordinates: None


Created at: Mon May 20 13:00:40 +0000 2019
Text: RT @umisr: #UMTweetCon2019 is this week! There's still time to register. Connect with other @UMich scholars in an interdisciplinary exchang…
Source: <a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>
Hashtag: UMTweetCon2019
Likes: 0
Retweets: 2
Retweet: False
Coordinates: None




## Search tweets by users located within a given radius of a GPS point

Add the `geo` argument to the `search.tweets` method.

In [24]:
query = 'food' 
limit = 5
georesults = twtr.search.tweets(q=query, count=limit, geo="42.7,-83.3, 10km")

Let's print out the tweet along with the datetime and coordinates information.

In [25]:
for i, tweet in enumerate(georesults['statuses']):
    print(i, tweet['text'])
    print(tweet['created_at'], tweet['coordinates'])
    print('\n')

0 I'm HERE for fast food restaurants making fun of each other lmao https://t.co/ITFgtxHzmM
Tue May 21 14:39:41 +0000 2019 None


1 @AidaraRose I’ll give it a few more weeks.. but yea, I think about food when I can’t eat. And when I can, I get worried about eating enough.
Tue May 21 14:39:41 +0000 2019 None


2 nem tudom mennyire ismerhetitek (pár éve voltak x-faktorban) ezt a formációt, de voltak itt zenélni food truck show… https://t.co/hKE7rVrzvz
Tue May 21 14:39:41 +0000 2019 None


3 Cajun Kitchen Chaos — Steemit https://t.co/PIvKHwYSOm
Tue May 21 14:39:41 +0000 2019 None




Q: How do I know what methods and arguments to use?  
A: I read the API documentation.

This URL has a list of API endpoints for Twitter https://developer.twitter.com/en/docs/api-reference-index. This is also a good jumping off point to get to the doucmentation. *Bookmark it!*

## Get a list of Followers

Use the `followers.list` method to get a list of followers for a given user. Returns a default of 20 results. Can set value up to 200 using the `count` argument.

In [36]:
followers = twtr.followers.list(screen_name='arc_um', skip_status=True, include_user_entities=False)
followers

{'users': [{'id': 1097880395209687040,
   'id_str': '1097880395209687040',
   'name': 'Matt - On Earth',
   'screen_name': 'notinorbit',
   'location': '',
   'url': None,
   'description': 'New account, old me. InfoSec pro @umisr. Opinions are my own.',
   'protected': False,
   'followers_count': 13,
   'friends_count': 46,
   'listed_count': 0,
   'created_at': 'Tue Feb 19 15:27:38 +0000 2019',
   'favourites_count': 549,
   'utc_offset': None,
   'time_zone': None,
   'geo_enabled': False,
   'verified': False,
   'statuses_count': 98,
   'lang': 'en',
   'contributors_enabled': False,
   'is_translator': False,
   'is_translation_enabled': False,
   'profile_background_color': 'F5F8FA',
   'profile_background_image_url': None,
   'profile_background_image_url_https': None,
   'profile_background_tile': False,
   'profile_image_url': 'http://pbs.twimg.com/profile_images/1097880644527505409/srhSVc6j_normal.jpg',
   'profile_image_url_https': 'https://pbs.twimg.com/profile_images/109

Print out the __screen name__ and the __description__ of the followers.

In [37]:
for i, user in enumerate(followers['users']):
    print(i, user['screen_name'], user['description'])

0 notinorbit New account, old me. InfoSec pro @umisr. Opinions are my own.
1 fountain79 Dreamer. Planner. Community Builder. Encourager. Singer. Dancer. Dog lover. Enthusiastically Mediocre Golfer.
2 pfschus President's Postdoctoral Fellow in Michigan NERS department. Interested in nuclear science, radiation detectors, national security, running, and pizza 🍕
3 willard125 Family man, world traveler, runner, home brewer, gamer. Torn between Colorado & Michigan.
4 CSMissThomas1 Miss Thomas 🌼 Computer Science Teacher 👩🏽‍🏫 Running, Countryside & Dog Walks with my Frenchie 🐾  Currently Second in Dept, new adventure from September 💭
5 UM_MICDE MICDE is the focal point for the wide spectrum of research in computational science and engineering at the University of Michigan.
6 TheWebConf The Web Conference Series (formerly WWW) ||  #TheWebConf
7 umichCREES Center for Russian, East European, and Eurasian Studies at the University of Michigan - including Copernicus Program in Polish Studies as #UM

To iterate beyond 20 (or whatever you initially asked for), use the key `next_cursor` from the initial result along with the `cursor` argument. Think of it as Page 2. Repeat as necessary (see example below for complete acquisition).

In [39]:
followers = twtr.followers.list(screen_name='arc_um', cursor=followers['next_cursor'], skip_status=True, include_user_entities=False, )
for i, user in enumerate(followers['users'], start=i+1):
    print(i, user['screen_name'], user['description'])

40 GlobalDAKS Data Analytics & Knowledge Sharing - that is what Global DAKS stands for. We believe knowledge is a common good.
41 m_cragin Exec Director, Midwest Big Data Hub
RTs ≠ endorsement.
42 mmdelc Profesor/Investigador @TecdeMonterrey @LaSalleMX Research Affiliate @citrisnews Fellow @UCBerkeley DigitalMethods & Datalover. Tiro punterazos @apuntesderabona
43 _Sherwani__ 
44 BrendanNyhan @UMich @fordschool @umisrcps professor, @UpshotNYT contributor, @BrightLineWatch co-organizer. Before: @dartmouth / @CJR / Spinsanity / All the President's Spin
45 shilpipatla 
46 sharmavansh99 Grad Student at @UMich | Worked as an R&D Engineer | Formula1 Fanatic| LewisHamilton44|  #GoBlue #UMGrad #Wolverine
47 JohnGMcNutt I teach and do research on online social action. The views expressed  are mine alone and do not necessarily reflect the views of the University of Delaware
48 bryant1410 🇺🇾 CS PhD student @UMich advised by @radamihalcea. Passionate about Natural Language Processing #NLProc and M

## Get a list of Following

Use the `friends.list` method to see who a given user is following. Returns a default of 20 results. Can set value up to 200 using the `count` argument.

In [40]:
following = twtr.friends.list(screen_name='arc_um', skip_status=True, include_user_entities=False)
for i, user in enumerate(following['users']):
    print(i, user['screen_name'], user['description'])

0 umsi The University of Michigan School of Information creates and shares knowledge so that people will use information -- with technology -- to build a better world.
1 chihiroABO Researcher @UMich,@UM_Genetics. Engineer turned Geneticist,Data Scientist. Troublemaker/Troubleshooter. Dog lover. Fave color:Blue. Opinions of my own...
2 EdwardTufte Statistician,visualizer,artist. Professor PoliSci, Statistics,CompSci Yale+Princeton 33 years. Founded Graphics Press, Hogpen Hill Tree Farms, ET Modern Gallery
3 UMichLaw The world needs lawyers. And Leaders. And Victors. This is where they are made. #LeadersAndBest   • http://fb.me/umichlaw  (Retweets not endorsements.)
4 CBSSM CBSSM focuses on clinical & research ethics, health communication & decision making, CBPR, and ELSI of genomics and much more.
5 KayteSB Law & bioethics, Assistant Professor @umichmedicine & @CBSSM. Former @bioethicsgov. @PennLaw @UPenn_MedEthics @Middlebury alumna #researchethics (views own)
6 WNicholsonPrice Innovat

Same function argument to access "Page 2" of the results as in the followers example.

## Cross Reference Followers and Following

Here is some example code to cross reference the two lists to see which relationships are reciprocated.

This function appends a list of users to an existing list.

In [41]:
def append_users(f, list_users):
    for user in f['users']:
        list_users.append(user['screen_name'])
    return list_users

Grab entire list of followers.

In [42]:
followers = {}
followers['next_cursor'] = -1
lemmings = []
while (followers['next_cursor'] != 0):
    followers = twtr.followers.list(screen_name='arc_um', count=200, cursor=followers['next_cursor'])
    lemmings = append_users(followers, lemmings)
print(f'There are {len(lemmings)} followers')

There are 595 followers


In [43]:
lemmings[:10]

['notinorbit',
 'fountain79',
 'pfschus',
 'willard125',
 'CSMissThomas1',
 'UM_MICDE',
 'TheWebConf',
 'umichCREES',
 'occhiphaura',
 'mrmiller1972']

Grab entire list of following.

In [46]:
following = {}
following['next_cursor'] = -1
leaders = []
while (following['next_cursor'] != 0):
    following = twtr.friends.list(screen_name='arc_um', count=200, cursor=following['next_cursor'])
    leaders = append_users(following, leaders)
print(f'Following {len(leaders)} accounts.')

Following 186 accounts.


Find the intersection of the two groups using a set operations.

In [45]:
lemmings = set(lemmings)
leaders = set(leaders)
relationship = leaders.intersection(lemmings)
print(len(relationship), relationship)

86 {'UMLifeSciences', 'denizbilman', 'MichEnergy', 'NYUDataScience', 'aulia_khamas', 'IllinoisIHSI', 'francescadomin8', 'astrocurry', 'ncm140', 'NUITResearch', 'dmaletta', 'umsi', 'tfinholt', 'emilymprovost', 'cjantonelli', 'BrendanNyhan', 'umichmedicine', 'urcmich', 'MrBobbyTables', 'UMichiganAI', 'BonezNQuality', 'PurdueRCAC', 'YvonneW6463', 'uwescience', 'AmberHarmon', 'UMichResearch', 'BloodFlowSim', 'KirkDBorne', 'danabrunson', 'umisr', 'peh3iii', 'uclbdi', 'sgowtham', 'SBroudeGeva', 'oliviawalch', 'DanEklund_UMich', 'jesse_caps', 'UMich_CRLT', 'SciNode', 'thanss', 'jhallum', 'JohnsenSCFPL', 'umichFFMI', 'QLogic', 'UMich', 'YottabyteLLC', 'ARCTS_UM', 'nanoHUBnews', 'PRACE_RI', 'UMengineering', 'EECSatMI', 'ljdursi', 'UMPublicAffairs', 'nswigginton', 'CASC_HPC', 'tonymarkel', 'UM_IHPI', 'umichgradschool', 'russellfunk', 'SOCRedu', 'icermsu', 'brockpalen', 'sciencegateways', 'TDataScience', 'UM_MiCHAMP', 'umichTECH', 'schelcj', 'chihiroABO', 'umichrna', 'mdst_umich', 'dankessler', '

## Search for Trends by Where on Earth (WOE) ID

Use the `trends.place` method to get a list of trending topics for a given location. The location is specified using the WOE ID.  
A WOE ID is a unique identifier for a place on Earth. 

Here is a dictionary of places and WOE IDs.

In [47]:
woeid = {'World':1, 'USA':23424977, 'San Francisco':2487956, 'Los Angeles':2442047,
         'Canada':23424775, 'Toronto':4118, 'Montreal':3534,
         'United Kingdom':23424975, 'Germany':23424829}

Searching for trends by WOE ID is analagous to searching for YouTube videos by Region Code.

In [48]:
woe_trends = twtr.trends.place(_id=woeid['Canada'])

In [49]:
woe_trends

[{'trends': [{'name': 'Bookie',
    'url': 'http://twitter.com/search?q=Bookie',
    'promoted_content': None,
    'query': 'Bookie',
    'tweet_volume': None},
   {'name': '#CollisionConf',
    'url': 'http://twitter.com/search?q=%23CollisionConf',
    'promoted_content': None,
    'query': '%23CollisionConf',
    'tweet_volume': None},
   {'name': '#Collision2019',
    'url': 'http://twitter.com/search?q=%23Collision2019',
    'promoted_content': None,
    'query': '%23Collision2019',
    'tweet_volume': None},
   {'name': '#TuesdayThoughts',
    'url': 'http://twitter.com/search?q=%23TuesdayThoughts',
    'promoted_content': None,
    'query': '%23TuesdayThoughts',
    'tweet_volume': 55431},
   {'name': 'Dave Bookman',
    'url': 'http://twitter.com/search?q=%22Dave+Bookman%22',
    'promoted_content': None,
    'query': '%22Dave+Bookman%22',
    'tweet_volume': None},
   {'name': '#DowntonAbbeyFilm',
    'url': 'http://twitter.com/search?q=%23DowntonAbbeyFilm',
    'promoted_conte

Print out top 10 trending topics with number of tweets

In [50]:
trends = woe_trends[0]['trends']
for trend in trends[:10]:
    print(trend['name'], trend['tweet_volume'])

Bookie None
#CollisionConf None
#Collision2019 None
#TuesdayThoughts 55431
Dave Bookman None
#DowntonAbbeyFilm None
#TuesdayMotivation 31434
Nunavut None
1 in 4 Canadians None
Naufrage None


Here is a URL detailing the WOE ID's supported by Twitter.  
https://twittercommunity.com/t/what-are-the-list-of-woeids-supported-by-twitter/8493/2.

**Note**: Not all WOE IDs are supported by Twitter. Places like Ann Arbor or Michigan are not. Apparently not important enough :(

## Submit Your Own Tweet

If you want to become an *evil trolling TwitterBot*, this is the first step towards darkness (although that would probably violate the use case you signed up for). Use the `statuses.update` method to start tweeting from Python.

In [None]:
#twtr.statuses.update(status="Python BotTweeting from the Twitter API workshop. Thanks #CSCAR, #UMTweetCon2019, @ARC_UM")

Tweeting parameters can be found at  
https://developer.twitter.com/en/docs/tweets/post-and-engage/api-reference/post-statuses-update

If you've been counting, we've touched upon 6+ API endpoints out of the 100+ endpoints available.

## Rate Limits

Usage of the Twitter API is subject to rate limits which varies based on the endpoint.  
Details can be found here at https://developer.twitter.com/en/docs/basics/rate-limits.

# Twitter Streaming API

The streaming API is used to collect tweets from the future. The streaming API provides a **sample** of the available Tweets. Polling and rate limits do NOT apply to the streaming API.

See the [overview on how to Filter Realtime Tweets](https://developer.twitter.com/en/docs/tweets/filter-realtime/overview) to see what the standard API is lacking compared to the **Enterprise** version.

Details on the public data streaming API can be found here at 
https://developer.twitter.com/en/docs/tweets/filter-realtime/api-reference/post-statuses-filter

This code block is here so we can quickly restart the kernel if the streaming API is complaining about `Exceeded connection limit for user`.

In [None]:
consumerKey = os.getenv('consumerkey')
consumerSecret = os.getenv('consumersecret')
oauthToken = os.getenv('accesstoken')
oauthTokenSecret = os.getenv('accesstokensecret')

import twitter
auth = twitter.OAuth(oauthToken, oauthTokenSecret, consumerKey, consumerSecret)

We need to create a new API hand for streaming using our authentication. Only one standing connection per account is allowed to a public endpoint. We'll be using the public stream API which is specified in the domain argument.

In [52]:
twtr_stream = twitter.TwitterStream(auth=auth, domain="stream.twitter.com")

**Note**: You can NOT request every future Tweet through this API. That is referred to as the Firehose. It costs a lot of `$$$$$$$$`.

## Search by Filter

Stream searches are done with a delimited list of terms. A phrase may consist of one or more terms. Term ordering is ignored and searches are not case sensitive.
 
spaces == logical ANDs (e.g. `"Alex twitter" == "alex AND twitter"`)  
commas == logical ORs (e.g. `"Alex, twitter" == "Alex OR twitter"`)

The text of the Tweet and some entity fields are considered for matches. Specifically:
- the `text` attribute of the Tweet
- `expanded_url` and `display_url` for links and media
- `text` for hashtags
- and `screen_name` for user mentions

Use the `statuses.filter` method to create a streaming query.

In [76]:
iterator = twtr_stream.statuses.filter(track="tennis, michigan")

Use a `for` loop to get the generator to yield future results as they come in. I'm printing the fields (where applicable) that are being searched except time. The `break` command is to prevent it going on indefinitely.

**Tip**: Use the stop button in the toolbar to prevent it from going to 100.

In [77]:
for i, tweet in enumerate(iterator):   
    print(f"{i} Time: {tweet['created_at']}")
    print(f"Tweet: {tweet['text']}")
    print(f"Tweet URL: https://twitter.com/{tweet['user']['screen_name']}/status/{tweet['id']}")
    if len(tweet['entities']['hashtags']) > 0:
        print(f"Hashtags: {tweet['entities']['hashtags'][0]['text']}")
    print("\n")
    if i > 3:
        break

0 Time: Tue May 21 18:00:55 +0000 2019
Tweet: Sale of Dan Gilbert's Greektown Casino to Penn National approved by Michigan gaming regulators, transfer will happe… https://t.co/BI8XpoEJ3d
Tweet URL: https://twitter.com/howardstutz/status/1130896266492178432


1 Time: Tue May 21 18:00:55 +0000 2019
Tweet: RT @br_CBB: Heat assistant and former Wolverine Juwan Howard is expected to be named Michigan head coach, per @miaheatbeat https://t.co/yfw…
Tweet URL: https://twitter.com/PrinceOFtheFAM/status/1130896267070939137


2 Time: Tue May 21 18:00:56 +0000 2019
Tweet: RT @Madison_Keys: Venus &amp; Serena Williams were both huge inspirations for me to play tennis. What they’ve done on court is incredible. What…
Tweet URL: https://twitter.com/mootlik/status/1130896268220391424


3 Time: Tue May 21 18:00:57 +0000 2019
Tweet: RT @br_CBB: Heat assistant and former Wolverine Juwan Howard is expected to be named Michigan head coach, per @miaheatbeat https://t.co/yfw…
Tweet URL: https://twitter.com/Ge

More details on the `tracks` request parameter can be found at https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters.

## Search by Location

Use the `locations` argument to specify a bounding box to search. The API will return all tweets whose location intersects the bounding box. This will return all tweets intersecting the New York City bounding box. 

In [79]:
iterator = twtr_stream.statuses.filter(locations="-74,40,-73,41")
for i, tweet in enumerate(iterator, start=1):
    print('Time: {}'.format(tweet['created_at']))
    print('Tweet: {}'.format(tweet['text']))
    print('Coordinates: {}'.format(tweet['coordinates']))
    if tweet['place'] is not None:
        print('BoundingBox: {}'.format(tweet['place']['bounding_box']['coordinates']))
        print('Name: {}'.format(tweet['place']['full_name']))
        print('Type: {}'.format(tweet['place']['place_type']))
        print('Place ID: {}\n'.format(tweet['place']['id']))
    else:
        print('')
    if i > 4:
        break

Time: Tue May 21 18:01:45 +0000 2019
Tweet: Lol a mi me pasa alrevessss
Coordinates: None
BoundingBox: [[[-74.026675, 40.683935], [-74.026675, 40.877483], [-73.910408, 40.877483], [-73.910408, 40.683935]]]
Name: Manhattan, NY
Type: city
Place ID: 01a9a39529b27f36

Time: Tue May 21 18:01:46 +0000 2019
Tweet: @BolandSays White guys w dreads who use natural deodorant once a month and can’t find the 1-3 or 2-4 even if you on… https://t.co/N2KJLrFgBH
Coordinates: None
BoundingBox: [[[-74.026675, 40.683935], [-74.026675, 40.877483], [-73.910408, 40.877483], [-73.910408, 40.683935]]]
Name: Manhattan, NY
Type: city
Place ID: 01a9a39529b27f36

Time: Tue May 21 18:01:46 +0000 2019
Tweet: @cfromhertz Not a bad time to look at a beaten up Lithium name $ALB
Coordinates: None
BoundingBox: [[[-73.962582, 40.541722], [-73.962582, 40.800037], [-73.699793, 40.800037], [-73.699793, 40.541722]]]
Name: Queens, NY
Type: city
Place ID: 00c39537733fa112

Time: Tue May 21 18:01:49 +0000 2019
Tweet: The devil i

Bounding boxes act like OR operators. They do not filter `track` parameters. So the following will either return football OR tweets from NYC. 

In [80]:
iterator = twtr_stream.statuses.filter(track="football", locations="-74,40,-73,41")
for i, tweet in enumerate(iterator, start=1):
    print('Time: {}'.format(tweet['created_at']))
    print('Tweet: {}'.format(tweet['text']))
    print('Coordinates: {}'.format(tweet['coordinates']))
    if tweet['place'] is not None:
        print('BoundingBox: {}'.format(tweet['place']['bounding_box']['coordinates']))
        print('Name: {}'.format(tweet['place']['full_name']))
        print('Type: {}'.format(tweet['place']['place_type']))
        print('ID: {}\n'.format(tweet['place']['id']))
    else:
        print('')
    if i > 4:
        break

Time: Tue May 21 18:02:51 +0000 2019
Tweet: @davidtabrown Big Lebowski
Coordinates: None
BoundingBox: [[[-79.76259, 40.477383], [-79.76259, 45.015851], [-71.777492, 45.015851], [-71.777492, 40.477383]]]
Name: New York, USA
Type: admin
ID: 94965b2c45386f87

Time: Tue May 21 18:02:51 +0000 2019
Tweet: RT @City_Chief: Javier Tebas (La Liga President): 

“Clubs should get their income out of revenue they generate. The problem with City and…
Coordinates: None

Time: Tue May 21 18:02:51 +0000 2019
Tweet: RT @Guillaumemp: 🚨En exclusivité pour @Eurosport_FR, j’ai évoqué avec Roberto Mancini l’arrivée de Sylvinho à l’@OL. Et il est convaincu qu…
Coordinates: None

Time: Tue May 21 18:02:51 +0000 2019
Tweet: @piersmorgan @UEFA It comes to something when @piersmorgan is correct. As a football fan it is wrong any player is… https://t.co/dZz8pLID9S
Coordinates: None

Time: Tue May 21 18:02:51 +0000 2019
Tweet: RT @LidodaduvhaFans: Throwback Tuesday, we choosing @Masandawana the 🇿🇦 football bosses.


More details on the `locations` request parameter can be found at https://developer.twitter.com/en/docs/tweets/filter-realtime/guides/basic-stream-parameters.

## Twitter API Status

Status of the API can be found at https://api.twitterstat.us/

## Saving the Tweets

Once you have the tweets in hand, you can save it in JSON format to a:
1. text file
2. NoSQL database (MongoDB seems to be a popular choice)

We won't cover saving in detail here because it is non-trivial to setup a MongoDB database (and requires admin privileges) within this workshop.

### Text File

To save a single tweet to a text file, use the `json` module with the `dumps` method with standard Python file I/O.

In [None]:
import json
with open('tweet.txt','w') as fout:
    fout.write(json.dumps(tweet, indent=2))

### MongoDB

Below is a simple example of how to add a tweet to a MongoDB.

**Tip**: Make sure MongoDB is running before running this snippet.

In [None]:
import pymongo
client = pymongo.MongoClient("localhost", 27017)
db = client.example
db.my_collection

Insert a single tweet

In [None]:
db.my_collection.insert_one(tweet).inserted_id

Lookup the single tweet

In [None]:
db.my_collection.find_one()

**Note**: There are additional steps besides the code shown to get MongoDB working. 

## Last Note: Library of Congress Twitter Archive

The Library of Congress and Twitter have teamed up back in April 2010 to archive every single public tweet. Here's an journal article on the subject matter. http://firstmonday.org/article/view/5619/4653#p4. Initially, that was the plan. The Library of Congress had a change of heart in Dec 2017 and will only acquire very selective tweets from here on. Regardless, the archive has not been made public yet. 