# Introduction
In this module, we will discuss issues related to accessing data via web services. Modern web sites and applications use Application Programming Interfaces (API) to retrieve and store data in a web environment such as a browser. Most of these transactions happen behind the scenes, but, because they occur in a web environment, they are accessible through any traditional internet channel. In this module we will look at web service in the context of Twitter's API. Though much of this module is specific to Twitter, the process is common to many other web services.

# Web Services
For most web services, you must register with the service before you can access their data. Additionally, in Python, many web services have community developed libraries which simplify the process of connecting to and downloading from various web services.
This document is organized in a way that acknowledges these facts. Firsts, I illustrate how to create a twitter developer account. Then, I show how to use a community-built library (Tweepy) to connect to Twitter’s API.
## Creating a Twitter Developer Account
https://developer.twitter.com/en/apply-for-access
<br><img src="http://thislondonhouse.com/Jupyter/Images/12-Twitter-01.png?1" width="50%" /><br>
Log in with your twitter account and apply for developer access.
<br><img src="http://thislondonhouse.com/Jupyter/Images/12-Twitter-02.png?1" width="50%" /><br>
Request academic access (student). 
<br><img src="http://thislondonhouse.com/Jupyter/Images/12-Twitter-03.png?1" width="50%" /><br>
When asked to describe your reasons for wanting developer access, use my answers as a template for your own answers. Be sure to indicate that we will not be writing applications that tweet, retweet or like content and that we will only be using data for aggregate purposes.
**My answers…**
1. I am using Twitter's API to learn about python programming. I plan to analyze tweets to learn about data collection and semantic analysis. I will learn how to connect to Twitter's API, download tweets and extract and aggregate information. My solution will not involve tweeting, retweeting or liking. No individual tweets will be displayed. All data will be presented in aggregate.
2. I will use Twitter's API to learn how to provide summary statistics. I do not intend to share the information and these exercises are for instructional purposes only.
<br><img src="http://thislondonhouse.com/Jupyter/Images/12-Twitter-04.png?1" width="50%" /><br>
Confirm your selections and select ‘Looks Good!’.
<br><img src="http://thislondonhouse.com/Jupyter/Images/12-Twitter-05.png?1" width="50%" /><br>
Agree to the terms and conditions and click ‘Submit Application’.
<br><img src="http://thislondonhouse.com/Jupyter/Images/12-Twitter-06.png?1" width="50%" /><br>
You should be granted access very quickly. Within the hour at the latest. Check your email.

Log in to your developer account at https://developer.twitter.com.
<br><img src="http://thislondonhouse.com/Jupyter/Images/12-Twitter-07.png?1" width="50%" /><br>
Click ‘Dashboard’ and create a new app.
<br><img src="http://thislondonhouse.com/Jupyter/Images/12-Twitter-08.png?1" width="50%" /><br>
**My answers…**
1. IS 352 Demo - Fall 2020
2. This is a student application to demo connecting to Twitter’s API.
3. Website URL: https://www.loyola.edu/sellinger-business/academics/departments/information-systems-law-and-operations
4. I am using Twitter's API to learn about python programming. I plan to analyze tweets to learn about data collection and semantic analysis. I will learn how to connect to Twitter's API, download tweets and extract and aggregate information. My solution will not involve tweeting, retweeting or liking. No individual tweets will be displayed. All data will be presented in aggregate.
<br><img src="http://thislondonhouse.com/Jupyter/Images/12-Twitter-09.png?1" width="50%" /> <br>
After you create your app, you will need to click “Keys and tokens” to access the API keys that your Python app will use to authenticate with Twitter’s servers. If you do not have Access tokens and access token secrets, you will need to generate them. Copy the keys and paste them into your python file (see below for instructions).


## Connecting to Twitter’s API via Python
In general, there are two common approaches to connecting to an API. The first is to use http requests to pull in data from API resources and the second is to use wrapper libraries to facilitate pulling data from the API. We will look at both in the sections below.

<blockquote class="twitter-tweet"><p lang="en" dir="ltr">‘Men of common sense do not allow much for coincidences in making the ordinary calculations of life’ (The Signalman). <a href="https://t.co/jk1GQmNM7h">pic.twitter.com/jk1GQmNM7h</a></p>&mdash; Charles Dickens (@DickensSays) <a href="https://twitter.com/DickensSays/status/1165316261414416384?ref_src=twsrc%5Etfw">August 24, 2019</a></blockquote> <script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>

### HTTP Requests
Most APIs simply allow external users access to source data via web pages. These web pages use URLs and query string parameters to request specific data. Facilitate these connections, we will use the Requests library (more info here: https://requests.kennethreitz.org/en/master/).

In [None]:
!pip install Requests

This approach requires that we directly access API resources, so we need some reference guide which will outline the available resources and the parameters that each method requires and/or allows. For twitter, this reference is here: https://developer.twitter.com/en/docs/api-reference-index.

Also, you will need your consumer/access keys/secrets to connect to the api. You will find them here: https://developer.twitter.com/en/apps/

In [None]:
import requests
import urllib
import pprint

In [None]:
def getEndpoint(resource, parameters):
    # set the base url
    base_url = 'https://api.twitter.com/'
    # set the api version
    api_version = "2/"
    # set bearer token
    bearer_token = "AAAAAAAAAAAAAAAAAAAAAFzuQQEAAAAAERrEoDqu4kZEZxdZ%2FFlvVwfGDvE%3D1w9la04dvMw0vjCVmcqxyFvos5NWnyzMwdbTJQqzCAdattBPrI"

    # build resource URL
    resource_url = base_url + api_version + resource

    # build headers for authorization
    headers = {
        'Authorization': 'Bearer ' + bearer_token
    }

    # verify resource url
    # print("Getting Endpoint: " + resource_url + "?" + urllib.parse.urlencode(parameters))

    # request data from resource url
    response = requests.get(resource_url, headers=headers, params=parameters)
    # print(response)
    # format response as a python dictionary
    response_data = response.json()
    # return response dictionary to main application
    return response_data


In [None]:
tweetFields = 'attachments,author_id,context_annotations,conversation_id,created_at,entities,geo,id,in_reply_to_user_id,lang,possibly_sensitive,public_metrics,referenced_tweets,reply_settings,source,text,withheld'
userFields = 'created_at,description,entities,id,location,name,pinned_tweet_id,profile_image_url,protected,public_metrics,url,username,verified,withheld'

In [None]:
id = 1165316261414416384
resource = f"tweets/{id}"
parameters = {
    'tweet.fields': tweetFields
}

tweetData = getEndpoint(resource, parameters)

In [None]:
print(type(tweetData))
pprint.pprint(tweetData)

In [None]:
print(tweetData['data']['text'])

In [None]:
tweet = tweetData['data']
pprint.pprint(tweet)

In [None]:
tweetText = tweet['text']

In [None]:
print("Tweet Text: " + tweetText)

**Notice, in the screenshot above, you can see the tweet id for tweets are listed in the url for the tweet. In this case, the URL for the tweet is https://twitter.com/DickensSays/status/1165316261414416384 and the tweet id is simply the number at the end of the URL.**

Also, notice the data returned from Twitter is in the form of a dictionary. This means that you would use standard dictionary notation to reference different values in the returned data.

In [None]:
print(tweet['created_at'])

In [None]:
print(tweet['public_metrics']['retweet_count'])

In [None]:
print(tweet['public_metrics']['like_count'])

In [None]:
print(tweet['entities'])

In [None]:
print(tweet['entities']['urls'][0]['expanded_url'])

In [None]:
print(tweet['author_id'])
authorId = tweet['author_id']

From here, you may want to use the user's id to pull in more data about that user. To do that, you would need to set a new resource url and update the search parameters accordingly.

In [None]:
userName = "IS251_London"
resource = f"users/by/username/{userName}"
parameters = {
    'user.fields': userFields,
}

userData = getEndpoint(resource, parameters)

In [None]:
pprint.pprint(userData)

In [None]:
user = userData['data']

In [None]:
print(user['public_metrics']['tweet_count'])

In [None]:
print(user['public_metrics']['followers_count'])

### Tweepy Wrapper
The second approach is to use a third-party wrapper that will facilitate API requests. These wrappers are written and maintained by groups who are unaffiliated with the source API so the wrapper may be outdated or may not work at all if the maintainers do not update the wrapper to account for any changes in the API. Despite this, wrappers are often easier to use and provide useful features that may be difficult to implement using http requests.

In [None]:
import tweepy

In [None]:
apiKey = "irX92jCuceq7GaTJqYSscstir"
apiSecret = "6YuUxt3JOzG0KzQdl4nmiflqSTBJt77WyDYa0MzctsBQDE4uJB"

auth = tweepy.OAuthHandler(apiKey, apiSecret)
api = tweepy.API(auth, wait_on_rate_limit=True, retry_count=10, retry_delay=5)

In [None]:
status_id = 1165316261414416384
status = api.get_status(status_id)

In [None]:
print(str(status.text))

In the example above, I import the tweepy library and set my consumer key/secret and access token/secret. I then use Tweepy’s .OAuthHandler() method to establish a secure and authenticated connection to Twitter’s servers. Next, I create an API connection that I will use to query Twitter’s servers. Finally, I use the .get_status() method to download tweet number 1165316261414416384.
 
An obvious question is: "How did I know the text of the tweet was accessible by printing status.text?" The answer to that question lies in how tweepy interfaces with the Twitter API. For more information on this exchange, go here: http://docs.tweepy.org/en/v3.6.0/. In short, tweepy works by accepting the <a href="https://www.json.org/">JSON</a> response for twitter, and then converting that into an object with properties that mirror those outlined in the twitter API. So, when you ask tweepy to get a status from Twitter, tweepy returns a <a href="https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object">tweet object</a>.

In [None]:
print(type(status))
pprint.pprint(status._json)

The text returned when you print the status is the text that Twitter's API returns. Tweepy uses this text to build a status object. 

In [None]:
print(status.created_at)

In [None]:
print(status.retweet_count)

In [None]:
print(status.favorite_count)

In [None]:
print(status.entities)

In [None]:
print(status.entities['urls'][0]['expanded_url'])

In [None]:
print(status.user.id)
print(status.user.screen_name)

Similarly, when you ask tweepy to get a user from Twitter, tweepy returns a <a href="https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/user-object">user object</a>.

In [None]:
user = api.get_user(screen_name="IS251_London")
pprint.pprint(user._json)

In [None]:
print(user.statuses_count)

In [None]:
print(user.favourites_count)

# Exercise
Pick a twitter account and calculate a measure of their influence (influence = # of friends / # of followers

In [None]:
# Step 1...

# Step 2...