<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px" />

# Introduction to Web Services and APIs

_Author: Dave Yerrington (SF)_

One of my first projects as a data scientist was to write a program to download images based on information obtained from an API. My boss told me "you should just have to make requests to a simple REST API endpoint." I nodded along, but I had no idea what half of those words meant.

After this lesson, you will have some idea what those words mean and will know enough to get started interacting with APIs in a secure way.

In [1]:
import os

import pandas as pd
import requests

# Introduction to APIs

## Data API Example

An API allows you to use code to interact with a service.

For instance, Apple offers a freely accessible iTunes Search API.

You can access a resource through this API by pasting this URL into your browser:

    https://itunes.apple.com/search?term=beyonce&entity=musicVideo

You can also use the Python library `requests` to retrieve this resource.

In [2]:
# Write code to retrieve the resource
# /scrub/
response = requests.get("https://itunes.apple.com/search?term=beyonce&entity=musicVideo")
search_response = response.json()
search_response

{'resultCount': 50,
 'results': [{'wrapperType': 'track',
   'kind': 'music-video',
   'artistId': 1419227,
   'collectionId': 939779719,
   'trackId': 939779783,
   'artistName': 'Beyoncé',
   'collectionName': 'BEYONCÉ (More Only) - EP',
   'trackName': '1+1',
   'collectionCensoredName': 'BEYONCÉ (More Only) - EP',
   'trackCensoredName': '1+1 (Live from Mrs. Carter Show World Tour)',
   'artistViewUrl': 'https://itunes.apple.com/us/artist/beyonc%C3%A9/1419227?uo=4',
   'collectionViewUrl': 'https://itunes.apple.com/us/music-video/1-1-live-from-mrs-carter-show-world-tour/939779783?uo=4',
   'trackViewUrl': 'https://itunes.apple.com/us/music-video/1-1-live-from-mrs-carter-show-world-tour/939779783?uo=4',
   'previewUrl': 'https://video-ssl.itunes.apple.com/apple-assets-us-std-000001/Video128/v4/b1/77/09/b177099e-8896-f8e8-fbd0-32a07043a198/mzvf_4178862957106535709.640x480.h264lc.U.p.m4v',
   'artworkUrl30': 'https://is1-ssl.mzstatic.com/image/thumb/Video3/v4/e5/ab/42/e5ab42d9-ce31-16

https://itunes.apple.com/search is an example of a **web API endpoint**. By typing this URL followed by `?` and then a sequence of ampersand-separated query terms, you can retrieve information from an iTunes database that Apple maintains.

## API Client Libraries

The URL "https://itunes.apple.com/search?term=beyonce&entity=musicVideo" isn't much fun to read or write. An **API client library** allows you to interact with the service by writing more expressive code in a language such as Python.

For instance, an API client library for the iTunes API might provide a function like this.

In [3]:
BASE_URL = "https://itunes.apple.com/"

def get_music_videos_by_artist(artist):
    endpoint = 'search'
    url = f'{BASE_URL}/{endpoint}?term={artist}&entity=musicVideo'
    response = requests.get(url)
    return response.json()

Then you could retrieve that resource by writing code like this:

In [4]:
get_music_videos_by_artist('beyonce')

{'resultCount': 50,
 'results': [{'wrapperType': 'track',
   'kind': 'music-video',
   'artistId': 1419227,
   'collectionId': 939779719,
   'trackId': 939779783,
   'artistName': 'Beyoncé',
   'collectionName': 'BEYONCÉ (More Only) - EP',
   'trackName': '1+1',
   'collectionCensoredName': 'BEYONCÉ (More Only) - EP',
   'trackCensoredName': '1+1 (Live from Mrs. Carter Show World Tour)',
   'artistViewUrl': 'https://itunes.apple.com/us/artist/beyonc%C3%A9/1419227?uo=4',
   'collectionViewUrl': 'https://itunes.apple.com/us/music-video/1-1-live-from-mrs-carter-show-world-tour/939779783?uo=4',
   'trackViewUrl': 'https://itunes.apple.com/us/music-video/1-1-live-from-mrs-carter-show-world-tour/939779783?uo=4',
   'previewUrl': 'https://video-ssl.itunes.apple.com/apple-assets-us-std-000001/Video128/v4/b1/77/09/b177099e-8896-f8e8-fbd0-32a07043a198/mzvf_4178862957106535709.640x480.h264lc.U.p.m4v',
   'artworkUrl30': 'https://is1-ssl.mzstatic.com/image/thumb/Video3/v4/e5/ab/42/e5ab42d9-ce31-16

## Model API Example

We just saw an example of an API that gives you access to data. There are also APIs that give you access to predictions from trained machine learning models.

For instance, Google has a "Cloud Vision API" that allows you to post an image and get back a list of labels that apply to that image, as predicted by a trained machine learning model. [Try it here](https://cloud.google.com/vision/docs/drag-and-drop).

Developers who need this kind of image recognition capability can pay to send images to the underlying API and retrieve responses through their application code.

## How APIs Help

**Data APIs:**
- *As a consumer:* You need data to do data science. Many organizations provide access to their data through an API, so knowing how to interact with APIs increases the range of data science projects that you can tackle. In addition, it allows you to get just the data you need rather than an entire set of files or database.
- *As a producer:* Exposing a valuable data set through an API allows you to provide access to that data set in a controlled way; for instance, it allows you to charge for each query above a certain limit.

**Model APIs:**
- *As a consumer:* Accessing a model API allows you to use a model someone else has already trained rather than building it yourself.
- *As a producer:* Building a model API allows you to sell access to a model or make it available to your organization's applications without having to write application code yourself.

Each API has its own structure, authentication methods, and client libraries. As a result, no single lesson can teach you everything you need to know to use any arbitrary API effectively. What we will aim to do is to teach you enough to get you started with data APIs. A well-maintained data API will have documentation that will get you the rest of the way with that specific service.

# Notes on Web Scraping

- Using code to obtain data from a website without going through an external API is called **web scraping**.
- Many websites provide rules for web scraping in a `robots.txt` file. ([Example](http://www.espn.com/robots.txt))
- Scraping a website is fine if no API is available, you follow any rules in a `robots.txt` file, and the amount of traffic you generate is not too large.
- A website may block you IP address if you violate these principles.
- This lesson's README provides links to webs scraping resources.

# Famous APIs

## Facebook Graph API

Facebook provides an API for interacting with their service in the following ways:

- View your posts
- View websites, people, posts, pages that you've liked
- View activity on apps from you and your friends
  - Movies watched
  - Music listened
  - Games played
- View places traveled / check-ins

**Potential Project Ideas**

- Determining Latent Characteristics
- Friends Activity
- Political Classification
- Text Mining
- Friend Classifier
- Trending Topics
- Recommenders
- Feature Importances
- Taste Profiling
- Hipster Detector
- Sub-group Identification
- Checkin-Prediction
- Relationship Forcasting
- Relationship Classification
- Sentiment Analysis
- Popularity Projection
- Personal Analytics
- Friend Similarity Prediction
- N-Gram Analysis
- Topic Modeling

<a id='yelp'></a>
##  Yelp API

Yelp provides a way for developers to access:

- Reviews
 - Services
 - Restaraunts / Bars / Cafes
 - Businesses
- Business meta-data

**Potential Project Ideas**

- Topic Modeling
- Text Mining
- Sentiment Analysis
- Funny / Cool / Interesting Classification
- Music Genre Classification
- Parking Index Classification
- Characteristics Profiling
- Hipster Index
- Ideal Activities
- Friend Recommender
- Venue Recommender
- Sports Bar Classifcation
- Where is the best [whatever] in [neighborhood]

<a id='echonest'></a>
## Echonest

Echonest consolidates access to many entertainment service APIs in one place.  It has a huge list of features and connected services including:

- Spotify
- Pandora
- Rdio
- Gracenote
- SoundHound
- Shazam

**Some Echonest features:**

- Music waveform identification (like Shazam, Soundhound music ID)
- Playlist recommendations
- Detailed artist, album, and track lookup
 - Bio / Origins / Contemporaries / Noteworthy Accomplishments
 - Official twitter / website / social media links
 - BPM / Mood / Popularity / Genre(s) 
 - Images / Videos / Media
- Detailed movie, actor, product lookup
- Concert Schedules and ticket metadata

# API Structure

## REST Architecture

Many APIs follow a set of design principles called REST. Here are the most important of those principles for our purposes:

- Each resource consists of text in XML or (more often) JSON format.
- Each resource has exactly one "address" (URL).
- Each resource contains links to related resources.
- Requests use a standard set of verbs:
    - POST -- create a resource
    - GET -- retrieve a resource
    - PATCH -- update a resource
    - DELETE -- delete a resource

## URLs

A URL specifies where a resource can be found on the web.

![URL](../assets/images/http1-url-structure.png)

## Response Structure

A response to an API request contains a status code as well as the requested resource (as well as some additional metadata).

### Status Codes

Status codes provide information about the status of a request, such as whether it succeeded or failed:

- 100-199: Informational
- 200-299: Success
- 300-399: Redirect
- 400-499: Client error
    - 400: Bad request
    - 403: Forbidden
    - 404: Not Found
- 500-599: Server error

### JSON

The most common format for the body of an HTTP request is JSON.

"JSON" is short for _JavaScript Object Notation_. JSON is called "JSON" because a JSON string is valid code in a programming language called Javascript that is used for web programming. It looks very similar to Python code. It is just a nested collection of objects similar to Python dictionaries ("objects" in Javascript) and lists ("arrays" in Javascript).

**JSON Components:**

|Description | Javascript name | Python counterpart |
|----|-----|----|
|Collection of key/value pairs | object | dict |
|Ordered collection of values | array | list |

**JSON Example** The JSON data below comes from Google's Geocode API, which allows you to get geographical information for an address.

In [5]:
google_response = {
  "results": [
    {
      "address_components": [
        {
          "long_name": "225",
          "short_name": "225",
          "types": [
            "street_number"
          ]
        },
        {
          "long_name": "Bush Street",
          "short_name": "Bush St",
          "types": [
            "route"
          ]
        },
        {
          "long_name": "Financial District",
          "short_name": "Financial District",
          "types": [
            "neighborhood",
            "political"
          ]
        },
        {
          "long_name": "San Francisco",
          "short_name": "SF",
          "types": [
            "locality",
            "political"
          ]
        },
        {
          "long_name": "San Francisco County",
          "short_name": "San Francisco County",
          "types": [
            "administrative_area_level_2",
            "political"
          ]
        },
        {
          "long_name": "California",
          "short_name": "CA",
          "types": [
            "administrative_area_level_1",
            "political"
          ]
        },
        {
          "long_name": "United States",
          "short_name": "US",
          "types": [
            "country",
            "political"
          ]
        },
        {
          "long_name": "94104",
          "short_name": "94104",
          "types": [
            "postal_code"
          ]
        }
      ],
      "formatted_address": "225 Bush St, San Francisco, CA 94104, USA",
      "geometry": {
        "location": {
          "lat": 37.7908343,
          "lng": -122.4015725
        },
        "location_type": "ROOFTOP",
        "viewport": {
          "northeast": {
            "lat": 37.7921832802915,
            "lng": -122.4002235197085
          },
          "southwest": {
            "lat": 37.7894853197085,
            "lng": -122.4029214802915
          }
        }
      },
      "place_id": "ChIJZXDI4YmAhYAReSK9_qXi2Oo",
      "types": [
        "street_address"
      ]
    }
  ],
  "status": "OK"
}

**Exercise (8 mins., pair programming)**

For these exercises, go to https://codebeautify.org/jsonviewer, Paste the JSON provided below into the JSON Input box, and click "Beautify."

```javascript
[{"source":"Twitter for iPhone","text":"If the E.U. wants to further increase their already massive tariffs and barriers on U.S. companies doing business there, we will simply apply a Tax on their Cars which freely pour into the U.S. They make it impossible for our cars (and more) to sell there. Big trade imbalance!","created_at":"Sat Mar 03 17:53:50 +0000 2018","retweet_count":22641,"favorite_count":87398,"is_retweet":false,"id_str":"969994273121820672"},{"source":"Twitter for iPhone","text":"The United States has an $800 Billion Dollar Yearly Trade Deficit because of our “very stupid” trade deals and policies. Our jobs and wealth are being given to other countries that have taken advantage of us for years. They laugh at what fools our leaders have been. No more!","created_at":"Sat Mar 03 17:43:26 +0000 2018","retweet_count":22563,"favorite_count":90540,"is_retweet":false,"id_str":"969991653393039361"}]
```

Identify the structure of the JSON:

- What type of structure is it at the top level?

/scrub/

array

- How many items does that structure contain?

/scrub/

2

- What type of structure is each of those items?

/scrub/

object

- What are the keys in each of those structures?

/scrub/

source, text, created_at, retweet_count, favorite_count, is_retweet, id_str

- **BONUS:** Write Python code to extract the latitude and longitude from `google_response`.

*Tip:* Work incrementally, digging into the JSON one level at a time.

In [6]:
# /scrub/
google_response['results'][0]['geometry']['location']

{'lat': 37.7908343, 'lng': -122.4015725}

$\blacksquare$

**Note:** Although JSON looks a lot like Python, its syntax differs from that of Python lists and dictionaries in a few ways:

- The JSON counterpart of Python strings must be written with double quotes, whereas Python strings can be written with either double or single quotes.
- The JSON counterpart of Python's `None` object is `null`.
- JSON uses "true" and "false" where Python uses "True" and "False".
- Unlike Python, JSON does not allow trailing commas at the end of an object or array definition.

<a name="guided-practice"></a>
# Guided practice: Pulling data from APIs

Let's use the Star Wars API (SWAPI) to retrieve data about Obi-Wan Kenobi. (This API is perhaps a little frivolous, but it's a good one to practice with because it require registering and setting up authentication.)

Referencing the [SWAPI Documentation](http://swapi.co/documentation), lets create a query to search for characters with "obi" in their name.

/scrub/

https://swapi.co/api/people/?format=json&search=obi

```javascript
{
    "count": 1,   
    "next": null,   
    "previous": null,   
    "results": [  
        {  
            "name": "Obi-Wan Kenobi",   
            "height": "182",   
            "mass": "77",   
            "hair_color": "auburn, white",   
            "skin_color": "fair",   
            "eye_color": "blue-gray",   
            "birth_year": "57BBY",   
            "gender": "male",   
            "homeworld": "http://swapi.co/api/planets/20/",   
            "films": [    
                "http://swapi.co/api/films/2/",   
                "http://swapi.co/api/films/5/",   
                "http://swapi.co/api/films/4/",   
                "http://swapi.co/api/films/6/",   
                "http://swapi.co/api/films/3/",   
                "http://swapi.co/api/films/1/"  
            ],   
            "species": ["http://swapi.co/api/species/1/"],     
            "vehicles": ["http://swapi.co/api/vehicles/38/"],   
            "starships": [  
                "http://swapi.co/api/starships/48/",     
                "http://swapi.co/api/starships/59/",   
                "http://swapi.co/api/starships/64/",   
                "http://swapi.co/api/starships/65/",   
                "http://swapi.co/api/starships/74/"  
            ],   
            "created": "2014-12-10T16:16:29.192000Z",     
            "edited": "2014-12-20T21:17:50.325000Z",     
            "url": "http://swapi.co/api/people/10/"    
        }  
    ]  
}  
```

<a id='submit'></a>
**Exercise (12 mins., in pairs)**.

- Use the `requests` library to retrieve a response from SWAPI at the URL above. Then print its JSON contents

In [7]:
# /scrub/
response = requests.get("http://swapi.co/api/people/10/")
response.json()

{'name': 'Obi-Wan Kenobi',
 'height': '182',
 'mass': '77',
 'hair_color': 'auburn, white',
 'skin_color': 'fair',
 'eye_color': 'blue-gray',
 'birth_year': '57BBY',
 'gender': 'male',
 'homeworld': 'https://swapi.co/api/planets/20/',
 'films': ['https://swapi.co/api/films/2/',
  'https://swapi.co/api/films/5/',
  'https://swapi.co/api/films/4/',
  'https://swapi.co/api/films/6/',
  'https://swapi.co/api/films/3/',
  'https://swapi.co/api/films/1/'],
 'species': ['https://swapi.co/api/species/1/'],
 'vehicles': ['https://swapi.co/api/vehicles/38/'],
 'starships': ['https://swapi.co/api/starships/48/',
  'https://swapi.co/api/starships/59/',
  'https://swapi.co/api/starships/64/',
  'https://swapi.co/api/starships/65/',
  'https://swapi.co/api/starships/74/'],
 'created': '2014-12-10T16:16:29.192000Z',
 'edited': '2014-12-20T21:17:50.325000Z',
 'url': 'https://swapi.co/api/people/10/'}

Print the response object you got back from `requests.get` above. What information is it giving you in square brackets?

In [8]:
response

<Response [200]>

/scrub/

It is giving the response code -- in this case, "200" indicating success.

- Use `requests` to retrieve three additional responses from SWAPI, and print their contents.

- Run the cell below to install the SWAPI Python client.

In [9]:
!conda install -y -c anaconda ujson
!pip install swapi

Solving environment: done


  current version: 4.5.11
  latest version: 4.6.1

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /Users/gGandenberger/anaconda3/envs/py37

  added / updated specs: 
    - ujson


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2018.12.5  |                0         123 KB  anaconda
    openssl-1.1.1              |       h1de35cc_0         4.6 MB  anaconda
    ujson-1.35                 |   py37h1de35cc_0          25 KB  anaconda
    certifi-2018.11.29         |           py37_0         145 KB  anaconda
    ------------------------------------------------------------
                                           Total:         4.9 MB

The following NEW packages will be INSTALLED:

    ujson:           1.35-py37h1de35cc_0 anaconda

The following packages will be UPDATED:

 

- Run the cell below to use the SWAPI Python client to get data on the character Luke and the planet Tatooine.

In [10]:
import swapi

luke = swapi.get_person(1)
tatooine = swapi.get_planet(1)

- The `swapi` Python client not only makes it easier to retrieve resources by constructing the relevant URLs for you but also returns structured Python objects instead of JSON blobs. Run the cells below to see what methods and attributes the object `luke` has.

In [11]:
# See documentation for the `People` class, of which `luke` is an instance
help(luke)

Help on People in module swapi.models object:

class People(BaseModel)
 |  People(raw_data)
 |  
 |  Representing a single person
 |  
 |  Method resolution order:
 |      People
 |      BaseModel
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, raw_data)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  get_films(self)
 |  
 |  get_homeworld(self)
 |  
 |  get_species(self)
 |  
 |  get_starships(self)
 |  
 |  get_vehicles(self)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from BaseModel:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



In [12]:
# See the names of all methods and attributes of `luke`.
# There is no way to tell which are methods and attributes,
# except to the extent that the package authors followed
# the convention of using verb phrases for method names
# and not for attribute names.
# "Dunder" methods (with double underscores at the beginning
# and end of their names) play special roles behind the
# scenes and are not meant to be accessed directly.
dir(luke)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'birth_year',
 'created',
 'edited',
 'eye_color',
 'films',
 'gender',
 'get_films',
 'get_homeworld',
 'get_species',
 'get_starships',
 'get_vehicles',
 'hair_color',
 'height',
 'homeworld',
 'mass',
 'name',
 'skin_color',
 'species',
 'starships',
 'url',
 'vehicles']

- Print at least three attributes of the `luke` object. (If you get something that looks like `<bound method XYZ>`, then you are accessing a method rather than an attribute.)

In [13]:
# /scrub/

print(luke.starships)
print(luke.get_starships)
print(luke.name)

['https://swapi.co/api/starships/12/', 'https://swapi.co/api/starships/22/']
<bound method People.get_starships of <Person - Luke Skywalker>>
Luke Skywalker


- Call at least one of `luke`'s methods.

In [14]:
# /scrub/

luke.get_starships()

<StarshipQuerySet - 2>

- Use the same approach to find out what attributes and methods the object `tatooine` has. Print at least one of its attributes, and call at least one of its methods.

In [15]:
#  /scrub/
help(tatooine)

Help on Planet in module swapi.models object:

class Planet(BaseModel)
 |  Planet(raw_data)
 |  
 |  Method resolution order:
 |      Planet
 |      BaseModel
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, raw_data)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  get_films(self)
 |  
 |  get_residents(self)
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from BaseModel:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)



In [16]:
# /scrub/
dir(tatooine)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'climate',
 'created',
 'diameter',
 'edited',
 'films',
 'get_films',
 'get_residents',
 'gravity',
 'name',
 'orbital_period',
 'population',
 'residents',
 'rotation_period',
 'surface_water',
 'terrain',
 'url']

In [17]:
# /scrub/
tatooine.get_films()

<FilmQuerySet - 5>

In [18]:
# /scrub/
tatooine.population

'200000'

- **BONUS:** Query SWAPI from the command line using the tool `curl`. This is a third way to access an API resource, in addition to entering it in your browser's address bar or running Python code. Try running this command in your terminal:

    `curl http https://swapi.co/api/people/1/`

/scrub/

```javascript   
{
	"name": "Luke Skywalker",
	"height": "172",
	"mass": "77",
	"hair_color": "blond",
	"skin_color": "fair",
	"eye_color": "blue",
	"birth_year": "19BBY",
	"gender": "male",
	"homeworld": "https://swapi.co/api/planets/1/",
	"films": [
		"https://swapi.co/api/films/2/",
		"https://swapi.co/api/films/6/",
		"https://swapi.co/api/films/3/",
		"https://swapi.co/api/films/1/",
		"https://swapi.co/api/films/7/"
	],
	"species": [
		"https://swapi.co/api/species/1/"
	],
	"vehicles": [
		"https://swapi.co/api/vehicles/14/",
		"https://swapi.co/api/vehicles/30/"
	],
	"starships": [
		"https://swapi.co/api/starships/12/",
		"https://swapi.co/api/starships/22/"
	],
	"created": "2014-12-09T13:50:51.644000Z",
	"edited": "2014-12-20T21:17:56.891000Z",
	"url": "https://swapi.co/api/people/1/"
}
```

- **BONUS:** Use the SWAPI Python client to get additional data.

$\blacksquare$

**IMPORTANT CLEANUP STEP:** Installing the swapi client library might have involved downgrading `matplotlib` in a way that will cause problems later. Run this cell to reinstall the most recent version of `matplotlib`:

In [19]:
!pip install --upgrade six
!conda upgrade -y matplotlib

Collecting six
  Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
[31mmadeye 0.1.0 requires keras, which is not installed.[0m
[31mmadeye 0.1.0 requires mlflow, which is not installed.[0m
[31mmadeye 0.1.0 requires tensorflow, which is not installed.[0m
[31mswapi 0.1.3 has requirement six==1.8.0, but you'll have six 1.12.0 which is incompatible.[0m
[31mspacy 2.0.16 has requirement requests<3.0.0,>=2.13.0, but you'll have requests 2.5.0 which is incompatible.[0m
[31mspacy 2.0.16 has requirement ujson>=1.35, but you'll have ujson 1.33 which is incompatible.[0m
Installing collected packages: six
  Found existing installation: six 1.8.0
    Uninstalling six-1.8.0:
      Successfully uninstalled six-1.8.0
Successfully installed six-1.12.0
[33mYou are using pip version 18.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' comm

# API Authentication

The owners of public APIs typically require users to register and set up API keys or access tokens in order to make more than a few requests. Requiring keys or tokens allows them to track how many requests each user is making. Often the owner of a public API provides a free service tier that allows maybe a few hundred requests per day (e.g. for personal users) and paid service tiers that allow more requests (e.g. for developers of apps that rely on the API).

## Example: Using a Bit.ly Access Token

Let's follow the instructions [here](https://bitly.com/a/sign_in?rd=/a/oauth_apps) to set up an access token for Bitly.

**Note:** You need to create new password using blank existing password if you log in through Google.

We then have the ability to make requests like the following, which provides a shortlink to https://fivethirtyeight.com/:

```
https://api-ssl.bitly.com/v3/shorten?access_token=<insert_access_token_here>&longUrl=http%3A%2F%2Ffivethirtyeight.com
```

Check out the [documentation](https://dev.bitly.com/) for more information about what you can do with Bitly's APIs.

## Handling Keys and Tokens Securely

Keys and tokens are at least moderately sensitive. Someone who has your key or token could make requests as if they were you, possibly incurring charges or causing you to hit usage limits.

A good way to handle access keys is to store them as **environment variables.** That way, no one can access them without logging into your account on your personal device.

**Demo.**

**Warning:** These instructions may need to be modified on Windows.

- The `export` terminal command can be used to assign a value to a variable in your environment. For instance, `export FAKE_ACCESS_KEY=abc` creates an environment variable called "FAKE_ACCESS_KEY" with the value `abc`.

- The value of an environment variable can be retrieved with `$`. For instance, entering `$FAKE_ACCESS_KEY` in the Terminal will cause it to attempt to run the command `abc`. This command will generate an error, because `abc` is not a known Terminal command. To print the value of "FAKE_ACCESS_KEY", you can run `echo $FAKE_ACCESS_KEY`.

- Environment variables go away at the end of a session, so running `echo $FAKE_ACCESS_KEY` in a new Terminal window will not work because `FAKE_ACCESS_KEY` is not defined in that window's session.

- There are special hidden files that run when you start a new Terminal session. For the Mac Terminal, these files include `~/.bashrc` and `~/.bash_profile`. To create an environment variable that is available in all of your future terminal sessions, put the relevant `export` command in one of those files.

In [20]:
# Access an environment variable with Python 
# /scrub/
print(os.getenv('FAKE_ACCESS_KEY'))

None


In [21]:
# Insert environment variable into a URL
# This is how you can use a token or key that is stored as an environment
# variable inside a Python program.
# /scrub/
fake_token = os.getenv('FAKE_ACCESS_KEY')
fake_url = f'http://wwww.mywebsite.com/data?access_token={fake_token}'
fake_url

'http://wwww.mywebsite.com/data?access_token=None'

**Exercise (8 mins., in pairs, take turns driving so that you both complete the steps on your machine).**

- Add a line to the file `~/.bash_profile` that creates a variable `MY_ACCESS_KEY_TEST` with the value `asdfghjkl`. Then open a new Terminal window and confirm that this variable has the correct value.

- In that same Terminal window, enter the command `ipython` to start a Python REPL session. Write Python code to insert MY_ACCESS_KEY_TEST into the fake url 'http://www.coolapi.com/stuff?my_key=<insert MY_ACCESS_KEY_TEST here\>'. Then enter the command `exit` to leave the REPL session.

- Remove the line you added to `~/.bash_profile` in the first step.

## Key Pairs

More secure APIs provide a pair of keys: a public key and a secret key.

The public key is a little like a website username in that you don't need to worry too much about tracking it and keeping it secure. You can often retrieve it by logging into the website associated with the relevant API service.

The secret key is more like a website password: you should be careful to keep track of it and to keep it hidden from others. The website associated with the relevant API service will typically reveal it to you only once, when it is initially created. It is particularly important not to push your secret key to GitHub. Don't EVER include it in a Git commit, because getting it out of your commit history once it is in there is difficult.

<a name="ind-practice2"></a>
**Exercise (~30 mins, pair programming)**

Now it's time to start exploring the world of data APIs semi-independently. Go to http://www.pythonforbeginners.com/api/list-of-python-apis or https://github.com/realpython/list-of-python-api-wrappers.
  
- Choose an API.
- Find the API documentation and follow the instructions for getting access (e.g. setting up API keys). 

This step may push you out of your comfort zone. Don't be afraid to try things. Google and Stack Overflow are your friends. Ask questions when you get stuck. You can't go too far wrong as long as you DON'T PROVIDE CREDIT CARD INFORMATION (unless you know what you are doing) and DON'T SHARE YOUR KEYS/TOKENS and DON'T ADD THEM TO GIT.

- Install the relevant Python client library, if available.
- Create a Jupyter notebook showing how to extract data from the API and listing a few ways you could use the data it provides.
- Prepare to share what you found with the rest of the class.
- **BONUS:** Build a model using data from the API.