<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Introduction to Web Services and APIs

_Authors: Dave Yerrington (SF)_

---
![](assets/opening.png)

### Learning Objectives
*By the end of this lesson, you will be able to:*
- Identify all of the HTTP verbs and their uses.
- Describe application programming interfaces (APIs) and how to make calls and consume API data.
- Access public APIs and get information back.
- Read and write data in JSON format.
- Use the `requests` library.

### Lesson Guide
- [Introduction to APIs](#intro)
- [What is an API?](#what-is-api)
- [Famous APIs](#famous)
    - [Facebook](#facebook)
    - [Yelp](#yelp)
    - [Echonest](#echonest)
- [Web APIs](#web-apis)
- [Separation of Concerns](#mvc)
- [HTTP](#http)
- [Web Applications](#web-app)
- [Demo: HTTP](#demo-http)
- [Independent Practice: HTTP](#ind-http)
- [HTTP Request](#http-request)
    - [HTTP Request Methods](#request-methods)
    - [HTTP Request Structure](#request-structure)
- [HTTP Response](#http-response)
    - [Response Types Overview](#response-types)
- [JSON](#json)
- [Independent Practice: Validating JSON](#ind-practice)
- [Guided Practice: Pulling Data From APIs](#guided-practice)
    - [Example 1: Movies](#ex1-movies)
    - [Submit Queries to the API](#submit)
- [oAuth](#oauth)
- [Independent Practice: Python APIs](#ind-practice2)
- [Closing Questions](#closing-questions)

## Has Anyone Used or is Currently Using an API?
<br>
_Warning Signs_:
<img src="assets/warning.png" style="float: left; width: 300px; margin: -5px 50px">
- _Random data found all over the computer_.
- _Withdrawals from friends and family_.
- _Constantly iterating over nested JSON_.
- _Lack of sleep_.
- _Talks about JSON as if it were a real person_.


<a name="intro"></a>
## Introduction to Application Programming Interfaces (APIs)

---

In previous lessons, we learned about building processes that scrape content from websites. In this lesson, we'll be diving into the world of APIs and taking a tour of one of the most accessible sources of data on the internet.

You'll learn:
- What's meant by "API."
- Common use cases.
- How to read API documentation.
- General development workflow with APIs.


<a id='what-is-api'></a>
## What is an API?

---

An application programming interface (API) is a set of routines, protocols, and tools for building software applications. It specifies how software components should interact.

APIs are a way developers abstract functionality to data, devices, and other resources they provide. 

Some examples include:

- Connectivity to a variety of databases.
- Python modules that can turn LED lights on and off.
- Applications that runs on native Windows, OSX, or Linux.
- Libraries that post content on Twitter, Facebook, Yelp, or LinkedIn.
- Web services for accessing currency or stock prices.

More abstract examples:
- Adding your own functions to NumPy.
- Extending Python with C code.
- Testing frameworks.

In the context of data science, APIs are a common method for interacting with data hosted by third parties and are most commonly provided by **web service APIs**.

<a id='famous'></a>
<a id='facebook'></a>

### Famous APIs: Facebook

Facebook provides an API for interacting with its service. At a glance, you can:

- View your posts.
- View websites, people, posts, and pages that you've liked.
- View activity on apps from you and your friends.
  - Movies watched.
  - Music listened to.
  - Games played.
- View places traveled/check ins.
- Maintain or build relationships.

#### Potential Project Ideas

|   |   |   |   |
|---|---|---|---|
| Determining Latent Characteristics | Friends Activity | Political Classification | Text Mining |
| Friend Classifier | Trending Topics | Recommenders | Feature Importances |
| Taste Profiling | Hipster Detector | Sub-Group Identification | Check-In Prediction |
| Relationship Forecasting | Relationship Classification | Sentiment Analysis | Popularity Projection |
| Personal Analytics | Friend Similarity Prediction | N-Gram Analysis | Topic Modeling |

<a id='yelp'></a>
### Famous APIs: Yelp

Yelp provides a way for developers to access:

- Reviews.
 - Services.
 - Restaurants, bars, and cafes.
 - Businesses.
- Business metadata.

#### Potential Project Ideas

|   |   |   |   |
|---|---|---|---|
| Topic Modeling | Text Mining | Sentiment Analysis | Funny/Cool/Interesting Classification | 
| Music Genre Classification | Parking Index Classification | Characteristics Profiling | Hipster Index |
| Ideal Activities | Friend Recommender | Venue Recommender | Sports Bar Classification |
| "Where is the best [whatever] in [neighborhood]?" | | |

<a id='echonest'></a>
### Famous APIs: Echonest

Echonest consolidates access to many entertainment service APIs in one place. It has a huge list of features and connected services, including:

- Spotify
- Pandora
- Rdio
- Gracenote
- SoundHound
- Shazam

Some Echonest features include:

- Music waveform identification (like Shazam or SoundHound's music ID).
- Playlist recommendations.
- Detailed artist, album, and track lookup.
 - Artist biographies, origins, contemporaries, and noteworthy accomplishments.
 - Official Twitter, website, and social media links.
 - BPM, mood, popularity, and genre(s). 
 - Images, videos, and media.
- Detailed movie, actor, and product lookup.
- Concert schedules and ticket metadata.

<a id='web-apis'></a>
## Web APIs

---

![](assets/notify.png)

The prevalence of web APIs has increased with the rise of JavaScript and the advent of web programming techniques that allow the transmission of small pieces of data without having to refresh the entire page.

With the growth of highly interactive websites — provided by the AJAX programming techniques in JavaScript — many languages have started co-opting standards to communicate data to and from web servers for two big reasons:
- Ease of integration.
- Consistent standards.


<a name="mvc"></a>
## Separation of Concerns

---

In order to talk about APIs, we need first to introduce the _separation of concerns_. In computer science, _separation of concerns_ (SOC) is a design principle for separating a computer program into distinct sections, such that each section addresses a separate concern. A concern is a set of information that affects the code of a computer program. 

In particular, when building a web application, it's best practice to separate the website logic from data models. This not only allows for cleaner code but is an easier way to manipulate our layouts and interactions. Separation of concerns becomes even more important when working with outside data.

<img src="assets/MVC-Process.png" style="width: 200px;"> 

> _MVC: Model view controller is a famous SOC paradigm in programming._ 

API calls are really just a fancy term for making _HTTP requests_ (in the context of web APIs) to a server and sending/receiving structured data from that endpoint (URL). We are still communicating with URLs — however, instead of receiving markup like we do with HTML pages, we receive data.

[Representational state transfer (REST)](https://en.wikipedia.org/wiki/Representational_state_transfer) is the most common architecture style for passing information to and from these API endpoints.

Before we start consuming these services, it's important to understand the fundamentals of the underlying communication layer: **HTTP**.


<a id='http'></a>
## Hypertext Transfer Protocol (HTTP)

---

HTTP is a protocol — a system of rules — that determines how web pages (see: "hypertext") get sent (see: "transfer") from one place to another. Among other things, it defines the format of the messages passed between HTTP clients and HTTP servers.

Because the web is a service, it works through a combination of clients that _make_ requests and servers that _receive_ requests.


### The HTTP Client

HTTP clients make or generate HTTP requests. Some types of clients include:

* Browsers — Chrome, Firefox, and Safari.
* Command line programs — [curl](http://curl.haxx.se/docs/) and [wget](http://www.gnu.org/software/wget/manual/wget.html).
* Application code — Python requests, Scrapy, and Mechanize.

HTTP clients respond to HTTP responses from a web server. They process the data being returned from a web server (a.k.a., HTTP server).

### HTTP and Web Servers

All _web servers_ receive _HTTP requests_ and generate _HTTP responses_. Often web servers are just the middleman, passing HTTP requests and responses between the client and web application. Two of the most popular _HTTP or web servers_ are [Apache](http://httpd.apache.org/) and [Nginx](http://nginx.com/), but there are lots of different [web servers](http://en.wikipedia.org/wiki/Comparison_of_web_server_software) out there.


### Check: Where Do You Think a Web Application Lives?

Client, server, the cloud, mobile device, in your car, or on your Bluetooth-connected toaster with LCD display?

<a id='web-app'></a>
## Web Applications

---

Web applications are programs that run on a web server, process the HTTP requests the server receives, and generate HTTP responses.

![HTTP Request and Response](assets/request-response.png)

Lost? Here's the play by play:

1) A client sends an HTTP request to an HTTP server running on a remote machine.  
  * The _hostname_ given in the URL indicates which server will receive the request.  
2) The HTTP server processes the HTTP request. This may entail passing the request to a web application, which creates an HTTP response.
3) The response gets sent back to the client.
4) The client processes the response.

How does the server know what the request is asking for? This is specified by the URL, a special kind of path that specifies where a resource can be found on the web.

![URL](./assets/http1-url-structure.png)

> **Check:** Can anyone define a client and a server?

<a name="demo-http"></a>
## Demo: HTTP

---

Let's explore HTTP resources. We'll start by looking at HTTP requests and responses using the Chrome Inspector.

![HTTP Request and Response](./assets/http_request_response.jpeg "HTTP Request and Response")

* In Chrome, open up Chrome Inspector (*command + option + 'i', or ctrl + click and select "Inspect Element"*).
* Select the Network tab. It should look something like this:

![Chrome Inspector](./assets/chrome_inspector.png)

* Next, go to the URL https://generalassemb.ly/.

You should be able to see a few HTTP requests and responses in the Network tab. For each request you'll see a **path**, **method**, **status**, **type**, and **size**, along with information about how long it took to get each of these resources.
  * Most of this information comes from the HTTP request and response.*
  * Some HTTP requests are for CSS, JavaScript, and images that are referenced by the HTML.
  * Select `generalassemb.ly` in the path column on the far left.
  * Select the Headers tab. **Headers** are metadata properties of an HTTP request or response, separate from the body of the message.

<a id='ind-http'></a>

## Independent Practice: HTTP

---


**With a partner, go to your favorite (work-appropriate) website(s), inspect the protocol from the Chrome Inspector tool (cmd-opt-j), and identify:**

- Requests sent by your client.
- Requests send by the server.
- The URL.

#### Research and Explain
- Cache-control.
- Age.
- Content-encoding.
- Expires.
- "GET" and "POST" requests.
- Query string parameters.

#### Bonus
What are cookies?
<img src="assets/cookies.png" style="width: 100px">

<a id='http-request'></a>
## HTTP Request

---

The first word in the request line, _GET_, is the **HTTP request's method**.

![HTTP Request](./assets/http_request.jpeg "HTTP Request")

<a id='request-methods'></a>
### HTTP Request Methods:

* **`GET`** => Retrieve a resource.  
* **`POST`** => Create a resource.  
* **`PATCH`** (_or **`PUT`**, but **`PATCH`** is recommended_) => Update an existing resource.  
* **`DELETE`** => Delete a resource.  
* **`HEAD`** => Retrieve the headers for a resource.

Of these, **`GET`** and **`POST`** are the most widely used.

<a id='request-structure'></a>
### HTTP Request Structure

```
[http request method] [URL] [http version]  
[list of headers]

[request body]
```

*Notice that the request header is separated from the request body by a new line.*

#### HTTP Request Method Example (No Body)

    GET http://vermonster.com HTTP/1.1  
    Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8  
    Accept-Encoding:gzip,deflate,sdch
    Accept-Language:en-US,en;q=0.8  
    Connection:keep-alive  
    Host:vermonster.com  
    User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_5)  
    AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1659.2 Safari/537.36  

<a id='http-response'></a>
## HTTP Response

---

![HTTP Response](./assets/http_response.jpeg "HTTP Response")

When a client sends a request, the server sends back a response; the standard format for this response is:

```
[http version] [status] [reason]  
[list of headers]

[response body] # Typically HTML, JSON, ...  
```

<a id='response-types'></a>
### Response Types Overview

> Check these out when you have time — at least be aware that there's an expected pattern to these codes.

**[Status codes](http://en.wikipedia.org/wiki/List_of_HTTP_status_codes)** have standard meanings. Here are a few:

|Code|Reason|
|:---|:-----|
|200| OK
|301| Moved Permanently
|302| Moved Temporarily
|307| Temporary Redirect
|400| Bad Request
|403| Forbidden
|404| Not Found
|500| Internal Server Error

<a name="json"></a>
## JSON

---

JSON is short for _JavaScript object notation_ and is a way to store information in an organized, easy-to-access manner. In a nutshell, it gives us a human-readable collection of data that we can access in a logical way.

**JSON is built on two structures:**
* A collection of name/value pairs. In various languages, this is realized as an object, record, structure, dictionary, hash table, keyed list, or associative array.
* An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

These are universal data structures. Virtually all modern programming languages support them in one form or another. It makes sense that a data format that's interchangeable with programming languages is also based on these structures.

### JSON Objects

An object is an unordered set of name/value pairs, such as Python dictionaries. An object begins with a left brace (`{`) and ends with a right brace (`}`). Each name is followed by a colon (`:`) and the name/value pairs are separated by a comma (`,`).

The syntax is as follows:

```
{ string : value, .......}
```
Like:
```
{"count": 1, ...}
```
_Seems a lot like a Python dictionary!_

<a name="ind-practice"></a>
## Independent Practice: Validating JSON 

---

JSON is simple to use if it's correctly structured. One of the resources for validating JSON and checking if the syntax is correct is [JSON Viewer](http://codebeautify.org/jsonviewer).

For this exercise, copy the [JSON data from the code folder](./code/test.json) and insert it in the web app above. Then, click "Validate."

If you see "Valid JSON," click "Beautify" and you'll see a more readable version. If you do not see the message "Valid JSON," it means that there's a syntax error.

* First, correct any errors.
* Then, work in pairs to identify the structure of the JSON.

    - What is a root element?
    - Are there any arrays?
    - How many objects are there?
    - What are the attributes of an object?

<a name="guided-practice"></a>
## Guided Practice: Pulling Data From APIs

---

Recall that APIs are methods and data formats that tell people how to talk to a system. Next we'll walk through a couple of examples.

<a id='ex1-movies'></a>
### Example 1: Movies

The Internet Movie Database (IMDb) is a large collection of data about movies. It can be browsed at http://www.imdb.com/.

What if we wanted to programmatically access the data in the database? Unless we're employees of IMDb.com, we probably don't have direct access to its internal database, so we can't perform SQL queries on the data.

We could use scraping to retrieve data from the web page, and in some cases, we'll have to do exactly that.

> *Note: Check the "Terms of Service" before you scrape a website, you could be infringing on its terms.*

In other cases, the website offers a way to programmatically access data from its database. That's an API.

In the case of movies, this is offered by http://www.omdbapi.com/.

_Unfortunately, OMDb is no longer a free-to-use API. Instead, we'll practice interacting with the Star Wars API (SWAPI)_.


**Let's try an example to retrieve the data about "Obi-Wan Kenobi."**

Referencing the [SWAPI documentation](http://swapi.co/documentation), let's create a query to search for characters with "obi" in their names.

**In a browser, paste:**

    https://swapi.co/api/people/?format=json&search=obi


{
    "count": 1,   
    "next": null,   
    "previous": null,   
    "results": [  
        {  
            "name": "Obi-Wan Kenobi",   
            "height": "182",   
            "mass": "77",   
            "hair_color": "auburn, white",   
            "skin_color": "fair",   
            "eye_color": "blue-gray",   
            "birth_year": "57BBY",   
            "gender": "male",   
            "homeworld": "http://swapi.co/api/planets/20/",   
            "films": [    
                "http://swapi.co/api/films/2/",   
                "http://swapi.co/api/films/5/",   
                "http://swapi.co/api/films/4/",   
                "http://swapi.co/api/films/6/",   
                "http://swapi.co/api/films/3/",   
                "http://swapi.co/api/films/1/"  
            ],   
            "species": ["http://swapi.co/api/species/1/"],     
            "vehicles": ["http://swapi.co/api/vehicles/38/"],   
            "starships": [  
                "http://swapi.co/api/starships/48/",     
                "http://swapi.co/api/starships/59/",   
                "http://swapi.co/api/starships/64/",   
                "http://swapi.co/api/starships/65/",   
                "http://swapi.co/api/starships/74/"  
            ],   
            "created": "2014-12-10T16:16:29.192000Z",     
            "edited": "2014-12-20T21:17:50.325000Z",     
            "url": "http://swapi.co/api/people/10/"    
        }  
    ]  
}  

**What just happened?**

We requested a URL, which responded with JSON.

SWAPI has a GUI-based response as well, which is the default:

https://swapi.co/api/people/?search=obi

Along with a Wookie-flavored one:

https://swapi.co/api/people/?format=wookiee&search=obi

<a id='submit'></a>
### Try submitting a couple more queries to familiarize yourself with the API.

- You can also query an API from the command line using the app `curl`. Try typing:

    `curl http://swapi.co/api/people/13/?format=json`


```javascript   
{"name":"Chewbacca",
 "height":"228",
 "mass":"112",
 "hair_color":"brown",
 "skin_color":"unknown",
 "eye_color":"blue",
 "birth_year":"200BBY",
 "gender":"male",
 "homeworld":"http://swapi.co/api/planets/14/",
 "films": ["http://swapi.co/api/films/2/","http://swapi.co/api/films/6/","http://swapi.co/api/films/3/","http://swapi.co/api/films/1/","http://swapi.co/api/films/7/"],
 "species":["http://swapi.co/api/species/3/"],
 "vehicles":["http://swapi.co/api/vehicles/19/"],
 "starships":["http://swapi.co/api/starships/10/","http://swapi.co/api/starships/22/"],
 "created":"2014-12-10T16:42:45.066000Z",
 "edited":"2014-12-20T21:17:50.332000Z",
 "url":"http://swapi.co/api/people/13/"}
```

In [1]:
# Request example for the IMDb example.
import pandas as pd
import requests
result = requests.get("http://swapi.co/api/people/10/")
df = pd.DataFrame([result.json()])
df

Unnamed: 0,birth_year,created,edited,eye_color,films,gender,hair_color,height,homeworld,mass,name,skin_color,species,starships,url,vehicles
0,57BBY,2014-12-10T16:16:29.192000Z,2014-12-20T21:17:50.325000Z,blue-gray,"[https://swapi.co/api/films/2/, https://swapi....",male,"auburn, white",182,https://swapi.co/api/planets/20/,77,Obi-Wan Kenobi,fair,[https://swapi.co/api/species/1/],"[https://swapi.co/api/starships/48/, https://s...",https://swapi.co/api/people/10/,[https://swapi.co/api/vehicles/38/]


### Example 2: coindesk

Coindesk offers information about bitcoins and other cryptocurrencies.

https://api.coindesk.com/v1/bpi/currentprice.json

In [12]:
# Request the resource from Google Maps.
result = requests.get("https://api.coindesk.com/v1/bpi/currentprice.json")
output = result.json()
output

{'time': {'updated': 'Oct 9, 2018 01:07:00 UTC',
  'updatedISO': '2018-10-09T01:07:00+00:00',
  'updateduk': 'Oct 9, 2018 at 02:07 BST'},
 'disclaimer': 'This data was produced from the CoinDesk Bitcoin Price Index (USD). Non-USD currency data converted using hourly conversion rate from openexchangerates.org',
 'chartName': 'Bitcoin',
 'bpi': {'USD': {'code': 'USD',
   'symbol': '&#36;',
   'rate': '6,621.7200',
   'description': 'United States Dollar',
   'rate_float': 6621.72},
  'GBP': {'code': 'GBP',
   'symbol': '&pound;',
   'rate': '5,059.8019',
   'description': 'British Pound Sterling',
   'rate_float': 5059.8019},
  'EUR': {'code': 'EUR',
   'symbol': '&euro;',
   'rate': '5,763.5650',
   'description': 'Euro',
   'rate_float': 5763.565}}}

In [16]:
for curr in ['USD','GBP','EUR']:
    print(curr, output['bpi'][curr]['rate'])

USD 6,621.7200
GBP 5,059.8019
EUR 5,763.5650



### Other public APIs

https://github.com/toddmotto/public-apis

<a id='oauth'></a>
## oAuth

---

oAuth is simply a secure authorization protocol that deals with the authorization of third-party applications to access a user's data without exposing their password. (You can log in with Facebook, Google+, and Twitter on many websites under this protocol.)

Basically there are three parties involved: **oAuth client, oAuth provider, and owner**.

- oAuth client: The application that wants to access your credentials.
- oAuth provider: E.g., Facebook, Twitter, LinkedIn, etc.
- Owner: The user with the Facebook, Twitter, etc. account.

Many APIs are free to access. You first need to register as a developer and obtain an authorization key. In most cases, this is also accompanied by a temporary token that needs to be renewed after some time. This is a way to prevent abuse on the server's resources.

You can read more about it here: http://oauth.net/2/.


### OAuth Example: Client Work With Instagram

![](assets/oAuth1.png)

![](assets/oAuth2.png)

![](assets/oAuth3.png)

<a name="ind-practice2"></a>
## Independent Practice: Python APIs

---

**Form pairs and do the following:**  

Go to http://www.pythonforbeginners.com/api/list-of-python-apis or https://github.com/realpython/list-of-python-api-wrappers.
  
- Choose one API: What data are available with your chosen API?
- Install the Python module (if available for API) and try to extract data.
- Discuss: How could you leverage that API? How could you use the data?

<a id='closing-questions'></a>
## Closing Questions

---

![](assets/closing.png)

### What's the easiest thing to understand about APIs?

### What's the most challenging thing to understand about APIs?

### How does this contrast with scraping?

### How would you explain APIs to someone who didn't know anything about them?

### Do you have any new ideas for capstone data?