### Connecting Pandas to Data

 - Databases
 - Accessing API's

#### Basic Process Happens in 3 Steps:

 - import database connector
 - establish connection to database
 - load a query into a pandas dataframe

#### Big Benefit of Using Database Connections:

Saves you from having to load the file into memory.  When you do queries, joins, etc, all of this is done on the database server, making it easier to accommodate larger inputs of data.

**Run This Line:** `conda install -c anaconda mysql-connector-python`

**Go to this website:** https://rfam.readthedocs.io/en/latest/database.html

In [3]:
import pandas as pd
import mysql.connector as sql

In [None]:
connector = sql.connect(user=username_here, password=password_here, port=port_number, database=database_name, host=host_address)

### Your Turn:  

Using the information given on the rfam website, go ahead and establish a connection in a manner analogous to the previous slide.

If you **don't** get an error message, then you did it correctly.

In [5]:
connector = sql.connect(user='rfamro', password=None, port=4497, database='Rfam', host='mysql-rfam-public.ebi.ac.uk')

`pd.read_sql_query`:  most common pandas method for reading in sql databases.

 - Takes two arguments:
   - your SQL query
   - information about your database connection
   
`df = pd.read_sql_query('Your_SQL_query_here', connector)`

In [6]:
df = pd.read_sql_query('SELECT * FROM taxonomy', connector)

In [7]:
df.head()

Unnamed: 0,ncbi_id,species,tax_string,tree_display_name,align_display_name
0,7,Azorhizobium caulinodans,Bacteria; Proteobacteria; Alphaproteobacteria;...,Azorhizobium_caulinodans,Azorhizobium_caulinodans[7]
1,9,Buchnera aphidicola,Bacteria; Proteobacteria; Gammaproteobacteria;...,Buchnera_aphidicola,Buchnera_aphidicola[9]
2,11,[Cellvibrio] gilvus,Bacteria; Actinobacteria; Actinobacteridae; Ac...,[Cellvibrio]_gilvus,[Cellvibrio]_gilvus[11]
3,14,Dictyoglomus thermophilum,Bacteria; Dictyoglomi; Dictyoglomales; Dictyog...,Dictyoglomus_thermophilum,Dictyoglomus_thermophilum[14]
4,17,Methylophilus methylotrophus,Bacteria; Proteobacteria; Betaproteobacteria; ...,Methylophilus_methylotrophus,Methylophilus_methylotr..[17]


<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px">

# Introduction to Web Services and APIs

_Author: Dave Yerrington (SF)_

---
![](assets/opening.png)

<a id='what-is-api'></a>
## What is an API?

---

An **API (Application Programming Interface)** is a set of routines, protocols, and tools for building software applications. It specifies how software components should interact.

In programming, we also use many interfaces. The interface we use to a programming library is its function calls. For example:

- **Matplotlib `plot()`**: We don't care how the `plot()` function works. All that we expect is that calling this function (with appropriate parameters) will plot the data and return a graphic. In fact, how the function works can completely change from version to version. 

#### Web APIs

- **How does this extend to the web?** A Web API is a list of function calls that are made to remote servers. The function call is sent by encoding it as a URL (technically, as an HTTP request -- we'll discuss these later!). The function call typically returns a string of text (e.g. JSON).

<a name="api-calls"></a>
## Making API calls

---

API calls are really a fancy term for making _HTTP requests_ (in the context of web APIs) to a server and sending/receiving structured data from that endpoint (URL). We are still communicating with URLs, however instead of receiving a string of HTML, we receive data.

[Representational state transfer (REST)](https://spring.io/understanding/REST) is the most common architecture style for passing information to and from these API endpoints.

<a id='http'></a>
## HTTP

---

**HTTP (HyperText Transfer Protocol)** is a protocol - a system of rules - that determines how web pages (see:'hypertext') get sent (see:'transferred') from one place to another. Among other things, it defines the format of the messages passed between HTTP clients and HTTP servers.

Since the web is a service, it works through a combination of **clients** (that _make_ requests) and **servers** (that _receive_ requests).


### HTTP and web servers

All _web servers_ receive _HTTP requests_ and generate _HTTP responses_. Often web servers are just the middleman, passing HTTP requests and responses between the client and web application. Two of the most popular _HTTP or Web servers_ are [Apache](http://httpd.apache.org/) and [Nginx](http://nginx.com/), But there are many different [web servers](https://www.tutorialspoint.com/internet_technologies/web_servers.htm) out there.

<a id='web-app'></a>
## Web applications

---

Web applications are programs that run on a web server. They process the HTTP requests that the server receives and generate HTTP responses.

![HTTP Request and Response](assets/request-response.png)

How does the server know what the request is asking for? This is specified by the URL, a special kind of path that specifies where a resource can be found on the web.

![URL](./assets/http1-url-structure.png)

<a name="demo-http"></a>
## Demo: HTTP

* In Chrome, open up Chrome Inspector (*command + option + 'i', or ctrl + click and select 'inspect element'*).
* Select the Network tab. It should look something like this:

![Chrome Inspector](./assets/chrome_inspector.png)

* Next, go to the URL https://generalassemb.ly/

<a id='ind-http'></a>

## Independent practice: HTTP

---


### Go to your favorite website(s) (safe for work), inspect the protocol from the Chrome network inspection tool (cmd-opt-i)

#### Research and Explain:
 - 200 request status
 - 202 request status
 - 204 request status
 - 302 request status
 - 404 request status

<a id='response-types'></a>
### Response types overview

|Code|Reason|
|:---|:-----|
|200| OK
|301| Moved Permanently
|302| Moved Temporarily
|307| Temporary Redirect
|400| Bad Request
|403| Forbidden
|404| Not Found
|500| Internal Server Error

<a id='http-request'></a>
## HTTP Request

---

First, notice that every HTTP request is ultimately plain text -- it is human-readable! This is convenient, but it makes requests larger.

The first word in the request line, _GET_, is the **HTTP method**.

![HTTP Request](./assets/http_request.jpeg "HTTP Request")

<a id='request-methods'></a>
### HTTP Request methods:

* **`GET`** => Retrieve a resource.  
* **`POST`** => Create a resource.  
* **`PATCH`** (_or **`PUT`**, but **`PATCH`** is recommended_) => Update an existing resource.  
* **`DELETE`** => Delete a resource.  
* **`HEAD`** => Retrieve the headers for a resource.

Of these, **`GET`** and **`POST`** are the most widely used.

<a name="json"></a>
## JSON

---

JSON is short for _JavaScript Object Notation_, and is a way to store information in an organized, easy-to-access manner. In a nutshell, it gives us a human-readable collection of data that we can access in a really logical manner.

It is very important to realize that JSON is just plain text that you can edit in a text editor. It is called JSON because a JSON string is valid JavaScript code. Because of this, JSON has become very popular with JavaScript programmers for talking with APIs.

**JSON is built on two structures:**
* A collection of name/value pairs. In various languages, this is realized as an object, record, structure, dictionary, hash table, keyed list, or associative array.
* An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence.

These are universal data structures. Virtually all modern programming languages support them in one form or another. It makes sense that a data format that is interchangeable with programming languages also be based on these structures.

<a name="ind-practice"></a>
## Validating JSON 

---

JSON is very simple to use if correctly structured. One of the resources to validate JSON and check if the syntax is correct is [JSON Viewer](http://codebeautify.org/jsonviewer).

For this exercise, copy the [JSON data from the code folder](./code/test.json) and insert it in the web app above. Then, click "Beautify".

### Core Concept Check

 - What does this remind you of?
 - How is it different from what we'd normally see in Python?

<a name="guided-practice"></a>
## Guided practice: Pulling data from APIs

---

Recall that APIs are methods and data formats to tell people how to "talk" to a system. We will walk through a couple of examples.

### The Requests Library

 - Extremely useful tool for accessing data from the web
 - Inititates a `GET` request very easily using http
 - Allows you to easily access information from a web page
 - Handy for retrieving data from API's or scraping the web

In [8]:
import requests

URL = 'https://generalassemb.ly'

req = requests.get(URL)

In [9]:
# this is our new request object
req

<Response [200]>

In [11]:
# we can check its status code
req.status_code

200

In [13]:
# and the encoding used in the content of our request
req.encoding

'utf-8'

In [14]:
# we can also get the content of our request embedded as text
req.text



In [16]:
# this request does NOT provide data as JSON
req.json()

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

<a id='ex1-star-wars'></a>
### Further Example: Star Wars

The Star Wars API (SWAPI) is a large collection of data about Star Wars. It can be browsed at the address: http://swapi.co/.

**Let's try for example to retrieve the data about the "Obi-Wan Kenobi".**

Referencing the [SWAPI Documentation](http://swapi.co/documentation), lets create a query to search for characters with "obi" in their name.

**In a browser, paste:**

    https://swapi.co/api/people/?format=json&search=obi


In [17]:
# Request example for the SWAPI example
import pandas as pd
import requests

result = requests.get("https://swapi.co/api/people/?format=json&search=obi")
result.json()

{'count': 1,
 'next': None,
 'previous': None,
 'results': [{'name': 'Obi-Wan Kenobi',
   'height': '182',
   'mass': '77',
   'hair_color': 'auburn, white',
   'skin_color': 'fair',
   'eye_color': 'blue-gray',
   'birth_year': '57BBY',
   'gender': 'male',
   'homeworld': 'https://swapi.co/api/planets/20/',
   'films': ['https://swapi.co/api/films/2/',
    'https://swapi.co/api/films/5/',
    'https://swapi.co/api/films/4/',
    'https://swapi.co/api/films/6/',
    'https://swapi.co/api/films/3/',
    'https://swapi.co/api/films/1/'],
   'species': ['https://swapi.co/api/species/1/'],
   'vehicles': ['https://swapi.co/api/vehicles/38/'],
   'starships': ['https://swapi.co/api/starships/48/',
    'https://swapi.co/api/starships/59/',
    'https://swapi.co/api/starships/64/',
    'https://swapi.co/api/starships/65/',
    'https://swapi.co/api/starships/74/'],
   'created': '2014-12-10T16:16:29.192000Z',
   'edited': '2014-12-20T21:17:50.325000Z',
   'url': 'https://swapi.co/api/people/

In [31]:
# this gives us access to the appropriate data that we need
result.json()['results']

[{'name': 'Obi-Wan Kenobi',
  'height': '182',
  'mass': '77',
  'hair_color': 'auburn, white',
  'skin_color': 'fair',
  'eye_color': 'blue-gray',
  'birth_year': '57BBY',
  'gender': 'male',
  'homeworld': 'https://swapi.co/api/planets/20/',
  'films': ['https://swapi.co/api/films/2/',
   'https://swapi.co/api/films/5/',
   'https://swapi.co/api/films/4/',
   'https://swapi.co/api/films/6/',
   'https://swapi.co/api/films/3/',
   'https://swapi.co/api/films/1/'],
  'species': ['https://swapi.co/api/species/1/'],
  'vehicles': ['https://swapi.co/api/vehicles/38/'],
  'starships': ['https://swapi.co/api/starships/48/',
   'https://swapi.co/api/starships/59/',
   'https://swapi.co/api/starships/64/',
   'https://swapi.co/api/starships/65/',
   'https://swapi.co/api/starships/74/'],
  'created': '2014-12-10T16:16:29.192000Z',
  'edited': '2014-12-20T21:17:50.325000Z',
  'url': 'https://swapi.co/api/people/10/'}]

In [32]:
df = pd.DataFrame(result.json()['results'])
df.head()

Unnamed: 0,birth_year,created,edited,eye_color,films,gender,hair_color,height,homeworld,mass,name,skin_color,species,starships,url,vehicles
0,57BBY,2014-12-10T16:16:29.192000Z,2014-12-20T21:17:50.325000Z,blue-gray,"[https://swapi.co/api/films/2/, https://swapi....",male,"auburn, white",182,https://swapi.co/api/planets/20/,77,Obi-Wan Kenobi,fair,[https://swapi.co/api/species/1/],"[https://swapi.co/api/starships/48/, https://s...",https://swapi.co/api/people/10/,[https://swapi.co/api/vehicles/38/]


<a id='ex2-geocode'></a>
### Example 2: Google Geocode API

Google offers a freely accessible API to query their GEO databases.  One of the many features Google Maps API provides is a way to get longitude and latitude coordinates from addresses.

**Try pasting the following line in your browser:**

    https://maps.googleapis.com/maps/api/geocode/json?address=225+Bush+St+San+Francisco+CA

In [2]:
# Request the resource from google maps
result = requests.get("https://maps.googleapis.com/maps/api/geocode/json?address=225+Bush+St+San+Francisco+CA")
google_result = result.json()

# Loop through results and display lat, lon values for reverse geocode
for item in google_result['results']:
    print(item['geometry']['location'])

{'lat': 37.790841, 'lng': -122.4012802}


<a id='oauth'></a>
## OAuth

---

**OAuth (Open Authorization)** is simply a secure authorization protocol that often deals with the authorization of a third-party application to access a user's data without exposing their password (e.g., Login with Facebook, gPlus, Twitter in many websites). 

For example, before your application can access a user's Facebook data, the user must give your application consent first. As you can imagine, this is a more complex process than before since it involves the user.

There are three parties involved: **OAuth Provider**, **OAuth Client**, and the **Owner**.

- OAuth Client (Application Which wants to access your credential)
- OAuth Provider (eg. facebook, twitter...)
- Owner (the one with facebook,twitter.. account )

There are various levels of OAuth access. For security, some APIs (such as the Yelp API) require OAuth access despite there not being any conventional users. These sites require a less complicated OAuth procedure.

### Implementing OAuth in Python

There are a number of OAuth libraries available in Python. If you are using a popular website such as Facebook, it is recommended to use their official Python library to handle the OAuth. If not, it is possible to code your own OAuth procedure. Because OAuth requires back-and-forth with your web application, discussing implementation details is outside of the scope of this lesson.

You can read more about OAuth here: http://oauth.net/2/

### OAuth Example:  client work with instagram

![](assets/oAuth1.png)

![](assets/oAuth2.png)

![](assets/oAuth3.png)

<a name="ind-practice2"></a>
## Independent Practice: Python APIs

---

**Form pairs and do the following:** 

Go to http://www.pythonforbeginners.com/api/list-of-python-apis  
or   
https://github.com/realpython/list-of-python-api-wrappers
  
  
- Choose an API: what data is available with your chosen API?
- Install Python module (if available for API), try to extract data.
- Discuss: How could you leverage that api? How could you use the data?