# Week 6 Discussion

## Infographic

* [Can't Trust Nutrition](http://fivethirtyeight.com/features/you-cant-trust-what-you-read-about-nutrition/)

## Links

## Notes

#### Hypertext Transfer Protocol

The hypertext transfer protocol (HTTP) is a set of rules for communication on the web.

For example, your web browser (Firefox, Chrome, Edge, ...) uses HTTP every time you visit a web page. The browser makes a _request_ to the server for the page, and if nothing goes wrong, the server responds with the page.

Several [different kinds of HTTP requests](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods) are possible. Think of these as the different "verbs" you can be use when communicating in HTTP.

The `requests` package has functions for making HTTP requests from Python and is [well-documented](http://docs.python-requests.org/en/master/). Even [the Python documentation for urllib](https://docs.python.org/2/library/urllib.html) recommends using `requests` (rather than `urllib`) for most tasks.

In [26]:
import requests

response = requests.get("https://jsharpna.github.io/")

In [2]:
response.status_code

200

In [3]:
response.raise_for_status() # throw an error/exception if status code is 4xx or 5xx

In [5]:
response.text

# Lots of < usually means HTML/XML
# Lots of [, { usually means JSON

'<!DOCTYPE html>\n<!-- This page was generated by GitHub Pages using the Cayman theme by Jason Long. -->\n<html lang="en-us">\n  <head>\n    <meta charset="UTF-8">\n    <title>Sharpnack Lab</title>\n    <meta name="viewport" content="width=device-width, initial-scale=1">\n    <link rel="stylesheet" type="text/css" href="stylesheets/normalize.css" media="screen">\n    <link href=\'https://fonts.googleapis.com/css?family=Open+Sans:400,700\' rel=\'stylesheet\' type=\'text/css\'>\n    <link rel="stylesheet" type="text/css" href="stylesheets/stylesheet.css" media="screen">\n    <link rel="stylesheet" type="text/css" href="stylesheets/github-light.css" media="screen">\n        <style type="text/css">\n      .page-header {\n      background-image:url("stylesheets/waterwise.jpg");\n      color: #fff;\n      text-shadow: 0px 0px 4px #ccccff;\n      }\n    </style>\n  </head>\n  <body>\n    <section class="page-header">\n      <h1 class="project-name">UC Davis Sharpnack Lab</h1>\n      <h2 class

### Web APIs

An _application programming interface_ (API) is a set of functions and data structures for communicating with other software. For instance, whenever you use a Python package, you're using the API created by the package's developers.

Web sites sometimes provide an API so that programmers can access content without web scraping. In a web API, each function has a different URL. Sometimes these functions/URLs are called _endpoints_. Arguments can be passed to the function by adding `?PARAMETER=`

The most popular kind of web API is a _representational state transfer_ (REST) API. In a REST API, the server handles separate function calls independently of each other.

We can call a function in a web API by making an HTTP request.

### Example: iTunes Web API

Apple provides a web API for getting information about media on iTunes, with documentation [here](https://affiliate.itunes.apple.com/resources/documentation/itunes-store-web-service-search-api/). There is only one function in the API:

```
https://itunes.apple.com/search
```

According to the documentation, a GET request to this URL searches for an artist.

Let's write a Python function that acts as a wrapper for the search function.

In [6]:
def search_itunes(term, country):
    response = requests.get("https://itunes.apple.com/search", params = {
        "term": term,
        "country": country
    })
    response.raise_for_status()
    
    # ...
    
    return response

In [24]:
response.text

'\n\n\n{\n "resultCount":50,\n "results": [\n{"wrapperType":"track", "kind":"song", "artistId":921917, "collectionId":991183728, "trackId":991184319, "artistName":"Bill Conti", "collectionName":"Rocky (Original Motion Picture Score)", "trackName":"Gonna Fly Now", "collectionCensoredName":"Rocky (Original Motion Picture Score)", "trackCensoredName":"Gonna Fly Now (Theme From \\"Rocky\\")", "artistViewUrl":"https://itunes.apple.com/us/artist/bill-conti/921917?uo=4", "collectionViewUrl":"https://itunes.apple.com/us/album/gonna-fly-now-theme-from-rocky/991183728?i=991184319&uo=4", "trackViewUrl":"https://itunes.apple.com/us/album/gonna-fly-now-theme-from-rocky/991183728?i=991184319&uo=4", \n"previewUrl":"https://audio-ssl.itunes.apple.com/apple-assets-us-std-000001/Music7/v4/78/f5/20/78f52024-5931-93be-6e0e-ed65e2088ce9/mzaf_3023930814535994088.plus.aac.p.m4a", "artworkUrl30":"http://is2.mzstatic.com/image/thumb/Music5/v4/8b/09/fb/8b09fb7f-de9b-1ee8-45c1-f7ace985f360/source/30x30bb.jpg", "

In [11]:
x = response.json()

In [15]:
x.keys()

dict_keys(['resultCount', 'results'])

In [19]:
{"hello": 41, "goodbye": 32}.keys()

dict_keys(['hello', 'goodbye'])

Most of the time, it's a good strategy to "wrap" each web API fuction with a Python function. This lets you to use the web API as if it's just another Python module.

If you're using the web API to get data, the next step is to convert the data to a Pandas data frame:

In [23]:
import pandas as pd

pd.DataFrame(x["results"])

Unnamed: 0,artistId,artistName,artistViewUrl,artworkUrl100,artworkUrl30,artworkUrl60,collectionArtistId,collectionArtistName,collectionArtistViewUrl,collectionCensoredName,...,trackHdPrice,trackHdRentalPrice,trackId,trackName,trackNumber,trackPrice,trackRentalPrice,trackTimeMillis,trackViewUrl,wrapperType
0,921917.0,Bill Conti,https://itunes.apple.com/us/artist/bill-conti/...,http://is2.mzstatic.com/image/thumb/Music5/v4/...,http://is2.mzstatic.com/image/thumb/Music5/v4/...,http://is2.mzstatic.com/image/thumb/Music5/v4/...,,,,Rocky (Original Motion Picture Score),...,,,991184319,Gonna Fly Now,1.0,1.29,,167533,https://itunes.apple.com/us/album/gonna-fly-no...,track
1,481488005.0,A$AP Rocky,https://itunes.apple.com/us/artist/a%24ap-rock...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,,,,LONG.LIVE.A$AP (Deluxe Version),...,,,581997309,"F**kin' Problems (feat. Drake, 2 Chainz & Kend...",7.0,1.29,,233787,https://itunes.apple.com/us/album/f-kin-proble...,track
2,3713599.0,Survivor,https://itunes.apple.com/us/artist/survivor/37...,http://is3.mzstatic.com/image/thumb/Music/v4/d...,http://is3.mzstatic.com/image/thumb/Music/v4/d...,http://is3.mzstatic.com/image/thumb/Music/v4/d...,885401.0,Various Artists,,Rocky IV (Original Motion Picture Soundtrack),...,,,207445080,Eye of the Tiger,4.0,1.29,,245640,https://itunes.apple.com/us/album/eye-of-the-t...,track
3,481488005.0,A$AP Rocky,https://itunes.apple.com/us/artist/a%24ap-rock...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,,,,LONG.LIVE.A$AP (Deluxe Version),...,,,581997310,Wild for the Night (feat. Skrillex & Birdy Nam...,8.0,1.29,,212640,https://itunes.apple.com/us/album/wild-for-the...,track
4,,Akiva Schaffer & Jorma Taccone,,http://is5.mzstatic.com/image/thumb/Video20/v4...,http://is5.mzstatic.com/image/thumb/Video20/v4...,http://is5.mzstatic.com/image/thumb/Video20/v4...,,,,,...,14.99,,1113248686,Popstar: Never Stop Never Stopping,,9.99,,5190495,https://itunes.apple.com/us/movie/popstar-neve...,track
5,481488005.0,A$AP Rocky,https://itunes.apple.com/us/artist/a%24ap-rock...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,,,,LONG.LIVE.A$AP (Deluxe Version),...,,,581997270,Goldie,2.0,1.29,,192067,https://itunes.apple.com/us/album/goldie/58199...,track
6,,John G. Avildsen,,http://is4.mzstatic.com/image/thumb/Video69/v4...,http://is4.mzstatic.com/image/thumb/Video69/v4...,http://is4.mzstatic.com/image/thumb/Video69/v4...,84755883.0,,https://itunes.apple.com/us/artist/mgm/8475588...,Rocky Heavyweight Collection,...,14.99,3.99,219214592,Rocky,1.0,14.99,3.99,7179928,https://itunes.apple.com/us/movie/rocky/id2192...,track
7,,Rick Famuyiwa,,http://is3.mzstatic.com/image/thumb/Video2/v4/...,http://is3.mzstatic.com/image/thumb/Video2/v4/...,http://is3.mzstatic.com/image/thumb/Video2/v4/...,,,,,...,14.99,3.99,999573394,Dope,,9.99,3.99,6193280,https://itunes.apple.com/us/movie/dope/id99957...,track
8,481488005.0,A$AP Rocky,https://itunes.apple.com/us/artist/a%24ap-rock...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,http://is2.mzstatic.com/image/thumb/Music/v4/8...,,,,LONG.LIVE.A$AP (Deluxe Version),...,,,581997146,Long Live a$AP,1.0,1.29,,289587,https://itunes.apple.com/us/album/long-live-a%...,track
9,481488005.0,A$AP Rocky,https://itunes.apple.com/us/artist/a%24ap-rock...,http://is2.mzstatic.com/image/thumb/Music1/v4/...,http://is2.mzstatic.com/image/thumb/Music1/v4/...,http://is2.mzstatic.com/image/thumb/Music1/v4/...,,,,AT.LONG.LAST.A$AP,...,,,994727382,Everyday (feat. Rod Stewart x Miguel x Mark Ro...,17.0,1.29,,260991,https://itunes.apple.com/us/album/everyday-fea...,track


### Being Polite

Slow down requests (in loops):

In [None]:
import time

for i in range(1000):
    response = requests.get("https://jsharpna.github.io/")
    time.sleep(0.5) # wait for 0.5 sec
    
# Generally you shouldn't make more than 20-30 requests / sec (based on Google's rules)

Avoid requesting things you've already requested:

In [25]:
import requests_cache # pip install requests_cache

requests_cache.install_cache("mycache")