# Interacting with APIs to import data from the web

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Introduction-to-APIs-and-JSONs" data-toc-modified-id="Introduction-to-APIs-and-JSONs-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Introduction to APIs and JSONs</a></span><ul class="toc-item"><li><span><a href="#Loading-and-exploring-a-JSON" data-toc-modified-id="Loading-and-exploring-a-JSON-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Loading and exploring a JSON</a></span></li></ul></li><li><span><a href="#APIs-and-interacting-with-the-world-wide-web" data-toc-modified-id="APIs-and-interacting-with-the-world-wide-web-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>APIs and interacting with the world wide web</a></span><ul class="toc-item"><li><span><a href="#API-requests" data-toc-modified-id="API-requests-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>API requests</a></span></li><li><span><a href="#JSON–from-the-web-to-Python" data-toc-modified-id="JSON–from-the-web-to-Python-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>JSON–from the web to Python</a></span></li><li><span><a href="#Checking-out-the-Wikipedia-API" data-toc-modified-id="Checking-out-the-Wikipedia-API-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Checking out the Wikipedia API</a></span></li></ul></li></ul></div>

## Introduction to APIs and JSONs

- APIs
    - Application Programming Interface 
    - Protocols and routines
        - Building and interacting with software applications
- JSONs
    - JavaScript Object Notation
    - Real-time server-to-browser communication
    - Human readable
    - JSONs consist of key-value pairs.
    - The JSON file format arose out of a growing need for real-time server-to-browser communication.
    - The function json.load() will load the JSON into Python as a dictionary.

- Loading JSONs in Python
        In [1]: import json
        In [2]: with open('snakes.json', 'r') as json_file:
                    json_data = json.load(json_file)
        In [3]: type(json_data)
        Out[3]: dict
- Exploring JSONs in Python
        In [4]: for key, value in json_data.items():
                    print(key + ':', value)

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import jupyterthemes.jtplot as jtplot
%matplotlib inline
jtplot.style(theme='onedork')

### Loading and exploring a JSON

In [2]:
import json
# Load JSON: json_data
with open("exercise/a_movie.json") as json_file:
    json_data = json.load(json_file)
print(type(json_data))
# Print each key-value pair in json_data
for k,v in json_data.items():
    print(k + ': ', v)

<class 'dict'>
Title:  The Social Network
Year:  2010
Rated:  PG-13
Released:  01 Oct 2010
Runtime:  120 min
Genre:  Biography, Drama
Director:  David Fincher
Writer:  Aaron Sorkin (screenplay), Ben Mezrich (book)
Actors:  Jesse Eisenberg, Rooney Mara, Bryan Barter, Dustin Fitzsimons
Plot:  As Harvard student Mark Zuckerberg creates the social networking site that would become known as Facebook, he is sued by the twins who claimed he stole their idea, and by the co-founder who was later squeezed out of the business.
Language:  English, French
Country:  USA
Awards:  Won 3 Oscars. Another 165 wins & 168 nominations.
Poster:  https://m.media-amazon.com/images/M/MV5BOGUyZDUxZjEtMmIzMC00MzlmLTg4MGItZWJmMzBhZjE0Mjc1XkEyXkFqcGdeQXVyMTMxODk2OTU@._V1_SX300.jpg
Ratings:  [{'Source': 'Internet Movie Database', 'Value': '7.7/10'}, {'Source': 'Rotten Tomatoes', 'Value': '96%'}, {'Source': 'Metacritic', 'Value': '95/100'}]
Metascore:  95
imdbRating:  7.7
imdbVotes:  574,061
imdbID:  tt1285016
Type: 

## APIs and interacting with the world wide web

- What is an API?
    - Set of protocols and routines 
    - Bunch of code
        - Allows two so!ware programs to communicate with each other
    - An API is a set of protocols and routines for building and interacting with software applications.
    - API is an acronym and is short for Application Program interface.
    - It is common to pull data from APIs in the JSON file format.
    - An API is a bunch of code that allows two software programs to communicate with each other.


- Connecting to an API in Python
        In [1]: import requests
        In [2]: url = 'http://www.omdbapi.com/?t=hackers'
        In [3]: r = requests.get(url)
        In [4]: json_data = r.json()    # decode .json file
        In [5]: for key, value in json_data.items():
                     print(key + ':', value)
- What was that URL?
    - http - making an HTTP request
    - www.omdbapi.com - querying the OMDB API
        -  ?t=hackers
            - string begin with '?' mark — Query string
                - parts of URL, not necessarily fit into conventional a hierarchical path structure
            - Return data for a movie with title (t) ‘Hackers’
            

### API requests

In [3]:
# Import requests package
import requests

# Assign URL to variable: url
url = 'http://www.omdbapi.com/?apikey=72bc447a&t=the+social+network'

# Package the request, send the request and catch the response: r
r = requests.get(url)

# Print the text of the response
print(r.text)


{"Title":"The Social Network","Year":"2010","Rated":"PG-13","Released":"01 Oct 2010","Runtime":"120 min","Genre":"Biography, Drama","Director":"David Fincher","Writer":"Aaron Sorkin (screenplay), Ben Mezrich (book)","Actors":"Jesse Eisenberg, Rooney Mara, Bryan Barter, Dustin Fitzsimons","Plot":"As Harvard student Mark Zuckerberg creates the social networking site that would become known as Facebook, he is sued by the twins who claimed he stole their idea, and by the co-founder who was later squeezed out of the business.","Language":"English, French","Country":"USA","Awards":"Won 3 Oscars. Another 165 wins & 168 nominations.","Poster":"https://m.media-amazon.com/images/M/MV5BOGUyZDUxZjEtMmIzMC00MzlmLTg4MGItZWJmMzBhZjE0Mjc1XkEyXkFqcGdeQXVyMTMxODk2OTU@._V1_SX300.jpg","Ratings":[{"Source":"Internet Movie Database","Value":"7.7/10"},{"Source":"Rotten Tomatoes","Value":"96%"},{"Source":"Metacritic","Value":"95/100"}],"Metascore":"95","imdbRating":"7.7","imdbVotes":"574,061","imdbID":"tt1285

### JSON–from the web to Python

In [4]:
# Import package
import requests
import json
# Assign URL to variable: url
url = 'http://www.omdbapi.com/?apikey=72bc447a&t=social+network'

# Package the request, send the request and catch the response: r
r = requests.get(url)
print(type(r))
# Decode the JSON data into a dictionary: json_data
json_data = r.json()
print(type(json_data))
# Print each key-value pair in json_data
for k in json_data.keys():
    print(k + ': ', json_data[k])
# save in local
with open("exercise/a_movie.json", 'w') as json_file:
    json.dump(json_data, json_file)

<class 'requests.models.Response'>
<class 'dict'>
Title:  The Social Network
Year:  2010
Rated:  PG-13
Released:  01 Oct 2010
Runtime:  120 min
Genre:  Biography, Drama
Director:  David Fincher
Writer:  Aaron Sorkin (screenplay), Ben Mezrich (book)
Actors:  Jesse Eisenberg, Rooney Mara, Bryan Barter, Dustin Fitzsimons
Plot:  As Harvard student Mark Zuckerberg creates the social networking site that would become known as Facebook, he is sued by the twins who claimed he stole their idea, and by the co-founder who was later squeezed out of the business.
Language:  English, French
Country:  USA
Awards:  Won 3 Oscars. Another 165 wins & 168 nominations.
Poster:  https://m.media-amazon.com/images/M/MV5BOGUyZDUxZjEtMmIzMC00MzlmLTg4MGItZWJmMzBhZjE0Mjc1XkEyXkFqcGdeQXVyMTMxODk2OTU@._V1_SX300.jpg
Ratings:  [{'Source': 'Internet Movie Database', 'Value': '7.7/10'}, {'Source': 'Rotten Tomatoes', 'Value': '96%'}, {'Source': 'Metacritic', 'Value': '95/100'}]
Metascore:  95
imdbRating:  7.7
imdbVotes:

### Checking out the Wikipedia API

In [5]:
# Import package
import requests

# Assign URL to variable: url
url = 'https://en.wikipedia.org/w/api.php?action=query&prop=extracts&format=json&exintro=&titles=pizza'

# Package the request, send the request and catch the response: r
r = requests.get(url)

# Decode the JSON data into a dictionary: json_data
json_data = r.json()

# Print the Wikipedia page extract
pizza_extract = json_data['query']['pages']['24768']['title']
print(pizza_extract)
#print(json_data)

Pizza
