In [120]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

# Data Acquisition, Web Scraping and Web APIs *

# Table of Contents
* [Data Acquisition, Web Scraping and Web APIs *](#Lecture-5---Data-Acquisition,-Web-Scraping-and-Web-APIs-*)
	* &nbsp;
		* [Content](#Content)
		* [Learning Outcomes](#Learning-Outcomes)
* [Data Acquisition](#Data-Acquisition)
* [1. Web scraping](#1.-Web-scraping)
	* [HTML](#HTML)
		* &nbsp;
			* [What is HTML?](#What-is-HTML?)
	* [Intro to Web Scraping](#Intro-to-Web-Scraping)
		* [--- WARNING ---](#----WARNING----)
	* [2. Web APIs](#2.-Web-APIs)
		* [REST](#REST)
		* [JSON](#JSON)
		* [Forming an API query](#Forming-an-API-query)
	* [Current International Space Station Details](#Current-International-Space-Station-Details)
	* [Dedicated API Wrapper Modules](#Proprietary-API-Wrapper-Modules)
	* [API Repositories and Market Places](#API-Repositories-and-Market-Places)


---
* Some material on web scraping and usage of APIs adapted from Kevin Markham's data science courses at https://github.com/justmarkham

### Content

1. Data gathering via web scraping
2. HTML basics
3. Data gathering via web APIs
4. JSON file format

### Learning Outcomes

At the end of this lecture, you should be able to:

* list the different dynamic sources of data
* explain what HTML is and its basic structure
* make HTTP requests using python
* traverse the HTML document tree
* perform web scraping at an introductory level
* describe and process the JSON file format
* perform rudimentary data acquisition using Web APIs



---

# Data Acquisition

So far, we have looked at how we can acquire data from pre-prepared Excel and text files in the CSV format. We also saw how we can use pandas clipboard facility to paste and build data frames. 

We also experienced that much of the data does not come in tidy formats that are prepared and ready for data analysis. For this we learned a number of techniques that help us to wrangle and tidy our data into shape. 

Now we are going to look at two additional sources of data that are dynamic and will require the combination of all the techniques we learned previously, such as wrangling, merging, aggregation, as well as some new skills. 

It is becoming common these days that data is acquired from multiple sources and merged into a single dataset. The data sources that are increasingly becoming the backbone of many analytics and information systems are web based.

This section considers how data can be read (scraped) from web pages (HTML documents), and how data can be retrieved from web servers using their application program interfaces (APIs).

# 1. Web scraping

Often when we need to acquire data, web pages are a great resource to turn to. Many websites make data available on their web pages for viewing in a browser, but do not make it conveniently downloadable as an easily machine-readable format like JSON, CSV, or XML. Because of this, we sometimes need to employ web scraping techniques.

The term "web scraping" refers to an application or script that processes HTML pages. This is done in order to extract data embedded in HTML for manipulation. 

Web scraping applications in effect simulate a person viewing a website with a browser.

Our task then becomes writing scripts that can traverse the structure of HTML documents and locate the particular piece of data we need.

## HTML

#### What is HTML?

HTML is a markup language (not a programming language) for describing web documents (web pages).

    HTML stands for Hyper Text Markup Language
    A markup language is a set of markup tags
    HTML documents are described by HTML tags
    Each HTML tag describes different document content

HTML pages consist of elements. Elements are marked up by tags, and the tags may have attributes inside them which describe how the content should be rendered by web browsers. The initial tag specifies the type of the document so that the browsers render the content correctly.

Please refer to http://www.w3schools.com/html/html_intro.asp for an introduction to HTML.

The examples below will show how we can perform web scraping on HTML pages using a Python package called `BeautifulSoup`. 

BeautifulSoup is an HTML/XML parser for Python that can turn markup text into a parse tree, that can then be traversed more easily.

In [None]:
from IPython.display import HTML, IFrame
IFrame("http://www.crummy.com/software/BeautifulSoup/bs4/doc/", width=1100, height=500)



BeautifulSoup provides a simplified, idiomatic way of navigating, searching, and modifying parse tree generated by HTML and XML.

More info on BeautifulSoup http://www.crummy.com/software/BeautifulSoup/bs3/documentation.html

Good examples of how this is done can be found in : http://www.gregreda.com/2013/03/03/web-scraping-101-with-python/ and http://blog.miguelgrinberg.com/post/easy-web-scraping-with-python

## Intro to Web Scraping

We are going to begin with a toy example first using the simple html page created below:

In [None]:
# imports
import requests                 # How Python gets the webpages
from bs4 import BeautifulSoup   # Creates structured, searchable object
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import json
#import prettyprint as pp


In [None]:
from pylab import rcParams

rcParams['figure.figsize'] = 15, 10
rcParams['font.size'] = 20
rcParams['axes.facecolor'] = 'white'

%matplotlib inline

In [None]:
# First, let's read the toy webpage as a string - this is what happens initially when you scrape any webpage
html_doc = """
<!doctype html>
<html lang="en">
<head>
  <title>Teo's Webpage</title>
</head>

<body>
  <h1>Teo's Webpage</h1>
  <p id="intro">My name is Teo.  I find web scraping interesting.</p>
  <p id="background">I live in Auckland and completed my PhD at Massey University in Computer Science, while studying the field of machine learning.</p>
  <p id="current">I currently work as a lecturer in Information Technology.</p>
  
  <h3>My Interests</h3>
  <ul>
      <li id="my favorite">Data Science and Machine Learning</li>
      <li class="hobby">Tennis</li>
      <li class="hobby">Reading</li>
      <li class="hobby">Travelling</li>
      <li class="hobby">Running</li>
  </ul>
</body>
</html>
"""
type(html_doc)

In [None]:
# Beautiful soup allows us to create structure from the html elements, and to traverse it
page = BeautifulSoup(html_doc, "lxml")
print(type(page))
page

In [None]:
# The most useful methods in a Beautiful Soup object are "find" and "findAll".
# "find" takes several parameters, the most important are "name" and "attrs".
# name will help us find the type of an element
# Let's target "name".
page.find(name='body') # Finds the 'body' tag and everything inside of it.

In [None]:
body = page.find(name='body')
type(body) #element.Tag

The above result tells us that 'body' element was found in the HTML page, and it tells us what object type it is. When the find fails, then this is what we get:

In [None]:
body = page.find(name='bodyyy')
type(body) #element.Tag

We can see its content below

In [None]:
body = page.find(name='body')
body.contents

We can recursively search for other elements inside the returned result as well:

In [None]:
h1 = body.find(name='h1') # Find the 'h1' element inside of the 'body' tag
h1

In [None]:
h1.text

In [None]:
h1.contents

Notice how we can access the entire element or just the content. 

Now let's find the 'p' elements:

In [None]:
p = page.find(name='p')
# This only finds one.  This is where 'findAll' comes in.
p

We can also do a search of all instances of an element:

In [None]:
all_p = page.findAll(name='p')
print(all_p)
type(all_p) # Result sets are a lot like Python lists

Access specific element with index:

In [None]:
print(all_p[0])
print(all_p[1])

In [None]:
# Iterable like  list
for one_p in all_p:
    print(one_p.text) # Print text

Access specific attribute of a tag:

In [None]:
all_p[0] # Specific element

In [None]:
all_p[0]['id'] # Specific attribute value of a specific element

Now let's look at 'attrs'. Beautiful soup also allows us to locate elements with specific attributes:

In [None]:
page.find(name='p', attrs={"id":"intro"})

In [None]:
page.find(name='p', attrs={"id":"background"})

In [None]:
result = page.find(name='p', attrs={"id":"current"})
result.text

Again we can also do a search of all instances of an element and name of a class:

In [None]:
page.findAll("li", "hobby")

**Exercise:** Extract the 'h3' element from Teo's webpage.

**Exercise:** Extract Teo's hobbies from the html_doc.  Print out the text of the hobby. 

**Exercise:** Extract Teo's hobby that has the id "my favorite".

## Example 

We will illustrate this concept further on an NZ Economics (which does not forbid web scraping) website called https://tradingeconomics.com/, where will will attempt to scrape data describing various economic indices.



We will try and pull the exchange rate from the Euro to the USD.

In order to find where the price is situated in the HTML document, we must look at the document's source code. By right clicking on a page in a browser, an option should be displayed allowing you to view the source.

We must inspect the source so that we can find the element that houses this value. We can then use the python's BeautifulSoup package to **read and traverse through the HTML element tree** in order to extract the data that we want.

There are three basic steps to scraping a single page:

    1. Get (request) the page
    2. Parse the page content (read and interpret the document structure)
    3. Search through the content of interest


Below is the example of a script that will access and display the population value for NZ:


In [None]:
#we first need to make some extra imports
import json
from time import sleep
from datetime import datetime


**STEP 1: GET** Access the page and read it into the beautiful soup object

In [None]:
url = "https://tradingeconomics.com/"
response = requests.get(url) 
response

### --- WARNING --- 

ALWAYS FIRST MAKE SURE THAT THE RESPONSE IS 200 - OTHERWISE YOU MIGHT HAVE AN ERROR, IN WHICH CASE YOU'D BE BEST TO STOP AND NOT TRY TO PROCESS THE DOCUMENT, SINCE THERE WILL BE NOTHING TO PROCESS

In [None]:
page = response.content

In [None]:
page[:10000]

**STEP 2: PARSE** Create a BeautifulSoup object that reads and parses the HTML page into a format that we can search and traverse.

In [None]:
scraping = BeautifulSoup(page, "lxml") 
scraping

Now we can search for a given tag, id or class name.

**STEP 3: SEARCH** Search through the page for 'tr' (table row) type tags looking for the attribute 'data-symbol' with the 'EURUSD:CUR value:

In [None]:
scraping = BeautifulSoup(page, "html") 
element1 = scraping.find( attrs={'data-symbol' : 'EURUSD:CUR'})
element1

The result needs more filtering in order to get to the target value. 

In [None]:
element2 = element1.find( attrs={'id' : 'p'})
element2

We have now arrived at the element holding the value and we need to extract the text and cast it to a float:

In [None]:
float(element2.text)

There are differnt ways of honing in on the data that we need from this, but one option could be to perform another search:

In [None]:
float(scraping.findAll( attrs={'id' : 'p'})[0].text)

As it turns out, there are multiple tags in the document with this tag-name combination. 

**Exercise:** From the same website, navigate to the appropriate webpage and make the changes required to scrape the USD to Singapore dollar exchange rate:

In [None]:
#step 1


In [None]:
#step 2


In [None]:
# step 3


We can also read in entire HTML tables into dataframe objects:

Here is an example of how to read in the ASCII table from a wikipedia page: https://en.wikipedia.org/wiki/ASCII

In [None]:
#step 1
url = "https://en.wikipedia.org/wiki/ASCII"
response = requests.get(url)
response

In [None]:
scraping_html_table_EQ = BeautifulSoup(response.content, "lxml") 
scraping_html_table_EQ = scraping_html_table_EQ.find_all("table", "wikitable")
df = pd.read_html(str(scraping_html_table_EQ))
first_table_df = df[0]
first_table_df.columns = first_table_df.iloc[0]
first_table_df = first_table_df.iloc[2:]
first_table_df

**Exercise**: Read the second table from the same webpage into a data frame object:

**Exercise**: Return to the Tradingeconomics website and scrape all the Australia and NZ stock exchange figures. Tidy the dataframe and plot the YTD values as a bar graph:

## 2. Web APIs

Web servers serve out web pages in the HTML format as they are requested by users. Web servers are also capable of providing data that is not formatted in HTML. 

These web servers provide public (and private) APIs through which users can interact, construct queries that the web servers understand, and receive data from them. 

Depending on who owns them, web servers will have different APIs. They usually provide developer help pages that demonstrate how they work and how queries can be constructed using HTTP which the servers understand.

Many websites have public APIs providing data feeds via JSON or some other common formats. We will consider only **JSON** as it is becoming a standard, and is conveniently, virtually identical to python's dictionaries in its syntax. 

Increasingly though, in order to access these APIs we must register for API Keys. They are **credentials**. Some of them are free and simply require that an account be created with a given website, while others must be purchased and have limits on the amount of data that can be pulled.

There are a number of ways to access these APIs. **REST** (Representational State Transfer) is becoming the most common mechanism and often uses **JSON** as the format for transmitting data. 

### JSON

JSON (short for JavaScript Object Notation) has become one of the standard formats
for sending data by HTTP request between web servers and browsers and other applications. 

It is a much more flexible data format than a tabular text form like CSV. 

Here is an example:

In [None]:
#In Python triple-quoted strings allow us to include strings that have escape chars in it.
obj = """
{"name": "Massey University",
"campuses_NZ": ["Albany", "Palmerston North", "Wellington"],
"campuses_international": null,
"colleges": [{"name": "Sciences", "degrees": 10, "majors": 30},
{"name": "Business", "degrees": 8, "majors": 25}]
}
"""
obj


JSON is very nearly valid Python code with the exception of its null value `null` and
some other nuances (such as disallowing trailing commas at the end of lists). The basic
types are objects (dicts), arrays (lists), strings, numbers, booleans, and nulls. 

**All of the keys in an object must be strings**. There are several Python libraries for reading and
writing JSON data. We will use `json` here as it is built into the Python standard library. 

To convert (deserialize) a JSON string from above to an equivalent Python object (`dict`), use `json.loads`:

In [None]:
result = json.loads(obj)
result

`json.dumps` on the other hand converts a Python object back to JSON:

In [None]:
as_json = json.dumps(result)
as_json

How you convert a JSON object or list of objects to a DataFrame or some other data
structure for analysis will be up to you. Conveniently, you can pass a list of JSON objects
to the DataFrame constructor and select a subset of the data fields:

In [None]:
massey_colleges = pd.DataFrame(result['colleges'], columns=['name', 'degrees'])
massey_colleges

We can convert a data frame back to a JSON object with the following:

In [None]:
massey_colleges.to_json()

### REST

**REST is a lightweight mechanism that is protocol independent, but often sits on top of the HTTP protocol** which enables applications to exchange data with severs. 

A combination of HTTP requests, together with valid REST queries can easily be constructed from Python. One easy-to-use method is through the `requests` package (http://docs.python-requests.org).

Previously, using Web Services and SOAP would result in queries like:

Using REST, such clumsy queries can be transformed into simple HTTP requests of a format (1) like:

Or alternatively, passing arguments using format (2) as follows:

There are slight differences in what you can expect from the two formats. Format 1 (**path segment parameter**) will return a 404 error when the parameter value does not correspond to an existing resource. 

Format 2 uses **optional parameters**. Instead of en error, this format will return an empty list when the parameter is not found in the query result.

## Examples of Forming API Queries

### Data Science Toolkit (http://dstk.britecorepro.com/) 

The Data Science Toolkit provides free APIs for accessing a range of services.

 


#### Google-style Geocoder 

Interfaces with Google's geocoding API. Provides the latitude and longitude of an address. 

In [None]:
#Albany Library,30 Kell Dr,Albany,Auckland,New Zealand
response = requests.get("http://dstk.britecorepro.com/maps/api/geocode/json?sensor=false&address=Albany Library,30+Kell+Dr,Albany,+Auckland,+New+Zealand")
print(response)
json.loads(response.content)

**Exercise**: Extract the latitude and longitude of your current residential address and check it on Google maps.

#### IP Address to Coordinates

This API takes either a single numerical IP address, a comma-separated list, or a JSON-encoded array of addresses, and returns a JSON object with a key for every IP.

In [None]:
response = requests.get("http://dstk.britecorepro.com//ip2coordinates/130.12.1.34")
print(response)
json.loads(response.content)

**Exercise**: Find out your current IP address (https://www.whatsmyip.org/ ), then extract the latitude and longitude check it on Google maps.

### CoinDesk

CoinDesk (https://old.coindesk.com/coindesk-api) provides a simple API to make its Bitcoin Price Index (BPI) data programmatically available to others. 



In [None]:
response = requests.get("https://api.coindesk.com/v1/bpi/currentprice.json")
print(response)
json.loads(response.content)



**Exercise**: Modify the query above in order to extract the current price of BitCoin in Singapore (SGD) dollars.

**Exercise**: Modify the query above in order to extract the historic value of BitCoin in NZD dollars from 2013 to today. Extract the values from the json object and plot them.

### <s>OpenRates</s> 

<s> OpenRates (http://www.openrates.io/) delivers and up-to-date exchange rate data for 32 world currencies in JSON format. All currency data is sourced from the European Central Bank. The OpenRates API also offers historical exchange rates back to 1999. </s> 

In [None]:
#base currency is the Euro
response = requests.get("http://api.openrates.io/latest")
print(response)
json.loads(response.content)


**Exercise**: Modify the above query in order to extract the latest exchange rates showing how many NZD dollars a SGD dollar buys. 

**Exercise**: Make API queries that will enable you to quantify the percentage change in the number of Australian dollars a NZ dollar could buy between today and exactly one year ago.

### <s> RestCountries</s> 

<s> REST Countries (https://restcountries.eu/) provides high level information about a comprehensive set of countries. </s> 

In [None]:
response = requests.get("https://restcountries.eu/rest/v2/all")
print(response)
json.loads(response.content)


An API call to RestCountries can return a list of countries that a target country shares a border with, as well as the GINI coefficient that signifies income inequalities.


**Exercise**: Make a set of API quesries which returns a list of countries with which Venezuela shares a border, then plot the GINI coefficient for all these countries including Venezuela. 

### Current International Space Station Details

http://open-notify.org/Open-Notify-API/

Examples below are taken from: https://www.dataquest.io/blog/python-api-tutorial/

Below is an example of querying the ISS in order to find out the location of the space station.

In [None]:
# Make a get request to get the latest position of the international space station from the opennotify api.
response = requests.get("http://api.open-notify.org/iss-now.json")

# Print the status code of the response.
print(response.status_code)

In [None]:
response.content

The example API query below returns a list of upcoming ISS passes for a particular location formatted as JSON.

In [None]:
import pprint as pp

In [None]:
# Set up the parameters we want to pass to the API.
# This is the latitude and longitude of New York City.
parameters = {"lat": 40.71, "lon": -74}

# Make a get request with the parameters.
response = requests.get("http://api.open-notify.org/iss-pass.json", params=parameters)

# Print the content of the response (the data the server returned)
pp.pprint(json.loads(response.content))

In [None]:
# This gets the same data as the command above
response = requests.get("http://api.open-notify.org/iss-pass.json?lat=40.71&lon=-74")
pp.pprint(json.loads(response.content))

**Exercise:** Find the latitude and longitude for Singapore and query the API for ISS as to when the predicted flybys over Singapore will be.

**Exercise:** Iterate through the results of the above query and convert the flyover time from Epoch Time to local time expressed in a readable format.

Using this API, we can also find out programatically how many astronauts are currently in the ISS and who they are: 

In [None]:
# Get the response from the API endpoint.
response = requests.get("http://api.open-notify.org/astros.json")
data = response.json()

# 9 people are currently in space.
print(data["number"])
pp.pprint(data)

### GeoNet API

https://www.geonet.org.nz/

> GeoNet is the result of a partnership between the Earthquake Commission (EQC), GNS Science, and Land Information New Zealand (LINZ). The GeoNet project was established in 2001 to build and operate a modern geological hazard monitoring system in New Zealand. It comprises a network of geophysical instruments, automated software applications and skilled staff to detect, analyse and respond to earthquakes, volcanic activity, large landslides, tsunami and the slow deformation that precedes large earthquakes.


GeoNet has an API from which latest seismic activity in the NZ region can be accessed: https://api.geonet.org.nz/


Here is an example of how to query their API and extract all recent seismic activity that was at or above the Modified Mercalli Intensity scale (MMI)(https://en.wikipedia.org/wiki/Mercalli_intensity_scale)

See https://api.geonet.org.nz/quake?MMI=3 for an example of the returned JSON object format.

In [None]:
response = requests.get("https://api.geonet.org.nz/quake?MMI=2")
#print(response.content)
res = json.loads(response.content)
pp.pprint(res)

We can now search through the JSON object and extract all the seismic activity in the 'Wellington' region for example and print out the magnitude of the quake.


In [None]:
for i in range(len(res['features'])):
    if res['features'][i]['properties']['locality'].find('Wellington') > -1:
        print(res['features'][i]['properties']['locality'], ' - Richter Scale Magnitude: ', res['features'][i]['properties']['magnitude'])

In [None]:
for i in range(len(res['features'])):
    if res['features'][i]['properties']['locality'].find('Seddon') > -1:
        print(res['features'][i]['properties']['locality'], ' - Richter Scale Magnitude: ', res['features'][i]['properties']['magnitude'])

In [None]:
for i in range(len(res['features'])):
    if res['features'][i]['properties']['locality'].find('Gisborne') > -1:
        print(res['features'][i]['properties']['locality'], ' - Richter Scale Magnitude: ', res['features'][i]['properties']['magnitude'])

**Exercise:** Modify the above code in order to generate a new query to return all seismic activity that was at or above 2 MMI. Then print all the results for activity in the vicinity of Gisborne , listing both the date of the quake and the depth.a

**Exercise:** https://api.geonet.org.nz/quake/stats returns the stats of all seismic activity in NZ over the last year. Generate the query that returns this JSON object and then plot the NZ daily seismic activity in NZ over the last year.

###  World Bank API

World Bank APIs provide access to various types of data and databases:

    The Indicators API provides programmatic access to time series development data and metadata. Most of the articles in this section are devoted to the Indicators API.

    The Data Catalog API provides information about the thousands of development-relevant datasets available through the World Bank Data Catalog. 

    The Projects API provides access to World Bank operations data, i.e., active, pipeline and closed projects implemented in countries and around the world. 

    The Finances API provides programmatic access to World Bank financial data (loans, credits, financial statements, etc) delivered on the World Bank Finances platform.

    The Climate Data API provides access to historical and modelled climate data from the Climate Knowledge Portal. 


> Source: https://datahelpdesk.worldbank.org/knowledgebase/topics/125589-developer-information

In [None]:
# World Bank API - GDP example

indicator =  'NY.GDP.PCAP.CD?date=2000:2018'
url = "http://api.worldbank.org/v2/countries/all/indicators/%s&format=json&per_page=5000" % indicator
print(url)
response = requests.get(url)
print(response)
result = response.content
result = json.loads(result)
result

The json object can then be converted into a dataframe :

In [None]:
worldbank_df = pd.DataFrame.from_dict(result[1])
worldbank_df

We need to extract the country from the dict object in the country column:

In [None]:
worldbank_df['country'] = worldbank_df[['country']].applymap(lambda x : x['value'])
worldbank_df

In [None]:
worldbank_df.country.unique()

We can next select and rename some of the columns into more meaningful names:

In [None]:
worldbank_df = worldbank_df[['country', 'countryiso3code', 'date', 'value']]
worldbank_df.columns = ['country', 'countryiso3code', 'date', 'GDP_per_capita']
worldbank_df

**Exercise:** Plot the GDP per capita for New Zealand from the above dataframe.

**Exercise:** Perform the same steps for Singapore and plot the data against that of New Zealand.

**Exercise:** Generate a new API query that extract the net migration numbers for NZ, Singapore and Australia since the year 2000 then plot them together on the same graph.

https://api.worldbank.org/v2/sources/40/indicators

###  IMF API

The INternational Monetary Fund provides an API for a comprehensive set of financial and economic indicators whose details can be explored here https://datahelpdesk.worldbank.org/knowledgebase/topics/125589-developer-information and http://datahelp.imf.org/knowledgebase/articles/838041-sdmx-2-0-restful-web-service


There are some examples of how to extract data from this API in the following links:

https://briandew.wordpress.com/2016/05/01/machine-reading-imf-data-data-retrieval-with-python/

https://briandew.wordpress.com/2016/08/10/using-the-imf-data-api-data-retrieval-with-python/ 

https://www.bd-econ.com/imfapi1.html


The endpoint for the IMF API service is http://dataservices.imf.org/REST/SDMX_JSON.svc/

A number of different databases can then be appended to the endpoint; however, the documentation is somewhat obscure in terms of usage instructions. 


In [None]:
# this is a request for quarterly (frequency: Q) import price index data (indicator: PMP_IX) for NZ. (reference area: NZ), 
# from the International Financial Statistics (IFS) series.

url = 'http://dataservices.imf.org/REST/SDMX_JSON.svc/' #endpoint
database = 'CompactData/IFS/Q.NZ.PMP_IX' # database and indicator


In [None]:
response = requests.get(url + database)
print(response)
result = response.content
result = json.loads(result)
result


In [None]:
result.keys()

In [None]:
result['CompactData'].keys()

In [None]:
result['CompactData']['DataSet'].keys()

In [None]:
result['CompactData']['DataSet']['Series'].keys()

In [None]:
result['CompactData']['DataSet']['Series']['Obs']

In [None]:
pd.DataFrame.from_dict(result['CompactData']['DataSet']['Series']['Obs'])

**Exercise:** Clean up the dataframe and plot the annual percentage change in the NZ Export Price Index Inflation Rate.

**Exercise:** Repeat the same analysis and plot for Singapore.

## Dedicated API Wrapper Modules

Well established companies will sometimes write and make available modules in various programming languages that form a wrapper around their REST APIs and an easier interface for communicating with their servers.

Yahoo and Spotify are an example of such companies that provide a Python module. Some of custom-made APIs are free-access and some require an account to be created with them first. Premium content can only be pulled from their servers using a paid Premium account.

Let's look at a custom wrapper for Craigslist:

In [None]:
#!pip install python-craigslist

In [None]:
from craigslist import CraigslistJobs, CraigslistForSale
CraigslistJobs.show_filters()
print("=========================")
CraigslistForSale.show_filters(category='cta')

Find a software developer job in New York (codes have been worked out by studying the craigslist website URLs):

In [None]:
from craigslist import CraigslistJobs
cl_j = CraigslistJobs(site='newyork', category='sof',
                      filters={ 'employment_type': ['full-time', 'part-time']})

for result in cl_j.get_results():
    print(result)



Find a free food events in New York:

In [None]:
from craigslist import CraigslistEvents
cl_e = CraigslistEvents(site='newyork', filters={'free': True, 'food': True})

for result in cl_e.get_results(sort_by='newest', limit=5):
    print(result)


## API Repositories and Market Places

A large number of other API repositories can be found under these links:

https://any-api.com/


A summary of some useful APIs can be found here http://www.computersciencezone.org/50-most-useful-apis-for-developers/ 

http://www.programmableweb.com/apis/directory
