# Retrieving data from Kiva API

We will be using [KIVA](https://www.kiva.org/) data for the entirety of this course. Kiva makes it's data publically available through it's [API](http://build.kiva.org/). 

Not sure what an API is? It stands for Application Program Interface. Code Academy has a fantastic short course (link [here](https://www.codecademy.com/en/tracks/placekitten) that introduces you to APIs and allows you to pull images of kittens from a website by the end of the session.

Below, we import the packages we need in order to retrieve data from the API.

In [2]:
from urllib.request import urlopen, Request
import json
from pandas.io.json import json_normalize
import pandas as pd
import json
import requests as r

There is great documentation on Kiva's API [here](http://build.kiva.org/api). The documentation explains what parameters (conditions) we need to pass in our request to Kiva's database in order to get the data we want.

We are trying to retrive all Kiva data from Kenya. So we will be using two main parameters where we set country_code=KE (KE is the two letter [ISO](https://en.wikipedia.org/wiki/ISO_3166-2) code for Kenya), and we increase the results per page to 500 (this is the maximum KIVAs API appears to allow). You can see the HTML results of the api call by pasting the url below into your browser, HTML is a format that is really easy to read.

http://api.kivaws.org/v1/loans/search/?country_code=KE&per_page=500

1) Go ahead and play with the url in order to retrieve different data. For example, how would you retrieve data from South Africa (ZA)?

2) How would you only retrive 200 results?

Answers:

1) http://api.kivaws.org/v1/loans/search/?country_code=ZA&per_page=500

2)  http://api.kivaws.org/v1/loans/search/?country_code=ZA&per_page=200



We want to request this data from the api and store it in a format that is more intuitive to us - a dataframe. Let's get started. The code below retrieves the first 500 results and converts it into a pandas dataframe. You will get to know a lot more about dataframes over the next few classes.

In [3]:
d = r.get('http://api.kivaws.org/v1/loans/search.json?country_code=KE&per_page=500')

Notice that in the request above we specify json as the type of text we want returned. This is easier to handle and change into a python dataframe. You can past the link into your browser to understand the difference between [JSON](https://en.wikipedia.org/wiki/JSON) and [HTML](https://en.wikipedia.org/wiki/HTML).

By running d.headers below we can see all the data associated with our request. It shows the time of our request, the fact that we are requesting json text 'Content-Type': 'application/json; charset=UTF-8', in addition to other details.

In [4]:
d.headers

{'Date': 'Tue, 02 May 2017 17:31:33 GMT', 'Server': 'Apache/2.4.7 (Ubuntu)', 'Access-Control-Allow-Origin': '*', 'Expires': 'Tue, 03 Jul 2001 06:00:00 GMT', 'Last-Modified': 'Tue, 02 May 2017 17:31:33 GMT', 'Cache-Control': 'private, no-store, no-cache, must-revalidate, max-age=0, post-check=0, pre-check=0, proxy-revalidate, no-transform', 'Pragma': 'no-cache', 'X-RateLimit-Overall-Limit': '60', 'X-RateLimit-Overall-Remaining': '60', 'Vary': 'Accept-Encoding', 'Content-Encoding': 'gzip', 'Content-Length': '27083', 'Content-Type': 'application/json; charset=UTF-8'}

By running the command d.json() below we can get an idea of what our data looks like before we change it into a pandas data frame.

In [5]:
d.json()

{'loans': [{'activity': 'Home Energy',
   'bonus_credit_eligibility': False,
   'borrower_count': 1,
   'description': {'languages': ['en']},
   'funded_amount': 50,
   'id': 1286455,
   'image': {'id': 2509135, 'template_id': 1},
   'lender_count': 1,
   'loan_amount': 50,
   'location': {'country': 'Kenya',
    'country_code': 'KE',
    'geo': {'level': 'town', 'pairs': '1.016667 35', 'type': 'point'},
    'town': 'Kitale'},
   'name': 'Caroline',
   'partner_id': 156,
   'planned_expiration_date': '2017-06-01T17:10:06Z',
   'posted_date': '2017-05-02T17:10:07Z',
   'sector': 'Personal Use',
   'status': 'funded',
   'tags': [],
   'themes': ['Green', 'Earth Day Campaign'],
   'use': 'to buy an eco-friendly stove.'},
  {'activity': 'Home Energy',
   'bonus_credit_eligibility': False,
   'borrower_count': 1,
   'description': {'languages': ['en']},
   'funded_amount': 50,
   'id': 1286461,
   'image': {'id': 2509143, 'template_id': 1},
   'lender_count': 2,
   'loan_amount': 50,
   'l

In [6]:
data = json.loads(d.text)

In [7]:
loans=json_normalize(data['loans'])

In [8]:
loans.head(3)

Unnamed: 0,activity,basket_amount,bonus_credit_eligibility,borrower_count,description.languages,funded_amount,id,image.id,image.template_id,lender_count,...,location.town,name,partner_id,planned_expiration_date,posted_date,sector,status,tags,themes,use
0,Home Energy,,False,1,[en],50,1286455,2509135,1,1,...,Kitale,Caroline,156,2017-06-01T17:10:06Z,2017-05-02T17:10:07Z,Personal Use,funded,[],"[Green, Earth Day Campaign]",to buy an eco-friendly stove.
1,Home Energy,,False,1,[en],50,1286461,2509143,1,2,...,Molo,Naomi,156,2017-06-01T17:10:07Z,2017-05-02T17:10:07Z,Personal Use,funded,[],"[Green, Earth Day Campaign]",to buy an eco-friendly stove.
2,Home Energy,0.0,False,1,[en],0,1286449,2509125,1,0,...,Kenyenya,Rodah,156,2017-06-01T17:00:05Z,2017-05-02T17:00:05Z,Personal Use,fundraising,"[{'name': '#Eco-friendly'}, {'name': '#Technol...","[Green, Earth Day Campaign]",to buy a solar lantern.


We have now extracted the first 500 rows of loans from the API. Now, we need to extract this data more systematically for all Kenyan loan results. KIVA has a parameter called page but does not allow for range of pages, so we will have to create a python loop to go through each page of results and add to our dataset.

In [None]:
, country_iso_code, dataset

In [None]:
def extract_data(pages):
    while in range 