<a href="https://colab.research.google.com/github/tinkercademy/ml-notebooks/blob/main/04_APIs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# APIs

Before we jump into APIs, we need to talk about one more structure in Python that we haven't covered yet: <b>dictionaries</b>

## Dictionaries

Dictionaries consist of key:value pairs. We can look up a key, and find its value.

The python structures we've used so far (lists, strings, etc.) are all ordered from left to right - we use indexing to access data stored in different places. Dictionaries are different and are unordered. Instead, they use mapping to allow us to access data stored.

<p><b>Interacting with Dictionaries</b></p>
Just creating dictionaries isn't enough. We want to be able to dynamically interact and update dictionary values. Just like strings, lists and dataframes, there are a number of ways we can interact and manipulate dictionaries.

# APIs

#### Challenge:
We want to create a visualisation using live weather data. For this, we’ll use Particulate Matter 2.5 readings from data.gov.sg.

When we were using data.gov.sg data to plot using matplotlib, we downloaded a CSV file. But this requires manual interaction. Every time we want updated info, we have to manually go to the website and download the most recent CSV. Is there a way to automate this?

#### Intro to APIs

If I wanted to ask someone what the weather was, I might ask them "Hey, what's the weather today?" or "How’s the weather?" or "Is it hot today?" and you’d know I’m asking about the weather.

But computers talk to one another in a slightly different way: by exchanging data in nicely-formatted data packets.  These have to be properly formatted so that computers can recognise them, just like with your Python syntax. As a result, we need to be very specific when sending requests to computers. It also means that they're going to send responses in very specific formats.

(See slides for more info!)






Some notes on APIs:

* Computers that are online--servers--can choose what information they want to provide.
* Servers can require registration and user access keys, to identify and protect against unwanted behaviour.
* The information can be requested through a series of "calls", e.g. "hey, can I have the weather today in Singapore?"
* The information provided should be in a format recognisable to both ends, e.g. "31 degrees Celsius, sunny conditions" or "31, sunny".
* Often, the request is through a URL, e.g. `getweather.com/request/singapore/today`
* Often, the reply is in pre-formatted XML or JSON, e.g. `{temperature: 31, condition: sunny}`

This request-reply mechanism is known as an API, an **Application Programming Interface**. Lots and lots of services provide APIs, so that people can make things with them, e.g.

* [Google Maps APIs](https://developers.google.com/maps/) for people to make property price mashups
* [Singapore Government APIs](https://www.data.gov.sg) for public information the government wants to share
* [Facebook APIs](https://developers.facebook.com/) so you can build off people's social networks
* [Chuck Norris API](https://github.com/chucknorris-io/chuck-api) so... umm. We don't know why this exists.


#### Using the data.gov.sg API

In [None]:
# Import the modules
import urllib.request
import json

# Create a request
url = "https://api.data.gov.sg/v1/environment/pm25?date=2018-01-24"
request = urllib.request.Request(url)

# Get the response and store it
response = urllib.request.urlopen(request)

# response is also an object with various methods/attributes
data = response.read()

#Decode the bytes object
json_string = data.decode('utf-8')

# Convert the JSON data to a Python dictionary
parsed_json = json.loads(json_string)

# Print
parsed_json

#### Try it out:
Can you find the average of the PM25 hourly readings in the South on Dec 31, 2017?

In [None]:
#Hint:
#    Step 1: Get all the south readings for 31 Dec
#    Step 2: Average them

### <font color="red">Bonus Exercise:</font>

Can you convert the JSON data into a dataframe so that it's easier to work with? Let's assume we only care about the timestamps and the PM2.5 readings for the 5 regions at each timestamp.


## Putting It All Together - APIs, Plot.ly, and Mapbox

1. Request the latest Particulate Matter 2.5 readings from data.gov.sg. Store the returned data in a python dictionary (hint: don’t specify a date to get current data)

2. Plot the longitude and latitude coordinates of the five locations on a map of Singapore using Plotly and Mapbox. (Your code should read the longitude and latitude coordinates directly from the response data; coordinates should not be hardcoded.)

3. Add hover text that displays:
    - the location name (north, east, central, etc.)
    - the current PM reading for that location.


## Stock APIs

This exercise will go through the following:

* Reading financial data from online sources
* Basic dataframe and series manipulation
* Basic plotting
* Exporting to Excel

In [None]:
import numpy as np
import pandas as pd
import datetime
import matplotlib.pyplot as plt
%matplotlib inline


## DataReader

Pandas has it's own **DataReader** function, from the **`pandas_datareader`** module, which makes it easy to read data (e.g. historical stock price data) from the following online sources, and to save them as DataFrames:

* Yahoo Finance _(only allows user downloads, not automated requests)_
* Google Finance _(discontinued as of Sept 2017)_
* St.Louis FED (FRED)
* World Bank
* Enigma
* Quandl        
* Kenneth French’s data library
* OECD
* Eurostat
* Thrift Savings Plan
* Nasdaq Trader symbol definitions
* Alpha

More information, including tutorials to access each of these sites, can be found at the [pandas documentation page](https://pandas.pydata.org/docs/).

The downsides, however, are that:
* many of these sites have been changing and no longer offer data (for example, Yahoo and Google both recently shut down their APIs)
* sites return different formats of data of varying levels of completeness.
* it requires you to install the pandas-datareader module

There are actually a lot of different modules that individuals have written and posted online. Oftentimes they're looking for an easy way to grab stock data for their own means and share their resulting code on github. These code snippets can be really useful! But they might not be maintained, and they could have specifically tailored solutions that might not be relevant to you.

Because pandas-datareader requires an extra install, and requires learning additional functions, we're not going to use that method (but know that it exists if you're interested in it!). Instead, we're going to use a free stock info service called Alpha Vantage (https://www.alphavantage.co/). You can get a free API key which allows you to access their stock info. They return data as either a downloadable CSV, or JSON format (just like data.gov.sg!).

Example usage can be found at: https://www.alphavantage.co/documentation/

Here, we'll start by reading Microsoft stock prices for the past two weeks.

<hr>

In [None]:
#Optional Exercises:

### <font color="red">Exercise 1: Get Google Data from Google

Create a variable called `goog` to store Google stock price. Slice the data so that it only looks at the time period from 1 Jan 2016 to 1 June 2017.


### <font color="red">Exercise 2

For your `goog` data between 1 Jan 2016 and 1 June 2017:

* Find the average trading volume
* Find the days which exceed twice its average trading volume (should be 14)

### <font color="red">Exercise 3

For the Google data set:

* Make a new column called "Prev Close", and shift the close data by 1 downwards, i.e. today's "Prev Close" is what was in yesterday's "Close". There's a `.shift()` function you can use for this.
* Add a new column showing the % difference between the current day's open and the previous day's close.
* Group the % difference by integer percentages, and show the counts of these.

For this last one, take a look at the `groupby` command.

### <font color="red">Exercise 4

* Plot the closing prices for AAPL, MSFT, and GOOG in the same graph, normalising for their different scales in a sensible manner.

### <font color="red">Exercise 5: Make your own stock analysis generator

For this final exercise, generate some interesting insight from a basket of stocks, that creates an Excel file automatically for you. Also, have it export a couple of graphs that you can plonk into your PowerPoint presentations!