### Accessing Data with API's

**OBJECTIVES**

- More with `groupby` and `.agg`
- Data Access via API


### `.groupby` 

The split-apply-combine paradigm we have explored is incredibly powerful and useful tool.  In addition to performing a single aggregate operation, we can use multiple built in or custom aggregate functions.

In [161]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

#### The Data

The dataset `salesdaily.csv` contains daily pharmaceutical sales data and the columns are described as follows:

```
M01AB - Anti-inflammatory and antirheumatic products, non-steroids, Acetic acid derivatives and related substances
M01AE - Anti-inflammatory and antirheumatic products, non-steroids, Propionic acid derivatives
N02BA - Other analgesics and antipyretics, Salicylic acid and derivatives
N02BE/B - Other analgesics and antipyretics, Pyrazolones and Anilides
N05B - Psycholeptics drugs, Anxiolytic drugs
N05C - Psycholeptics drugs, Hypnotics and sedatives drugs
R03 - Drugs for obstructive airway diseases
R06 - Antihistamines for systemic use
```

Load in the data and be sure to set a `datetime` index. 

In [142]:
#read in the data


In [145]:
#look at the info


**EXAMPLE**

How do the daily average sales of `M01AB` compare to those of `NO5B`?  Construct a horizontal bar chart.

In [146]:
#groupby and then plot -- kind = 'barh'


#### Multiple Summaries with `agg`

Rather than a single aggregate formula, we can use mutliple aggregate functions with the `.agg` method, and pass the functions or common names as strings.  You may also use a custom function.

In [147]:
#mean and standard deviation


In [104]:
#custom functino to fine range in data
def lowhi(x):
    
    return x.max() - x.min()

In [134]:
#mean, median, range


#### Sparkline Example

In [106]:
#pip install sparklines

In [107]:
import sparklines

In [148]:
#sparklines function


In [140]:
#apply with mean and standard deviation


#### `.resample`

Simlar to groupby, with `datetime` indicies you can group on units of time in your data and aggregate across them.  The `.resample` method works similar to `groupby` in that it splits the data into time chunks, and then you apply an aggreagate method.

In [141]:
# resample by quarter and examine the mean


## Data Input via APIs

This is about using an api "Application program interface" ...basic idea, allows
direct access to some database or parts of it without having do download everything

Documentation is here...

https://pandas-datareader.readthedocs.io/en/latest/index.html

This documentation is good too:

http://pandas-datareader.readthedocs.io/en/latest/remote_data.html

In [1]:
import os
import pandas as pd
import matplotlib.pyplot as plt
import datetime 

In [22]:
# pip install pandas-datareader

**Quick Example**

Extract five years of 10-year constant maturity yields on U.S. government bonds.

In [151]:
import pandas_datareader as pdr
matyld = pdr.get_data_fred('GS10')

In [153]:
#look at top 5 rows


In [26]:
#look at info


**Excercise** Can you find the unemployment rate for the US from FRED. Use the data reader. And create a plot of unemployment from the 2005 on ward. Challenge, can you create a histogram of unemployment rates?

#### Other Data Sources and Functionality

The documentation [here](https://pandas-datareader.readthedocs.io/en/latest/remote_data.html) shows other sources of data from the datareader.  

In [154]:
# extract aapl data from yahoo


In [155]:
# plot percent change month to month


**Exercise** Grab luluemon's data. Plot the closing value and the volume since the begining of the year.

### Accessing Data Without a Library

In the `pandas_datareader` we have a library written in Python that interacts with data for us.  Sometimes, this is not possible and you will need to interact with the data in a more general environment.  Let's try this out by looking up some information about cats.  Here is the documentation [link](https://developers.thecatapi.com/view-account/ylX4blBYT9FaoVd6OhvR?report=bOoHBz-8t). 

![](images/catapi.png)

In [162]:
import requests

**Response**

- Random Cat: https://api.thecatapi.com/v1/images/search
- 10 Bengal Cats: https://api.thecatapi.com/v1/images/search?limit=10&breed_ids=beng&api_key=REPLACE_ME

In [178]:
#url for a random cat


In [179]:
#request of the url


In [180]:
#look at response code


In [181]:
#text of request


In [182]:
#examine the json


In [183]:
#repeat for bengal cats
#url

#make request

#turn into json


In [199]:
#extract the links to images


**Problem**: 

![](https://dog.ceo/img/dog-api-logo.svg)

Head over to the Dog API [here](https://dog.ceo/dog-api/).  

1. Make a request that returns a list of all dog breeds.

2. Make a request that returns a random image of a dog and extract the url.  

### ALPHA VANTAGE

A more detailed example than that of `pandas_datareader`.  The API for Alpha Vantage provides many Forex and Crypto feeds as well as economic and technical indicators.  To use it, you will need an API key -- so head over to [here](https://www.alphavantage.co/#page-top) and let's sign up for one. 

![](images/alpha.png)

In [171]:
base_url = 'https://www.alphavantage.co/query'
req = requests.get(
    base_url,
    params={
        "function": "TIME_SERIES_DAILY",
        "symbol": "AAPL",
        "apikey": "LW9XCI6UYMQY5E14"
    }
)

In [184]:
#examine the response


In [198]:
#extract the headline


**Exercise**

Use the `TIME_SERIES_DAILY` endpoint to extract data for `AAPL`. 

### Different Endpoints

Let's explore some news about Apple.  The documentation on the news & sentiment endpoint is [here](https://www.alphavantage.co/documentation/#news-sentiment). 

In [191]:
base_url = 'https://www.alphavantage.co/query'

In [200]:
#news about AAPL


**PROBLEM**: Extract weekly data for bitcoin (`BTC`) from the Cryptocurrency endpoint in Alpha Vantage.


#### API Wrappers

Often, someone has written a library to wrap the API.  For example, there is an alpha vantage Python API wrapper:

- https://github.com/RomelTorres/alpha_vantage

Let's head over, install the library, and retrieve some intraday returns.

#### Summary

Great job!  Now, you have additional tools for going out and accessing data from a variety of sources.  Your homework this week will involve extracting further information from the API's and visualizing this with `seaborn` and `matplotlib`.