<h2 style="color: blue;">API Data Wrangling</h2>
<p>This notebook does data wrangling using financia data from the Quandl API. The API token has been removed. The code was made as reusable as possible, while still being specific to a dataset.</p>

In [4]:
import requests
from collections import defaultdict
import re

In [5]:
#Insert your API here. As long as it's valid, the rest of the code will do the rest. API=''
API='dGjLh4ex4BaPJEGF-xi8'
url = 'https://www.quandl.com/api/v3/datasets/FSE/AFX_X/data.json?{!s}'.format(API)

<h3 style="color: blue;">Extracting 2018 Data</h3>
<p>This function does the data wrangling. It takes an API and queries the Quandl AFX_X database.
    It returns a list of all rows of data from 2018.</p>

In [34]:
def financial_data_by_year(api,*args):
    """Updated to be more flexible. List is passed and returned filled with data from the year passed"""
    url = 'https://www.quandl.com/api/v3/datasets/FSE/AFX_X/data.json?{!s}'.format(api)
    jsondict = defaultdict(list)
    data = requests.get(url).json()['dataset_data']
    for arg in args:
        pattern = u'{!s}'.format(arg)
        string = re.compile(pattern)
        for row in data['data']:
            if string.match(row[0]):
                jsondict[arg].append(row)
    return jsondict

<p>This is the instantiation that contains the list of rows from 2018</p>

In [35]:
result=financial_data_by_year(API,2018)

<h3 style="color: blue;">Converting Json to a Dict</h3>
<p>This section turns the json dataset into a python dictionary</p>

In [45]:
def json2dict(api):
    """Function wrapper for financial_data_by_year. It just iterates over a list of years and returns a defaultdict."""
    jsondict=defaultdict(list)
    keys =[year for year in range(2000,2019)]
    jsondict=financial_data_by_year(api, *keys)
    return jsondict

<h3 style="color: blue;">Query Function</h3>
<p> This cell is inserted so the json2dict function doesn't need to be repeatedly called for the following functions.
This function is generalized to answer multiple queries.</p>

In [50]:
dict_data=json2dict(API)

In [95]:
def financial_query(api,query,*years):
    """Open price is the 2nd element of the row data. Skips over empty rows. Edited to be more flexible.
    Now allows multiple queries"""
    if query.lower().strip() == 'high' or query.strip() == 'low':
        high=0
        low=0
        for year in years:
            for row in dict_data[year]:
                try:
                    assert type(row[1]) != type(None)
                    if row[1] > high : high = row[1]
                    elif row[1] <= low: low = row[1]
                except AssertionError: pass
        return high, low
    elif query.lower().strip() == 'change':
        diff = 0 
        for year in years:
            for row in dict_data[year]:
                try:
                    assert all([row[2],row[3]]) #asserts neither are null
                    change = abs(row[2]-row[3])
                    if diff < change: diff = change
                except AssertionError: pass
        return diff

<h3 style="color: blue;">Finding Highest and Lowest Opening Prices</h3>
<p>Here the highest and lowest opening stock opening prices are calculated.</p>

In [96]:
high,low = financial_query(API,'high',2017)

<h3 style="color: blue;">Finding Largest Fluctuation</h3>
<p>Here the greatest difference in opening and closing prices are calculated.</p>

In [100]:
diff= financial_query(API,'change',2017)

<h3 style="color: blue;">Finding Largest Fluctuation</h3>
<p>Here the greatest difference in opening and closing prices are calculated.</p>