# Labs: Download Pricing Data from Quandl

In this lab, you are going to:

1. View a demonstration of data download using `quandl` module.
2. Recreate the same functionalities by pulling data directly from a URL.

By recreating this task, we hope to familiarize you with the process of downloading data via (most) APIs.

## 1. Importing Modules

In a Jupyter notebook document, we usually include all the module importing routines and global settings in the first code cell. This code cell can then be run multiple times without changing the state of your notebook. In that way, if in later code cells you need to include or remove certain modules or settings, you may simply re-run this code cell without having to run subsequent cells.

### Setting The Configuration

Create a copy of `config.cfg.default` and rename it into `config.cfg`. Afterward, update the content to include your Quandl API key.

**Why is this needed?**

We do not want to put a sensitive data in our code that may be available publicly. A way to deal with this situation, therefore, is by keeping the code in a configuration file and then ignore it from our Git repository.

In [1]:
# os module contains functions to work with local directories.
import os
import requests

# For writing the data
import csv

import configparser
import quandl

# For presenting the data nicely
from prettytable import PrettyTable

config = configparser.ConfigParser()
config.read('config.cfg')

API_PATH = "https://www.quandl.com/api/v3/datasets/EOD/{sym}?start_date={sd}&end_date={ed}&api_key={key}"
start_date = '2017-11-28'
end_date = '2017-12-28'
QUANDL_API_KEY = config['App']['QUANDL_API_KEY']

# Stock symbol
symbol = 'HD'

# Path of the downloaded csv file. This code means the name will be `HD.csv`
filepath = "{}.csv".format(symbol)

## 2. Download Data via `quandl` Module

First, let's try to download data directly using `quandl` module. This is the preferable method if you are working with Python for processing Quandl data.

In [2]:
# Todo: Use quandl module to download data and store it in a pandas DataFrame object `df` 
#      (you will learn about pandas DataFrame in the next lesson).
quandl.ApiConfig.api_key = QUANDL_API_KEY
df = quandl.get('EOD/HD', start_date=start_date, end_date=end_date)

In [3]:
df

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Dividend,Split,Adj_Open,Adj_High,Adj_Low,Adj_Close,Adj_Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2017-11-28,174.89,176.77,172.9911,176.57,6048963.0,0.0,1.0,165.715597,167.496976,163.91631,167.307467,6048963.0
2017-11-29,176.57,178.43,175.68,177.25,4901810.0,0.89,1.0,168.147544,169.918822,167.299998,168.795108,4901810.0
2017-11-30,178.07,180.67,177.41,179.82,9179321.0,0.0,1.0,169.575994,172.051973,168.947476,171.242518,9179321.0
2017-12-01,180.32,180.6,176.7,180.42,4682482.0,0.0,1.0,171.718668,171.985312,168.271343,171.813898,4682482.0
2017-12-04,183.19,186.31,183.19,184.9,6192507.0,0.0,1.0,174.451768,177.422943,174.451768,176.0802,6192507.0
2017-12-05,184.79,184.91,182.2589,182.85,6347228.0,0.0,1.0,175.975447,176.089723,173.565082,174.127986,6347228.0
2017-12-06,180.25,182.15,178.68,180.8,6891093.0,0.0,1.0,171.652007,173.461376,170.156896,172.175772,6891093.0
2017-12-07,180.05,182.58,179.77,182.0,5484611.0,0.0,1.0,171.461547,173.870865,171.194903,173.318531,5484611.0
2017-12-08,182.5,183.9,182.16,183.41,5091538.0,0.0,1.0,173.794681,175.1279,173.470899,174.661274,5091538.0
2017-12-11,182.9,182.96,181.1201,182.25,6038891.0,0.0,1.0,174.175601,174.232739,172.480603,173.556606,6038891.0


In the above, you can directly see your dataset in a nicely formatted table.

In the following section, you will perform the same operation by manually pulling the data via GET request to a URL.

## 3. Download Data via URL

The process has the following steps:
1. prepare a URL from API_PATH. Replace the symbol, dates, and key with proper global settings we created in the first code cell above.
2. Use `requests` module to get from that URL

In [4]:
# Todo: Pull data from the API url and present the data in a table.

# Replace the variables in the API_PATH with above variables.
url = API_PATH.format(sd=start_date, ed=end_date, sym=symbol,
                      key=QUANDL_API_KEY)

# Create a response object by getting a URL.
response = requests.get(url)
content = response.json()
data = content['dataset']['data']
column_names = content['dataset']['column_names']

In [5]:
# We use PrettyTable to nicely display the table:
table = PrettyTable(column_names)
for row in data:
    table.add_row(row)
print(table)

+------------+---------+----------+----------+--------+-----------+----------+-------+-----------------+-----------------+-----------------+-----------------+------------+
|    Date    |   Open  |   High   |   Low    | Close  |   Volume  | Dividend | Split |     Adj_Open    |     Adj_High    |     Adj_Low     |    Adj_Close    | Adj_Volume |
+------------+---------+----------+----------+--------+-----------+----------+-------+-----------------+-----------------+-----------------+-----------------+------------+
| 2017-12-28 |  190.91 |  190.98  |  189.64  | 189.78 | 3175631.0 |   0.0    |  1.0  | 181.80352078562 | 181.87018175914 | 180.59410026601 | 180.72742221306 | 3175631.0  |
| 2017-12-27 |  190.6  |  191.49  |  190.01  | 190.19 | 5912613.0 |   0.0    |  1.0  | 181.50830790288 | 182.35585456623 | 180.94645112606 | 181.11786505797 | 5912613.0  |
| 2017-12-26 |  188.53 |  190.42  |  188.34  | 190.36 | 2969182.0 |   0.0    |  1.0  | 179.53704768589 | 181.33689397097 | 179.35611075776 |

## 4. Store The Data in a CSV File

For the last step, you are going to store the data in a CSV file. This is so that you can reuse the data without network connection. As the analysis gets more complex, you will need to combine multiple data sources. It is therefore important to have the data stored locally somewhere for instant access.

In [6]:
# Todo: Store in a CSV whose filename is in variable `filepath`.

with open(filepath, 'w') as f:
    writer = csv.writer(f)
    writer.writerow(column_names)
    writer.writerows(data)

In [7]:
# When the CSV file is correct, the following output should be the same with the output
# of PrettyTable in section 3 above.

saved_data = []
with open(filepath, 'r') as f:
    reader = csv.reader(f)
    for row in reader:
        saved_data.append(row)

table = PrettyTable(saved_data[0])
for row in saved_data[1:]:
    table.add_row(row)
print(table)

+------------+---------+----------+----------+--------+-----------+----------+-------+-----------------+-----------------+-----------------+-----------------+------------+
|    Date    |   Open  |   High   |   Low    | Close  |   Volume  | Dividend | Split |     Adj_Open    |     Adj_High    |     Adj_Low     |    Adj_Close    | Adj_Volume |
+------------+---------+----------+----------+--------+-----------+----------+-------+-----------------+-----------------+-----------------+-----------------+------------+
| 2017-12-28 |  190.91 |  190.98  |  189.64  | 189.78 | 3175631.0 |   0.0    |  1.0  | 181.80352078562 | 181.87018175914 | 180.59410026601 | 180.72742221306 | 3175631.0  |
| 2017-12-27 |  190.6  |  191.49  |  190.01  | 190.19 | 5912613.0 |   0.0    |  1.0  | 181.50830790288 | 182.35585456623 | 180.94645112606 | 181.11786505797 | 5912613.0  |
| 2017-12-26 |  188.53 |  190.42  |  188.34  | 190.36 | 2969182.0 |   0.0    |  1.0  | 179.53704768589 | 181.33689397097 | 179.35611075776 |