# Weather Underground API Script

The purpose of this notebook is to illustrate 2 scripts to acquire data from Weather Undergroud. 

The first script download data as-is in JSON format, provided a `start date`, `end date` and a `station id`. The second parses the data out of the JSON files and store in a format acceptable by our current database schema. 

## API Request Script

In order to use the API, we need to register for a key. The key does NOT give unlimited access. About 1 year worth of data can be requested per day. If you exceed the amount of requests, it will timeout until the following day. Exception to this is their `raindrop` system. Check API Documentation for more details. On the script version of this notebook, the key is passed as a parameter to run the script rather than loaded from the file.

In [6]:
with open('keys/api.txt', 'r') as f:
    key = f.readline().rstrip()

Next, we specify: 

 * `start date` (inclusive): YYYYMMDD
 * `end date` (inclusive): YYYYMMDD
 * `station ID`: Check on Weather Underground website (e.g. https://www.wunderground.com/cgi-bin/findweather/getForecast?query=pws:KHIKAPOL23) the ID is **KHIKAPOL23**. 
 * `save_path`: Path to the folder where the downloaded readings will be stored (one file per day worth of readings, each containing as name the timestamp of the given day). 

In [62]:
#Will be loaded as parameter when running the script
start_date = '20170101'
end_date = '20170103'
station_id = 'KHIKAPOL23'
save_path = "data/"

The script then uses the API key, start date, end date, station id and save path to:

 * Enumerate all the list of dates in the range, inclusive (create_list_of_dates)
 * Use one date at a time to create the URL for request worth one day of readings (this is WU's API interface constraint) (create_request_url). 
 * Store the date in an individual JSON file as-is so we can preserve the raw data if we ever need again to extract any additional information years later that was not stored at this point in time in the database by the second script. (download_one_day_readings).

In [52]:
import datetime
def create_list_of_dates(start_date,end_date):
    '''Creates a list of dates based on start and end date, inclusive for both dates. Requestes uses one date at a time.'''
    #start = datetime.datetime.strptime("21-06-2014", "%d-%m-%Y")
    start = datetime.datetime.strptime(start_date, "%Y%m%d")
    end = datetime.datetime.strptime(end_date, "%Y%m%d")
    dates_list = [start + datetime.timedelta(days=x) for x in range(0, (end-start).days+1)]
    return dates_list

def create_request_url(date,station_id,key):
    '''Creates the URL to obtain the data. This is just a formatting function, it does not send the request itself.'''
    api_url = 'http://api.wunderground.com/api/'
    key = key+'/'
    date = 'history_' + date + '/'
    station = 'q/pws:'+station_id+'.json'
    return (api_url+key+date+station)

import urllib.request
import shutil
def download_one_day_readings(date,url,save_path):
    '''Saves to specified path a file with the date as name downloaded from the formatted url.'''
    with urllib.request.urlopen(url) as response, open(save_path+date+'.json', 'wb') as out_file:
        shutil.copyfileobj(response, out_file)

    

Using the functions should retrieve the desired data in raw-format:

In [64]:
dates_list = create_list_of_dates(start_date,end_date)
for date in dates_list:
    date = date.strftime("%Y%m%d")
    url = create_request_url(date,station_id,key='<api_key>')
    print(url)
    #download_one_day_readings(date,url,save_path)

http://api.wunderground.com/api/<api_key>/history_20170101/q/pws:KHIKAPOL23.json
http://api.wunderground.com/api/<api_key>/history_20170102/q/pws:KHIKAPOL23.json
http://api.wunderground.com/api/<api_key>/history_20170103/q/pws:KHIKAPOL23.json


## Weather Underground JSON to ERDL Database Script