### Get Historical Weather

Divide London's 32 boroughs into the following 5 main areas :
- Inner (Inner London, including City of London)
- O1 (Outer London area 1)
- O2 (Outer London area 2)
- O3 (Outer London area 3)
- O4 (Outer London area 4)

Darksky api has 1000 calls per day limit. To get 3 years weather data for above 5 areas, we need to run the script over 6 days to prevent exceeding the limit.

How to map accident data to weather data:
- Combine both accident and non-accident samples
- From 'Local Authority' code, get the 'Area' (in Local_Authority_DistrictLondon_gps.csv file)
- Convert accident (or non-accident) date and time to 'UNIX-time'
- With 'Area' and UNIX-time, get the weather data

In [2]:
import os
path = "d:\cyk\gatech\CSE6242\ProjectCSE6242\ProjectUKdata"
#path = "d:\gatech\CSE6242\Project\data"
os.chdir(path)

### Import Libraries

In [3]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import json
import requests
from datetime import datetime, timedelta, date

key = "xxxx"   #NOTE: replace with your darksky own api key !!!!!!!!!!!!!!!!!


This file maps the London boroughs to the 'Area' for calling api

In [4]:
LocDis_df = pd.read_csv("contextCSVs\Local_Authority_DistrictLondon_gps.csv")
#LocDis_df = pd.read_csv("contextCSV\localauthlondon.csv")
LocDis_df.head(10)

Unnamed: 0,code,label,Area,lat,long
0,1,Westminster,Inner,51.51279,-0.09184
1,2,Camden,Inner,51.51279,-0.09184
2,3,Islington,Inner,51.51279,-0.09184
3,4,Hackney,Inner,51.51279,-0.09184
4,5,Tower Hamlets,Inner,51.51279,-0.09184
5,6,Greenwich,Inner,51.51279,-0.09184
6,7,Lewisham,Inner,51.51279,-0.09184
7,8,Southwark,Inner,51.51279,-0.09184
8,9,Lambeth,Inner,51.51279,-0.09184
9,10,Wandsworth,Inner,51.51279,-0.09184


Summarise the above file into 5 London areas with ref GPS

In [6]:
# Get only the 5 London areas and reference GPS for getting weather data
area_df = LocDis_df
area_df.sort_values("Area", inplace = True) 
area_df.drop_duplicates(subset =["Area","lat","long"], keep = "first", inplace = True) 
area_df = area_df[["Area","lat","long"]]

area_df

Unnamed: 0,Area,lat,long
0,Inner,51.51279,-0.09184
12,O1,51.654236,-0.080747
13,O2,51.573696,0.124922
22,O3,51.398475,-0.094401
24,O4,51.541067,-0.420668


In [31]:
weather_column = ['Area','lat','long','time','cloudCover','dewPoint','humidity','precipType','precipIntensity','precipProbability','pressure','icon','temperature','uvIndex','visibility','windGust','windSpeed']
df_weather = pd.DataFrame(columns=weather_column)

#function to get date interval
def daterange(date1, date2):
    for n in range(int ((date2 - date1).days)+1):
        yield date1 + timedelta(n)


start_dt = date(2014, 7, 1)
end_dt = date(2014, 12, 31)
#end_dt = date(2014, 1, 1)
for dt in daterange(start_dt, end_dt):
    tm = dt.strftime("%Y-%m-%d") + "T" + "00:00:01"  # get 1 day data at a time
#    for index, row in area_df.iloc[[0]].iterrows():
    print("date:" + str(dt))
    for index, row in area_df.iterrows():
        area = row['Area']
        lat = row["lat"]
        long = row["long"]
        URL = "https://api.darksky.net/forecast/" + key + "/" + str(lat) + "," + str(long) + "," + tm + "?exclude=currently,minutely,flags"
        response = requests.get(URL)
        data = response.json()
        #Loop at hourly data
        for idx, elem in enumerate(data["hourly"]["data"]):
            dic1 = elem
            dic1.update({'Area': area, 'lat': lat, 'long': long})            
            dfnew = pd.DataFrame(dic1,index=[idx])
            df_weather = df_weather.append(dfnew,ignore_index=True)


str1 = "weather" + str(start_dt) + "_" + str(end_dt) + ".csv"
df_weather.to_csv(str1)            

date:2014-07-01
date:2014-07-02
date:2014-07-03
date:2014-07-04
date:2014-07-05
date:2014-07-06
date:2014-07-07
date:2014-07-08
date:2014-07-09
date:2014-07-10
date:2014-07-11
date:2014-07-12
date:2014-07-13
date:2014-07-14
date:2014-07-15
date:2014-07-16
date:2014-07-17
date:2014-07-18
date:2014-07-19
date:2014-07-20
date:2014-07-21
date:2014-07-22
date:2014-07-23
date:2014-07-24
date:2014-07-25
date:2014-07-26
date:2014-07-27
date:2014-07-28
date:2014-07-29
date:2014-07-30
date:2014-07-31
date:2014-08-01
date:2014-08-02
date:2014-08-03
date:2014-08-04
date:2014-08-05
date:2014-08-06
date:2014-08-07
date:2014-08-08
date:2014-08-09
date:2014-08-10
date:2014-08-11
date:2014-08-12
date:2014-08-13
date:2014-08-14
date:2014-08-15
date:2014-08-16
date:2014-08-17
date:2014-08-18
date:2014-08-19
date:2014-08-20
date:2014-08-21
date:2014-08-22
date:2014-08-23
date:2014-08-24
date:2014-08-25
date:2014-08-26
date:2014-08-27
date:2014-08-28
date:2014-08-29
date:2014-08-30
date:2014-08-31
date:201