# Weather API 


In this NB we import the weather data, preprocess if is needed and save as a new Data Frame. 


### Historical Data 

For training and testing the model we use the historicla weather data related to our energy consumption data provided. 

For that, we'll use the API provided by [AT-Wetter](http://at-wetter.tk/index.php?men=api). We'll use this one, and not the OpenWeather API since if we want to get the historical data depth more than a week (we need a year) we'll have to use several queries. 


#### API 

This API don't need a Key or other autentication. It's located on the AT-Server in `http://at-wetter.tk/api/v1/.`
The instruction to use it are in their [website](http://at-wetter.tk/index.php?men=api) 

#### Example 

To query **all the available stations** we need a call following the format: 
`/api/v1/[field]/[YYYY-MM-DD]/[day count]` 

A query call to `http://at-wetter.tk/api/v1/t/2014-08-01/2`  returns Temperature Data for every available station for 2014-08-01, 2 days in the past. 



To query just **one specific stations** we use: 

`/api/v1/station/[station]/[field]|[all]/[YYYY-MM-DD]/[day count]`

This will return the `[field]` from the specified date `[YYYY-MM-DD]` and prints data `[day count]` days in the past for the specified stations `[station]`



### Calling the API


In [2]:
import requests as rq
import json
import datetime as dt

* See the available stations 

In [None]:
URL ="http://at-wetter.tk/api/v1/stations"
response = rq.request("GET", URL)
print(response.text)

* Calculate de day count from Building Energy Data set 

We cold do it manually, but in order to automated all the steps we'll calculated with a python script 

In [None]:
import pandas as pd 
DataFrame = pd.read_excel("./DataSets/Building_energy.xlsx", header=1)
DataFrame.head()

#Calculate the Delta Time 

start_date = dt.datetime.strptime(DataFrame['Time'][0], "%Y-%m-%d %H:%M:%S")
end_date = dt.datetime.strptime(DataFrame['Time'][len(DataFrame)-1], "%Y-%m-%d %H:%M:%S")
day_count = end_date - start_date
print(day_count.days)


As the `[daycount]` field for API Query doesn't allow hours, if we have any hours of difference between the start day and the end day, we add an extra day. Then in the data frame matching we'll delethe the extra hours for the extra day. 

In [None]:
dayCount=0 
if day_count.seconds //3600 > 0: 
    dayCount = day_count.days + 1
else: 
    dayCount = day_count.days
print(dayCount)

* Download the temperature data

In [None]:
station = '11240' #GRAZ 
field = 'all'  #Temperatuer
days_count = str(day_count.days)

URL = "http://at-wetter.tk/api/v1/station/"
URL += station 
URL += "/" + field
URL += "/" + str(end_date.year) + "-" + str(end_date.month) + "-" + str(end_date.day)
URL += "/" + str(dayCount)
response = rq.request("GET", URL)

print(URL)
#print(response.text)
type(response.text)


In [None]:
#response.json(encoding="utf-8")
#print(response.text)

* Saving the data into a Data Frame 

Since the data can't be downloaded in JSON format, we first split the whole text into lineas, and then elements. 

In [None]:
lines = response.text.splitlines()
ColumnNames = [ (col.replace("'",'')) for col in lines[0].split(";") ]
WeatDataFrame = pd.DataFrame(columns=ColumnNames, index=range(len(lines)-2)) # -1 cause the columnames and -1 again cuase it starts from 0 

The index of the daframe starts from 0 but the first line of response.text is the ColumnName row. So we start from 0 for the DataFrame index and for idx+1 for lineas

In [None]:
for idx in range(len(lines)-2):  
    WeatDataFrame.iloc[idx] = [ (col.replace("'",'')) for col in lines[idx+1].split(";") ]
WeatDataFrame

* Changing the datum and zeit column into a newone with datetime type object

In [None]:
WeatDataFrame.info()

In [None]:
WeatDataFrame["datum"][:]

In [None]:
from datetime import datetime as dt 

dates=[] 
for idx in range(len(WeatDataFrame["datum"])): 
    str_date = str(WeatDataFrame["datum"][idx]) + " " + str(WeatDataFrame["zeit"][idx])
    dt = dt.strptime(str_date, "%Y-%m-%d %H:%M")
    dates.append(dt)

print(len(dates), len(WeatDataFrame.index))
WeatDataFrame.insert(0,"Time",dates)

In [None]:
WeatDataFrame

* Deleting the timestamps column, zatum and zeit 

In [None]:
WeatDataFrame.drop('datum', inplace = True, axis=1) 
WeatDataFrame.drop('zeit', inplace = True, axis=1) 
WeatDataFrame.drop('timestamp', inplace = True, axis=1) 


Save the data frame into a csv file 

In [5]:
WeatDataFrame.to_csv("./DataSets/Raw_Weather_DataSet.csv",index=False)

NameError: name 'WeatDataFrame' is not defined

### Forecast 
#### API 

We'll use python to request de dataset from the OpenWeather data base. The API  request call should be in this format: 

``` http://history.openweathermap.org/data/2.5/history/city?q={city ID},{country code}&type=hour&start={start}&end={end}&appid={API key}```

#### Usage 

Once we introduce the day that we want to predict we request for that day's forecast. If we have it, we arrange all the necessarry inputs for that day (temp, holiday, week day,etc) and we introduce them into the model. 

It's gonna be an input for the user to introduce the month, day, and hour. Then we check for the necessary data for that day (if it's a monday, a holiday, etc), we make the input data array for that one and we introduce it to the model 