# Weather API 


In this NB we import the weather data, preprocess if is needed and save as a new Data Frame. 


### Historical Data 

For training and testing the model we use the historicla weather data related to our energy consumption data provided. 

For that, we'll use the API provided by [AT-Wetter](http://at-wetter.tk/index.php?men=api). We'll use this one, and not the OpenWeather API since if we want to get the historical data depth more than a week (we need a year) we'll have to use several queries. 


#### API 

This API don't need a Key or other autentication. It's located on the AT-Server in `http://at-wetter.tk/api/v1/.`
The instruction to use it are in their [website](http://at-wetter.tk/index.php?men=api) 

#### Example 

To query **all the available stations** we need a call following the format: 
`/api/v1/[field]/[YYYY-MM-DD]/[day count]` 

A query call to `http://at-wetter.tk/api/v1/t/2014-08-01/2`  returns Temperature Data for every available station for 2014-08-01, 2 days in the past. 



To query just **one specific stations** we use: 

`/api/v1/station/[station]/[field]|[all]/[YYYY-MM-DD]/[day count]`

This will return the `[field]` from the specified date `[YYYY-MM-DD]` and prints data `[day count]` days in the past for the specified stations `[station]`



### Calling the API


In [1]:
import requests as rq
import json
import datetime as dt

* See the available stations 

In [2]:
URL ="http://at-wetter.tk/api/v1/stations"
response = rq.request("GET", URL)
print(response.text)


'11157';'Aigen im Ennstal';'640';'m'
'11244';'Bad Gleichenberg';'280';'m'
'11101';'Bregenz';'424';'m'
'11190';'Eisenstadt';'184';'m'
'11';'Feuerkogel';'1618';'m'
'11155';'Feuerkogel';'1618';'m'
'11240';'Graz/Flughafen';'340';'m'
'11121';'Innsbruck';'579';'m'
'11331';'Klagenfurt/Flughafen';'447';'m'
'11012';'Kremsmünster';'383';'m'
'11012';'Kremsm�nster';'383';'m'
'11130';'Kufstein';'495';'m'
'11204';'Lienz';'659';'m'
'11010';'Linz/Hörsching';'298';'m'
'11010';'Linz/H�rsching';'298';'m'
'11171';'Mariazell';'866';'m'
'11126';'Patscherkofel';'2247';'m'
'11022';'Retz';'320';'m'
'11150';'Salzburg';'430';'m'
'11343';'Sonnblick';'3105';'m'
'11389';'St. Pölten';'270';'m'
'11389';'St. P�lten';'270';'m'
'11265';'Villacher Alpe';'2140';'m'
'11035';'Wien/Hohe Warte';'203';'m'
'11036';'Wien/Schwechat';'183';'m'



* Calculate de day count from Building Energy Data set 

We cold do it manually, but in order to automated all the steps we'll calculated with a python script 

In [3]:
import pandas as pd 
DataFrame = pd.read_excel("./DataSets/Building_energy.xlsx", header=1)
DataFrame.head()

#Calculate the Delta Time 

start_date = dt.datetime.strptime(DataFrame['Time'][0], "%Y-%m-%d %H:%M:%S")
end_date = dt.datetime.strptime(DataFrame['Time'][len(DataFrame)-1], "%Y-%m-%d %H:%M:%S")
day_count = end_date - start_date
print(day_count.days)


280


As the `[daycount]` field for API Query doesn't allow hours, if we have any hours of difference between the start day and the end day, we add an extra day. Then in the data frame matching we'll delethe the extra hours for the extra day. 

In [4]:
dayCount=0 
if day_count.seconds //3600 > 0: 
    dayCount = day_count.days + 1
else: 
    dayCount = day_count.days
print(dayCount)

281


* Download the temperature data

In [5]:
station = '11240' #GRAZ 
field = 'all'  #Temperatuer
days_count = str(day_count.days)

URL = "http://at-wetter.tk/api/v1/station/"
URL += station 
URL += "/" + field
URL += "/" + str(end_date.year) + "-" + str(end_date.month) + "-" + str(end_date.day)
URL += "/" + str(dayCount)
response = rq.request("GET", URL)

print(URL)
#print(response.text)
type(response.text)


http://at-wetter.tk/api/v1/station/11240/all/2021-11-3/281


str

In [48]:
#response.json(encoding="utf-8")
#print(response.text)

* Saving the data into a Data Frame 

Since the data can't be downloaded in JSON format, we first split the whole text into lineas, and then elements. 

In [49]:
lines = response.text.splitlines()
ColumnNames = [ (col.replace("'",'')) for col in lines[0].split(";") ]
WeatDataFrame = pd.DataFrame(columns=ColumnNames, index=range(len(lines)-2)) # -1 cause the columnames and -1 again cuase it starts from 0 

The index of the daframe starts from 0 but the first line of response.text is the ColumnName row. So we start from 0 for the DataFrame index and for idx+1 for lineas

In [50]:
for idx in range(len(lines)-2):  
    WeatDataFrame.iloc[idx] = [ (col.replace("'",'')) for col in lines[idx+1].split(";") ]
WeatDataFrame

Unnamed: 0,station,name,hoehe,datum,zeit,t,tp,rf,wr,wg,wsr,wsg,regen,ldred,ldstat,sonne,timestamp
0,11240,Graz/Flughafen,340,2021-01-26,00:00,-3.1,-4.0,93,220,1.8,,7.6,0.0,1010.3,965.2,0,2021-01-26 00:10:10
1,11240,Graz/Flughafen,340,2021-01-26,01:00,-3.3,-3.9,96,360,3.6,,5.4,0.0,1010.9,965.8,0,2021-01-26 01:10:09
2,11240,Graz/Flughafen,340,2021-01-26,02:00,-2.8,-3.5,95,260,5.4,,7.6,0.0,1011.1,966.0,0,2021-01-26 02:10:08
3,11240,Graz/Flughafen,340,2021-01-26,03:00,-3.8,-4.2,97,180,5.4,,7.6,0.0,1011.4,966.2,0,2021-01-26 03:10:08
4,11240,Graz/Flughafen,340,2021-01-26,04:00,-4.7,-5.3,96,280,1.8,,9.4,0.0,1012.0,966.6,0,2021-01-26 04:10:09
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6693,11240,Graz/Flughafen,340,2021-11-03,18:00,7.9,7.3,96,160,5.4,,9.4,0.2,1005.3,962.2,0,2021-11-03 18:10:10
6694,11240,Graz/Flughafen,340,2021-11-03,19:00,7.6,7.0,96,230,5.4,,9.4,0.5,1004.7,961.6,0,2021-11-03 19:10:08
6695,11240,Graz/Flughafen,340,2021-11-03,20:00,7.6,7.0,96,0,1.8,,9.4,0.9,1004.2,961.1,0,2021-11-03 20:10:12
6696,11240,Graz/Flughafen,340,2021-11-03,21:00,7.9,7.5,97,80,3.6,,7.6,1.0,1003.8,960.8,0,2021-11-03 21:10:08


* Changing the datum and zeit column into a newone with datetime type object

In [51]:
WeatDataFrame.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6698 entries, 0 to 6697
Data columns (total 17 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   station    6698 non-null   object
 1   name       6698 non-null   object
 2   hoehe      6698 non-null   object
 3   datum      6698 non-null   object
 4   zeit       6698 non-null   object
 5   t          6698 non-null   object
 6   tp         6698 non-null   object
 7   rf         6698 non-null   object
 8   wr         6698 non-null   object
 9   wg         6698 non-null   object
 10  wsr        6698 non-null   object
 11  wsg        6698 non-null   object
 12  regen      6698 non-null   object
 13  ldred      6698 non-null   object
 14  ldstat     6698 non-null   object
 15  sonne      6698 non-null   object
 16  timestamp  6698 non-null   object
dtypes: object(17)
memory usage: 889.7+ KB


In [52]:
WeatDataFrame["datum"][:]

0       2021-01-26
1       2021-01-26
2       2021-01-26
3       2021-01-26
4       2021-01-26
           ...    
6693    2021-11-03
6694    2021-11-03
6695    2021-11-03
6696    2021-11-03
6697    2021-11-03
Name: datum, Length: 6698, dtype: object

In [53]:
from datetime import datetime as dt 

dates=[] 
for idx in range(len(WeatDataFrame["datum"])): 
    str_date = str(WeatDataFrame["datum"][idx]) + " " + str(WeatDataFrame["zeit"][idx])
    dt = dt.strptime(str_date, "%Y-%m-%d %H:%M")
    dates.append(dt)

print(len(dates), len(WeatDataFrame.index))
WeatDataFrame.insert(0,"Time",dates)

6698 6698


In [54]:
WeatDataFrame

Unnamed: 0,Time,station,name,hoehe,datum,zeit,t,tp,rf,wr,wg,wsr,wsg,regen,ldred,ldstat,sonne,timestamp
0,2021-01-26 00:00:00,11240,Graz/Flughafen,340,2021-01-26,00:00,-3.1,-4.0,93,220,1.8,,7.6,0.0,1010.3,965.2,0,2021-01-26 00:10:10
1,2021-01-26 01:00:00,11240,Graz/Flughafen,340,2021-01-26,01:00,-3.3,-3.9,96,360,3.6,,5.4,0.0,1010.9,965.8,0,2021-01-26 01:10:09
2,2021-01-26 02:00:00,11240,Graz/Flughafen,340,2021-01-26,02:00,-2.8,-3.5,95,260,5.4,,7.6,0.0,1011.1,966.0,0,2021-01-26 02:10:08
3,2021-01-26 03:00:00,11240,Graz/Flughafen,340,2021-01-26,03:00,-3.8,-4.2,97,180,5.4,,7.6,0.0,1011.4,966.2,0,2021-01-26 03:10:08
4,2021-01-26 04:00:00,11240,Graz/Flughafen,340,2021-01-26,04:00,-4.7,-5.3,96,280,1.8,,9.4,0.0,1012.0,966.6,0,2021-01-26 04:10:09
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6693,2021-11-03 18:00:00,11240,Graz/Flughafen,340,2021-11-03,18:00,7.9,7.3,96,160,5.4,,9.4,0.2,1005.3,962.2,0,2021-11-03 18:10:10
6694,2021-11-03 19:00:00,11240,Graz/Flughafen,340,2021-11-03,19:00,7.6,7.0,96,230,5.4,,9.4,0.5,1004.7,961.6,0,2021-11-03 19:10:08
6695,2021-11-03 20:00:00,11240,Graz/Flughafen,340,2021-11-03,20:00,7.6,7.0,96,0,1.8,,9.4,0.9,1004.2,961.1,0,2021-11-03 20:10:12
6696,2021-11-03 21:00:00,11240,Graz/Flughafen,340,2021-11-03,21:00,7.9,7.5,97,80,3.6,,7.6,1.0,1003.8,960.8,0,2021-11-03 21:10:08


* Deleting the timestamps column, zatum and zeit 

In [55]:
WeatDataFrame.drop('datum', inplace = True, axis=1) 
WeatDataFrame.drop('zeit', inplace = True, axis=1) 
WeatDataFrame.drop('timestamp', inplace = True, axis=1) 


Save the data frame into a csv file 

In [57]:
WeatDataFrame.to_csv("./DataSets/tempDataSet.csv")

### Forecast 
#### API 

We'll use python to request de dataset from the OpenWeather data base. The API  request call should be in this format: 

``` http://history.openweathermap.org/data/2.5/history/city?q={city ID},{country code}&type=hour&start={start}&end={end}&appid={API key}```

#### Usage 

Once we introduce the day that we want to predict we request for that day's forecast. If we have it, we arrange all the necessarry inputs for that day (temp, holiday, week day,etc) and we introduce them into the model. 

It's gonna be an input for the user to introduce the month, day, and hour. Then we check for the necessary data for that day (if it's a monday, a holiday, etc), we make the input data array for that one and we introduce it to the model 