# Model based on ANN

This notebook explores ANN with 1h resolution of data. 

Model is trained on all data before 2017 and then used to predict days in 2017. 
ANN trains too slow to train it for each day of 2017 from scratch.

In [1]:
import datetime
import calendar
import time
import json
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Embedding, SimpleRNN, LSTM
import matplotlib.pyplot as plt
from matplotlib import rcParams
rcParams['figure.figsize'] = 12, 4

Using TensorFlow backend.


# Load project

Load rainfall and flow data from the files and clean it by:
  * Resample to 1 hour
  * Slice to the common range (Rainfall data is only up to 2017-12-01
  * Fill NaNs

In [2]:
PROJECT_FOLDER = '../../datasets/radon-medium/'

flow = pd.read_csv(PROJECT_FOLDER + 'flow1.csv', parse_dates=['time'])
flow = flow.set_index('time').flow
flow = flow.resample('1H').mean()

rainfall = pd.read_csv(PROJECT_FOLDER + 'rainfall1.csv', parse_dates=['time'])
rainfall = rainfall.set_index('time').rainfall
rainfall = rainfall.resample('1H').sum()

data_frame = pd.concat([flow, rainfall], axis=1).fillna(0)
data_frame['hour'] = data_frame.index.map(lambda x: x.hour)
data_frame = data_frame['2015-01-01': '2017-12-01']
print(data_frame.head())
print(data_frame.tail())

                          flow  rainfall  hour
time                                          
2015-01-01 00:00:00  76.796188       0.0     0
2015-01-01 01:00:00  71.892892       0.0     1
2015-01-01 02:00:00  63.906876       0.0     2
2015-01-01 03:00:00  60.286973       0.0     3
2015-01-01 04:00:00  57.049687       0.0     4
                          flow  rainfall  hour
time                                          
2017-12-01 19:00:00  79.317773       0.0    19
2017-12-01 20:00:00  79.206970       0.0    20
2017-12-01 21:00:00  79.096870       0.0    21
2017-12-01 22:00:00  80.328204       0.0    22
2017-12-01 23:00:00  77.861051       0.0    23


# Prepare dataset

For ANN we will create the following dataset:
```
(hour, previous_flow, precipitation, target_flow)
```

In [11]:
X = data_frame[['hour', 'rainfall']]
X['last_flow'] = data_frame.flow.shift()
X = X[1:]
print(X.head())
y = data_frame.flow[1:]
print(y.head())

                     hour  rainfall  last_flow
time                                          
2015-01-01 01:00:00     1       0.0  76.796188
2015-01-01 02:00:00     2       0.0  71.892892
2015-01-01 03:00:00     3       0.0  63.906876
2015-01-01 04:00:00     4       0.0  60.286973
2015-01-01 05:00:00     5       0.0  57.049687
time
2015-01-01 01:00:00    71.892892
2015-01-01 02:00:00    63.906876
2015-01-01 03:00:00    60.286973
2015-01-01 04:00:00    57.049687
2015-01-01 05:00:00    52.906274
Freq: H, Name: flow, dtype: float64
