Time-series forecasting from IoT Home Automation Data.

Tuomas Eerola - 2019

Data source: https://github.com/eerolat/home-automation-data-logger

# Run either of the following cells.

This will to connect you to the data:


1.   Some test data from the Internet; or
2.   the actual sensor data.


Run the following cell to use test data from the Internet. 

In [0]:
import pandas as pd

from urllib.request import urlopen

log_url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/daily-min-temperatures.csv"

url = urlopen(log_url)

series_example = pd.read_csv(log_url, header=0, parse_dates=[0])
series_example.columns=['Date Time', 'Temp']

series = series_example.drop_duplicates(subset='Date Time', keep='last')
series = series.set_index('Date Time')
series = series.resample('H').interpolate()

print ("Data loading ready.")

Run the following cell to use actual sensor data. 

In [0]:
import pandas as pd

from urllib.request import urlopen

log_url = "http://eerola.dy.fi/temp/temperature.log"

series_own = pd.read_csv(log_url, sep=" ", parse_dates=[[0, 1]])
series_own.columns=['Date Time', 'SourceInfo1', 'SourceInfo2', 'MeasurementInfo1', 'Temp', 'MeasurementInfo2', 'Measurement2']
#series_own.insert(6, "Target", "NaN")
dropcolumns = ['SourceInfo1', 'SourceInfo2', 'MeasurementInfo1', 'MeasurementInfo2', 'Measurement2']
series_own.drop(dropcolumns, inplace=True, axis=1)
series_own.set_index('Date Time', inplace=True)

series = series_own.reset_index()
series = series.drop_duplicates(subset='Date Time', keep='last')
series = series.set_index('Date Time')
series = series.resample('H').bfill()

print ("Data loading ready.")

# Visualize the data to see what we've got.

In [0]:
from matplotlib import pyplot


series.plot()
pyplot.show()

# Let's run a forecast.

In [0]:
forecast_days = 5

Let's fit the forecast model...

In [0]:
from pandas import read_csv
from statsmodels.tsa.arima_model import ARIMA
import datetime

model = ARIMA(series, order=(4,0,2))
model_fit = model.fit(disp=0)

...and run the forecast.

In [0]:
start_index = series.index[-1]
end_index = start_index + datetime.timedelta(forecast_days)
forecast = model_fit.predict(start=start_index, end=end_index)

# Analysing the data

Visualizing the last 7 days and the forecast.

In [0]:
from matplotlib import pyplot


series.last('7D').plot()
forecast.plot()
pyplot.show()

Comparing the last 1 day and the next 1 day forecast. 

In [0]:
print(series.last('1D'))

In [0]:
print(forecast.first('1D'))