# Assignment 1: Time Series Forecast With Python (Seasonal ARIMA)

**Lecturer**: Vincent Claes<br>
**Authors:** Bryan Honof, Jeffrey Gorissen<br>
**Start Date:** 19/10/2018
    
**Objective:** Visualize and predict the future temperatures via ARIMA

**Description:** This notebook acts as an index to all the other notebooks

## Table of contents <a name="table-of-contents"></a>

1. [Table of notebooks](#table-notebooks)
2. [Introduction](#introduction)
3. [Collecting the data](#collecting-the-data)
4. [Uploading the data](#uploading-the-data)
5. [Working with the data](#working-with-the-data)
6. [Predicting the future](#predicting-the-future)
7. [Conclusion](#conclusion)

## 1. Table of notebooks <a name="table-notebooks"></a>

A table of all the notebook used in this project can be found here.

1. [Index notebook (current notebook)](./1_entry_notebook.ipynb)
2. [Getting the data](./2_get_data.ipynb)
3. [Some Exploratory Data Analysis (EDA)](./3_exploratory_data_analysis.ipynb)
4. [Selecting our model](./4_model_selection.ipynb)
5. [Fitting the data and predicting the future!](./5_fitting_and_predicting.ipynb)

## 2. Introduction <a name="introduction"></a>

This notebook will give a brief overview of what the assignment was about, how we handled it and a small conclusion.

The goal of the assignment was to predict the temperature values one hour and one day ahead of time. This was done by first collecting temperature data from a chosen location with the [CC3200 LaunchPad](http://www.ti.com/tool/CC3200-LAUNCHXL) development board.


The data the LaunchPad collected was stored in a database hosted on [Heroku](https://heroku.com). An API to upload the data was also hosted on there.

Once enough data is collected it was possible to build these notebooks to accually predict the temperature. This was done by using the [Python](https://www.python.org/) Programming Language in combination with some [modules](./requirements.txt) and [anaconda](https://www.anaconda.com/).

Using Seasonal ARIMAX the future is predicted.

## 3. Collecting the data <a name="collecting-the-data"></a>

As said before, collection of the data is done on the [CC3200 LaunchPad](http://www.ti.com/tool/CC3200-LAUNCHXL) development board. It has an on-board temperature sensor ([TMP006AIYZFR](http://www.ti.com/ww/eu/sensampbook/tmp006.pdf)) that can be used to well... measure temperature!

The temperature is measured and uploaded in intervals of ~15 minutes and uploaded to the server.

The firmware was completely written in [Code Composer Studio](http://www.ti.com/tool/CCSTUDIO) so we could get acces to a lot of usefull included libraries and APIs. TI-RTOS because it has software timers and we were a little to lazy to figure out how we could get the hardware timers to count beyond 5 minutes so we took the easy wasy out. Some improvements could definitely happen in this department. 

It is worth mentioning that the collected temperature values are recorded in degrees Celcius and not Fahrenheit. The location where the data was recorded is also saved.

## 4. Uploading the data <a name="uploading-the-data"></a>

Uploading the actual data to the Heroku server was pretty simple. Using the simplelink API provided in the Code Composer Studio it is simple to:
* Connect to an AP
* Create HTTP requests like POST, PUT and DELETE
* Disconnect from that same AP
* ...

Using this API the board was able to connect to an AP every time it had a new temperature value ready, then create a valid HTTP header and actually send out data toghetter with it. After sending out the data it will disconnect from the AP until it had a new value ready to be uploaded again.

To accually recieve the data an API was created on [Heroku](https://heroku.com) with Python and [Flask](http://flask.pocoo.org/). On there multiple routes were created with Flask to POST, PUT and DELETE data with the development board. On Heroku we also made it possible to send the data to an PostgreSQL database, with the help of some Python modules this was no problem at all.

## 5. Working with the data <a name="working-with-the-data"></a>

After some time passes and the database is filled with data it is to work with that data.

First thing was to "download" the data. This is done by using a Python module named Pandas and SQLAlchemy.
Pandas is used to do 2 things but in this case it was used to act as some sort of web scraper so we can get data directly from the API. SQLAlchemy is used as an alternative to the web scraping technique of pandas. SQLAlchemy directly connects to the database and gets the data via a SQL Querry.

Once the data is "downloaded" a very brief look at the data is taken and a plot is created to give a visual representation.

After this we save the data into a ```.csv``` file this is done because we are dealing with multiple notebooks and the data has to be transferd between them. We could of course reconnect to the database or scrape the web again in every notebook but this takes time. Reading in a ```.csv``` is faster and has the benefit of being able to work offline with them.

The data also comes in the format of a JSON object and everything is a string. This means a conversion has to happen to convert the ```creation_date``` column to a ```datetimeIndex```, and the ```value``` to a ```float```.

When all that is done a closer look at the data is taken. What was concluded here is that not all data is perfect and this one deffinatly has some ugly sides. For example, at a certain point in time the AP the development board was supposed to connect to lost all connection to the internet provider so the development board had nowhere to send the data to. This caused a huge gap to appear in the data.

