# Multivariate time-series forecasting for Aquaponics

### Abstract
Implementation of physics-guided recurrent neural networks (PG-RNN) as architecture to forecast relevant variables in Aquaponics. The main advantage of this architecture is the inclusion of theory-based knowledge into deep learning models as a constraint optimization formulation. The resulting PGNN model will be used in a model-based reinforcement learning framework.

Target variables to forecast:
* pH
* Dissolved oxygen
* Vegetable weight

Import libraries: 
* PyTorch for deep learning model design
* TensorBoard for model training visualization
* Pandas, Numpy, MatplotLib for scientific computing


In [14]:
# Import libraries
import os
import sys

# PyTorch
import torch
import torch.nn as nn
import torch.nn.functional as F

# TensorFlow utilities
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

# Scientific computing
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

In [15]:
# Expose GPUs
# os.environ['KMP_DUPLICATE_LIB_OK']='True'
# os.environ["CUDA_VISIBLE_DEVICES"] = "2" 

# Setup plotting parameters
mpl.rcParams['figure.figsize'] = (14, 8)
mpl.rcParams['axes.grid'] = True
font = {'size'   : 18}
mpl.rc('font', **font)

# Random seed
np.random.seed(10)

Code checklist
* Data
    * Data asquisition
    * Data pre-processing
        * Outliers removal
        * Noise filtering
        * Missing data filling
    * Data architecture - Time-series
* Model
    * Model design:
        * Baseline model: 
            * Recurrent Neural Network
            * Feed-fordward Neural Network
        * Physics-guided Neural Network
            * How to include constraints in the model? 
    * Experiment design:
        * Training/evaluation routine
        * Multi-GPU training
* Experiments
    * Training
    * Validation
    * Results visualization - TensorBoard


### Data Pre-processing
* Pre-processing routine to remove outliers, fill missing data and denoise signals from aquaponics.
* The data comes from an aquaponics between October 15th to December 4th (Fall/Winter)
* The variables to analyze are:
    * Water flow between tanks
    * Dissolved oxygen
    * pH in sump tank
    * Water supplied
    * Water temperature
    * Water level
    * Motor pump output
    * Fish food
    * CO2
    * Light
    * Vegetables weight

#### Read dataset

In [11]:
def read_file(textfile):
    df = pd.read_csv(textfile, compression='zip', sep="\t")
    df.drop(df.columns[0], axis=1, inplace=True)
    print('Dataset ready')
    return df

# Format dd_mm_yyyy
date_start  = '04_10_2021'
date_end    = '15_12_2021'

file_name = 'dataset_aquaponics_{}_{}.cvs.gz'.format(date_start, date_end)
df = read_file(file_name)

dataset_aquaponics_04_10_2021_15_12_2021.cvs.gz


#### Drop corrupted data

In [None]:
drop_signal = [3, 12, 13, 14, 15, 22, 23, 24, 25, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 42, 43, 44, 45, 46, 47, 48, 49, 56, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 93, 94, 95, 97, 98, 99, 100, 101, 104, 105, 106, 111, 112, 113, 114]
df.drop(columns = drop_signal, axis = 1, inplace = True)

#### Remove outliers

In [None]:
# TODO

#### Denoise signals

In [None]:
# TODO

#### Fill missing values

In [None]:
# TODO

### Model design

#### Feed-fordward Neural Network

In [None]:
# TODO

#### LSTM Model

In [None]:
# TODO

#### GRU Model

In [None]:
# TODO

#### PG-LSTM Model

In [None]:
# TODO

#### PG-GRU Model

In [None]:
# TODO

### Training routine

In [None]:
# TODO