# ANN CALSIM 3 style

This notebook creates the ANN structure as mentioned in 
```
Artificial Neural Network for Sacramento–San Joaquin Delta Flow–Salinity Relationship for CalSim 3.0
Nimal C. Jayasundara, M.ASCE1; Sanjaya A. Seneviratne2; Erik Reyes3; and Francis I. Chung
```

The input structure consists of 8 inputs and their antecedent conditions expressed in daily or moving averaged values

## Just the imports including the annutils (local) module

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers.experimental.preprocessing import Normalization
from tensorflow.keras import layers
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import r2_score
import hvplot.pandas
import panel as pn
import holoviews as hv
hv.extension('bokeh')

In [None]:
import os
path_data = '.'

In [None]:
dfin_on = pd.read_csv(os.path.join(path_data, 'smscg_input_on.csv'), index_col=0, parse_dates=True)
dfin_off = pd.read_csv(os.path.join(path_data, 'smscg_input_off.csv'), index_col=0, parse_dates=True)

dfout_on = pd.read_csv(os.path.join(path_data, 'smscg_output_on.csv'), index_col=0, parse_dates=True)
dfout_off = pd.read_csv(os.path.join(path_data, 'smscg_output_off.csv'), index_col=0, parse_dates=True)

# Import input (features) and output (labels) data from csv files
If you need to see how to build these files see [how to process dss files to create csv files](./read_calsim_and_collate_inputs.ipynb) 

## Tensorflow Board Setup
A log directory to keep the training logs

Tensorboard starts a separate process and is best started from the command line. Open a command window and activate this environment (i.e. keras) and goto the current directory. Then type in
```
tensorboard --logdir=./tf_training_logs/ --port=6006
```

In [None]:
# %load_ext tensorboard
# %tensorboard --logdir=./tf_training_logs/ --port=6006
root_logdir = os.path.join(os.curdir, "tf_training_logs")
tensorboard_cb = keras.callbacks.TensorBoard(root_logdir)

# Calibration and Validation Periods
Calibration is from 1940 - 2015 and Validation from 1923 - 1939 as per the Calsim 3 ANN paper

The output locations are names of the columns in the output(labels) csv files. For each location, an ANN is trained on all the specified data sets

In [None]:
output_locations = ['CVP_INTAKE', 'MIDR_INTAKE', 'OLDR_CCF', 'ROLD024',
                    'RSAC081', 'RSAC092', 'RSAN007', 'RSAN018', 'SLMZU003', 'SLMZU011', 'VICT_INTAKE']
calib_slice = slice('1940', '2015')
valid_slice = slice('1923', '1939')

In [None]:
# Define Sequential model with 3 layers
NFEATURES = 126  # (8 + 10)*7


def build_model(nhidden1=8, nhidden2=2, act_func='sigmoid'):
    model = keras.Sequential(
        [
            layers.Input(shape=(NFEATURES)),
            layers.Dense(nhidden1, activation=act_func),
            layers.Dense(nhidden2, activation=act_func),
            layers.Dense(1, activation=keras.activations.linear)
        ])
    model.compile(optimizer=keras.optimizers.Adam(
        learning_rate=0.001), loss="mse")
    #model.compile(optimizer=keras.optimizers.RMSprop(), loss="mse")
    return model

In [None]:
import annutils

In [None]:
for location in output_locations:
    output_location = '%s_EC' % location
    # create tuple of calibration and validation sets and the xscaler and yscaler on the combined inputs
    (xallc, yallc), (xallv, yallv), xscaler, yscaler = \
        annutils.create_training_sets([dfin_on, dfin_off],
                                      [dfout_on[[output_location]],
                                       dfout_off[[output_location]]],
                                      calib_slice=slice('1940', '2015'),
                                      valid_slice=slice('1923', '1939'))
    model = build_model(8, 2, act_func='sigmoid')
    display(model.summary())
    history = model.fit(
        xallc,
        yallc,
        epochs=1000,
        batch_size=128,
        validation_data=(xallv, yallv),
        callbacks=[
            keras.callbacks.EarlyStopping(
                monitor="val_loss", patience=50, mode="min", restore_best_weights=True),
            tensorboard_cb
        ],
    )
    # pd.DataFrame(history.history).hvplot(logy=True) # if you want to view the graph for calibration/validation training
    annutils.save_model(location, model, xscaler, yscaler)

# Show the performance on the data sets visually

Change the location to one of the locations for which the ANN is trained and run cells below to see performance on one or more of the data sets

In [None]:
location = 'RSAN018'
output_location = '%s_EC' % location
print('Location: ', location)
annmodel = annutils.load_model(location)

In [None]:
annutils.show_performance(annmodel.model, dfin_on,
                          dfout_on[output_location], annmodel.xscaler, annmodel.yscaler)

In [None]:
annutils.show_performance(annmodel.model, dfin_off,
                          dfout_off[output_location], annmodel.xscaler, annmodel.yscaler)

# Display weights and x and y scaling parameters


In [None]:
annmodel.model.get_weights()

In [None]:
annmodel.xscaler.data_min_, annmodel.xscaler.data_max_

In [None]:
annmodel.xscaler.feature_range

In [None]:
annmodel.xscaler.min_

In [None]:
annmodel.xscaler.scale_