# HIDRA model training

In [1]:
# Comment out to enable GPU support (requires a compatible CUDA version)
%env CUDA_VISIBLE_DEVICES -1

env: CUDA_VISIBLE_DEVICES=-1


In [2]:
import numpy as np
import h5py
import tensorflow as tf

In [3]:
from hidra import HIDRA, compile_model
from hidra.data import DataNormalization, hdf5_dataset

**Step 1**: Prepare the training (and validation) data and import it as a TensorFlow Dataset of samples with the following structure:
```
(atmospheric_data, sea_level_data), labels
```

In our setup, we prepare the data into an HDF5 file containing the samples. The HDF5 file contains the following fields:

| Field name | Shape | Description |
|---|----------------------|---|
| `weather` | $$ N  \times \frac{ T_{max} + T_{min} }{4} \times h \times w \times 4  $$ | Atmospheric input tensors subsampled to a 4h temporal resolution. |
| `ssh`, `tide` & `delta` | $$N \times T_{min} \times 1$$ | Sea level tensors (full, tidal component and residual component) |
| `lbl_ssh`, `lbl_tide` & `lbl_delta` | $$N \times T_{max} \times 1$$ | Target (labels) sea level tensors (full, tidal component and residual component). |
| `dates` (optional) | $$N \times T_{max}$$ | Timestamps corresponding to prediction times of labeled data (`lbl_*`) |

$N$ denotes the number of samples in the dataset. Refer to the provided sample data file for additional information about the structure of the data.


We load the HDF5 file into a TensorFlow dataset then make use of a mapping function to select and prepare the data for training.

In [4]:
# Specify which fields from the HDF5 file to load. 
# We'll use the residual and tide signals as input and predict residuals in this case.
dataset = hdf5_dataset('../data/example_data.hdf5', ['weather', 'delta', 'tide', 'lbl_delta'])

In [5]:
# Create a mapping function that prepares the data
def map_fn(weather, delta, tide, lbl_delta):
    sea_level = tf.concat([delta, tide], axis=1)
    return (weather, sea_level), lbl_delta[..., 0]

In [6]:
dataset = dataset.map(map_fn)
dataset

<MapDataset shapes: (((24, 29, 37, 4), (24, 2)), (72,)), types: ((tf.float32, tf.float64), tf.float64)>

**Step 2**: Create and compile the HIDRA model before training

In [7]:
model = HIDRA(num_predictions=72)

In [8]:
# Prepares the loss function, metrics and the optimizer
model = compile_model(model)

**Step 3**: Train the model

In [10]:
model.fit(dataset.repeat().batch(32), epochs=1, steps_per_epoch=2000)



<tensorflow.python.keras.callbacks.History at 0x7ffa215cd210>

**Step 4**: Export the model weights

In [11]:
model.save_weights('../models/my_model.hdf5')