# Understanding Fully Convolutional Neural Networks

Now that we have generated the coarsened, low-resolution datasets, we can now feed them as training data for our parameterized machine learning (ML) models. In this tutorial series, we will focus on one category of ML models, fully convolutional neural networks (FCNNs), though there are other strata of models that can be employed (and have been explored within the paper) including hybrid linear and symbolic regression using genetic programming. Before we begin running these models and making predictions, in this notebook, we will take the time to better our understanding of FCNNs including the initialization process, undegoing training sessions, data preparation, feature extraction and generating predictions.

The code that we utilize for running parameterized ML models resides within this [repository](https://github.com/m2lines/pyqg_parameterization_benchmarks). Our main focus of interest is in the files `neural_networks.py` and `utils.py`. Starting at a high level, within `neural networks.py`, there sits the `FCNNParameterization` class. We use this class to generate parameterized FCNN models on which we can train and make predictions. Before we can begin making predictions we need to create and train our parameterized FCNNs. The class method `train_on()` takes in the dataset that the models will be initially trained on, the path to save the models to as well as the inputs and targets we are training on as strings. 

In [None]:
# neural_networks.py:244
class FCNNParameterization(Parameterization):
    
# neural_networks.py:281:287
@classmethod
def train_on(cls, dataset, directory,
        inputs=['q','u','v'], 
        targets=['q_subgrid_forcing'], # See {INSERT SECTION REFERENCE} for valid target values of sugrid forcing and flux
        num_epochs=50,
        zero_mean=True,
        padding='circular', **kw): # Accepts values 'same', 'circuluar', or None

We can also pass in arguments for additional parameters including the number of epochs, whether the final output layers should be constrained to have zero spatial mean when predicting the subgrid forcing target, and padding technique. This method creates two `FullyCNN` objects, one for each layer of the quasigeostrophic model on which we ran simulations on. 

In [None]:
# neural_networks.py:289:299
layers = range(len(dataset.lev))

models = [
    FullyCNN(
        [(feat, zi) for feat in inputs for zi in layers],
        [(feat, z) for feat in targets],
        zero_mean=zero_mean,
        padding=padding

    ) for z in layers
]

Upon initializing the models, they are subsequently trained on the dataset that was passed in. This is done by first extracting the relevant input and target values from the training dataset. Since the dataset is passed as an `xarray.Dataset` we must convert it into proper `numpy.ndarray` format to feed directly into the `FullyCNN`. 

In [None]:
# neural_networks.py:308:309
X = model.extract_inputs(dataset)
Y = model.extract_targets(dataset)

# neural_networks.py:57:66
def extract_vars(self, m, features, dtype=np.float32):
    ex = FeatureExtractor(m)

    arr = np.stack([
        np.take(ex(feat), z, axis=-3) for feat, z in features
    ], axis=-3)

    arr = arr.reshape((-1, len(features), ex.nx, ex.nx))
    arr = arr.astype(dtype)
    return arr

# utils.py:126:128
class FeatureExtractor:
    """Helper class for taking spatial derivatives and translating string
    expressions into data. Works with either pyqg.Model or xarray.Dataset."""

The above functions `extract_inputs()` and `extract_targets()` are wrappper functions of the method `extract_vars()` which creates a `FeatureExtractor` object from the dataset. This class works with `pyqg.Model` or `xarray.Dataset` as a helper class for taking spatial derivatives and translating string expressions into data. This object is then used in extracting the appropriate features from the dataset and reshaping these features from an `xarray.Dataset` format to a `numpy.ndarray` representation which can be passed into the model. The main function that carries this out is `extract_feature()`.

In [None]:
# utils.py:208:209
def extract_feature(self, feature):
    """Evaluate a string feature, e.g. laplacian(advected(curl(u,v)))."""

Now, upon processing and extracting the relevant features from the inputs and targets of the training dataset, further data preparation is done in the form of normalizing these values. This is crucial and helpful in the training of our FCNNs as the different features are now on a similar scale. 

In [None]:
# neural_networks.py:310
model.fit(X, Y, num_epochs=num_epochs, **kw)

# neural_networks.py:131:139
def fit(self, inputs, targets, rescale=False, **kw):
        if rescale or not hasattr(self, 'input_scale') or self.input_scale is None:
            self.input_scale = ChannelwiseScaler(inputs)
        if rescale or not hasattr(self, 'output_scale') or self.output_scale is None:
            self.output_scale = ChannelwiseScaler(targets, zero_mean=self.is_zero_mean)
        train(self,
              self.input_scale.transform(inputs),
              self.output_scale.transform(targets),
              **kw)

The function `fit()` takes in as parameters the extracted values for the inputs and targets and other additional parameters including the number of epochs to train on and whether to rescale using the passed in input and target values. If there is not a `ChannelwiseScaler` object or rescaling is occurring for either the inputs or targets, then . Lastly, the function calls `train()` in order to kick off the training session now that the training data has been processed and prepared.

In [None]:
# neural_networks.py:222
def train(net, inputs, targets, num_epochs=50, batch_size=64, learning_rate=0.001, device=None):

The above function `train()` takes in