# Training process

The data set generated in `generate_data.py` can now be used to train a normalising flow neural network to perform gravity inversion. For this part of the work it is recommended to use a GPU as this will provide a significant speed-up in computation times.

To train the network, first we need to define some directories.

In [1]:
import os

data = 'data' # where our training and validation that are located
save = 'trained_flow' # where we want to save our outputs

if not os.path.exists(save):
    os.mkdir(save)

### Preparing the data

Now we need to read in our generated data set and reformat it. In this stage we also need to define what information to show the network during training. This can be set in the inputs to the `BoxDataset.make_data_for_network()` function. We need to carefully think about what information we want to and we have to include for the data to be interpretable for the normalising flow, and so that we only marginalise over parameters we are not interested in. 

For example, in this case we define our survey points in constant locations, therefore we do not need to provide the coordinates of these points. Since we don't provide these coordinates, mixing the order of the survey points would confuse the network, therefore we set `mix_survey_order = False`. Finally, we allowed a range of possible noise realisations when creating our data set. We want to be able to tell the network of the scale of the noise on the specific gravimetry survey we are inverting, therefore we include the `'noise_scale'` in our conditional.

In [2]:
from giflow.box import BoxDataset
import pickle as pkl

# Reading in files
trainsize = 5000
with open(os.path.join(data, 'trainset.pkl'), 'rb') as file:
    dt = pkl.load(file)
    train_data, train_conditional = dt.make_data_for_network(
        survey_coordinates_to_include = ['noise_scale'],
        model_info_to_include = [],
        add_noise = True, # This refers to survey noise.
        mix_survey_order = False
    )

# Need to do this with the validation data too
valsize = 500
with open(os.path.join(data, 'valset.pkl'), 'rb') as file:
    dt = pkl.load(file)
    validation_data, validation_conditional = dt.make_data_for_network(
        survey_coordinates_to_include = ['noise_scale'],
        model_info_to_include = [],
        add_noise = True,
        mix_survey_order = False,
    )


The output from this function is a list of arrays all containing different information, with the first array in the `train_data` always containing the source model parameters, and the first array in `train_conditional` containing the gravity values. This can be directly sent to the `Scaler`.

In [3]:
from giflow.scaler import Scaler
from sklearn.preprocessing import MinMaxScaler

sc_data = Scaler(scalers = [MinMaxScaler()]) # Need to define the scaler for each element in the train_data list.
sc_data.scale_data(train_data, fit = True) # Fit the scaler and store in the class

sc_conditional = Scaler(scalers = [MinMaxScaler(), MinMaxScaler()])
sc_conditional.scale_data(train_conditional, fit = True)

scalers = {'conditional': sc_conditional, 'data': sc_data}

Fitting scaler and compressor to data set...
Fitting scaler and compressor to data set...


### Training the network

Now it is time to define the normalising flow.

In [4]:
import torch
from giflow.flowmodel import FlowModel, save_flow

hyperparameters = {
        'n_inputs': 7, # the total number of parameters in the source model, including any additional information we chose to include
        'n_conditional_inputs': 65, # the total number of values in the conditional
        'n_transforms': 12,
        'n_blocks_per_transform': 2,
        'n_neurons': 64,
        # The parameters below define some settings for the training
        'batch_size': 5000,
        'batch_norm': True,
        'lr': 0.001,
        'epochs': 3000,
        'early_stopping': False # if set True, the training stops when the validation loss stops decreasing
}

# Construct the flow
flow = FlowModel(
        hyperparameters = hyperparameters,
        datasize = trainsize,
        scalers = scalers
);
flow.save_location = save
flow.data_location = data
save_flow(flow)
flow.construct();

It is required to define an optimiser for the training and we can choose to use a scheduler as well to make the training more efficient.

In [5]:
optimiser = torch.optim.Adam(
    flow.flowmodel.parameters(),
    lr = flow.hyperparameters['lr']
)

Now that everything is set up, we can send our data to the desired device and start training the neural network.

In [6]:
device = torch.device('cuda') # Define the GPU we want to train on

# Make the tensor data sets
train_dataset = flow.make_tensor_dataset(
    train_data, 
    train_conditional, 
    device = device, 
    scale = True
)

validation_dataset = flow.make_tensor_dataset(
    validation_data, 
    validation_conditional, 
    device = device, 
    scale = True
)

In [None]:
flow.train(
    optimiser = optimiser,
    validation_dataset = validation_dataset,
    train_dataset = train_dataset,
    scheduler = None,
    device = device
)

All the outputs, including diagnostics and loss plots are saved in the `trained_flow` directory.