# Training End-to-End Models

## Checking GPU Availability
First, let's see if we have a GPU available. It is highly recommended to utilize a GPU, although it is also possible to run this experiment on the CPU at a much slower rate.

In [None]:
from experiment import *
from data_utils import load_e2e_data


device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

print("Current device: " + str(device))


## Loading Training Data
Here, we define the paths to the images and their corresponding masks (targets).
The values for normalization are pre-computed from the training data in order to save time.
In case you want to reproduce all results from the paper, you need to run this notebook several times with different amount of training data. Remember that the actual number is 10-fold higher than what the start/stop inidices suggest, since we have every sample in 10 versions with different SNRs.

Because we wanted to compare the effect of using an increasing number of samples for training, we kept the validation data always the same for a fair and meaningful comparison of the results. Notice, (contrary to the training data) we don't apply any augmentation to the validation data.

In [None]:
train_dir = os.path.join(os.getcwd(), 'data', 'end2end', 'train')
mask_dir = os.path.join(os.getcwd(), 'data', 'masks', 'train')

# end2end
normalizer = {
        'norm_mean': [49],
        'norm_std': [56],
        }

start = 0
stop = 1000
#stop = 2000
#stop = 4000
#stop = 8000
train_dataset = load_e2e_data(train_dir, mask_dir, start, stop, normalizer, augment=True)

# this subset is always reserved as the validation set, regardless of the number of training samples
start = 9000
stop = 10000
val_dataset = load_e2e_data(train_dir, mask_dir, start, stop, normalizer)

print('Number of training samples: ' + str(len(train_dataset)))
print('Number of validation samples: ' + str(len(val_dataset)))

## Creating the Architecture
Here, we create the actual architecture of our network, and set it up to be used in the previously determined device (cpu or cuda).

In [None]:
# this will give create the compact architecture...
model = get_cmp_thermunet().to(torch.device(device))

# ...or you could use the more complex architecture
#model = get_lrg_thermunet().to(torch.device(device))

## Inspecting the Architecture
We can have a look at the architecture and check for instance, if the number of parameters is what we expect.


In [None]:
from torchsummary import summary

device = torch.device('cuda' if next(model.parameters()).is_cuda else 'cpu')

summary(model, input_size=(1, 64, 256), device=str(device))

## Training
Finally, we can start the training procedure.
The folder names to store the results are automatically generated and based on a timestamp.
They are always a subfolder of the training data folder.

In case you want to observe some random output samples during training, you can set the visualization_lvl to either 1 (plot validation data output samples) or 2 (plot validation and training data output samples).

In [None]:
epochs = 100
visualization_lvl = 1

train(model, train_dataset, val_dataset, epochs, visualization_lvl)

print("training finished...")    