Copyright (c) Microsoft Corporation.

Licensed under the MIT License.

# Model training and evaluation on F3 Netherlands dataset

Seismic interpretation, also referred to as facies classification, is a task of determining types of rock in the earth’s subsurface, given seismic data. Seismic interpretation is used as a standard approach for determining precise locations of oil deposits for drilling, therefore reducing risks and potential losses. In recent years, there has been a great interest in using fully-supervised deep learning models for seismic interpretation. 

In this notebook, we demonstrate how to train a deep neural network for facies prediction using F3 Netherlands dataset. The F3 block is located in the North Sea off the shores of Netherlands. The dataset contains 6 classes (facies or lithostratigraphic units), all of which are of varying thickness (class imbalance). Processed data is available in numpy format as a `401 x 701 x 255` array. The processed F3 data is made available by [Alaudah et al. 2019](https://github.com/yalaudah/facies_classification_benchmark).

We specifically demonstrate a patch-based model approach, where we process a patch of an inline or crossline slice, instead of the entire slice.

## Environment setup

To set up the conda environment and the Jupyter notebook kernel, please follow the instructions in the top-level [README.md](../../../README.md) file.

## Notebook-specific parameters

Now let's set parameters which are required only for this notebook.

We use configuration files to specify experiment configuration, such as hyperparameters used in training and evaluation, as well as other experiment settings. 

This notebook is designed to showcase the patch-based models on Dutch F3 dataset, hence we load the configuration files from that experiment by navigating to the `experiments` folder in the root directory. Each configuration file specifies a different Computer Vision model which is loaded for this notebook.

Modify the `CONFIG_FILE` variable below if you would like to run the experiment using a different configuration file from the same experiment.

For "out-of-the-box" Docker experience we, already pre-poppulated each model configuration file with the correct paramters.

In [None]:
# load an existing experiment configuration file
CONFIG_FILE = (
    "../../../experiments/interpretation/dutchf3_patch/local/configs/hrnet.yaml"
)
# number of images to score
N_EVALUATE = 20
# demo flag - by default notebook runs in demo mode and only fine-tunes the pre-trained model. Set to False for full re-training.
DEMO = True
# options are test1 or test2 - picks which Dutch F3 test set split to use
TEST_SPLIT = "test1"

import os
assert os.path.isfile(CONFIG_FILE), "Experiment config file CONFIG_FILE not found on disk"
assert isinstance(N_EVALUATE, int) and N_EVALUATE>0, "Number of images to score has to be a positive integer"
assert isinstance(DEMO, bool), "demo mode should be a boolean"
assert TEST_SPLIT == "test1" or TEST_SPLIT == "test2"

## Data download and preparation

To download and prepare the F3 data set, please follow the instructions in the top-level [README](../../../README.md) file. Once you have downloaded and prepared the data set, you will find your files in the following directory tree:

```
data
├── splits
├── test_once
│   ├── test1_labels.npy
│   ├── test1_seismic.npy
│   ├── test2_labels.npy
│   └── test2_seismic.npy
└── train
    ├── train_labels.npy
    └── train_seismic.npy
```

We recommend saving the data under `$HOME/data/dutchf3` since this notebook will use that location as the data root. Otherwise, modify the `DATASET.ROOT` field in the configuration file, described next. 

## Library imports

Let's load required libraries - the first step fixes the seeds to obtain reproducible results and the rest of the steps import the libraries

In [None]:
import numpy as np
import torch
import logging
import logging.config
from os import path

import random
import matplotlib.pyplot as plt

plt.rcParams.update({"font.size": 16})

import yacs.config

import cv2
from albumentations import Compose, HorizontalFlip, Normalize, PadIfNeeded, Resize
from ignite.contrib.handlers import CosineAnnealingScheduler
from ignite.handlers import ModelCheckpoint
from ignite.engine import Events
from ignite.metrics import Loss
from ignite.utils import convert_tensor
from toolz import compose
from torch.utils import data

from cv_lib.utils import load_log_configuration
from cv_lib.event_handlers import SnapshotHandler, logging_handlers
from cv_lib.event_handlers.logging_handlers import Evaluator
from cv_lib.event_handlers import tensorboard_handlers
from cv_lib.event_handlers.tensorboard_handlers import (
    create_image_writer,
    create_summary_writer,    
)
from cv_lib.segmentation import models
from cv_lib.segmentation.dutchf3.engine import (
    create_supervised_evaluator,
    create_supervised_trainer,
)

from cv_lib.segmentation.metrics import (
    pixelwise_accuracy,
    class_accuracy,
    mean_class_accuracy,
    class_iou,
    mean_iou,
)

from cv_lib.segmentation.dutchf3.utils import (
    current_datetime,
    generate_path,
    git_branch,
    git_hash,
    np_to_tb,
)

from deepseismic_interpretation.dutchf3.data import (
    get_patch_loader,
    decode_segmap,
    get_test_loader,
)

from itkwidgets import view

from utilities import (
    plot_aline,
    patch_label_2d,
    compose_processing_pipeline,
    output_processing_pipeline,
    write_section_file,
    runningScore,
    validate_config_paths,
    download_pretrained_model,
)

## Experiment configuration file

Let's load the experiment configuration!

In [None]:
with open(CONFIG_FILE, "rt") as f_read:
    config = yacs.config.load_cfg(f_read)

print(
    f"Configuration loaded. Please check that the DATASET.ROOT:{config.DATASET.ROOT} points to your data location."
)
print(
    f"To modify any of the options, please edit the configuration file {CONFIG_FILE} and reload. \n"
)

We run test pipelines to test the notebooks, which use [papermill](https://papermill.readthedocs.io/en/latest/). If this notebook is being executed as part of such pipeline, the variables below are overridden. If not, we simply update these variable from a static configuration file specified earlier.

Override parameters in case we use papermill:

In [None]:
# The number of datapoints you want to run in training or validation per batch
# Setting to None will run whole dataset
# useful for integration tests with a setting of something like 3
# Use only if you want to check things are running and don't want to run
# through whole dataset
# The number of epochs to run in training
max_epochs = config.TRAIN.END_EPOCH
max_snapshots = config.TRAIN.SNAPSHOTS
papermill = False
dataset_root = config.DATASET.ROOT
model_pretrained = config.MODEL.PRETRAINED

In [None]:
# read back the parameters from papermill to config if papermill was used to run this notebook
if papermill:
    # reduce number of images scored for testing
    N_EVALUATE=2

opts = [
    "DATASET.ROOT",
    dataset_root,
    "TRAIN.END_EPOCH",
    max_epochs,
    "TRAIN.SNAPSHOTS",
    max_snapshots,
]
if "PRETRAINED" in config.MODEL.keys():
    opts += ["MODEL.PRETRAINED", model_pretrained]

config.merge_from_list(opts)

# download pre-trained model if possible
config = download_pretrained_model(config)

# update model pretrained (in case it was changed when the pretrained model was downloaded)
model_pretrained = config.MODEL.PRETRAINED

These are the final configs which are going to be used for this notebook - please check them carefully:

In [None]:
if DEMO:
    opts = [
        "TRAIN.END_EPOCH",
        1,
        "TRAIN.SNAPSHOTS",
        1,
        "TRAIN.MAX_LR",
        10 ** -9,
        "TRAIN.MIN_LR",
        10 ** -9,
    ]
    config.merge_from_list(opts)

In [None]:
# Fix random seeds, and set CUDNN benchmark mode:
torch.backends.cudnn.benchmark = config.CUDNN.BENCHMARK

# Fix random seeds:
torch.manual_seed(config.SEED)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(config.SEED)
np.random.seed(seed=config.SEED)

In [None]:
print(config)
validate_config_paths(config)

For tests we reduce the number of data used by the Jupyter notebook (pending Ignite 0.3.0 where we can just reduce the number of batches per EPOCH).

## F3 data set 

Let's visualize a few sections of the F3 data set. The processed F3 data set is stored as a 3D numpy array. Let's view slices of the data along inline and crossline directions. 

In [None]:
# Load training data and labels
train_seismic = np.load(path.join(config.DATASET.ROOT, "train/train_seismic.npy"))
train_labels = np.load(path.join(config.DATASET.ROOT, "train/train_labels.npy"))

print(f"Number of inline slices: {train_seismic.shape[0]}")
print(f"Number of crossline slices: {train_seismic.shape[1]}")
print(f"Depth dimension : {train_seismic.shape[2]}")

In [None]:
view(train_labels, slicing_planes=True)

Let's plot a __crossline__ slice.

In [None]:
idx = 100
x_in = train_seismic[idx, :, :].swapaxes(0, 1)
x_inl = train_labels[idx, :, :].swapaxes(0, 1)

plot_aline(x_in, x_inl, xlabel="crossline (relative)")

Let's plot an __inline__ slice.

In [None]:
x_cr = train_seismic[:, idx, :].swapaxes(0, 1)
x_crl = train_labels[:, idx, :].swapaxes(0, 1)

plot_aline(x_cr, x_crl, xlabel="inline (relative)")

## Model training

In [None]:
# Set up logging
load_log_configuration(config.LOG_CONFIG)
logger = logging.getLogger(__name__)
logger.debug(config.WORKERS)

scheduler_step = config.TRAIN.END_EPOCH // config.TRAIN.SNAPSHOTS

### Set up data augmentation

Let's define our data augmentation pipeline, which includes basic transformations, such as _data normalization, resizing, and padding_ if necessary. 

In [None]:
# Setup Augmentations
base_aug = Compose(
    [
        Normalize(
            mean=(config.TRAIN.MEAN,), std=(config.TRAIN.STD,), max_pixel_value=1
        ),
        PadIfNeeded(
            min_height=config.TRAIN.PATCH_SIZE,
            min_width=config.TRAIN.PATCH_SIZE,
            border_mode=0,
            always_apply=True,
            mask_value=255,
            value=0,
        ),
        Resize(
            config.TRAIN.AUGMENTATIONS.RESIZE.HEIGHT,
            config.TRAIN.AUGMENTATIONS.RESIZE.WIDTH,
            always_apply=True,
        ),
        PadIfNeeded(
            min_height=config.TRAIN.AUGMENTATIONS.PAD.HEIGHT,
            min_width=config.TRAIN.AUGMENTATIONS.PAD.WIDTH,
            border_mode=config.OPENCV_BORDER_CONSTANT,
            always_apply=True,
            mask_value=255,
        ),
    ]
)

if config.TRAIN.AUGMENTATION:
    train_aug = Compose([base_aug, HorizontalFlip(p=0.5)])
    val_aug = base_aug
else:
    raise NotImplementedError(
        "We don't support turning off data augmentation at this time"
    )

### Load the data

For training the model, we will use a patch-based approach. Rather than using entire sections (crosslines or inlines) of the data, we extract a large number of small patches from the sections, and use the patches as our data. This allows us to generate larger set of images for training, but is also a more feasible approach for large seismic volumes. 

We are using a custom patch data loader from our __`deepseismic_interpretation`__ library for generating and loading patches from seismic section data.

In [None]:
scheduler_step = config.TRAIN.END_EPOCH // config.TRAIN.SNAPSHOTS

TrainPatchLoader = get_patch_loader(config)

train_set = TrainPatchLoader(
    config.DATASET.ROOT,
    config.DATASET.NUM_CLASSES,
    split="train",
    is_transform=True,
    stride=config.TRAIN.STRIDE,
    patch_size=config.TRAIN.PATCH_SIZE,
    augmentations=train_aug,
)
n_classes = train_set.n_classes
logger.info(train_set)
val_set = TrainPatchLoader(
    config.DATASET.ROOT,
    config.DATASET.NUM_CLASSES,
    split="val",
    is_transform=True,
    stride=config.TRAIN.STRIDE,
    patch_size=config.TRAIN.PATCH_SIZE,
    augmentations=val_aug,
)

if papermill:
    val_set = data.Subset(val_set, range(3))
elif DEMO:
    val_set = data.Subset(val_set, range(config.VALIDATION.BATCH_SIZE_PER_GPU))

logger.info(val_set)

train_loader = data.DataLoader(
    train_set,
    batch_size=config.TRAIN.BATCH_SIZE_PER_GPU,
    num_workers=config.WORKERS,
    shuffle=True,
)
val_loader = data.DataLoader(
    val_set,
    batch_size=config.VALIDATION.BATCH_SIZE_PER_GPU,
    num_workers=config.WORKERS,
)

The following code defines the snapshot duration in batches over which we snapshot training models to disk. Variable `scheduler_step` defines how many epochs we have in a snapshot and multiplying that by the number of data points per epoch gives us the number of datapoints which we have per snapshot.

In [None]:
# if we're running in test mode, just run 2 batches
if papermill:
    train_len = config.TRAIN.BATCH_SIZE_PER_GPU*2    
# if we're running in demo mode, just run 10 batches to fine-tune the model
elif DEMO:
    train_len = config.TRAIN.BATCH_SIZE_PER_GPU*10    
# if we're not in test or demo modes, run the entire loop
else:
    train_len = len(train_loader)

snapshot_duration = scheduler_step * train_len if not papermill else 2*len(train_loader)

We also must specify a batch transformation function which allows us to selectively manipulate the data for each batch into the format which model training expects in the next step.

In [None]:
def prepare_batch(batch, device=None, non_blocking=False):
    x, y = batch
    return (
        convert_tensor(x, device=device, non_blocking=non_blocking),
        convert_tensor(y, device=device, non_blocking=non_blocking),
    )


### Set up model training

Next, let's define a model to train, an optimization algorithm, and a loss function. 

Note that the model is loaded from our __`cv_lib`__ library, using the name of the model as specified in the configuration file. To load a different model, either change the `MODEL.NAME` field in the configuration file, or create a new one corresponding to the model you wish to train.

In [None]:
# load a model
model = getattr(models, config.MODEL.NAME).get_seg_model(config)

# Send to GPU if available
device = "cpu"
if torch.cuda.is_available():
    device = "cuda"
model = model.to(device)

# SGD optimizer
optimizer = torch.optim.SGD(
    model.parameters(),
    lr=config.TRAIN.MAX_LR,
    momentum=config.TRAIN.MOMENTUM,
    weight_decay=config.TRAIN.WEIGHT_DECAY,
)

# learning rate scheduler
scheduler = CosineAnnealingScheduler(
    optimizer, "lr", config.TRAIN.MAX_LR, config.TRAIN.MIN_LR, cycle_size=snapshot_duration
)

# weights are inversely proportional to the frequency of the classes in the training set
class_weights = torch.tensor(
    config.DATASET.CLASS_WEIGHTS, device=device, requires_grad=False
)

# loss function
criterion = torch.nn.CrossEntropyLoss(
    weight=class_weights, ignore_index=255, reduction="mean"
)

## Training the model

We use [ignite](https://pytorch.org/ignite/index.html) framework to create training and validation loops in our codebase. Ignite provides an easy way to create compact training/validation loops without too much boilerplate code.

In this notebook, we demonstrate the use of ignite on the training loop only. We create a training engine `trainer` that loops multiple times over the training dataset and updates model parameters. 

In [None]:
# create training engine
trainer = create_supervised_trainer(
    model, optimizer, criterion, prepare_batch, device=device
)

# add learning rate scheduler
trainer.add_event_handler(Events.ITERATION_STARTED, scheduler)

#### Logging

We add various events to the trainer, using an event system, that allows us to interact with the engine on each step of the run, such as, when the trainer is started/completed, when the epoch is started/completed and so on. 

Over the next few cells, we use event handlers to add the following events to the training loop:
- log training output
- log and schedule learning rate and
- periodically save model to disk.

In [None]:
# define and create main output directory 
# output_dir = path.join(config.OUTPUT_DIR+"_nb", config.TRAIN.MODEL_DIR)
output_dir = config.OUTPUT_DIR+"_nb"
generate_path(output_dir)

# define main summary writer which logs all model summaries
summary_writer = create_summary_writer(log_dir=path.join(output_dir, config.LOG_DIR))

# add logging of training output
trainer.add_event_handler(
    Events.ITERATION_COMPLETED,
    logging_handlers.log_training_output(log_interval=config.TRAIN.BATCH_SIZE_PER_GPU),
)

# add logging of learning rate
trainer.add_event_handler(Events.EPOCH_STARTED, logging_handlers.log_lr(optimizer))

# log learning rate to tensorboard
trainer.add_event_handler(
    Events.EPOCH_STARTED,
    tensorboard_handlers.log_lr(summary_writer, optimizer, "epoch"),
)

# log training summary to tensorboard as well
trainer.add_event_handler(
    Events.ITERATION_COMPLETED,
    tensorboard_handlers.log_training_output(summary_writer),
)

We also checkpoint models and snapshot them to disk with every training epoch.

In [None]:
# add model checkpointing
checkpoint_handler = ModelCheckpoint(
    output_dir,
    "model_f3_nb",
    save_interval=1,
    n_saved=1,
    create_dir=True,
    require_empty=False,
)

trainer.add_event_handler(
    Events.EPOCH_COMPLETED, checkpoint_handler, {config.MODEL.NAME: model}
)

Next we need to score the model on validation set as it's training. To do this we need to add helper functions to manipulate data into the required shape just as we've done to prepare each batch for training at the beginning of this notebook.

We also set up evaluation metrics which we want to record on the training set.

In [None]:
# helper function for
def _select_pred_and_mask(model_out_dict):
    return (model_out_dict["y_pred"].squeeze(), model_out_dict["mask"].squeeze())


def _select_max(pred_tensor):
    return pred_tensor.max(1)[1]


def _tensor_to_numpy(pred_tensor):
    return pred_tensor.squeeze().cpu().numpy()


def snapshot_function():
    return (trainer.state.iteration % snapshot_duration) == 0

evaluator = create_supervised_evaluator(
    model,
    prepare_batch,
    metrics={
        "nll": Loss(criterion, output_transform=_select_pred_and_mask),
        "pixacc": pixelwise_accuracy(
            n_classes, output_transform=_select_pred_and_mask, device=device
        ),
        "cacc": class_accuracy(n_classes, output_transform=_select_pred_and_mask),
        "mca": mean_class_accuracy(n_classes, output_transform=_select_pred_and_mask),
        "ciou": class_iou(n_classes, output_transform=_select_pred_and_mask),
        "mIoU": mean_iou(n_classes, output_transform=_select_pred_and_mask),
    },
    device=device,
)

trainer.add_event_handler(Events.EPOCH_COMPLETED, Evaluator(evaluator, val_loader))

evaluator.add_event_handler(
    Events.EPOCH_COMPLETED,
    logging_handlers.log_metrics(
        "Validation results",
        metrics_dict={
            "nll": "Avg loss :",
            "pixacc": "Pixelwise Accuracy :",
            "mca": "Avg Class Accuracy :",
            "mIoU": "Avg Class IoU :",
        },
    ),
)

evaluator.add_event_handler(
    Events.EPOCH_COMPLETED,
    tensorboard_handlers.log_metrics(
        summary_writer,
        trainer,
        "epoch",
        metrics_dict={
            "mIoU": "Validation/mIoU",
            "nll": "Validation/Loss",
            "mca": "Validation/MCA",
            "pixacc": "Validation/Pixel_Acc",
        },
    ),
)


transform_func = compose(np_to_tb, decode_segmap(n_classes=n_classes), _tensor_to_numpy)

transform_pred = compose(transform_func, _select_max)

evaluator.add_event_handler(
    Events.EPOCH_COMPLETED,
    create_image_writer(summary_writer, "Validation/Image", "image"),
)

evaluator.add_event_handler(
    Events.EPOCH_COMPLETED,
    create_image_writer(
        summary_writer, "Validation/Mask", "mask", transform_func=transform_func
    ),
)

evaluator.add_event_handler(
    Events.EPOCH_COMPLETED,
    create_image_writer(
        summary_writer, "Validation/Pred", "y_pred", transform_func=transform_pred
    ),
)

Start the training engine run.

In [None]:
trainer.run(train_loader, max_epochs=config.TRAIN.END_EPOCH, epoch_length=train_len, seed = config.SEED)

## Tensorboard
Using tensorboard for monitoring runs can be quite enlightening. Just ensure that the appropriate port is open on the VM so you can access it. Below we have the command for running tensorboard in your notebook. You can as easily view it in a seperate browser window by pointing the browser to the appropriate location and port.

In [None]:
if not papermill:
    %load_ext tensorboard

In [None]:
if not papermill:
    %tensorboard --logdir $output_dir --port 9001 --host 0.0.0.0

## Evaluation

We will next evaluate the performance of the model by looking at how well it predicts facies labels on samples from the test set.

We will use the following evaluation metrics:

- Pixel Accuracy (PA)
- Class Accuracy (CA)
- Mean Class Accuracy (MCA)
- Frequency Weighted intersection-over-union (FW IoU)
- Mean IoU (MIoU)

You have an option here to use either the pre-trained model which we provided for you or to use the model which we just fine-tuned in this notebook. By default, we use the fine-tuned model, but you can change that in the cell below.

In [None]:
# use the model which we just fine-tuned
opts = ["TEST.MODEL_PATH", path.join(output_dir, f"model_f3_nb_seg_hrnet_{train_len}.pth")]
# uncomment the line below to use the pre-trained model instead
# opts = ["TEST.MODEL_PATH", config.MODEL.PRETRAINED]
config.merge_from_list(opts)

In [None]:
model.load_state_dict(torch.load(config.TEST.MODEL_PATH))
model = model.to(device)

Next we load the test data and define the augmentations on it. 

In [None]:
# Augmentation
# augment entire sections with the same normalization
section_aug = Compose(
    [Normalize(mean=(config.TRAIN.MEAN,), std=(config.TRAIN.STD,), max_pixel_value=1,)]
)

# augment each patch and not the entire sectiom which the patches are taken from
patch_aug = Compose(
    [
        Resize(
            config.TRAIN.AUGMENTATIONS.RESIZE.HEIGHT,
            config.TRAIN.AUGMENTATIONS.RESIZE.WIDTH,
            always_apply=True,
        ),
        PadIfNeeded(
            min_height=config.TRAIN.AUGMENTATIONS.PAD.HEIGHT,
            min_width=config.TRAIN.AUGMENTATIONS.PAD.WIDTH,
            border_mode=config.OPENCV_BORDER_CONSTANT,
            always_apply=True,
            mask_value=255,
        ),
    ]
)

# Process test data
pre_processing = compose_processing_pipeline(config.TRAIN.DEPTH, aug=patch_aug)
output_processing = output_processing_pipeline(config)

# Select the test split
split = TEST_SPLIT

labels = np.load(path.join(config.DATASET.ROOT, "test_once", split + "_labels.npy"))
section_file = path.join(config.DATASET.ROOT, "splits", "section_" + split + ".txt")
write_section_file(labels, section_file, config)

# Load test data
TestSectionLoader = get_test_loader(config)
test_set = TestSectionLoader(
    config.DATASET.ROOT, config.DATASET.NUM_CLASSES, split=split, is_transform=True, augmentations=section_aug
)
# needed to fix this bug in pytorch https://github.com/pytorch/pytorch/issues/973
# one of the workers will quit prematurely
torch.multiprocessing.set_sharing_strategy("file_system")
test_loader = data.DataLoader(
    test_set, batch_size=1, num_workers=config.WORKERS, shuffle=False
)

### Predict segmentation mask on the test data

For demonstration purposes and efficiency, we will only use a subset of the test data to predict segmentation mask on. More precisely, we will score `N_EVALUATE` images. If you would like to evaluate more images, set this variable to the desired number of images.

In [None]:
CLASS_NAMES = [
    "upper_ns",
    "middle_ns",
    "lower_ns",
    "rijnland_chalk",
    "scruff",
    "zechstein",
]

n_classes = len(CLASS_NAMES)

# keep only N_EVALUATE sections to score
test_subset = random.sample(list(test_loader), N_EVALUATE)

results = list()
running_metrics_split = runningScore(n_classes)

# testing mode
with torch.no_grad():
    model.eval()
    # loop over testing data
    for i, (images, labels) in enumerate(test_subset):
        logger.info(f"split: {split}, section: {i}")
        outputs = patch_label_2d(
            model,
            images,
            pre_processing,
            output_processing,
            config.TRAIN.PATCH_SIZE,
            config.TEST.TEST_STRIDE,
            config.VALIDATION.BATCH_SIZE_PER_GPU,
            device,
            n_classes,
        )

        pred = outputs.detach().max(1)[1].numpy()
        gt = labels.numpy()
        
        # update evaluation metrics
        running_metrics_split.update(gt, pred)
        
        # keep ground truth and result for plotting
        results.append((np.squeeze(gt), np.squeeze(pred)))
        

Let's view the obtained metrics on this subset of test images. Note that we trained our model for for a small number of epochs, for demonstration purposes, so the performance results here are not meant to be representative. 

The performance exceed the ones shown here when the models are trained properly. For the full report on benchmarking performance results, please refer to the [README.md](../../../README.md) file.

In [None]:
# get scores
score, _ = running_metrics_split.get_scores()

# Log split results
print(f'Pixel Acc: {score["Pixel Acc: "]:.3f}')
for cdx, class_name in enumerate(CLASS_NAMES):
    print(f'  {class_name}_accuracy {score["Class Accuracy: "][cdx]:.3f}')

print(f'Mean Class Acc: {score["Mean Class Acc: "]:.3f}')
print(f'Freq Weighted IoU: {score["Freq Weighted IoU: "]:.3f}')
print(f'Mean IoU: {score["Mean IoU: "]:0.3f}')

### Visualize predictions

Let's visualize the predictions on entire test sections. Note that the crosslines and inlines have different dimensions, however we were able to use them jointly for our network training and evaluation, since we were using smaller patches from the sections, whose size we can control via hyperparameter in the experiment configuration file. 

In [None]:
fig = plt.figure(figsize=(15, 50))
# only plot a few images
nplot = min(N_EVALUATE, 10)
for idx in range(nplot):
    # plot actual
    plt.subplot(nplot, 2, 2 * (idx + 1) - 1)
    plt.imshow(results[idx][0])
    # plot predicted
    plt.subplot(nplot, 2, 2 * (idx + 1))
    plt.imshow(results[idx][1])
    
f_axes = fig.axes
_ = f_axes[0].set_title("Actual")
_ = f_axes[1].set_title("Predicted")
fig.savefig("plot_predictions.png")