<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#dhSegment-demonstration" data-toc-modified-id="dhSegment-demonstration-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>dhSegment demonstration</a></span><ul class="toc-item"><li><span><a href="#Setup" data-toc-modified-id="Setup-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Setup</a></span></li><li><span><a href="#Preparing-the-data" data-toc-modified-id="Preparing-the-data-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Preparing the data</a></span></li><li><span><a href="#Training-the-model" data-toc-modified-id="Training-the-model-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Training the model</a></span></li><li><span><a href="#Inference-and-new-annotations" data-toc-modified-id="Inference-and-new-annotations-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Inference and new annotations</a></span></li></ul></li></ul></div>

<a href="https://colab.research.google.com/github/dhlab-epfl/dhSegment-torch/blob/master/demo/dhSegment_demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# dhSegment demonstration

This notebook will show a demonstration which will:
1. Read a VIA annotation file and produce the necessary data for training dhSegment
2. Train a dhSegment model
3. Use the dhSegment model to predict new annotatations and save them in a VIA format.

## Setup

The first 3 cells install dhSegment on the colab notebook, make the necessary imports and load the tensorboard extension to see the training process.

In [None]:
!pip install git+https://github.com/dhlab-epfl/dhSegment-torch.git@master

In [None]:
from dh_segment_torch.config import Params

from dh_segment_torch.data import DataSplitter
from dh_segment_torch.data.annotation import AnnotationWriter

from dh_segment_torch.training import Trainer

from dh_segment_torch.inference import PredictProcess
from dh_segment_torch.post_processing import PostProcessingPipeline

import os
import torch

In [None]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

In order to be able to train dhSegment effectively, a GPU is necessary, the following cell will tell you if you enabled the GPU runtime of the Google Colab and how many memory you have available.

If the cell gives an error, check that you have a 'GPU' runtime in the menu > Runtime > Change Runtime Type > Hardware Accelerator. Then rerun all the above cells again.

If the cells runs sucessfully, it will give you the amount of GPU memory available, please remember it as it will be used to tweak the training part.

In [None]:
assert torch.cuda.device_count() >= 1

print("The GPU has %.2f GB of memory."%(torch.cuda.get_device_properties(0).total_memory//1024**2/1000))

## Preparing the data

The following cell contains the parameters for loading the data and creating the necessary data for training.

Important parameters are
- `file_path` which is the path to the annotation file.
- `attrib_name` which is the name of the attribute to consider in the via file
- `images_dir` which is the directory containing the images (if not using iiif)

Any VIA annotations can be used with this demo. We provide annotations and images for a venetian document and annotations of columns and rows. It is available here: https://drive.switch.ch/index.php/s/t63roYZBEUZIl1U.

The file should be downloaded, unzipped and either uploaded on your personal google drive acccount or uploaded directly to this notebook.

To see if the file are there, you can click on the folder icon in the left panel and mount your own GDrive account.

In [None]:
params = {
    'data_path' : '/content/data', # Path to write the data
    'data_splitter': {'train_ratio': 0.8, 'val_ratio': 0.2, 'test_ratio': 0.0}, # splitting ratio of the data
    'annotation_reader': {
         'type': 'via2_project', # File format of the data
         'attrib_name': 'lines', # Name of the via attribute where the labels are defined
         'file_path': '/content/drive/My Drive/sample_catastici/via_catastici_annotated.json', # Path to the annotation file
         'images_dir': '/content/drive/My Drive/sample_catastici', # Path to the images directory (not necessary if using IIIF)
         'line_thickness': 4, # Thickness of the lines to draw
    },
    'color_labels': {
        'type': 'colors', # Definition of colors for each labels
        'colors': ['#93be59', '#be5993'], # Colors
        'labels': ['row', 'column'] # Corresponding labels
    },
    'copy_images': True, # Whether to copy the images
    'overwrite': True, # Whether to overwrite the images
    'progress': True, # Whether to show progress
    'resizer': {'height': 1100} # Size to which the images should be resized while importing them (useful, since we resize them anyway in the processing pipeline)
}

The following cell prepares the data according to the parameters defined above. It is not necessary to understand its content.

In [None]:
num_processes = params.pop("num_processes", 4)

data_path = params.pop("data_path")

os.makedirs(data_path, exist_ok=True)

relative_path = params.pop("relative_path", True)

params.setdefault("labels_dir", os.path.join(data_path, "labels"))
labels_dir = params.get("labels_dir")

params.setdefault("images_dir", os.path.join(data_path, "images"))
images_dir = params.get("images_dir")

params.setdefault(
    "color_labels_file_path", os.path.join(data_path, "color_labels.json")
)
params.setdefault("csv_path", os.path.join(data_path, "data.csv"))

data_splitter_params = params.pop("data_splitter", None)
train_csv_path = params.pop("train_csv", os.path.join(data_path, "train.csv"))
val_csv_path = params.pop("val_csv", os.path.join(data_path, "val.csv"))
test_csv_path = params.pop("test_csv", os.path.join(data_path, "test.csv"))

params.setdefault("type", "image")
image_writer = AnnotationWriter.from_params(params)
data = image_writer.write(num_processes)

if relative_path:
    data['image'] = data['image'].apply(lambda path: os.path.join("images", os.path.basename(path)))
    data['label'] = data['label'].apply(lambda path: os.path.join("labels", os.path.basename(path)))

if data_splitter_params:
    data_splitter = DataSplitter.from_params(data_splitter_params)
    data_splitter.split_data(data, train_csv_path, val_csv_path, test_csv_path)

## Training the model

The following cell contains the configuration for the model and its training.

The defaults should be fine for most use cases.

The only parameter that may require tweaking is the batch size that needs to be set according to the amount of memory the GPU has. If the GPU size was above 14GB, a batch size of 4 is fine, otherwise a batch size of 2 should be set.

In [None]:
params = {
        "color_labels": {"label_json_file": '/content/data/color_labels.json'}, # Color labels produced before
        "train_dataset": {
            "type": "image_csv", # Image csv dataset
            "csv_filename": "/content/data/train.csv",
            "base_dir": "/content/data",
            "repeat_dataset": 4, # Repeat 4 times the data since we have little
            "compose": {"transforms": [{"type": "fixed_size_resize", "output_size": 1e6}]} # Resize to a fixed size, could add other transformations.
        },
        "val_dataset": {
            "type": "image_csv", # Validation dataset
            "csv_filename": "/content/data/val.csv",
            "base_dir": "/content/data",
            "compose": {"transforms": [{"type": "fixed_size_resize", "output_size": 1e6}]}
        },
        "model": { # Model definition, original dhSegment
            "encoder": "resnet50", 
            "decoder": {
                "decoder_channels": [512, 256, 128, 64, 32],
                "max_channels": 512
            }
        },
        "metrics": [['miou', 'iou'], ['iou', {"type": 'iou', "average": None}], 'precision'], # Metrics to compute
        "optimizer": {"lr": 1e-4}, # Learning rate
        "lr_scheduler": {"type": "exponential", "gamma": 0.9995}, # Exponential decreasing learning rate
        "val_metric": "+miou", # Metric to observe to consider a model better than another, the + indicates that we want to maximize
        "early_stopping": {"patience": 4}, # Number of validation steps without increase to tolerate, stops if reached
        "model_out_dir": "./model_test", # Path to model output
        "num_epochs": 100, # Number of epochs for training
        "evaluate_every_epoch": 5, # Number of epochs between each validation of the model
        "batch_size": 4, # Batch size (to be changed if the allocated GPU has little memory)
        "num_data_workers": 0,
        "track_train_metrics": False,
        "loggers": [
           {"type": 'tensorboard', "log_dir": "./model_cadaster_test/log", "log_every": 5, "log_images_every": 10}, # Tensorboard logging
           ]
    }


The following cell will start and launch a Tensorboard instance. If it fails, relaunch the cell and wait until you see the Tensorboard orange bar.

In [None]:
%tensorboard --logdir ./

The following cell generates the trainer from the parameters defined above and trains the model.

The trainig process can then be observed in the above tensorboard window.

The can be interrupted if a good enough result has been obtained has checkpoints are automatically created.

In [None]:
trainer = Trainer.from_params(params)
trainer.train()

## Inference and new annotations

This part will be be documented at a later time.

In [None]:

post_process_params = {
    'type': 'dag',
    'operations' : {
    "h_lines": {
        "inputs": "h_probs",
        "ops": [
            {"type": "filter_gaussian", "sigma": 1.2},
            {
                "type": "threshold_hysteresis",
                "low_threshold": 0.1,
                "high_threshold": 0.4,
            },
            {"type": "horizontal_lines_page", "angle_variance": 5, "vote_threshold": 100},
            {"type": "lines_filter", "dist_thresh": 40},
            "to_line",
            {'type': "assign_label", "label": "row"}
        ],
    },
    "v_lines": {
        "inputs": "v_probs",
        "ops": [
            {"type": "filter_gaussian", "sigma": 1.2},
            {
                "type": "threshold_hysteresis",
                "low_threshold": 0.1,
                "high_threshold": 0.4,
            },
            {"type": "vertical_lines_page", "angle_variance": 5, "vote_threshold": 100},
            {"type": "lines_filter", "dist_thresh": 40},
            "to_line",
            {'type': "assign_label", "label": "column"}
        ],
    },
    "mask_size": {
        "inputs": "h_probs",
        "ops": "probas_to_image_size"
    },
    'labels_annotations': {
        'inputs': ['h_lines', 'v_lines'],
        "ops": ["concat_lists", 'to_labels_annotations']
    },
    'labels_annotations_normalized': {
        'inputs': ['mask_size', 'labels_annotations'],
        'ops': "normalize_labels_annotations"
    },
    'annotation': {
        'inputs': ['path', 'labels_annotations_normalized'],
        'ops': 'to_annotation'
    }
}

}

dataset_params = {
    "type": "folder",
    "folder": "/content/drive/My Drive/sample_catastici",
    "pre_processing": {"transforms": [{"type": "fixed_size_resize", "output_size": 1e6}]}
}

model_params = {
    "model": {
            "encoder": "resnet50",
            "decoder": {"decoder_channels": [512, 256, 128, 64, 32], "max_channels": 512}
        },
        "num_classes": 3,
        "model_state_dict": "./model_test/best_checkpoint", # To be completed
        "device": "cuda:0"
}


process_params = Params({
    'data': dataset_params,
    'model': model_params,
    'post_process': post_process_params,
    'batch_size': 4,
    'num_workers': 4,
    'index_to_name': {1: 'h_probs', 2: 'v_probs'},
    'output_names': 'annotation',
    'add_path': True
})

In [None]:
predict_annots = PredictProcess.from_params(process_params)
annots = predict_annots.process()

In [None]:
annotation_writer_params = Params({
         'type': 'via2',
         'attrib_name': 'type',
         'json_path': './annotations.json',
    
        'annotation_iterator': annots
    })
writer = AnnotationWriter.from_params(annotation_writer_params)
writer.write(num_workers=8)