Skeletonization: supervised, inference
======================================

In this notebook we use the supervised module to extract length and head width using a model trained on manually annotated data. We will use the script `skeletons/main_supervised_skeletons_inference.py` to extract skeletons form the clips. 

We first import the necessary libraries: 

In [6]:
import argparse
import os
import sys
import torch
import cv2

from datetime import datetime
from pathlib import Path
from PIL import Image
from matplotlib import pyplot as plt
from PIL import Image
from skimage.morphology import thin
from torchvision import transforms
from tqdm import tqdm

import numpy as np
import pandas as pd
import pytorch_lightning as pl
import yaml

from mzbsuite.skeletons.mzb_skeletons_pilmodel import MZBModel_skels
from mzbsuite.skeletons.mzb_skeletons_helpers import paint_image_tensor, Denormalize
from mzbsuite.utils import cfg_to_arguments, find_checkpoints

# Set the thread layer used by MKL
os.environ["MKL_THREADING_LAYER"] = "GNU"

We need to set up some running parameters for the script too: 

In [7]:
ROOT_DIR = ROOT_DIR = Path("D:\mzb-workflow") #Path("/data/shared/mzb-workflow")
MODEL = "mit-b2-v1"

arguments = {
    "config_file": ROOT_DIR / "configs/mzb_example_config.yaml",
    "input_dir": ROOT_DIR / "data/bgb/derived/blobs",
    "input_type": "external", 
    "input_model": ROOT_DIR / f"models/mzb-skeleton-models/{MODEL}", 
    "output_dir": ROOT_DIR / "results/bgb/skeletons/skeletons_supervised",
    "save_masks": ROOT_DIR / "data/bgb/skeletons/skeletons_supervised", 
    "verbose": True,
}
    
with open(str(arguments["config_file"]), "r") as f:
    cfg = yaml.load(f, Loader=yaml.FullLoader)

# cfg["trcl_gpu_ids"] = None
print(arguments)

{'config_file': WindowsPath('D:/mzb-workflow/configs/mzb_example_config.yaml'), 'input_dir': WindowsPath('D:/mzb-workflow/data/bgb/derived/blobs'), 'input_type': 'external', 'input_model': WindowsPath('D:/mzb-workflow/models/mzb-skeleton-models/mit-b2-v1'), 'output_dir': WindowsPath('D:/mzb-workflow/results/bgb/skeletons/skeletons_supervised'), 'save_masks': WindowsPath('D:/mzb-workflow/data/bgb/skeletons/skeletons_supervised'), 'verbose': True}


Convert to a dictionary for the scripts to parse. 

In [8]:
# Transforms configurations dicts to argparse arguments
args = cfg_to_arguments(arguments)
cfg = cfg_to_arguments(cfg)
print(str(cfg))

{'glob_random_seed': 222, 'glob_root_folder': '/home/jovyan/work/mzb-workflow/', 'glob_blobs_folder': '/home/jovyan/work/mzb-workflow/data/derived/blobs/', 'glob_local_format': 'pdf', 'model_logger': 'wandb', 'impa_image_format': 'jpg', 'impa_clip_areas': [2700, 4700, -1, -1], 'impa_area_threshold': 5000, 'impa_gaussian_blur': [21, 21], 'impa_gaussian_blur_passes': 3, 'impa_adaptive_threshold_block_size': 351, 'impa_mask_postprocess_kernel': [11, 11], 'impa_mask_postprocess_passes': 5, 'impa_bounding_box_buffer': 200, 'impa_save_clips_plus_features': True, 'lset_class_cut': 'order', 'lset_val_size': 0.1, 'trcl_learning_rate': 0.0001, 'trcl_batch_size': 8, 'trcl_weight_decay': 0, 'trcl_step_size_decay': 5, 'trcl_number_epochs': 75, 'trcl_save_topk': 1, 'trcl_num_classes': 8, 'trcl_model_pretrarch': 'convnext-small', 'trcl_num_workers': 16, 'trcl_wandb_project_name': 'mzb-classifiers', 'trcl_logger': 'wandb', 'trsk_learning_rate': 0.001, 'trsk_batch_size': 32, 'trsk_weight_decay': 0, 'tr

We can load the code necessary to run the inference from the dedicated script, and call it with the arguments specified above. 

In [9]:
# from classification.main_classification_finetune import main as finetune_classifier
from scripts.skeletons.main_supervised_skeleton_inference import main as inference_skeleton
?inference_skeleton

[1;31mSignature:[0m [0minference_skeleton[0m[1;33m([0m[0margs[0m[1;33m,[0m [0mcfg[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Function to run inference of skeletons (body, head) on macrozoobenthos images clips, using a trained model.

Parameters
----------
args : argparse.Namespace
    Namespace containing the arguments passed to the script. Notably:

        - input_dir: path to the directory containing the images to be classified
        - input_type: type of input data, either "val" or "external"
        - input_model: path to the directory containing the model to be used for inference
        - output_dir: path to the directory where the results will be saved
        - save_masks: path to the directory where the masks will be saved
        - config_file: path to the config file with train / inference parameters

cfg : dict
    Dictionary containing the configuration parameters.

Returns
-------
None. Saves the results in the specified folder.
[1;31mF

Now we can call the function and run the inference on the images. 

In [10]:
inference_skeleton(args, cfg)

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Predicting DataLoader 0: 100%|██████████| 13/13 [00:05<00:00,  2.35it/s]
Neural network predictions done, refining and saving skeletons...


100%|██████████| 100/100 [02:49<00:00,  1.70s/it]


This produces a `.csv` file with the predictions for body length and head width (saved in `output_dir`), as well as the predicted skeletons for body length and head for each clip (saved in `save_masks`). 

We can also visualise the results and assess model accuracy against a manually annotated validation set, if available. First we need to provide some additional arguments: 

In [11]:
arguments['input_dir'] = ROOT_DIR / "data/mzb_example_data/derived/blobs"
arguments['manual_annotations'] = ROOT_DIR / "data\mzb_example\skeletons\supervised_skeletons\manual_anns\manual_annotations_summary.csv"
arguments['model_annotations'] = ROOT_DIR / "results\mzb_example_data\skeletons\skeletons_supervised\size_skel_supervised_model.csv"
arguments['output_dir'] = ROOT_DIR / "results\mzb_example_data\skeletons\skeletons_supervised"
print(arguments)

{'config_file': WindowsPath('D:/mzb-workflow/configs/mzb_example_config.yaml'), 'input_dir': WindowsPath('D:/mzb-workflow/data/mzb_example_data/derived/blobs'), 'input_type': 'external', 'input_model': WindowsPath('D:/mzb-workflow/models/mzb-skeleton-models/mit-b2-v1'), 'output_dir': WindowsPath('D:/mzb-workflow/results/mzb_example_data/skeletons/skeletons_supervised'), 'save_masks': WindowsPath('D:/mzb-workflow/data/bgb/skeletons/skeletons_supervised'), 'verbose': True, 'manual_annotations': WindowsPath('D:/mzb-workflow/data/mzb_example/skeletons/supervised_skeletons/manual_anns/manual_annotations_summary.csv'), 'model_annotations': WindowsPath('D:/mzb-workflow/results/mzb_example_data/skeletons/skeletons_supervised/size_skel_supervised_model.csv')}


In [12]:
args = cfg_to_arguments(arguments)
print(str(args))

{'config_file': WindowsPath('D:/mzb-workflow/configs/mzb_example_config.yaml'), 'input_dir': WindowsPath('D:/mzb-workflow/data/mzb_example_data/derived/blobs'), 'input_type': 'external', 'input_model': WindowsPath('D:/mzb-workflow/models/mzb-skeleton-models/mit-b2-v1'), 'output_dir': WindowsPath('D:/mzb-workflow/results/mzb_example_data/skeletons/skeletons_supervised'), 'save_masks': WindowsPath('D:/mzb-workflow/data/bgb/skeletons/skeletons_supervised'), 'verbose': True, 'manual_annotations': WindowsPath('D:/mzb-workflow/data/mzb_example/skeletons/supervised_skeletons/manual_anns/manual_annotations_summary.csv'), 'model_annotations': WindowsPath('D:/mzb-workflow/results/mzb_example_data/skeletons/skeletons_supervised/size_skel_supervised_model.csv')}


In [13]:
from scripts.skeletons.main_supervised_skeleton_assessment import main as assess_skeletons
?assess_skeletons

[1;31mSignature:[0m [0massess_skeletons[0m[1;33m([0m[0margs[0m[1;33m,[0m [0mcfg[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Main function to run an assessment of the length measurements.
Computes the absolute error between manual annotations and model predictions, and reports plots grouped by species.

Parameters
----------
args: argparse.Namespace
    Arguments parsed from the command line. Specifically:
    
        - args.input_dir: path to the directory with the model predictions
        - args.manual_annotations: path to the manual annotations
        - args.model_annotations: path to the model predictions
        - args.output_dir: path to the directory where plots accuracy report should be saved

cfg: argparse.Namespace
    configuration options.

Returns
-------
None. Plots and metrics are saved in the results directory.
[1;31mFile:[0m      d:\mzb-workflow\scripts\skeletons\main_supervised_skeleton_assessment.py
[1;31mType:[0m      function

In [None]:
assess_skeletons(args, cfg)