# **MitoEM Benchmark**

This notebook aims to produce a reproducable benchmark for the [Connectomics MitoEM tutorial](https://connectomics.readthedocs.io/en/latest/tutorials/mito.html). Both evaluation data and a pre-trained benchmark are provided for the user. In this notebook, due to resource limitations, we perform inference on ten slices of data.

## (1) Install dependencies and fetch prepared data

In [None]:
# install Connectomics and dependencies
%%capture
! pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113
! git clone https://github.com/zudi-lin/pytorch_connectomics.git
%cd pytorch_connectomics/
! pip install --editable .
! pip install numpy"<1.24"

In [None]:
# fetch training data from Hugging Face along with a small amount of cleanup
! mkdir -p datasets/MitoEM
! wget -q -O datasets/MitoEM/EM30-R-im.zip --show-progress https://huggingface.co/datasets/pytc/EM30/resolve/main/EM30-R-im.zip?download=true
! unzip -q datasets/MitoEM/EM30-R-im.zip -d datasets/MitoEM/EM30-R-im
! rm -r datasets/MitoEM/EM30-R-im/__MACOSX
! rm datasets/MitoEM/EM30-R-im.zip
! wget -q -O datasets/MitoEM/mito_val.zip --show-progress https://huggingface.co/datasets/pytc/MitoEM/resolve/main/EM30-R-mito-train-val-v2.zip?download=true
! unzip -q datasets/MitoEM/mito_val.zip -d datasets/MitoEM/EM30-R-val
! rm datasets/MitoEM/mito_val.zip

# fetch pre-trained model weights from Hugging Face
! mkdir -p outputs/MitoEM
! wget -q -O outputs/MitoEM/mito_u3d-bc_mitoem_300k.pth.tar --show-progress https://huggingface.co/pytc/mito/resolve/main/mito_u3d-bc_mitoem_300k.pth.tar?download=true

## (2) Specify model inference parameters
The provided model configuration that comes in the [Connectomics Github repo](https://github.com/zudi-lin/pytorch_connectomics.git) must be modified to accomodate the resource limitations of Colab. Important configurations for inference include:

* SYSTEM.NUM_GPUS: the number of GPUs that are available for inference
* DATASET.INPUT_PATH: the location that images are stored
* INFERENCE.IMAGE_NAME: the images that one is running inference on
* INFERNCE.OUTPUT_PATH: the location that the results will be located in

In [None]:
base_yaml = """SYSTEM:
  NUM_CPUS: 1
  NUM_GPUS: 1
MODEL:
  ARCHITECTURE: unet_plus_3d
  BLOCK_TYPE: residual_se
  INPUT_SIZE: [17, 225, 225]
  OUTPUT_SIZE: [17, 225, 225]
  IN_PLANES: 1
  NORM_MODE: sync_bn
  FILTERS: [32, 64, 96, 128, 160]
DATASET:
  IMAGE_NAME: ["im_train.json"]
  LABEL_NAME: ["mito_train.json"]
  INPUT_PATH: datasets/MitoEM/EM30-R-im/im
  OUTPUT_PATH: outputs/MitoEM-R/
  PAD_SIZE: [4, 64, 64]
SOLVER:
  LR_SCHEDULER_NAME: WarmupCosineLR
  BASE_LR: 0.04
  ITERATION_STEP: 1
  ITERATION_SAVE: 5000
  ITERATION_TOTAL: 150000
  SAMPLES_PER_BATCH: 2
INFERENCE:
  INPUT_SIZE: [10, 1024, 1024]
  OUTPUT_SIZE: [10, 1024, 1024]
  IMAGE_NAME: imstack_400_410.tif
  OUTPUT_PATH: outputs/MitoEM/EM30-R-im/results/
  OUTPUT_NAME: result # will automatically save to HDF5
  PAD_SIZE: [0, 0, 0]
  AUG_MODE: mean
  AUG_NUM: None
  STRIDE: [1, 513, 513]
  SAMPLES_PER_BATCH: 1"""

bc_yaml = """MODEL:
  OUT_PLANES: 2
  TARGET_OPT: ["0", "4-1-1"]
  LOSS_OPTION:
    - - WeightedBCEWithLogitsLoss
      - DiceLoss
    - - WeightedBCEWithLogitsLoss
      - DiceLoss
  LOSS_WEIGHT: [[1.0, 0.5], [1.0, 0.5]]
  WEIGHT_OPT: [["1", "0"], ["1", "0"]]
  OUTPUT_ACT: [["none", "sigmoid"], ["none", "sigmoid"]]
INFERENCE:
  OUTPUT_ACT: ["sigmoid", "sigmoid"]"""

with open('configs/MitoEM/MitoEM-R-Base.yaml', 'w') as fp:
    fp.write(base_yaml)

with open('configs/MitoEM/MitoEM-R-BC.yaml', 'w') as fp:
    fp.write(bc_yaml)


In [None]:
# move data around such that the configuration above points to the right data
from PIL import Image
import tifffile
import numpy as np

imstack = np.zeros([10, 4096, 4096])
for idx in range(400, 410):
    imstack[idx-400] = np.array(Image.open(f'datasets/MitoEM/EM30-R-im/im/im{idx:04}.png'))
tifffile.imwrite('datasets/MitoEM/EM30-R-im/im/imstack_400_410.tif', imstack)

## (3) Model infence

In [None]:
! time python -u scripts/main.py --config-base configs/MitoEM/MitoEM-R-Base.yaml --config-file configs/MitoEM/MitoEM-R-BC.yaml --inference --checkpoint outputs/MitoEM/mito_u3d-bc_mitoem_300k.pth.tar

## (4) Evaluation
Evaluation is done using code from the [mAP_3Dvolume Github repo master branch](https://github.com/ygCoconut/mAP_3Dvolume/tree/master), which is where the MitoEM Grand Challenge also performs evaluation.

In [None]:
# import dependencies
import glob
import h5py
import itertools
import numpy as np
from scipy import ndimage
from connectomics.data.utils import readvol, writeh5
from connectomics.utils.process import bc_watershed

In [None]:
# perform watershed processing on the data (currently represented as semantic/countour segmentations) to retrieve mitochondria instance segmentation
data = np.array(h5py.File("outputs/MitoEM/EM30-R-im/results/result.h5")['vol0'])
connected = bc_watershed(data, thres1=0.85, thres2=0.6, thres3=0.8, thres_small=512)
with h5py.File("outputs/MitoEM/EM30-R-im/results/watershed.h5", "w") as fp:
    fp.create_dataset('main', data=connected.astype(np.uint16))

In [None]:
# prepare validation data for evaluation
files = [f"datasets/MitoEM/EM30-R-val/mito-val-v2/seg{idx:04}.tif" for idx in range(400, 410)]
data = []
for file in files:
    data.append(tifffile.imread(file))
data = np.array(data)
writeh5("outputs/MitoEM/EM30-R-im/results/validation_gt.h5", data)

In [None]:
# fetch and execute validation scripts
! wget -q --show-progress https://raw.githubusercontent.com/ygCoconut/mAP_3Dvolume/master/demo.py
! wget -q --show-progress https://raw.githubusercontent.com/ygCoconut/mAP_3Dvolume/master/vol3d_eval.py
! wget -q --show-progress https://raw.githubusercontent.com/ygCoconut/mAP_3Dvolume/master/vol3d_util.py
! python demo.py -gt outputs/MitoEM/EM30-R-im/results/validation_gt.h5 -p outputs/MitoEM/EM30-R-im/results/watershed.h5

In [None]:
# show performance stats
# more data is available at pytorch_connectomics/map_output_match_fn.txt and pytorch_connectomics/map_output_match_p.txt
! cat map_output_map.txt