# **StomataPy - training tutorial**

Contact: hongyuan.zhang@usys.ethz.ch

The file structure:
```
├── StomataPy (the root directory)
    ├── Stomata_detection (data for object detection)
    ├── Stomata_segmentation (data for semantic segmentation)
    ├── train (config files for training)
    ├── Training.ipynb (the file you are running now)
```

# Clone the codebase from GitHub 
Skip it if you have done this

In [1]:
token = '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'
repo_url = f'https://{token}@github.com/Alias-z/StomataPy.git'
!git clone --recursive {repo_url}

Cloning into 'StomataPy'...
remote: Enumerating objects: 650, done.[K
remote: Counting objects: 100% (331/331), done.[K
remote: Compressing objects: 100% (220/220), done.[K
remote: Total 650 (delta 202), reused 222 (delta 104), pack-reused 319 (from 1)[K
Receiving objects: 100% (650/650), 162.87 MiB | 40.45 MiB/s, done.
Resolving deltas: 100% (368/368), done.
Submodule 'mmdetection' (https://github.com/Alias-z/mmdetection.git) registered for path 'mmdetection'
Submodule 'mmsegmentation' (https://github.com/Alias-z/mmsegmentation.git) registered for path 'mmsegmentation'
Submodule 'sahi' (https://github.com/Alias-z/sahi.git) registered for path 'sahi'
Submodule 'sam-hq' (https://github.com/Alias-z/sam-hq.git) registered for path 'sam-hq'
Cloning into '/StomataPy/StomataPy/mmdetection'...
remote: Enumerating objects: 36804, done.        
remote: Counting objects: 100% (5/5), done.        
remote: Compressing objects: 100% (3/3), done.        
remote: Total 36804 (delta 3), reused 2 (

# Install StomataPy
## Run this if you have already built the Docker image.

In [None]:
# run in terminal (cd to ./StomataPy)

pip install -v -e ./sahi ./mmdetection -e ./mmsegmentation

# Get DinoV2 weights

In [None]:
# run in terminal (cd to ./StomataPy)

wget -P train/checkpoints https://dl.fbaipublicfiles.com/dinov2/dinov2_vitl14/dinov2_vitl14_pretrain.pth
python mmsegmentation/tools/rein/convert_dinov2.py train/checkpoints/dinov2_vitl14_pretrain.pth train/checkpoints/dinov2_converted.pth --height 512 --width 512

# Get training dataset

In [None]:
# run in terminal (cd to ./StomataPy)

pip install -U "huggingface_hub[cli]"
HF_HUB_ETAG_TIMEOUT=50000
huggingface-cli login --token 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
huggingface-cli download aliasz/StomataPy400K --repo-type=dataset --local-dir ./StomataPy400K --force-download

In [11]:
# To remove hidden folders that start with '.'

import os
import shutil
dataset_root = 'StomataPy400K'
for dir_name in os.listdir(dataset_root):
    dir_path = os.path.join(dataset_root, dir_name)
    if dir_name.startswith('.') and os.path.isdir(dir_path):
        shutil.rmtree(dir_path)

# Prepare training set
## 1. Training for 'stomatal complex' instance segmentation

In [None]:
# run in terminal (cd to ./StomataPy)

python stomatapy/utils/prepare_trainset.py --dataset_root "StomataPy400K" --ensemble_by_modality --r_train 0.8 --r_test 0 --aim "object detection"

In [None]:
# run in terminal (cd to ./StomataPy)

python mmdetection/tools/evensampler_json_convertor.py --root_dir "StomataPy400K_train"

In [None]:
# run in terminal (cd to ./StomataPy)

#mim run mmdet browse_dataset train/config/det_rein_dinov2_mask2former_evensample.py  --output-dir viz_dataset_mmdet/ 
# mim train mmdet train/config/det_rein_dinov2_mask2former.py --gpus 6 --launcher pytorch

#WANDB: e98d0d0fc185629c362c4f0f39cc9282d955d217
mim train mmdet train/config/det_rein_dinov2_mask2former_evensample.py --gpus 6 --launcher pytorch

## 2. Training for 'stoma' semantic segmentation

In [None]:
from stomatapy.utils.data_statistics import DataStatistics

dataset_root = 'StomataPy400K'

DataStatistics.dataset_filter(dataset_root, pavements_only=False, sc_flag=1, semantic=True, ensemble_by_modality=False)

Filtering dataset: StomataPy400K


In [None]:
python stomatapy/utils/prepare_trainset.py --dataset_root "StomataPy400K_filtered" --r_train 0.8 --r_test 0 --aim "semantic segmentation"

In [None]:
mim train mmsegmentation train/config/seg_rein_dinov2_mask2former.py --gpus 6 --launcher pytorch

## 3. generate full weights after training

In [None]:
# run in terminal (cd to ./StomataPy)

python mmsegmentation/tools/rein/generate_full_weights.py \
    --backbone train/checkpoints/dinov2_converted.pth \
    --segmentor_save_path Models/"StomataPy400K_stomatal_complex_24185"/dinov2_detector.pth \
    --rein_head Models/"StomataPy400K_stomatal_complex_24185"/best_coco_segm_mAP_epoch_292.pth