Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions

Introduction

This is a PyTorch implementation of MICCAI22 [Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions].

In this papper, we design distill knowledge from publicly available models trained on large generic datasets to facilitate the self-supervised learning of surgical videos.

Framework visualization

Preparation

Data Preparation

We use the dataset Cholec80 and M2CAI 2016 Challenge.
Training and test data split

Cholec80: first 40 videos for training and the rest 40 videos for testing.

M2CAI: 27 videos for training and 14 videos for testing.
Data Preprocessing:

Using FFmpeg to convert the videos to frames;
Downsample 25fps to 1fps (Or can directly set the convert frequency number as 1 fps in the previous step);
Cut the black margin existed in the frame using the function of change_size() in video2frame_cutmargin.py;

Note: You also can directly use ``video2frame_cutmargin.py`` for step 1&3, you will obtain the cutted frames with original fps.

Resize original frame to the resolution of 250 * 250.

The structure of data folder is arranged as follows:

(root folder)
├── data
|  ├── cholec80
|  |  ├── cutMargin
|  |  |  ├── 1
|  |  |  ├── 2
|  |  |  ├── 3
|  |  |  ├── ......
|  |  |  ├── 80
|  |  ├── phase_annotations
|  |  |  ├── video01-phase.txt
|  |  |  ├── ......
|  |  |  ├── video80-phase.txt
├── code
|  ├── ......

Setup & Training

Check dependencies:

matplotlib==3.4.3
numpy==1.20.3
opencv_python==4.5.3.56
Pillow==9.2.0
registry==0.4.2
scikit_learn==1.1.2
scipy==1.7.1
termcolor==1.1.0
torch==1.9.0
torchvision==0.10.0
tqdm==4.61.2

Conduct Semantic-preserving training: You need first download the pre-trained model for ResNet50 and save it to /IN_supervised

CUDA_VISIBLE_DEVICES=0,1,2,3 python main_moco.py   -a resnet50   --lr 0.010   --batch-size 128   --dist-url 'tcp://localhost:10002' --multiprocessing-distributed --world-size 1 --rank 0 --mlp --moco-t 0.2 --aug-plus --cos --method=base --sample_rate=25  --moco-k=2048 --onlyfc

Conduct Pre-training

CUDA_VISIBLE_DEVICES=0,1,2,3 python main_moco.py   -a resnet50   --lr 0.010   --batch-size 128   --dist-url 'tcp://localhost:10002' --multiprocessing-distributed --world-size 1 --rank 0 --mlp --moco-t 0.2 --aug-plus --cos --method=base --sample_rate=25  --moco-k=2048 --dis_weight=5 --distill=1

Conduct Linear fine-tuning

CUDA_VISIBLE_DEVICES=0 python frame_feature_extractor.py --model=resnet50 --action=train --target=train_set --sample_rate=25 --best_ep=199 --start=1  --end=41 --epochs=10

Exract features

CUDA_VISIBLE_DEVICES=0 python frame_feature_extractor.py --model=[The path for the obtained model by step 3] --action=extract --target=train_set --sample_rate=5 --start=1  --end=41 --best_ep=4

Training TCN

CUDA_VISIBLE_DEVICES=0 python train.py --action=base_train --sample_rate=5 --backbone=[The path for the obtained model by step 4]

Predict Results

CUDA_VISIBLE_DEVICES=0 python train.py --action=base_predict --sample_rate=5 --backbone=[The path for the obtained model by step 5] --best_ep=[the best epoch in Step 5] --fps=5

Evaluate the predcitions

   matlab-eval/Main.m (cholec80)
   matlab-eval/Main_m2cai.m (m2cai16)

Citation

If this repository is useful for your research, please cite:

@@article{ding2022free,
  title={Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions},
  author={Ding, Xinpeng and Liu, Ziwei and Li, Xiaomeng},
  journal={arXiv preprint arXiv:2205.09292},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
RegionCLM		RegionCLM
matlab-eval		matlab-eval
moco		moco
models		models
resources		resources
simsiam		simsiam
tools		tools
README.md		README.md
extract_feat.sh		extract_feat.sh
frame_feature_extractor.py		frame_feature_extractor.py
main_moco.py		main_moco.py
main_region.py		main_region.py
main_simsiam.py		main_simsiam.py
requirements.txt		requirements.txt
script.sh		script.sh
train.py		train.py

xmed-lab/DistillingSelf

Folders and files

Latest commit

History

Repository files navigation

Free Lunch for Surgical Video Understanding by Distilling Self-Supervisions

Introduction

Preparation

Data Preparation

Setup & Training

Citation

About

Resources

Stars

Watchers

Forks

Languages