Self-Supervised Learning for Endoscopic Video Analysis

Code and models for MICCAI23 paper: "Self-Supervised Learning for Endoscopy Video Analysis".

Background

Self-supervised learning (SSL) has led to important breakthroughs in computer vision by allowing learning from large amounts of unlabeled data. As such, it might have a pivotal role to play in biomedicine where annotating data requires a highly specialized expertise.

In this work, we study the use of a leading SSL framework, Masked Siamese Networks (MSNs), for endoscopic video analysis such as colonoscopy and laparoscopy. To fully exploit the power of SSL, we create sizable endoscopic video datasets. Our extensive experiments show that MSN training on this data leads to state-of-the-art performance in public standard endoscopic benchmarks such as surgical phase recognition during laparoscopy and in colonoscopic polyp characterization.

Furthermore, we show that 50% the annotated data are sufficient to match the performance when training on the entire labeled datasets. Our work provides evidence that SSL can dramatically reduce the need of annotated data in endoscopy.

Pre-trained models

We release a series of models pre-trained with our method over a large corpus of endoscopic videos:

Arch	Dataset	Down-stream results	Link
ViT-S	Private Laparoscopy	Cholec80 F1: 83.4	Link
ViT-B	Private Laparoscopy	Cholec80 F1: 82.6	Link
ViT-L	Private Laparoscopy	Cholec80 F1: 84.0	Link
-	-	-	-
ViT-S	Private Colonoscopy	PolypSet Acc: 78.5	Link
ViT-B	Private Colonoscopy	PolypSet Acc: 78.2	Link
ViT-L	Private Colonoscopy	PolypSet Acc: 80.4	Link

Repository

Environment

You may use the requirements.`txt file for reproduction of our development environment.

conda create --name <env_name> --file ./requirements.txt

Data

We publish the data modules for Cholec80 experiments, which can be easily adopted to the rest of the paper. Our data pipeline is heavily adopted from TF-Cholec80.

Run prepare.py for downloading and extracting the public Cholec80 dataset:

python prepare.py --data_rootdir YOUR_LOCATION

The ./data/cholec80_images.py module contains classes for loading the pre-processed datasets into a TF dataset object.

Down-stream experiments

./down_stream/main.py is the entry point for running the downstream experiments, where a pre-trained module can be fine-tunned for the task of phase classification.

Inference

./inference.py script can be used for loading a pre-trained model and extracting representations from it.

Citation

Please cite:

@misc{hirsch2023selfsupervised,
      title={Self-Supervised Learning for Endoscopic Video Analysis}, 
      author={Roy Hirsch and Mathilde Caron and Regev Cohen and Amir Livne and Ron Shapiro and Tomer Golany and Roman Goldenberg and Daniel Freedman and Ehud Rivlin},
      year={2023},
      eprint={2308.12394},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

Our work is licensed under BSD 3-Clause license, as found in the LICENSE file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Supervised Learning for Endoscopic Video Analysis

Background

Pre-trained models

Repository

Environment

Data

Down-stream experiments

Inference

Citation

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
cholec80/samples		cholec80/samples
data		data
down_stream		down_stream
train		train
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
model.png		model.png
requirements.txt		requirements.txt

License

RoyHirsch/endossl

Folders and files

Latest commit

History

Repository files navigation

Self-Supervised Learning for Endoscopic Video Analysis

Background

Pre-trained models

Repository

Environment

Data

Down-stream experiments

Inference

Citation

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages