Skip to content

MLMIP/MMCRL

Repository files navigation

Multi-view Masked Contrastive Representation Learning for Endoscopic Video Analysis

This repository provides the official PyTorch implementation of the paper Multi-view Masked Contrastive Representation Learning for Endoscopic Video Analysis image

Installation

We can install packages using provided environment.yaml.

cd MMCRL
conda env create -f environment.yaml
conda activate MMCRL

Data Preparation

We use the datasets provided by Endo-FM and are grateful for their valuable work.

weights

pretrain weight:

pretrain

downstream weight:

Classification

Segmentation

Detection

Pre-training

cd MMCRL
wget -P checkpoints/ https://github.com/kahnchana/svt/releases/download/v1.0/kinetics400_vitb_ssl.pth
bash scripts/pretrain.sh

Fine-tuning

# PolypDiag (Classification)
cd MMCRL
bash scripts/eval_finetune_polypdiag.sh

# CVC (Segmentation)
cd MMCRL/TransUNet
python train.py

# KUMC (Detection)
cd MMCRL/STMT
bash script/train_stft.sh

Acknowledgement

Our code is based on Endo-FM, DINO, TimeSformer, SVT, TransUNet, and STFT. Thanks them for releasing their codes.

Citation

@article{hu2024one,
  title={Multi-view Masked Contrastive Representation Learning for Endoscopic Video Analysis},
  author={Hu, Kai and Xiao, Ye and Zhang, Yuan and Gao, Xieping},
  journal={Advances in Neural Information Processing Systems},
  year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors