Histopathology-Domain-Specific-Pretraining

This Repository is Linked to Code and Results for the paper published in "MILLanD 2023: the 2nd Workshop on Medical Image Learning with noisy and Limited Data" in MICCAI 2023".

Paper

Recommended Citation

Study Design

Figure-1 : Do different weight initialization matter? The study is designed from the perspective of an AI user who can choose between multiple pretrained models (domain or non-domain, supervised or self-supervised) options for a given task. The best pretrained model is the one that has the highest accuracy on the task and is least affected by distribution shifts. This study provides a framework to choose amongst pretrained models and select the most advantageous for the task.

How to use

Comet Ml Requirement

This repository uses comet_ml for logging of results. You will need to create a comet ml account and provide the API key, project name and workspace. Details can be added in line 35 of baselines/baseline.py.

Datasets used

CRAG :- Download Link
GLAS :- Download Link
KUMAR, CPM17, TNBC :- Download Link

Dataset should be arranged as :-

cpm17/
├── test
│   ├── Images
│   ├── Labels
│   └── Overlay
└── train
    ├── Images
    ├── Labels
    └── Overlay

kumar
├── test_diff
│   ├── Images
│   ├── Labels
│   └── Overlay
├── test_same
│   ├── Images
│   ├── Labels
│   └── Overlay
└── train
    ├── Images
    ├── Labels
    └── Overlay
CRAG
├── annotations
│   ├── train
│   └── valid
└── images
    ├── train
    └── valid

GLAS
├── annotations
│   ├── train
│   └── valid
└── images
    ├── train
    └── valid

Command Lines to Train the models

Download the dataset and put them in a single folder. Now you can pass the path to folder as a commandline option using -dataset_root or set the dataset root in "dataset_file.py" line number 17.
Set the scratch root as the path where you want to store the results and models. Can be done using command line or line 18 in "dataset_file.py".

To Train the model with random encoder intialization

python baselines.py -nepoch 100 -patchSize 256 -batchSize 4

To Train the model with ImageNet Supervised Encoder intialization

python baselines.py -nepoch 100 -patchSize 512 -batchSize 4 -resentInit ImageNetV1 python baselines.py -nepoch 100 -patchSize 512 -batchSize 4 -resentInit ImageNetV2

To Train the model with ImageNet Self Supervised Encoder intialization

python baselines.py -nepoch 100 -patchSize 512 -batchSize 4 -resentInit SSLImage -sslType BT

To Train the model with Histopathology Self Supervised Encoder intialization

Models taken from

python baselines.py -nepoch 100 -patchSize 512 -batchSize 4 -resentInit SSLPathology -sslType BT

To Change the dataset

For Gland Segmentation there are two choices, "-sourcedataset glas" and "-sourcedataset crag". For cell segmentation there are three choices, "-sourcedataset cpm17", "-sourcedataset tnbc" and "-sourcedataset kumar "

If you use this repository, please cite the following.

@inproceedings{kataria2023pretrain,
  title={To pretrain or not to pretrain? A case study of domain-specific pretraining for semantic segmentation in histopathology},
  author={Kataria, Tushar and Knudsen, Beatrice and Elhabian, Shireen},
  booktitle={Workshop on Medical Image Learning with Limited and Noisy Data},
  pages={246--256},
  year={2023},
  organization={Springer}
}

@article{kataria2023automating,
  title={Automating Ground Truth Annotations for Gland Segmentation Through Immunohistochemistry},
  author={Kataria, Tushar and Rajamani, Saradha and Ayubi, Abdul Bari and Bronner, Mary and Jedrzkiewicz, Jolanta and Knudsen, Beatrice S and Elhabian, Shireen Y},
  journal={Modern Pathology},
  volume={36},
  number={12},
  pages={100331},
  year={2023},
  publisher={Elsevier}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
baselines		baselines
dataloader		dataloader
images		images
inference		inference
models		models
trainingLoop		trainingLoop
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baselines

baselines

dataloader

dataloader

images

images

inference

inference

models

models

trainingLoop

trainingLoop

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Histopathology-Domain-Specific-Pretraining

Paper

Recommended Citation

Study Design

How to use

Comet Ml Requirement

Datasets used

Command Lines to Train the models

To Train the model with random encoder intialization

To Train the model with ImageNet Supervised Encoder intialization

To Train the model with ImageNet Self Supervised Encoder intialization

To Train the model with Histopathology Self Supervised Encoder intialization

To Change the dataset

If you use this repository, please cite the following.

About

Releases

Packages

Languages

tushaarkataria/Histopathology-Domain-Specific-Pretraining

Folders and files

Latest commit

History

Repository files navigation

Histopathology-Domain-Specific-Pretraining

Paper

Recommended Citation

Study Design

How to use

Comet Ml Requirement

Datasets used

Command Lines to Train the models

To Train the model with random encoder intialization

To Train the model with ImageNet Supervised Encoder intialization

To Train the model with ImageNet Self Supervised Encoder intialization

To Train the model with Histopathology Self Supervised Encoder intialization

To Change the dataset

If you use this repository, please cite the following.

About

Resources

Stars

Watchers

Forks

Languages