This repository contains the code for Ink Removal in Whole Slide Images using Hallucinated Data. The trained model weights are at this link
This project is about identifying and removing ink markings from histopathology whole slides for aiding downstream computational analysis. The algorithm requires no annotation or manual curation of data and requires only clean slides, making it easy to adapt and deploy in new set of histopathology slides.
The methodlogy consists of two networks:-
- Ink filter: A binary classifier with Resnet 18 backbone
- Ink corrector: Pix2pix module for removing ink from a patch by image to image translation An overview of the methodology and its results are shown below
opencv
dominate
visdom
trainer - https://github.com/Vishwesh4/TrainerCode
pytorch-gpu
wandb
openslide
scikit-learn
scipy
scikit-image
The project has 6 modules:-
- Ink filter module -
./train_filter
- Ink removal module (Pix2pix) -
./ink_removal
- Patch Extraction -
./modules/patch_extraction
- Image Metric Calculate -
./modules/metrics
- Registration -
./modules/register
- Deployment of methodology over new slides -
./deploy
- The model can be trained by modifying
config.yml
file, specifying the location of path of clean slides to be used, and set of colors to be used - The training can be done by using
python train.py -c [CONFIG FILE LOCATION]
- The code has been taken from the original repository link
- For training with your own dataset, please follow a similar code structure to
./ink_removal/data/dcisink_dataset.py
or./ink_removal/data/tiger_dataset.py
. Mixture of the two datasets was used for the given model./ink_removal/data/mixed_dataset.py
- The model can be trained by using
./train_pix2pix.sh
- The model can be tested by using
./test_pix2pix.sh
For testing, corresponding ink and clean slides should be available 5. The image metrics can be calculated by using
./run_calc_metrics.sh
The test model name has to be specified
- The modules can be deployed using the class
Ink_deploy
. An example is shown in./deploy/process.py
. It also has a script./deploy/construct_wsi.py
for running algorithm over a whole slide image, however it expects sedeen annotation.
ink_deploy = Ink_deploy(filter_path:str=INK_PATH,
output_dir:str=None,
pix2pix_path:str=PIX2PIX_PATH,
device=torch.device("cpu"))
- Vishwesh Ramanathan (@Vishwesh4)
If you want to contact, you can reach the authors by raising an issue or email at vishweshramanathan@mail.utoronto.ca
- The registeration code
./modules/register/register.py
was developed by Wenchao Han at Sunnybrook Research Institute (wenchao.han@sri.utoronto.ca) - The pix2pix code was taken from link
- The
./modules/metrics/quality_metrics.py
code was taken from link
@inproceedings{ramanathan2023ink,
title={Ink removal in whole slide images using hallucinated data},
author={Ramanathan, Vishwesh and Han, Wenchao and Bassiouny, Dina and Rakovitch, Eileen and Martel, Anne L},
booktitle={Medical Imaging 2023: Digital and Computational Pathology},
volume={12471},
pages={230--238},
year={2023},
organization={SPIE}
}