Skip to content

YinTuo/AutoIHC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fully Automated Stain Quantification Framework for IHC Whole Slide Images in Breast Cancer

Tuo Yin, Frédéric Lifrange, Zoë Denis, Alex de Caluwé, Laurence Buisseret, Xavier Catteau, Clara Legros, Nick Reynaert, Jennifer Dhont

Framework Overview


Overview of the proposed H-scoring framework, consisting of three modules: (1) a tumor–stroma segmentation module (TSM), (2) a nuclei segmentation module (NSM), and (3) an H-score estimation module (HEM).

Environment and Runtime

On two NVIDIA RTX A6000 GPUs, with the CUDA toolkit enabled and the dependencies listed in requirements.txt installed, this approach enables H-score estimation for a gigapixel immunohistochemistry (IHC) whole slide image (WSI) in approximately 2 minutes.

Framework Training

1. Dataset preparation

Organize your training data in the dataset folder as follows:

dataset/                       # All images are in patches of 512 × 512 pixels  
│── TSM                        # Tumor-stroma segmentation dataset
│   │── train                  # Training set (for training and validation)
│   │   │── patches_H          # H-stains of IHC patches
│   │   │── patches_ann_gray   # Annotations (pixel values 0–3 correspond to four classes)
│
│── NSM                        # Nuclei segmentation dataset
│   │── train
│   │   │── patches            # IHC patches
│   │   │── patches_H          # H-stains of IHC patches
│   │   │── nuclei_mask        # Annotations (pixel values 255 or 0 = nuclei or others)
│
│── HEM                        # H-score estimation dataset
│   │── train
│   │   │── patches_HEM        # Tumor cell mask & stroma nuclei mask (255 or 0 = target or others)
│   │   │── patch_labels.xlsx  # Columns: patch, patch_NH, patch_H, patch_N

2. Training

python main_train_models.py --root_dir "(Root directory where the project)" --TSM_model "MoCo-SM" --NSM_model "Triple UNet" --HEM_model "VGG16Regression" --batch_size 8 --TSM_lr 0.0001 --NSM_lr 0.0001 --HEM_lr 0.001 --epochs 200

Note:
Choose TSM model from MoCo-SM (best performance), MobileNetV3, UNet, UNet++, or DeepLabV3+.
Choose HEM model from VGG16Regression (best performance), StainIntensityNet (computationally efficient), or RAM-CNN.

Framework Inference

1. Data preparation

The framework accepts whole-slide images (WSIs) in .ndpi (40× magnification; 0.25 µm per pixel) or .png (20× magnification; 0.5 µm per pixel) format.

Place WSIs in .ndpi format in the data/WSIs_ndpi folder.
Place WSIs in .png format in the data/WSIs_png folder, and comment out the following line in main_inference.py (Line 26):

convert_images_to_png_infolder(root_dir)

2. Inference

Either use:
(1) Your trained models (model parameters have been saved in the checkpoint folder after training), or
(2) The models trained on our internal dataset ([Download model weights]).

python main_inference.py --root_dir "/home/yin/pycharm/github_code" --TSM_model "MoCo-SM" --NSM_model "Triple UNet" --HEM_model "VGG16Regression" --Aggregation "Endtoend"

Note:
Choose TSM and HEM models in the same way as during training.

Key Results

The framework achieved a Spearman’s rank correlation (Spearman's ρ) in internal validation of 0.84 (95% confidence interval [CI]: 0.77–0.89) across 100 expert-annotated WSIs, outperforming state-of-the-art (ρ=0.78, 95% CI: 0.68–0.85) and matching the inter-observer variability between two expert pathologists (ρ=0.84, 95% CI: 0.63–0.94).


Comparison of H-scores (a) and H-score rankings (b) of WSIs from the internal test set (MHC-I) as assessed by two pathologists and estimated by the proposed framework (green) and a state-of-the-art (SOTA) method (blue).

Usage and License Notices

All users are responsible for reviewing the output of the developed framework to determine whether the framework meets the user’s needs and for validating and evaluating the framework before any clinical use.

Citation

If you find the framework useful for your your research and applications, please cite using this BibTeX:

@article{yin2026fully,
  title={Fully Automated Stain Quantification Framework for IHC Whole Slide Images in Breast Cancer},
  author={Yin, Tuo and Lifrange, Fr{\'e}d{\'e}ric and Denis, Zo{\"e} and de Caluw{\'e}, Alex and Buisseret, Laurence and Catteau, Xavier and Legros, Clara and Reynaert, Nick and Dhont, Jennifer},
  journal={Technology in Cancer Research \& Treatment},
  volume={25},
  pages={15330338251407734},
  year={2026},
  publisher={SAGE Publications Sage CA: Los Angeles, CA}
}

About

This is the official PyTorch implementation of the paper titled "Fully Automated Stain Quantification Framework for IHC Whole Slide Images in Breast Cancer."

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages