Skip to content

Latest commit

 

History

History
63 lines (35 loc) · 2.56 KB

File metadata and controls

63 lines (35 loc) · 2.56 KB

Comprehensive analysis of lung cancer pathology images

Co-first Author: Shidan Wang, QBRC & Alyssa Chen

Contact: sdw95927@gmail.com

Scripts for https://www.nature.com/articles/s41598-018-27707-4, Comprehensive analysis of lung cancer pathology images to discover tumor shape and boundary features that predict survival outcome.

Flowchart

Citation

Wang, S., Chen, A., Yang, L., Cai, L., Xie, Y., Fujimoto, J., ... & Xiao, G. (2018). Comprehensive analysis of lung cancer pathology images to discover tumor shape features that predict survival outcome. bioRxiv, 274332.

This repository includes

Requirements

  • Python 2

  • keras==2.0.5

  • tensorflow==1.2.1

  • Other commonly used python libraries

  • R

  • survival==2.41-3

The pipeline

0) Annotate slides

Aperio Imagescope is used to annotate the pathology slides (.svs files) and generate the corresponding .xml files. "Tumor" and "normal" regions are circled out from which the training set image patches are extracted.

1) Generate patches

In total, 2475 ROI, 2139 Normal, and 730 White patches were generated. One can easily generate more training/testing samples by running ./scripts/1_generatePatches.py. Below is a sample ROI, normal, and white patch, respectively.

roi normal white

2) Train the InceptionV3 model

The thousands of image patches are used to train an InceptionV3 model by running ./scripts/2_modelInception.py. Training curve:

drawing

3) Apply the model to the whole pathology slide through a sliding window

A tumor region heatmap for a pathology image can be generated using ./script/3_getHeatmap.py:

heatmap

4) Extract tumor shape features

Done by ./script/4_generateSlideProps.py

5) Survival analysis

Done by ./script/5_univariateAnalysisSlides.R and 6_coxph_model.R. Prediction performance in TCGA validation dataset:

TCGA validation