Skip to content

GeorgeBatch/dependency-mil

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Accurate Subtyping of Lung Cancers by Modelling Class Dependencies - Accepted to ISBI 2024

This code was built upon the https://github.com/binli123/dsmil-wsi repository. So the organisation structure is largely of the datasets folder is inherited. For faster computation, the csv features were converted into hdf5 and pt files like in https://github.com/mahmoodlab/CLAM.

Creation of the Multi-label Dataset

Source files used to make the labels

Dummy label files

Columns include the label (LUAD vs LUSC) and paths to features:

  • features_csv_file_path
  • h5_file_path
  • pt_file_path
mapping = {
    "LUAD": 0,
    "LUSC": 1,
}

DHMC has only LUAD slides, so all entries in the label field are 0:

TCGA has both LUAD and LUSC so entries in the label field include 0 and 1:

Run the creation code

Run the labels creation code notebook. The code will create the files in labels/experiment-label-files/.

Note, the combined dataset for training/validation is not the same as in the paper since the in-house DART dataset is not publicly available. The test set, however, is the same as in the paper and is fully available in the 8-label task and 5-label task.

Acknowledgements

George Batchkala is supported by Fergus Gleeson and the EPSRC Center for Doctoral Training in Health Data Science (EP/S02428X/1). The work was done as part of DART Lung Health Program (UKRI grant 40255).

The computational aspects of this research were supported by the Wellcome Trust Core Award Grant Number 203141/Z/16/Z and the NIHR Oxford BRC. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health.