This is a Python implementation of the paper:
@Article{graphbased,
author = {Mario Parreno-Centeno and Guidantonio Malagoli Tagliazucchi and Maria Secrier},
title = {A deep learning and graph-based approach to characterise the immunological landscape and spatial architecture of colon cancer tissue},
journal = {arXiv preprint},
year = {2022},
}
- Introduction
- Predicting the Immunescore level and presence of cells
- Exploring our database
- Graph analysis
- Acknowledgements
- Contact
Tumour immunity is key for the prognosis and treatment of colon adenocarcinoma, but its characterisation remains cumbersome and expensive, requiring sequencing or other complex assays. Detecting tumour-infiltrating lymphocytes in haematoxylin and eosin (H&E) slides of cancer tissue would provide a cost-effective alternative to support clinicians in treatment decisions, but inter- and intra-observer variability can arise even amongst experienced pathologists when assessing a sample. Furthermore, the compounded effect of other cells in the tumour microenvironment is challenging to quantify but could yield useful additional biomarkers.
To predict the level of Immunescore and presence of cells, we used a model consisting of two parts: a convolutional neural network (CNN) feature extractor followed by a non-linear classifier. We based the feature extractor backbone in the InceptionV3 architecture.
To train the model, first you have to arrange a folder structure of the dichotomised dataset such as:
├── train
│ ├── high level class
│ └── low level class
└── test
├── high level class
└── low level class
After, pointing to the path of your foldefr structure in the train.py file, to start the training you have only to run:
python train.py
It will generate two graps showing the learning curves and accuracy performnace in the same folder than the train.py file is included. In addition, it will save the model for each epochs in a folder call weigths.
You can explore our database following the next steps:
- download a copy of our database here. It is a zip file including all the necessary files.
- You will need a version of the Neo4j software running in your computer. We have used the open source version Neo4j Community Edition.
- Ensure the system service is not running on your machine. In Linux:
sudo systemctl stop neo4j.service
- Unzip the same graph.db.zip into the appropriate database folder. By default in Linux /data/database
- Restart the Neo4j system service
sudo systemctl start neo4j.service sudo systemctl enable neo4j.service
- Ensure Neo4j us running:
sudo systemctl enable neo4j.service
You can see a live demo in:
SupplementaryVideo1.mp4
Once you have an operational copy of our database you can replicate the queries we conduct in the paper for our analysis following the Jupyter notebook in.
- MPC was supported by an Academy of Medical Science Springboard award (SBF004\1042).
- GMT was supported by a Wellcome Seed Award in Science (215296/Z/19/Z).
- MS was supported by a UKRI Future Leaders Fellowship (MR/T042184/1) and work in MS’s lab was supported by a BBSRC equipment grant (BB/R01356X/1) and a Wellcome Institutional Strategic Support Fund (204841/Z/16/Z).
To whom correspondence should be addressed @msecrier - feel free to contact me!