Cervical cancer (CC) is the fourth most common malignant tumor among women worldwide. Here, we proposed a robust deep convolutional neural cervical model for cervical cancer screening.
- References:
- Xueguang Li, Mingyue Du et al. Deep convolutional neural networks for cervical cancer screening and diagnosis using active learning strategy (submitted)
- Install Labelme (https://github.com/wkentaro/labelme)
- Install and Configuring tensorflow 1.14, tf_config.ipynb shows how to setup tensorflow_gpu-1.14 on a PC machine.
- The configuration was tested at PC (Windows 10) and Ubuntu (10.10) workstations, CPU: i7-960@3.20GHz quad-core. Memory: 16GB. Graphics card: GeForce GTX 1080 Ti 11GB and GTX 2080 Ti 11GB
- Cloning this github repository to local machine.
- Downloading/Collect testing data (see below)
- 500 TCT manually labelled images (200X magnification) which contains at least one cancer cell 500_tct_labeled_images (827M)
- 400 TCT whole slide images (WSI) (~800000 images) are avaliable from the corresponding authors on reasonable request
- yang-211-model-1.tsv The cell type prediction results of T1 model for 211 cancer patients
- yin-189-model-T1.tsv The cell type prediction results of T1 model for 189 normal patients
- yang-211-model-A3.tsv The cell type prediction results of A3 model for 211 cancer patients
- yin-189-model-A3.tsv The cell type prediction results of A3 model for 189 normal patients
- coco model (245M) coco Model which can be used as the initial model
- T1_Model (171M) from convertional training method (450/50)
- A1_Model (171M) from the first iteration of active learning
- A2_Model (171M) from the second iteration of active learning
- A3_Model (171M) from the second iteration of active learning
- hpv.py Main script to launcth RCNN model training and prediction
- PR-Curve.py Script to calculate the precision and draw the PR Curves
- train.cfg Config file for training model
- predict.cfg Config file for predicting
- crossvalidation.py Python code to run 10x cross valiation for all ML models (Random forest, Logistic regression, SVM)
- patient_classifier.ipynb Python notebook to make ML models (Random forest, Logistic regression, SVM)
- xgbools.ipynb Python notebook to make XGBoost model.
- Download 500_tct_labeled_images to a dictory (e.g. ./cc_tct_labeled_500_v1)
- Modify configure files train.cfg or predict.cfg as needed
python hpv.py train --config ./config/train.cfg
python hpv.py detect --config ./config/predict.cfg
- Splict the training dataset into serveral (e.g. 3) directories
- Make train.cfg and predict.cfg for each interation
# Using initial model (e.g. coco model) to do the first training step
python hpv.py train --config ./config/train_1.cfg
# using the last model for first training procedure to do predict
python hpv.py detect --config ./config/predict_1.cfg
# using labelme to adjust the predict results manually
python hpv.py train --config ./config/train_2.cfg
# using the last model for first training procedure to do predict
python hpv.py detect --config ./config/predict_2.cfg
# using labelme to adjust the predict results manually
....
- Run PR-Curve.py to calculate the precision of the model using test data
- Run prediction using TCT whole slide images (WSI) to get cell classification results such as yang-211-model-1.tsv
- Run patient_classifier.ipynb and xgbools.ipynb to do patient classification.
Copyright (c) 2020 Quanyuan He Ph.D.
Contact: Dr. Quanyuan He , Dr. Junhua Zhou
Released under GPLv3. See license for details.
This software is supplied 'as is' without any warranty or guarantee of support. The developers are not responsible for its use, misuse, or functionality. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability arising from, out of, or in connection with this software.