WARNING: This is an experimental algorithm for diagnosing eye disease. It is not intended for practical or clinical use at this stage. The author is not responsible for any misdiagnoses as a result of misuse of this software.
This algorithm uses a novel technique I call CNN Detachment Ensembling, which is detailed in this paper. Essentially, a CNN is trained, then the ANN part on the end is discarded. The convolutional layers are used as a feature extraction phase and a number of other models are ensembled using the features extracted by the conv layers as training data. Basically, the NN in CNN is replaced with an ensemble of 10 different ML models for a minimal increase in computational power required, which boosted test set accuracy here from 70.8% to 78.6%.
Install conda
, then:
conda install python=3.7
conda create --name PythonGPU
conda activate PythonGPU
conda install -c anaconda numpy jupyter matplotlib seaborn pandas scikit-learn tensorflow-gpu cuda-toolkit=9.0 py-xgboost-gpu
pip3 install opencv-python #OR conda install -c menpo opencv
For Jupyter notebook (not necessary):
conda install notebook ipykernel
ipython kernel install --user --name=PythonGPU
jupyter notebook
Then in the notebook editor click kernel->change kernel->PythonGPU.
APTOS Kaggle competition:
- train_images (data/new), test_images, train.csv (data/new.csv), test.csv
Google Diabetic Retinopathy competition (resized by tanlovesmath)
- resized_train_cropped (data/old), trainLabels_cropped.csv (data/old.csv)
All data is preprocessed and stored in data/proc/{size \in {224,299}}/{class \in {0,1,2,3,4}}
Download data, preprocess, augment, train models, train stacked models, etc. This section may be slightly out of date.
setup.sh
to download data and generate filesystempython3 processing/preprocess.py
to preprocess itpython3 processing/corruption-checker.py
and delete all corrupt imagespython3 processing/augment.py
to augment it and generate datasetspython3 models/densenet.py
to train densenet121python3 models/mobilenet.py
to train mobilenetpython3 models/extraction/*.py
to extract features using all CNNspython3 models/ensemble/model-generation.py
to train other models stacked on CNN filterspython3 models/ensemble/model-selection.py
to choose best modelspython3 models/ensemble/model-evaluation.py
to compute predictions for train and test datapython3 models/ensemble/average.py
to compute ensemble outputspython3 models/ensemble/results-eda.py
to generate confusion matrices