Skip to content

Dermatoscopic Image Classification: Exploring the classification of cancerous & pre-cancerous lesions.

Notifications You must be signed in to change notification settings

discourses/derma

Repository files navigation

Associated Repositories:

  • dermatology: An image data repository of the original and augmented images.
  • augmentation: This repository's package creates augmentations of the original images.

Tools

  • reader: In progress. The docker image of this repository will be used to run containers that download, and dearchive if necessary, data sets into a volume [for this repository, the augmentation repository, and any other]

Associated Colab Notebook:



Derma

Note, a link to a Colab interface is upcoming. Colab offers access to GPU machines; the times per epoch are superb, hence prototyping is continuing within Colab.



Notes In Progress

This repository uses the wonderful continuous integration & delivery tool GitHub Actions. Hence, a variety of tests are conducted continuously. The badges below will continuously highlight the state of each repository branch w.r.t. GitHub Action's actions.

branch state
develop
master
codebuild develop

Brief Start Notes

Sometimes the models are run in an AWS machine via docker images; greyhypotheses @ Docker Hub.

An instance, i.e., container, of the image greyhypotheses/derma:importing serves dermatoscopic images to the deep learning model/s; importing will be replaced with reader

# Import greyhypotheses/derma:importing from Docker Hub.
sudo docker pull greyhypotheses/derma:importing

# Running docker package greyhypotheses/derma:importing
sudo docker run -v ~/images:/app/images greyhypotheses/derma:importing

The feature extraction deep learning model

# Import greyhypotheses/derma:FeatureExtractionDL from Docker Hub.
sudo docker pull greyhypotheses/derma:FeatureExtractionDL

# Runs the FeatureExtractionDL model.  It requires one string argument; the string
# must be a URL oF A  YAML file of hyperparameters, e.g.,
# https://raw.githubusercontent.com/discourses/derma/develop
# /resources/hyperparameters/pattern.yml
sudo docker run -v ~/images:/app/images -v ~/checkpoints:/app/checkpoints 
    greyhypotheses/derma:FeatureExtractionDL src/main.py $1

Technical Notes

  • Local operating system: Windows 7
  • Cloud test machine: GitHub Actions Ubuntu

Locally, the python environment was created via venv

>> python -m venv env

This virtual environment can be deleted via the command rm -r env (Cygwin). The environment is activated via

>> env\Scripts\activate.bat

within a Windows operating system; deactivated via the command env\Scripts\deactivate.bat. The command

>> env\Scripts\pip list

is used to list the set of directly & indirectly installed packages. Always remember to upgrade pip before populating the environment

>> python -m pip install --upgrade pip==21.3.1

The requirements document lists the directly installed packages and their versions; and a few indirectly installed pckages. Thus far, the TensorFlow version used by this package/repository is TensorFlow 2.5.0

>> env\Scripts\pip install --upgrade tensorflow==2.7.0

The TensorFlow installation step installs numpy & requests, and the rest

pip install --upgrade pandas
pip install --upgrade scikit-learn
pip install --upgrade pytest coverage pytest-cov pylint flake8
pip install --upgrade PyYAML

The Python version is can be checked via python --version. Finally, the requirements document was/is created via

env\Scripts\pip freeze -r docs/filter.txt > requirements.txt

It is edited -> the packages above the line ## The following requirements were added by pip freeze: are the directly installed packages.



Automatic Identification of Skin Lesion Types

Via Dermoscopic Images of Cancerous/Pre-cancerous Skin Lesions


Problem Statement

The World Health Organisation lists cancer as the second leading cause of death globally; the 2018 death estimate is 9.6 million. And, early diagnosis or effective assessment is usually critical to effective treatment and survival. One common tool for early diagnosis, cancer precursor investigations, and/or tumour assessment is medical imaging. For example, magnetic resonance imaging for brain tumours, chest radiographs for investigating symptoms suggestive of lung cancer, mammography for breast cancer, etc. A challenge, as the mammography paper illustrates, is accurate interpretation of medical images.

This project is focused on image classification for cancer diagnostics, it is specifically focused on the International Skin Imaging Collaboration’s dermoscopic images of skin lesions. The aim is the

Automatic classification of dermoscopic images according to 9 diagnostic classes: Melanoma, Melanocytic Nevus, Basal Cell Carcinoma, Actinic Keratosis, Benign Keratosis (solar lentigo / seborrheic keratosis / lichen planus-like keratosis), Dermatofibroma, Vascular Lesion, Squamous Cell Carcinoma, Unknown



Rationale

This project has been chosen as a precursor to applying bayesian deep learning, amongst other bayesian techniques, to diagnostic, prognostic, and pathogenetic challenges in medicine. Uncertainty is an inherent aspect of medical and health diagnostics, but deep learning methods that consider uncertainty are rarely used due to the scalability challenges of such methods. A key example being bayesian deep learning methods.

The first objective of this project is to

  • Apply deep learning, amongst other methods, to the stated problem within an engineering design/prototype that is not constrained by scalability.

  • Investigate and apply interpretability options.

Note: non-bayesian deep convolutional neural networks has been applied to skin cancer images.



The Data

As noted above, this project's modelling challenge is focused on the International Skin Imaging Collaboration’s (ISIC's) dermoscopic images of skin lesions. It is specifically using a subset of the images of the ISIC 2019 Challenge, i.e.,


file description size
ISIC_2019_Training_Input.zip 25,331 JPEG images of skin lesions ~9GB
ISIC_2019_Training_Metadata.csv 25,331 metadata entries of age, sex, general anatomic site, and common lesion identifier 1.15MB
ISIC_2019_Training_GroundTruth.csv 25,331 entries of gold standard lesion diagnoses 1.23MB

To ensure availability these three data files are also stored in a GitHub repository. The images are either the same as those hosted by the ISIC Archive API or down-sampled versions. Future modelling projects might involve re-visiting the original images of the ISIC Archive API. The API is documented at ISIC Archive API Documentation. The data set outlined below might be used if the ground truths are released in time.



Preliminary Analysis of Metadata

A preliminary analysis of the metadata is hosted in the notebook preliminary.ipynb.



Copyright and Attribution

Details: https://challenge2019.isic-archive.com/data.html

The images and metadata of the "ISIC 2019: Training" data used herein are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC-BY-NC). The copyright holders are:


References

  1. P. Tschandl, C. Rosendahl, H. Kittler: The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions, Scietific Data, Volume 5, Article Number: 180161, 2018, doi:10.1038/sdata.2018.161
  2. Noel C. F. Codella, David Gutman, M. Emre Celebi, Brian Helba, Michael A. Marchetti, Stephen W. Dusza, Aadi Kalloo, Konstantinos Liopyris, Nabin Mishra, Harald Kittler, Allan Halpern: Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC), 2018, arXiv:1710.05006
  3. Noel Codella, Veronica Rotemberg, Philipp Tschandl, M. Emre Celebi, Stephen Dusza, David Gutman, Brian Helba, Aadi Kalloo, Konstantinos Liopyris, Michael A. Marchetti, Harald Kittler, Allan Halpern: Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC), 2019, arXiv:1902.03368
  4. Marc Combalia, Noel C. F. Codella, Veronica Rotemberg, Brian Helba, Veronica Vilaplana, Ofer Reiter, Cristina Carrera, Alicia Barreiro, Allan C. Halpern, Susana Puig, Josep Malvehy: BCN20000: Dermoscopic Lesions in the Wild, 2019, arXiv:1908.02288

About

Dermatoscopic Image Classification: Exploring the classification of cancerous & pre-cancerous lesions.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages