Skip to content

Machine Learning Model to Skin Tumor Analysis and Classification.

License

Notifications You must be signed in to change notification settings

antonioscardace/ISIC-2019-v2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Skin Tumors Classification

Personal Machine Learning Project
Antonio Scardace @ Dept of Math and Computer Science, University of Catania

CodeFactor License Open Issues credits

Introduction

Inspired by the groundbreaking paper titled "Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC)" for the ISIC 2019 challenge, this project delves into the realm of dermatology, aiming to analyse and classify skin lesions from images.

With a dataset sourced from Kaggle, comprising 25331 images, I embark on the quest to classify dermoscopic images across 8 different diagnostic categories:

  • Melanocytic nevus (NV) (50.83%)
  • Melanoma (MEL) (17.85%)
  • Basal cell carcinoma (BCC) (13.12%)
  • Benign keratosis (solar lentigo / seborrheic keratosis / lichen planus-like keratosis) (BKL) (10.36%)
  • Actinic keratosis (AK) (03.42%)
  • Squamous cell carcinoma (SCC) (02.48%)
  • Vascular lesion (VASC) (01.00%)
  • Dermatofibroma (DF) (0.94%)

Examples

Zooming into tumour images, our focus sharpens on 19080 images among 5 different diagnostic categories. As the first version of the project, I aim to discern between Benign and Malignant Tumors, crafting a dichotomy between safety and peril. Here is the breakdown:

  • Melanocytic nevus (NV) - BENIGN (55.72%)
  • Dermatofibroma (DF) - BENIGN (01.23%)
  • Melanoma (MEL) - MALIGNANT (22.78%)
  • Basal cell carcinoma (BCC) - MALIGNANT (17.01%)
  • Squamous cell carcinoma (SCC) - MALIGNANT (03.26%)

The Dataset was divided into a Training Set (85%) and a Test Set (15%). Data Augmentation techniques were applied to the Training Set to enhance model performance. As a result, the final model achieved an accuracy of 78.41% on the Test Set, with a precision of 88.57% for the most significant class (Malignant Tumor).

Test Confusion Matrix

Getting Started

So that the repository is successfully cloned and the project runs, there are a few prerequisites:

  • Adequate GPU cores and RAM for computational tasks.
  • Free disk space of at least 10GB.

Then, follow these steps to get started:

   $ git clone https://github.com/antonioscardace/ISIC-2019-v2.git
   $ cd YOUR_PATH/ISIC-2019-v2/
   $ pip install -r requirements.txt
   $ mkdir models

Now, download the dataset from Kaggle and put the images in /data/images/.
You're all set! You can start working on the project utilising any available notebook.