A deep learning-based system for detecting and classifying skin cancer using the HAM10000 dataset. This project implements a CNN architecture with advanced techniques to achieve high accuracy in skin lesion classification.
This system uses deep learning to classify skin lesions into different categories of skin cancer, providing a reliable tool for early detection and diagnosis assistance.
- Multi-class classification of skin lesions
- Advanced data augmentation techniques
- Transfer learning with pre-trained models
- Ensemble learning approach
- Real-time prediction capabilities
- Comprehensive model evaluation metrics
- Python 3.8+
- TensorFlow 2.x
- Keras
- OpenCV
- NumPy
- Pandas
- Scikit-learn
- EfficientNetB0 (Base model)
- ResNet50 (Ensemble model)
- Custom CNN Architecture
- Albumentations for advanced image augmentation
- PIL for image processing
- Matplotlib for visualization
skin_cancer_detection/
├── src/
│ ├── data_loader.py # Data loading and preprocessing
│ ├── model.py # Model architecture definitions
│ ├── prepare_data.py # Data preparation and augmentation
│ ├── train.py # Training pipeline
│ └── utils.py # Utility functions
├── notebooks/
│ └── skin_cancer_analysis.ipynb # Analysis and visualization
├── data/ # Dataset directory (not tracked in git)
├── models/ # Saved model checkpoints
├── requirements.txt # Project dependencies
└── README.md # Project documentation
- Image resizing and normalization
- Advanced data augmentation:
- Random rotations and flips
- Color jittering
- Random brightness/contrast
- Elastic transformations
- Cutout augmentation
- Base model: EfficientNetB0 with custom head
- Ensemble model: ResNet50
- Custom CNN with:
- Batch normalization
- Dropout layers
- Global average pooling
- Dense layers with ReLU activation
- Transfer learning with fine-tuning
- Learning rate scheduling
- Early stopping
- Model checkpointing
- Cross-validation
- Mixed precision training
- Gradient clipping
- Learning rate warmup
- Model ensembling
The system achieves:
- Training accuracy: ~95%
- Validation accuracy: ~92%
- Test accuracy: ~90%
- F1-score: ~0.89
- Precision: ~0.91
- Recall: ~0.88
- Clone the repository:
git clone https://github.com/kailhashed/Skincancer_detection.git
cd Skincancer_detection- Create and activate virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Download the HAM10000 dataset and place it in the
data/directory
- Prepare the data:
python src/prepare_data.py- Train the model:
python src/train.py- Use the trained model for predictions:
from src.model import load_model
from src.utils import preprocess_image
model = load_model('models/best_model.h5')
prediction = model.predict(preprocess_image('path_to_image.jpg'))Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- HAM10000 dataset creators and contributors
- TensorFlow and Keras teams
- Open source community for various tools and libraries