To run this project locally, you need to have Python installed on your system.
- Python: Version 3.7 or higher is recommended.
- Python Libraries: Install the required libraries using pip. It's highly recommended to use a virtual environment.
pip install Flask tensorflow tensorflow-keras numpy opencv-python werkzeug Pillow matplotlib seaborn scikit-learn missingno albumentations imbalanced-learn
Flask: To build the web application.tensorflow,tensorflow-keras: For building, training, and loading the deep learning model.numpy: For numerical operations.opencv-python(cv2): For image processing.werkzeug: Used by Flask for file uploads (secure_filename).Pillow: For image handling (used by Keras preprocessing and PIL).matplotlib,seaborn,missingno: For data exploration and visualization in the notebook.scikit-learn: For data splitting, preprocessing (encoding, scaling), and evaluation metrics.albumentations: For image augmentation (used in the notebook).imbalanced-learn: For handling class imbalance (e.g.,RandomOverSamplerused in the notebook).
- Dataset: The HAM10000 dataset is required. This dataset includes
HAM10000_metadata.csv,hmnist_28_28_RGB.csv, and two folders containing image files (HAM10000_images_part_1,HAM10000_images_part_2). This dataset is commonly available on platforms like Kaggle. You should place theHAM10000_metadata.csv,hmnist_28_28_RGB.csv,HAM10000_images_part_1, andHAM10000_images_part_2files/folders in the project's root directory for local use, matching the paths expected by the notebook.
-
Clone the repository:
git https://github.com/fatimxox/Skin_Cancer_Classification_Web_Using_DeepLearning.git cd Skin_Cancer_Classification_Web_Using_DeepLearning -
Set up a virtual environment (recommended):
python -m venv venv # On Windows .\venv\Scripts\activate # On macOS/Linux source venv/bin/activate
-
Install dependencies: With your virtual environment activated, install the required libraries:
pip install -r requirements.txt # If you create a requirements.txt # OR manually install if you don't have requirements.txt pip install Flask tensorflow tensorflow-keras numpy opencv-python werkzeug Pillow matplotlib seaborn scikit-learn missingno albumentations imbalanced-learn
(You can generate a
requirements.txtfile after installing dependencies usingpip freeze > requirements.txt) -
Obtain the dataset: Download the HAM10000 dataset and place its files and folders in the project's root directory, following the structure expected by the notebook.
-
Run the Notebook: Execute all cells in
skin-cancer-classification.ipynb. This notebook performs data loading, EDA, preprocessing (including handling missing values, encoding/scaling tabular data, and demonstrating image/augmentation processing), builds and trains the hybrid neural network model, evaluates its performance, and saves the trained model asbest_model.kerasand/ortruetrue_model.h5. You need to ensure that the model file loaded byapp.py(model.h5) matches the one saved by the notebook (truetrue_model.h5orbest_model.keras). You might need to renametruetrue_model.h5tomodel.h5or adjust the loading path inapp.py.
- Data Analysis & Preprocessing (Notebook): The
skin-cancer-classification.ipynbnotebook extensively explores the HAM10000 dataset, identifies missing values and class imbalances, visualizes data distributions, preprocesses both the tabular metadata (handling missing age, encoding categorical features, scaling numerical age) and image data (demonstrating augmentation and reshaping). It then splits the data into training, validation, and test sets. - Hybrid Model Architecture: The notebook defines and trains a Keras model with two input branches: one for the image data (using
Conv2DandMaxPooling2Dlayers followed byFlattenandDense) and another for the tabular data (usingDenselayers). These branches are concatenated, followed by additionalDenselayers and a final 7-neuron output layer withsoftmaxactivation for classification. - Model Training (Notebook): The model is compiled with an Adam optimizer and sparse categorical crossentropy loss. It is trained using the prepared training data, with validation performed on a separate set. Callbacks like Early Stopping, Model Checkpoint (to save the best model), and ReduceLROnPlateau are used to optimize the training process. The trained model is saved.
- Model Loading (Flask App): The
app.pyscript loads the saved Keras model (model.h5) into memory when the Flask application starts. - Preprocessing for Prediction (Flask App):
app.pyincludes functionspreprocess_imageandpreprocess_user_inputs.preprocess_image: Takes an image file object, reads it, converts to RGB (if needed), resizes it to 28x28 (the input size specified in the notebook's model architecture for image input), applies normalization, and reshapes it to the model's expected image input shape(1, 28, 28, 3).preprocess_user_inputs: Takes age, sex, and localization inputs, applies the same scaling (usingStandardScaler- note: notebook usesMinMaxScalerfor age, this is another inconsistency to clarify or align) and encoding used during training, and prepares them for the tabular input branch(1, 4).
- Prediction Endpoint (
/api/predictPOST):- The Flask app receives a POST request containing the image file and form data (age, sex, localization).
- It preprocesses the image and the user inputs using the dedicated functions.
- It passes the preprocessed image and user inputs to the loaded Keras model's
predictmethod (model.predict([processed_image, user_inputs])). - It processes the raw prediction probabilities, identifies the top 3 most likely classes, retrieves relevant information (name, risk level, description, treatments) for each class using the
get_skin_condition_infofunction. - A JSON response is returned to the frontend containing the top predictions and user inputs.
- Comprehensive error handling is implemented to manage various potential issues during the process.
- Web Interface (HTML + JavaScript):
index.html: Provides a landing page with information about the project and skin cancer types.predict.html: Contains the form for user inputs and image upload.- JavaScript (
static/js/prediction.js) handles frontend interactions:- Populating location dropdowns.
- Handling drag-and-drop/file selection for the image.
- Displaying image preview.
- Submitting form data (including the image) to the
/api/predictendpoint using Fetch API. - Displaying a loading spinner during processing.
- Receiving the JSON response and populating the modal window with the prediction results (diagnosis, probability/progress bar, detailed analysis, recommendations).
- Includes basic frontend validation (e.g., required fields).
- CSS (
static/styles/main.css,static/styles/predict.css) styles the web application for a modern look and feel, including the dark mode functionality handled bystatic/js/theme.js.
The notebook skin-cancer-classification.ipynb provides detailed evaluation of the trained model, including:
- Accuracy: Reports accuracy on the training, validation, and test sets.
- Loss & Accuracy Plots: Visualizes the training history over epochs, showing how loss decreased and accuracy increased for both training and validation sets.
- Classification Report: Provides per-class metrics (Precision, Recall, F1-Score) and overall averages, indicating how well the model performs for each specific type of skin lesion.
- Confusion Matrix: A heatmap visualizing the true vs. predicted counts for each class, helping identify which classes the model confuses.
- Individual Metric Plots: Visualizes Precision, Recall, and F1-Score for each class individually, providing a clearer picture of performance across the imbalanced dataset.
These evaluations are crucial for understanding the model's strengths and weaknesses and estimating its performance on unseen data.
- Start the Flask server: Follow the installation steps and run
python app.pyin your terminal. Ensure your virtual environment is activated and you are in the project's root directory. - Open in browser: Navigate to
http://127.0.0.1:5000/(or the address specified by Flask) in your web browser. - Navigate to Prediction Page: Click "Get Started" or navigate directly to
/predict. - Input Data: Enter the patient's age, select sex and lesion location from the dropdowns.
- Upload Image: Drag and drop an image of the skin lesion onto the designated upload area, or click the area to browse for the file. Ensure the file is an image (JPG, PNG, JPEG, GIF) and ideally below 16MB (configurable in
app.py). - Analyze: Click the "Analyze Image" button.
- View Results: A modal window will appear showing the analysis results, including the predicted diagnosis, a probability bar, a description of the condition, and recommended treatments.
- Dark Mode: Toggle the dark mode button in the navigation bar to switch themes.
- Align Preprocessing: Ensure the image preprocessing pipeline implemented in
app.pyexactly matches the final preprocessing applied to the data used for training the model in the notebook. Similarly, align the scaler used for age (MinMaxScaler vs StandardScaler). - Display Confidence in UI: Include the confidence percentage for the predicted diagnosis in the web application's results modal.
- Multiple Image Inputs: Allow users to upload multiple images of the same lesion from different angles for potentially more robust predictions.
- User Feedback: Implement a system for users (especially medical professionals) to provide feedback on prediction accuracy to gather data for future model retraining.
- More Detailed Patient Data: Explore incorporating additional relevant patient metadata if available (e.g., family history, sun exposure habits) to potentially improve model performance.
- Explainability: Integrate techniques like Grad-CAM or LIME to visualize which parts of the image were most influential in the model's prediction.
- Advanced Model Architecture: Experiment with transfer learning from more powerful pre-trained CNNs (e.g., VGG, ResNet, other EfficientNet variants) directly on the RGB images (28x28x3 input), or explore different ways to combine image and tabular features.
- Improve UI/UX: Enhance the web interface design, add more interactive elements, and provide clearer visual feedback during loading and processing.
- Error Handling Refinements: Add more specific error messages for different types of input validation or processing failures.
- Production Deployment: Package the application for deployment on a web server using a production-ready WSGI server (e.g., Gunicorn or uWSGI).
- Dockerization: Create a Dockerfile to containerize the application.
This project is licensed under the MIT License. See the LICENSE file in the repository for details.
If you have any questions about the project, feel free to open an issue on the GitHub repository.