Skip to content

trunglap923/Chest-Cancer-Classification

Repository files navigation

Chest Cancer Classification

An end-to-end Machine Learning project to classify chest CT scans as Normal or Adenocarcinoma Cancer (and potentially other classes). This project demonstrates a complete MLOps pipeline using TensorFlow/Keras, DVC (Data Version Control), MLflow, Flask, Docker, and GitHub Actions.

Table of Contents

Live Demo

Check out the live application here: Chest Cancer Diagnostic Demo

Project Overview

This application takes a chest CT scan image as input and uses a VGG16-based Deep Learning model to predict the diagnosis. It provides a user-friendly web interface for easy interaction. The project is structured to be modular, reproducible, and easily deployable.

Project Structure

├── .github/workflows/   # CI/CD pipelines (GitHub Actions)
├── config/              # Configuration files
│   └── config.yaml      # Main config for data paths, model params
├── src/                 # Source code
│   └── cnnClassifier/   # Main package
│       ├── components/  # Core logic (Ingestion, Training, Evaluation)
│       ├── config/      # Configuration manager
│       ├── constants/   # Constant values
│       ├── entity/      # Data classes
│       ├── pipeline/    # Pipeline orchestration
│       └── utils/       # Utility functions
├── templates/           # HTML templates for Flask
├── artifacts/           # Generated artifacts (Data, Models - gitignored)
├── logs/                # Application & Training logs
├── app.py               # Flask Application Entry point
├── main.py              # Training Entry point
├── dvc.yaml             # DVC Pipeline definition
├── params.yaml          # Hyperparameters
├── requirements.txt     # Python dependencies
├── setup.py             # Package setup
├── Dockerfile           # Docker configuration
└── .dockerignore        # Docker ignore rules

Prerequisites

  • Python 3.8+
  • Git
  • Docker (Optional, for containerization)
  • AWS Account (Optional, for deployment)

Installation

  1. Clone the repository:

    git clone https://github.com/trunglap923/Chest-Cancer-Classification.git
    cd Chest-Cancer-Classification
  2. Create and activate a virtual environment (Recommended):

    # Windows
    python -m venv .venv
    .venv\Scripts\activate
    
    # Linux/Mac
    python3 -m venv .venv
    source .venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt

Usage

Run Locally (Flask)

To start the web application on your local machine:

python app.py

Open your browser and navigate to http://localhost:8080.

Run with Docker

  1. Build the image:
    docker build -t chest-cancer-app .
  2. Run the container:
    docker run -p 8080:8080 chest-cancer-app
    Access the app at http://localhost:8080.

Train the Model

This project uses DVC to manage the training pipeline. To rerun the entire pipeline (Ingestion -> Training -> Evaluation):

dvc repro

Or run main.py directly (if not using DVC caching):

python main.py

CI/CD Deployment (AWS)

The project includes a GitHub Actions workflow (.github/workflows/main.yaml) to automate deployment to AWS EC2 using ECR.

Setup Steps:

  1. AWS Console:

    • Create an IAM User with AmazonEC2ContainerRegistryFullAccess and AmazonEC2FullAccess.
    • Create an ECR Repository (e.g., chest-cancer-repo).
    • Launch an EC2 Instance (Ubuntu).
    • Install Docker on the EC2 instance.
  2. Self-Hosted Runner:

    • Go to GitHub Repo > Settings > Actions > Runners.
    • Follow instructions to install the runner on your EC2 instance.
  3. GitHub Secrets: Add the following secrets in GitHub Repo > Settings > Secrets and variables > Actions:

    • AWS_ACCESS_KEY_ID: Your IAM Access Key.
    • AWS_SECRET_ACCESS_KEY: Your IAM Secret Key.
    • AWS_REGION: e.g., us-east-1.
    • ECR_REPOSITORY_NAME: Name of your ECR repo.

Once configured, every push to the main branch will trigger the pipeline to build the Docker image, push it to ECR, and deploy it to your EC2 instance.

About

AI-powered Chest Cancer Diagnostic tool built with VGG16, DVC, Docker, and automated AWS deployment.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors