Skip to content

In this notebook, I built gradient boosting classifier and neural network models to classify and predict the survival rate of patients with breast cancer.

Notifications You must be signed in to change notification settings

tstran155/Multiclass-classification-of-breast-cancer-patients

Repository files navigation

Multiclass classification of breast cancer patients

Breast cancer is the most common cancer and the leading cause of cancer-related deaths in women. Cancer is typically associated with genetic abnormalities. People with the same disease stage and similar clinical characteristics may have different treatment responses and overall survival rates. Therefore, it is important to accurately predict the survival time and prevent unnecessary surgical procedures.

In this notebook, I built gradient boosting classifier and neural network models to classify and predict the survival rates of breast cancer patients. The dataset used in this work included 1,980 primary breast cancer samples and can be found on Kaggle website.

https://www.kaggle.com/datasets/raghadalharbi/breast-cancer-gene-expression-profiles-metabric

There are two notebooks -gradient boosing classifier and neural network- in this repository. Their file structure is similar except for Section #4 and 5.

  1. Prepare Problem

a) Load libraries

b) Load dataset

  1. Summarize Data

a) Descriptive statistics

b) Data visualizations

  1. Prepare Data

a) Data Cleaning

b) Feature Selection using SelectKBest

c) Split Dataset into Train and Test Sets

d) Data Transforms

  1. Evaluate Algorithms

a) Spot check algorithms (cross-validation technique)

  • Notebook 1: Standard algorithms (KNeighborsClassifier, DecisionTreeClassifier, GaussianNB, SVC)

            Ensemble algorithms (AdaBoostClassifier, GradientBoostingClassifier, RandomForestClassifier,ExtraTreesClassifier) 
    
  • Notebook 2: Ensemble algorithm (XGBoost)

            Neural network algorithm (Keras Sequential)
    

b) Compare algorithms

  1. Improve Accuracy by Tuning Model's Parameters

  2. Finalize Model

a) Predictions on validation dataset (best algorithm)

b) Save model for later use

  1. Conclusions

About

In this notebook, I built gradient boosting classifier and neural network models to classify and predict the survival rate of patients with breast cancer.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published