Skip to content

Heart attack prediction using machine learning algorithms and pre-processing data

Notifications You must be signed in to change notification settings

leo8198/heart-attack-prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Heart-attack prediction using ML

Heart attack prediction using machine learning algorithms and pre-processing data

It's a binary classification problem (0 and 1) where for 1 the patient had a heart attack and for 0 he didn't. The dataset used was the heart attack dataset from UC Irvine. The data contains 303 samples.

My final model achieved 81% of accuracy.

Pre-processing

  • Exploratory Data Analysis (EDA)
  • Separate between train and test data
  • Removing null values
  • Removing outliers through boxplot
  • Normalizing data
  • PCA analysis
  • Analyse the correlation matrix

Machine learning algorithms

  • Dummy classifier (week baseline)
  • Random Forest
  • XGBoost
  • Naive Bayes
  • Logistic Regression
  • Support Vector Machines

Metrics

Applied the following metrics to analyse the models results:

  • Accuracy
  • ROC Curve
  • F1 Score

The best algorithm was Random Forest with 79% of accuracy.

Fine-tuning

  • Random Grid search
  • Cross-validation

The fine-tuned model improved the accuracy to 81%. Other papers achieved around 80% of accuracy as well.

Possible improvements

  • Aggregate patients data like age by range instead of raw numbers
  • Test removing variables with 0.40 correlation
  • Use log(x) for the "caa" variable to normalize the data distribution
  • Use other algorithms like Catboost
  • Tune XGBoost hyperparameters

About

Heart attack prediction using machine learning algorithms and pre-processing data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published