Heart-attack prediction using ML

Heart attack prediction using machine learning algorithms and pre-processing data

It's a binary classification problem (0 and 1) where for 1 the patient had a heart attack and for 0 he didn't. The dataset used was the heart attack dataset from UC Irvine. The data contains 303 samples.

My final model achieved 81% of accuracy.

Pre-processing

Exploratory Data Analysis (EDA)
Separate between train and test data
Removing null values
Removing outliers through boxplot
Normalizing data
PCA analysis
Analyse the correlation matrix

Machine learning algorithms

Dummy classifier (week baseline)
Random Forest
XGBoost
Naive Bayes
Logistic Regression
Support Vector Machines

Metrics

Applied the following metrics to analyse the models results:

Accuracy
ROC Curve
F1 Score

The best algorithm was Random Forest with 79% of accuracy.

Fine-tuning

Random Grid search
Cross-validation

The fine-tuned model improved the accuracy to 81%. Other papers achieved around 80% of accuracy as well.

Possible improvements

Aggregate patients data like age by range instead of raw numbers
Test removing variables with 0.40 correlation
Use log(x) for the "caa" variable to normalize the data distribution
Use other algorithms like Catboost
Tune XGBoost hyperparameters

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
study.ipynb		study.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart-attack prediction using ML

Pre-processing

Machine learning algorithms

Metrics

Fine-tuning

Possible improvements

About

Releases

Packages

Languages

leo8198/heart-attack-prediction

Folders and files

Latest commit

History

Repository files navigation

Heart-attack prediction using ML

Pre-processing

Machine learning algorithms

Metrics

Fine-tuning

Possible improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages