Stroke Prediction Model

This project aims to predict the likelihood of a stroke using various machine learning models based on the healthcare-dataset-stroke-data.csv dataset. The dataset contains information on various health attributes that may contribute to stroke occurrence.

Installation

Clone the repository:

git clone https://github.com/Elilora/stroke-prediction.git
cd stroke-prediction

Install the required libraries:

pip install numpy pandas seaborn matplotlib scikit-learn xgboost shap

Usage

Ensure the dataset file healthcare-dataset-stroke-data.csv is in the /kaggle/input/stroke-prediction-dataset/ directory.
Run the script to train and evaluate the models:
```
python stroke_prediction.py
```

Dataset

The dataset used for this project is healthcare-dataset-stroke-data.csv, which contains the following columns:

id
gender
age
hypertension
heart_disease
ever_married
work_type
Residence_type
avg_glucose_level
bmi
smoking_status
stroke

Data Preprocessing

Handle missing values:
- The bmi column had missing values, which were imputed using the mean strategy.
Encode categorical variables:
- Categorical columns were encoded using LabelEncoder.

Model Building

The following models were used to predict stroke occurrence:

Decision Tree Classifier
Random Forest Classifier

Steps

Split the data into training and testing sets (70% train, 30% test).
Train the models on the training set.
Evaluate the models on the testing set using accuracy and F1-score metrics.
Generate classification reports and confusion matrices.

Results

Decision Tree Classifier

Accuracy: 91.06%
F1-score: 10.46%
Detailed classification report and confusion matrix are generated in the script output.

Random Forest Classifier

Accuracy: 95.50%
F1-score: 0.00%
Detailed classification report and confusion matrix are generated in the script output.

Contributing

Contributions are welcome! Please fork the repository and submit a pull request for any enhancements or bug fixes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
stroke-prediction.ipynb		stroke-prediction.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stroke Prediction Model

Table of Contents

Installation

Usage

Dataset

Data Preprocessing

Model Building

Steps

Results

Decision Tree Classifier

Random Forest Classifier

Contributing

License

About

Releases

Packages

Languages

Elilora/Stroke-Prediction

Folders and files

Latest commit

History

Repository files navigation

Stroke Prediction Model

Table of Contents

Installation

Usage

Dataset

Data Preprocessing

Model Building

Steps

Results

Decision Tree Classifier

Random Forest Classifier

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages