- The aim of the project is to analyze and predict whether the person having the chances of CVD.
To analyze the dataset of the Cardio Vascular Disease Risk Prediction Dataset and build and train the model on the basis of different features and variables.
There are 19 features and 308854 entries in this dataset.
General_Health
- Would you say that in general your health is?Checkup
- About how long has it been since you last visited a doctor for a routine checkup?Exercise
- During the past month, other than your regular job, did you participate in any physical activities or exercises such as running, calisthenics, golf, gardening, or walking for exercise?Heart_Disease
- Respondents that reported having coronary heart disease or mycardialinfarctionSkin_Cancer
- Respondents that reported having skin cancerOther_Cancer
- Respondents that reported having any other types of cancerDepression
- Respondents that reported having a depressive disorder (including depression, major depression, dysthymia, or minor depression)Diabetes
- Respondents that reported having a diabetes. If yes, what type of diabetes it is/was.Arthritis
- Respondents that reported having an ArthritisSex
- Respondent's Gender
- Pandas
- Numpy
- Matplotlib
- Sklearn
- Sci-py
- Seaborn
- Joblib
- Flask
- Create a virtual environment using
python -m venv myenv
. - To activate the virtual environment use
.\myenv\Scripts\activate
. - If error occurs, use
Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
. - Now, app.py is the flask app code. run the command
pip install -r requirements.txt
to install the required dependencies for the flask app. - You may need to install additional libraries for running the jupyter notebooks.
- Load the dataset which contains 308854 entries in it and having 19 features in it.
- Performing EDA on the dataset to get insights of the dataset.
- Plotting different features graphs correspond to
target
feature and performing univariate and bivariate analysis. - Analyse the dataset by using correlation and plot the bar plot i.e., how much it is related to
target
feature. - Reduce the parameters and split the dataset into input and target features.
- Split the parameters into training and testing sets.
- Train the different models and get their accuracies and MSE & R2 scores even after tuning the hyper-parameters.
- Even build a neural network and tune the parameters of their, but Neural network gives 91.91% accuracy.
- Dump the model into
.joblib
extension file and create a front-end for it. - Also creating a
requirements.txt
file for the model and website build-up. - Create a front-end using FLASK framework and create a user-friendly template.
- Website can takes input and pass to the backend of the model and model will predict and provide the user a best result as of accuracy is around 91.91%.
Alcohol Consumption | ||
Body Mass Index | ||
Fried Patato Consumption | ||
Fruit Consumption | ||
Correlation |
- Neural Network model show promising performance with 91.91% accuracy of the model.
- Created a user-friendly front-end framework using FLASK and integrate it to the model.