![image.png](heart.png)

## **Heart Failure Prediction**

The task of this project is to analyze dataset containing different characteristics of 918 patients to predict heart failures using Python, Machine Learning and data visualization tools. Utilize machine learning model to create a model to assess the likelihood of a possible heart disease event.

**Understanding the terms/characteristics in the dataset**

**ChestPainType** - TA: Typical Angina - substernal chest pain precipitated by physical exertion or emotional stress and relieved with rest or nitroglycerin; ATA: Atypical Angina, NAP: Non-Anginal Pain, ASY: Asymptomatic

**RestingBP** - resting blood pressure in mm Hg/ millimeters of mercury. A normal reading would be any blood pressure below 120/80 mm Hg and above 90/60 mm Hg in an adult.

**Cholesterol** - Total or serum cholesterol measured in [mm/dl]. Below 200 mg/dL -  is desirable/normal; 200-239 mg/dL - borderline high; 240 mg/dL and above - high.

RestingECG

MaxHR

Oldpeak




## Step I: Preprocessing

In [39]:
# Import our dependencies
import pandas as pd
import matplotlib as plt
import matplotlib.pyplot as plt
import numpy as np

In [30]:
#  Import and read the heart.csv.
heart_df = pd.read_csv("heart.csv")
heart_df.head(3)

Unnamed: 0,Age,Sex,ChestPainType,RestingBP,Cholesterol,FastingBS,RestingECG,MaxHR,ExerciseAngina,Oldpeak,ST_Slope,HeartDisease
0,40,M,ATA,140,289,0,Normal,172,N,0.0,Up,0
1,49,F,NAP,160,180,0,Normal,156,N,1.0,Flat,1
2,37,M,ATA,130,283,0,ST,98,N,0.0,Up,0


In [32]:
heart_df.dtypes

Age                 int64
Sex                object
ChestPainType      object
RestingBP           int64
Cholesterol         int64
FastingBS           int64
RestingECG         object
MaxHR               int64
ExerciseAngina     object
Oldpeak           float64
ST_Slope           object
HeartDisease        int64
dtype: object

In [33]:
heart_df.shape

(918, 12)

In [38]:
# Check the total missing values in each column. A field with a NULL value is the one that has been left blank during the record creation.
print("Total NULL Values in each columns")
print("*********************************")
print(heart_df.isnull().sum())

Total NULL Values in each columns
*********************************
Age               0
Sex               0
ChestPainType     0
RestingBP         0
Cholesterol       0
FastingBS         0
RestingECG        0
MaxHR             0
ExerciseAngina    0
Oldpeak           0
ST_Slope          0
HeartDisease      0
dtype: int64


In [52]:
#cleaning the dataset by removing all zeroes in the column "cholesterol" as there is no 0 cholesterole.
clean_df=heart_df[heart_df['Cholesterol'] !=0]
clean_df.head(3)

Unnamed: 0,Age,Sex,ChestPainType,RestingBP,Cholesterol,FastingBS,RestingECG,MaxHR,ExerciseAngina,Oldpeak,ST_Slope,HeartDisease
0,40,M,ATA,140,289,0,Normal,172,N,0.0,Up,0
1,49,F,NAP,160,180,0,Normal,156,N,1.0,Flat,1
2,37,M,ATA,130,283,0,ST,98,N,0.0,Up,0
