About this dataset

Age : Age of the patient
Sex : Sex of the patient
exang: exercise induced angina (1 = yes; 0 = no)
ca: number of major vessels (0-3)
cp : Chest Pain type chest pain type

Value 1: typical angina
Value 2: atypical angina
Value 3: non-anginal pain
Value 4: asymptomatic
trtbps : resting blood pressure (in mm Hg)
chol : cholestoral in mg/dl fetched via BMI sensor
fbs : (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
rest_ecg : resting electrocardiographic results
Value 0: normal
Value 1: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV)
Value 2: showing probable or definite left ventricular hypertrophy by Estes' criteria
thalach : maximum heart rate achieved
target : 0= less chance of heart attack 1= more chance of heart attack
thal : Thalium Stress Test Result

In [None]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from collections import Counter

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler 
from sklearn.svm import SVC
from sklearn.linear_model import LogisticRegression
import xgboost as xgb
from sklearn.metrics import accuracy_score
from sklearn.neighbors import KNeighborsClassifier  
from sklearn.ensemble import RandomForestRegressor
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import BernoulliNB
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import confusion_matrix


In [None]:
df=pd.read_csv("../input/heart-attack-analysis-prediction-dataset/heart.csv")

In [None]:
df.head(5)

In [None]:
df.info()

In [None]:
df.isnull().sum()

In [None]:
output=Counter(df.output)
values=[]
classes=[]
for i in output.keys():
    classes.append(i)
    values.append(output[i])

plt.pie(
        values,
        labels=classes,
        autopct='%1.2f%%',
        startangle=80)
plt.legend()
plt.title("Percentage of patient chance of heart attack")
plt.show()

54% of people have more chance of heart attack

Finding the corelation between Age and output

In [None]:
g=sns.FacetGrid(df,col="output")
g.map(plt.hist,"age",bins=10)
plt.figure(figsize=(10,10))
sns.displot(df.age,color='red',label='Age',kde=True)
plt.legend()

1. Chance of heart attack is height for age group 55 to 60

**Breakdown of chest pain**
0: typical angina 
1: atypical angina 
2: non-anginal pain 
3: asymptomatic 

In [None]:
sns.countplot(data=df,x='cp')

1. We observed that patient have chest pain of type 0 i.e,'typical angina' is the height
2. 42% of patient have chest pain of type 0
3. Chest pain of type 3 is the lowest

Now we find the corelation between 'Chest Pain' and chance of Heart attack i.e, 'output'
output: 1=Height chance of heart attack,0=less chance of heart attack


In [None]:
g=sns.FacetGrid(df,col='output')
g.map(plt.hist,'cp')

1. It can be observed that Most of the patients having chest pain 0 is the less chance of heart attack.
2. And most of the patients having chest pain 2 is the hiest chance of heart attack.

Now we find the dependecy of the corelation(chest pain with chance of heart attack) with the age

In [None]:
g=sns.FacetGrid(df,col="output",row="cp")
g.map(plt.hist,"age",bins=10)

1. It is clear that Most of the patients having chest pain 0 is the less chance of heart attack and most of the ages are between 60 and 55 and
2. Most of the patients having chest pain 2 is the hieght chance of heart attack and most of the ages are between 45 and 40.

find the corelation between sex and output

In [None]:
g=sns.FacetGrid(df,col="output",row="sex")
g.map(plt.hist,"age",bins=10)

**A fasting blood sugar level less than 100 mg/dL (5.6 mmol/L) is normal. A fasting blood sugar level from 100 to 125 mg/dL (5.6 to 6.9 mmol/L) is considered prediabetes**

As per our dataset 'Heart.csv' fbs: 1= (fasting blood sugar > 120 mg/dl) and 
0:  (fasting blood sugar < 120 mg/dl)

                                       **Breakdown of fbs**

In [None]:
sns.countplot(data=df,x='fbs')

1. It is observed that 75% of people have fasting blood suger level<120mg/dl i.e suger level is normal

In [None]:
g=sns.FacetGrid(df,col='output')
g.map(plt.hist,'fbs')

Most of the people have fbs 0 is the height chance of heart attack

                           **Breakdown of ECG**

In [None]:
sns.countplot(data=df,x='restecg')

Count of ECG is almost same for type 0 and type 1 and type 2 is almost negligible in the comparision of type 0 and type 1

In [None]:
g=sns.FacetGrid(df,col='output')
g.map(plt.hist,'restecg')

1. In both cases of chance of heart attack there is a blance of ECG of type 0(normal) and
2. we observed that the people have ECG of type 1 is the height chance of heart attack and
3. in the both cases of chance of heart attack ECG of type 2 is same

**Breakdown for exercise induced angina**

In [None]:
sns.countplot(data=df,x='exng')

We observed that exercise induced angina is half of type 0

In [None]:
g=sns.FacetGrid(df,col='output')
g.map(plt.hist,'exng')

**Breakdown for Thalium Strees Pain**

In [None]:
sns.countplot(data=df,x='thall')

thall of type 2 is maximum and type 0 is minimum

Find the corelation between cholestoral and chance of heart attack(output)

In [None]:
g=sns.FacetGrid(df,col='output')
g.map(plt.hist,'chol')

1. Here we observed that most of the patients have cholestoral range 200-300

**Corelation between 'resting blood pressure (in mm Hg)' and output**

In [None]:
g=sns.FacetGrid(df,col='output')
g.map(plt.hist,'trtbps')

1. Here we observed that most of the people who have height chance of heart attack, resting blood pressure range is from 120 to 140 and
2. in both cases  most of the people have trtbps range from 120 to 140.

**Corelation between thalach : maximum heart rate achieved and output**

In [None]:
plt.figure(figsize=(10,10))
sns.displot(df.thalachh,color='blue',label='maximum heart rate achieved',kde=True)

g=sns.FacetGrid(df,col='output')
g.map(plt.hist,'thalachh')

1. Here we observed that count of heart rate is maximum from 145 to 170 &
2. Most of the people have range of heart rate for the height chance of heart attack is 150 to 175

In [None]:
g=sns.FacetGrid(df,col='output')
g.map(plt.hist,'caa')

In [None]:
x = df.iloc[:, 1:-1].values
y = df.iloc[:, -1].values
x,y

In [None]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size= 0.1, random_state= 0)

In [None]:
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

In [None]:
model = SVC()
model.fit(x_train, y_train)
  
predicted = model.predict(x_test)
print("The accuracy of SVM is : ", accuracy_score(y_test, predicted)*100, "%")