**Heart Disease Analysis**

**Introduction**
<br>
This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to this date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4.

**Goals**<br>
Here, we will try to find possibility of affected by heart disease based on sex, age, fbs.

**Attribute or Dataset Columns Feature Information**
1. ageage in years
2. sex(1 = male; 0 = female)
3. cpchest pain type
4. trestbpsresting blood pressure (in mm Hg on admission to the hospital)
5. cholserum cholestoral in mg/dl
6. fbs(fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
7. restecgresting electrocardiographic results
8. thalachmaximum heart rate achieved
9. exangexercise induced angina (1 = yes; 0 = no)
10. oldpeakST depression induced by exercise relative to rest
11. slopethe slope of the peak exercise ST segment
12. canumber of major vessels (0-3) colored by flourosopy
13. thal3 = normal; 6 = fixed defect; 7 = reversable defect
14. target1 or 0

**If you totally new in Kaggle, then I would like to recommend you to see this course.**<br>
Faster Data Science Education,
Link: https://www.kaggle.com/learn/overview

**Import Libraries**<br>
First, I import necessary libraries.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import math
%matplotlib inline


**Read Data**<br>
Now, we are loading our dataset to the df variable using **read_csv** (because our dataset in csv format) funcation.

In [None]:
df = pd.read_csv('../input/heart-disease-uci/heart.csv')

**Let's take a look to our loaded dataset**

Print first five rows of our loaded data.

In [None]:
df.head()

Print last five rows of loades data

In [None]:
df.tail()

**Let's check the size of our loaded data**

In [None]:
df.shape

(303, 14) means our loaded data have 303 rows and 14 columns.

**Let's show the headers or columns name**

In [None]:
df.columns

**Let's show loaded data info**

In [None]:
df.info()

**Let's describe our loaded data**

In [None]:
df.describe()

The features described in the above data set are:

1. **Count** tells us the number of NoN-empty rows in a feature.

2. **Mean** tells us the mean value of that feature.

3. **Std** tells us the Standard Deviation Value of that feature.

4. **Min** tells us the minimum value of that feature.

5. **25%, 50%, and 75%** are the percentile/quartile of each features.

6. **Max** tells us the maximum value of that feature.

**Let's check if there any null value in loaded data**

In [None]:
df.isnull().sum()

There is no null values in our loaded data.

**Visualization of loaded data**

In [None]:
plt.figure(figsize=(18,10))
sns.heatmap(df.corr(), annot = True, cmap='cool')
plt.show()

**Let's analyze Sex of the loaded data.**

In [None]:
print(len(df.sex))

Number of total genders are 303.

In [None]:
df.sex.value_counts()

Here, 1 means male and 0 means female. So, number of males and females are 207, 96.

In [None]:
sns.countplot(df.sex)
plt.show()

0 means female, 1 means male.

In [None]:
male = len(df[df['sex'] == 1])
female = len(df[df['sex'] == 0])
plt.figure(figsize=(7, 6))

labels = 'Male', 'Female'
sizes = [male, female]
colors = ['orange', 'gold']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.1f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

Here, 68.3% are male and 31.7% are female in the loaded dataset.

**Let's check male heart situation.**

In [None]:
male_1 = len(df[(df.sex == 1) & (df['target'] == 1)])
male_0 = len(df[(df.sex == 1) & (df['target'] == 0)])
sns.barplot(x = ['Male Target On', 'Male Target Off'], y = [male_1, male_0])
plt.xlabel('Male and Target')
plt.ylabel('Count')
plt.title('State of the Male gender')
plt.show()

**From above, we can say that male is less affected by heart diseases.**

**Let's check female heart situation.**

In [None]:
female_1 = len(df[(df.sex == 0) & (df['target'] == 1)])
female_0 = len(df[(df.sex == 0) & (df['target'] == 0)])

sns.barplot(x = ['Female Target On', 'Female Target Off'], y = [female_1, female_0])
plt.xlabel('Female and Target')
plt.ylabel('Count')
plt.title('State of the Female gender')
plt.show()

**From above, we can say that female is more affected by heart diseases.**

In [None]:
sns.countplot(df.sex, hue = df.target)
plt.title('Male & Femele Heart health condition')
plt.show()

From above output, we can again say that female are more affected and male are less affected by heart diseases.

**Let's see ratio of affected sex(gender).**

In [None]:
male = ((len(df[(df.sex == 1) & (df['target'] == 1)])) / len(df[df['sex'] == 1])) * 100
female = ((len(df[(df.sex == 0) & (df['target'] == 1)])) / len(df[df['sex'] == 0])) * 100
plt.figure(figsize=(8, 6))

labels = 'Male', 'Female'
sizes = [male, female]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

**From above output, we see that 37.46% male and 62.54% female are affected by heart diseases.**

**Let's analyze the age.**

In [None]:
df['count'] = 1
df.groupby('age').count()['count']

In [None]:
sns.barplot(x=df.age.value_counts()[:10].index,y=df.age.value_counts()[:10].values)
plt.xlabel('Age')
plt.ylabel('Age counter')
plt.title('Age Analysis')
plt.show()

In [None]:
print('Min age:', min(df.age))
print('Max age:', max(df.age))
print('Mean age: ', df.age.mean())

**The human age is classified into four categories as Child (0-12 years), Adolescence (13-18 years), Adult (19-59 years) and Senior Adult (60 years and above).**<br>
As there min or lawest age is 29, so we can divide age in two types.
1. Adult
2. Senior adult

In [None]:
print('Total adult people: ', len(df[(df.age >= 29) & (df.age <= 59)]))
print('Total senior adult people: ', len(df[(df.age > 59)]))

**Let's find the ratio of adult and senior adult.**

In [None]:
adult0 = ((len(df[(df.age >= 29) & (df.age <= 59)])) / len(df['age'])) * 100
senior0 = ((len(df[(df.age > 59)])) / len(df['age'])) * 100
plt.figure(figsize=(8, 6))

labels = 'Adult(19-59)', 'Senior Adult(60 or above)'
sizes = [adult0, senior0]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

Here, 70.30% adult and 29.70% are senior adult.

In [None]:
adult1 = ((len(df[(df.age >= 29) & (df.age <= 59) & (df['target'] == 1)])) / len(df['age'])) * 100
senior1 = ((len(df[(df.age > 59) & (df['target'] == 1)])) / len(df['age'])) * 100
plt.figure(figsize=(8, 6))

labels = 'Adult(29-59)', 'Senior Adult(60 or above)'
sizes = [adult1, senior1]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

**76.97% of adult and 23.03% senior adult are affected by heart diseases.**

**Let's analyze age and sex together.**

**Male**

In [None]:
adult2 = ((len(df[(df.sex == 1) & (df.age >= 29) & (df.age <= 59) & (df['target'] == 1)])) / len(df['age'])) * 100
senior2 = ((len(df[(df.sex == 1) & (df.age > 59) & (df['target'] == 1)])) / len(df['age'])) * 100
plt.figure(figsize=(8, 6))

labels = 'Adult Male(29-59)', 'Senior Adult Male(60 or above)'
sizes = [adult2, senior2]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

**86.02 % adult male and 13.98% senior adult affected by heart diseases.**

**Female**

In [None]:
adult3 = ((len(df[(df.sex == 0) & (df.age >= 29) & (df.age <= 59) & (df['target'] == 1)])) / len(df['age'])) * 100
senior3 = ((len(df[(df.sex == 0) & (df.age > 59) & (df['target'] == 1)])) / len(df['age'])) * 100
plt.figure(figsize=(8, 6))

labels = 'Adult Feale(29-59)', 'Senior Adult Female(60 or above)'
sizes = [adult3, senior3]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

**65.28% adult female and 34.72% senior adult female are afftected by heart diseases.**

**Let's analyze FBS or fasting blood sugar.**<br>
If a person have fasting blood sugar more than 120 mg/dl, then he or she is considered as with diabetes.

In [None]:
df['fbs']

In [None]:
print('Total people with diabetes: ', len(df[(df.fbs == 1)]))
print('Total people without diabetes: ', len(df[(df.fbs == 0)]))

**Let's find the ratio of diabetes and no diabetes.**

In [None]:
with_diabetes = (len(df[(df.fbs == 1)]) / len(df['fbs'])) * 100
without_diabetes =  (len(df[(df.fbs == 0)]) / len(df['fbs'])) * 100
plt.figure(figsize=(8, 6))

labels = 'Diabetes', 'No Diabetes'
sizes = [with_diabetes, without_diabetes]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

**In our loaded data, 14.85% people have diabetes and 85.15% have no diabetes.**

**Let's find out another ratio.**

In [None]:
with_diabetes_on1 = (len(df[(df.fbs == 1) & (df['target'] == 1)]) / len(df['fbs'])) * 100
with_diabetes_off1 = (len(df[(df.fbs == 1) & (df['target'] == 0)]) / len(df['fbs'])) * 100
plt.figure(figsize=(8, 6))

labels = 'Have diabetes and heart problem', 'Have diabetes but heart problem'
sizes = [with_diabetes_on1, with_diabetes_off1]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

**Here, 51.11% people have both diabetes and heart problem and 49.89% have diabetes but no heart problem.**

**Let's analyze sex, age, fbs, heart disease together.**

**Male**

In [None]:
adult4 = ((len(df[(df.sex == 1) & (df.fbs == 1) & (df.age >= 29) & (df.age <= 59) & (df['target'] == 1)])) / len(df['age'])) * 100
senior4 = ((len(df[(df.sex == 1)  & (df.fbs == 1) & (df.age > 59) & (df['target'] == 1)])) / len(df['age'])) * 100
plt.figure(figsize=(8, 6))

labels = 'Adult Male(29-59) have both diabetes and heart problem', 'Senior Adult Male(60 or above) have both diabetes and heart problem'
sizes = [adult4, senior4]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

**76.47% adult male(29-59) have both diabetes and heart problem and 23.53% senior adult male(60 or above) have both diabetes and heart problem.**

**Female**

In [None]:
adult5 = ((len(df[(df.sex == 0) & (df.fbs == 1) & (df.age >= 29) & (df.age <= 59) & (df['target'] == 1)])) / len(df['age'])) * 100
senior5 = ((len(df[(df.sex == 0)  & (df.fbs == 1) & (df.age > 59) & (df['target'] == 1)])) / len(df['age'])) * 100
plt.figure(figsize=(8, 6))

labels = 'Adult female(29-59) have both diabetes and heart problem', 'Senior adult female(60 or above) have both diabetes and heart problem'
sizes = [adult5, senior5]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

**50% of adult female and 50% of senior adult female have diabetes and heart problem together.**

**Male**

In [None]:
adult6 = ((len(df[(df.sex == 1) & (df.fbs == 1) & (df.age >= 29) & (df.age <= 59) & (df['target'] == 1)])) / len(df['age'])) * 100
senior6 = ((len(df[(df.sex == 1)  & (df.fbs == 1) & (df.age > 59) & (df['target'] == 1)])) / len(df['age'])) * 100
plt.figure(figsize=(8, 6))

labels = 'Adult male(29-59) have both diabetes and heart problem', 'Senior adult male(60 or above) have both diabetes and heart problem'
sizes = [adult6, senior6]
colors = ['orange', 'yellow']
explode = (0, 0)

plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.2f%%', shadow=True, startangle=90)

plt.axis('equal')
plt.show()

**76.47% of adult male and 23.53% of senior adult male have diabetes and heart problem together.**

**To be continued..!**