**Author:** Rizwan Rizwan\
**Submission Date:** 24-10-2023

# Data Visualization on 'Heart Attack' dataset using `Plotly`

- The dataset can be downloaded from Kaggle [link](https://www.kaggle.com/datasets/rashikrahmanpritom/heart-attack-analysis-prediction-dataset)

## About the dataset

The dataset has been loaded successfully. Based on the first few rows, the dataset has the following columns:

1. **age:** Age of the patient
2. **sex:** Sex of the patient (1: male, 0: female)
3. **cp:** Chest pain type
4. **trtbps:** Resting blood pressure
5. **chol:** Cholesterol level
6. **fbs:** Fasting blood sugar (> 120 mg/dl, 1: true; 0: false)
7. **restecg:** Resting electrocardiographic results
8. **thalachh:** Maximum heart rate achieved
9. **exng:** Exercise induced angina (1: yes; 0: no)
10. **oldpeak:** ST depression induced by exercise relative to rest
11. **slp:** Slope of the peak exercise ST segment
12. **caa:** Number of major vessels (0-4) colored by fluoroscopy
13. **thall:** Thalassemia (3: normal; 6: fixed defect; 7: reversable defect)
14. **output:** Heart attack (1: yes, 0: no)

In [1]:
import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt 
import seaborn as sns 
%matplotlib inline
import plotly.express as px

In [2]:
# load the dataset 
df = pd.read_csv('./data/heart.csv')

### Plot 1. Histogram - Age Distribution

In [22]:
fig1 = px.histogram(df, x="age", title="Age Distribution", nbins=30)
fig1.show()

**Interpretation**

This histogram provides insight into the age distribution of the participants, indicating which age groups are most prevalent in the dataset.

### Plot 2. Pie Chart - Gender Distribution 

In [23]:
fig2 = px.pie(df, names="sex", title="Gender Distribution", labels={"sex": "Gender"}, category_orders={"sex": [0, 1]})
fig2.show()


**Interpretation:**
 
A pie chart representing the proportion of male and female participants in the dataset.

### Plot 3. Scatterplot - Maximum Heart Rate vs. Age

In [24]:
fig3 = px.scatter(df, x="age", y="thalachh", title="Maximum Heart Rate vs. Age", color="output", labels={"output": "Heart Attack"})
fig3.show()

**Interpretation:** 

The scatter plot depicts the relationship between age and maximum heart rate. The color differentiation based on heart attack outcomes provides insights into risk factors.

### Plot 4. Box Plot - Cholesterol Levels by Gender 

In [26]:
fig4 = px.box(df, x="sex", y="chol", title="Cholesterol Levels by Gender", labels={"sex": "Gender", "chol": "Cholesterol Level"})
fig4.show()

**Interpretation:** 

This box plot showcases the spread and central tendency of cholesterol levels across genders, helping identify any significant differences.

### Plot5. Violin Plot - Resting Blood Pressure 

In [27]:
fig5 = px.violin(df, y="trtbps", title="Distribution of Resting Blood Pressure", box=True)
fig5.show()

**Interpretation:** 

A violin plot gives a deeper understanding of the distribution of resting blood pressure, combining aspects of box plots and density plots.

### Plot6. Bar Plot - Heart Attack Outcome by Age 

In [28]:
fig6 = px.histogram(df, x="age", color="output", title="Heart Attack Outcome by Age", barmode="group", labels={"output": "Heart Attack"})
fig6.show()

**Interpretation:** 

This grouped bar plot provides a clear view of heart attack outcomes across different age groups, highlighting potential high-risk age groups.

### Plot7. Bar Plot -  Heart Attack Outcome by Chest Pain Type 

In [29]:
fig7 = px.bar(df, x="cp", color="output", title="Heart Attack Outcome by Chest Pain Type", labels={"output": "Heart Attack"})
fig7.show()

**Interpretation:**

The bar chart illustrates how varying chest pain types relate to heart attack outcomes, potentially indicating which pain types are more concerning

### Plot8. Scattter Plot -  Number of Major Vessels vs. Age with Heart Attack Outcome

In [30]:
fig8 = px.scatter(df, x="age", y="caa", color="output", title="Number of Major Vessels vs. Age", labels={"output": "Heart Attack"})
fig8.show()

 **Interpretation:**
 
 This scatter plot elucidates the relationship between age and the number of major vessels, with the color differentiation helping identify potential risk factors.

### Plot9. Sunburst Plot - Plot Thalassemia Distribution 

In [31]:
fig9 = px.sunburst(df, path=['thall', 'output'], title="Thalassemia Distribution with Heart Attack Outcome")
fig9.show()

**Interpretation:**

The sunburst chart offers a hierarchical view of thalassemia types and their associated heart attack outcomes

### Plot10. Scatter Plot. ST Depression Induced by Exercise vs. Age

In [33]:
fig10 = px.scatter(df, x="age", y="oldpeak", title="ST Depression Induced by Exercise vs. Age", color="output", labels={"output": "Heart Attack"})
fig10.show()


**Interpretation:**

This scatter plot illustrates the relationship between age and ST depression induced by exercise. The color-coded outcomes help to discern age groups at higher risk based on this parameter.