
# Analyzing Data with Pandas and Visualizing Results with Matplotlib

### Assignment Solution  
Dataset: **heart.csv**

---

### Objectives:
- Load and explore the dataset using pandas  
- Perform basic data analysis  
- Create visualizations with matplotlib  


## Task 1: Load and Explore the Dataset

In [None]:

import pandas as pd
import matplotlib.pyplot as plt

# Load dataset
try:
    df = pd.read_csv("heart.csv")
    print("✅ Dataset loaded successfully!")
except FileNotFoundError:
    print("❌ Error: The file 'heart.csv' was not found.")

# Display first rows
df.head()


In [None]:

# Dataset info
df.info()


In [None]:

# Check for missing values
df.isnull().sum()


## Task 2: Basic Data Analysis

In [None]:

# Basic statistics
df.describe()


In [None]:

# Grouping: average cholesterol by sex (0 = female, 1 = male)
avg_chol_by_sex = df.groupby("sex")["chol"].mean()
avg_chol_by_sex


In [None]:

# Grouping: average max heart rate by chest pain type
avg_thalach_by_cp = df.groupby("cp")["thalach"].mean()
avg_thalach_by_cp


## Task 3: Data Visualization

In [None]:

# 1. Line chart - Age vs. Cholesterol
plt.figure(figsize=(8,5))
plt.plot(df["age"], df["chol"], 'b.', alpha=0.5)
plt.title("Age vs Cholesterol Levels")
plt.xlabel("Age")
plt.ylabel("Cholesterol")
plt.grid(True)
plt.show()


In [None]:

# 2. Bar chart - Average Cholesterol by Sex
avg_chol_by_sex.plot(kind="bar", color=["pink", "blue"], figsize=(6,4))
plt.title("Average Cholesterol by Sex")
plt.xlabel("Sex (0=Female, 1=Male)")
plt.ylabel("Average Cholesterol")
plt.show()


In [None]:

# 3. Histogram - Distribution of Cholesterol
plt.figure(figsize=(8,5))
plt.hist(df["chol"], bins=20, color="orange", edgecolor="black")
plt.title("Distribution of Cholesterol")
plt.xlabel("Cholesterol")
plt.ylabel("Frequency")
plt.show()


In [None]:

# 4. Scatter Plot - Age vs Max Heart Rate
plt.figure(figsize=(8,5))
plt.scatter(df["age"], df["thalach"], c=df["target"], cmap="coolwarm", alpha=0.7)
plt.title("Age vs Max Heart Rate (Colored by Target)")
plt.xlabel("Age")
plt.ylabel("Max Heart Rate (thalach)")
plt.colorbar(label="Target (1=Heart Disease, 0=No Disease)")
plt.show()


✅ **Analysis and Visualization Complete!**