
# 📊 Analyzing Data with Pandas and Visualizing Results with Matplotlib

This notebook demonstrates how to:
- Load and explore a dataset using **pandas**
- Perform basic data analysis (descriptive statistics and groupings)
- Visualize results using **matplotlib** and **seaborn**
- Handle errors and clean data where necessary

We will use the classic **Iris dataset** for this analysis.


In [None]:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris



## 🔹 Task 1: Load and Explore the Dataset
We begin by loading the **Iris dataset**, inspecting the first few rows, 
checking the structure, and handling any missing values.


In [None]:

# Load the Iris dataset
iris = load_iris(as_frame=True)
df = iris.frame

# Display first rows
df.head()


In [None]:

# Dataset info and missing values check
df.info()
df.isnull().sum()



## 🔹 Task 2: Basic Data Analysis
We compute descriptive statistics, group the data by species, 
and identify patterns in the dataset.


In [None]:

# Descriptive statistics
df.describe()


In [None]:

# Group by species (target) and compute mean values
df.groupby("target").mean()



## 🔹 Task 3: Data Visualization
We will create four different plots to better understand the dataset:
1. Line Chart
2. Bar Chart
3. Histogram
4. Scatter Plot


In [None]:

# Line Chart: Sepal Length over index
plt.figure(figsize=(6,4))
plt.plot(df.index, df["sepal length (cm)"], label="Sepal Length")
plt.title("Line Chart of Sepal Length over Index")
plt.xlabel("Index")
plt.ylabel("Sepal Length (cm)")
plt.legend()
plt.show()


In [None]:

# Bar Chart: Average Petal Length per Species
plt.figure(figsize=(6,4))
df.groupby("target")["petal length (cm)"].mean().plot(kind="bar", color="skyblue")
plt.title("Average Petal Length per Species")
plt.xlabel("Species (target)")
plt.ylabel("Petal Length (cm)")
plt.show()


In [None]:

# Histogram: Distribution of Sepal Length
plt.figure(figsize=(6,4))
plt.hist(df["sepal length (cm)"], bins=15, color="green", alpha=0.7)
plt.title("Histogram of Sepal Length")
plt.xlabel("Sepal Length (cm)")
plt.ylabel("Frequency")
plt.show()


In [None]:

# Scatter Plot: Sepal Length vs Petal Length
plt.figure(figsize=(6,4))
plt.scatter(df["sepal length (cm)"], df["petal length (cm)"], c=df["target"], cmap="viridis")
plt.title("Scatter Plot: Sepal Length vs Petal Length")
plt.xlabel("Sepal Length (cm)")
plt.ylabel("Petal Length (cm)")
plt.colorbar(label="Species (target)")
plt.show()



## ✅ Conclusion
- Setosa has the smallest petal dimensions, while Virginica has the largest.  
- Sepal length shows a relatively wide distribution across species.  
- The scatter plot reveals clear clustering of species based on petal and sepal size.  

This project demonstrates the use of **Pandas** for analysis and **Matplotlib/Seaborn** for visualization.
