# Analyzing Data with Pandas and Visualizing Results with Matplotlib
This notebook loads the Iris dataset, explores it, performs basic analysis, and creates visualizations using Pandas, Matplotlib, and Seaborn.

In [None]:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris


## Task 1: Load and Explore the Dataset

In [None]:

try:
    iris = load_iris(as_frame=True)
    df = iris.frame
    print("Dataset loaded successfully!")
except Exception as e:
    print("Error loading dataset:", e)

# Show first rows
print("\nFirst 5 rows:")
display(df.head())

# Dataset info
print("\nDataset info:")
print(df.info())

# Missing values
print("\nMissing values per column:")
print(df.isnull().sum())

# Drop missing values if any
df = df.dropna()


## Task 2: Basic Data Analysis

In [None]:

# Summary statistics
print("\nSummary statistics:")
display(df.describe())

# Grouping by species
grouped = df.groupby("target").mean()
print("\nMean values by species (target IDs):")
display(grouped)

# Map species numbers to names
df["species"] = df["target"].map({i: name for i, name in enumerate(iris.target_names)})


## Task 3: Data Visualization

In [None]:

# 1. Line Chart - Cumulative Sepal Length
plt.figure(figsize=(8,5))
plt.plot(df.index, df["sepal length (cm)"].cumsum(), label="Cumulative Sepal Length")
plt.title("Line Chart - Cumulative Sepal Length")
plt.xlabel("Sample Index")
plt.ylabel("Cumulative Sepal Length (cm)")
plt.legend()
plt.show()


In [None]:

# 2. Bar Chart - Avg Petal Length per Species
plt.figure(figsize=(8,5))
df.groupby("species")["petal length (cm)"].mean().plot(kind="bar", color="skyblue")
plt.title("Bar Chart - Avg Petal Length per Species")
plt.xlabel("Species")
plt.ylabel("Average Petal Length (cm)")
plt.show()


In [None]:

# 3. Histogram - Sepal Width Distribution
plt.figure(figsize=(8,5))
plt.hist(df["sepal width (cm)"], bins=15, color="orange", edgecolor="black")
plt.title("Histogram - Sepal Width Distribution")
plt.xlabel("Sepal Width (cm)")
plt.ylabel("Frequency")
plt.show()


In [None]:

# 4. Scatter Plot - Sepal vs Petal Length
plt.figure(figsize=(8,5))
sns.scatterplot(x="sepal length (cm)", y="petal length (cm)", hue="species", data=df)
plt.title("Scatter Plot - Sepal vs Petal Length")
plt.xlabel("Sepal Length (cm)")
plt.ylabel("Petal Length (cm)")
plt.legend(title="Species")
plt.show()


## Findings / Observations
1. Iris Setosa generally has smaller petal length compared to Versicolor and Virginica.
2. Sepal length and petal length are positively correlated.
3. The histogram shows most sepal widths are between 2.5 - 3.5 cm.
4. The line chart indicates a steady increase in cumulative sepal length as samples progress.