# 📊 Data Analysis and Visualization Project
This notebook follows a three-task structure to analyze a dataset using Pandas, Matplotlib, and Seaborn.

## 🔍 Task 1: Load and Explore the Dataset
We use the Iris dataset from `sklearn.datasets`, simulate loading from CSV, check for missing values, and explore structure.

In [None]:
import pandas as pd # type: ignore
import matplotlib.pyplot as plt # type: ignore
import seaborn as sns # type: ignore
from sklearn.datasets import load_iris # type: ignore

sns.set(style='whitegrid')

In [None]:
try:
    iris = load_iris()
    df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
    df['species'] = iris.target
    df['species'] = df['species'].map(dict(zip(range(3), iris.target_names)))
    print("✅ Dataset loaded successfully.")
except Exception as e:
    print("❌ Failed to load dataset:", str(e))

In [None]:
df.head()

In [None]:
print(df.info())
print("\nMissing values:\n", df.isnull().sum())

In [None]:
if df.isnull().values.any():
    df = df.dropna()
    print("Dropped rows with missing values.")
else:
    print("No missing values found.")

## 📈 Task 2: Basic Data Analysis
We compute basic statistics and group by species to analyze feature differences.

In [None]:
df.describe()

In [None]:
df.groupby('species').mean()

## 📊 Task 3: Data Visualization
Below are four types of visualizations with full labeling and custom styling.

In [None]:
plt.figure(figsize=(10, 5))
plt.plot(df.index, df['sepal length (cm)'], label='Sepal Length')
plt.plot(df.index, df['petal length (cm)'], label='Petal Length')
plt.title('Sepal and Petal Length Trend Over Samples')
plt.xlabel('Sample Index')
plt.ylabel('Length (cm)')
plt.legend()
plt.tight_layout()
plt.show()

In [None]:
plt.figure(figsize=(8, 5))
sns.barplot(x='species', y='petal length (cm)', data=df, palette='muted')
plt.title('Average Petal Length per Species')
plt.xlabel('Species')
plt.ylabel('Petal Length (cm)')
plt.tight_layout()
plt.show()

In [None]:
plt.figure(figsize=(8, 5))
sns.histplot(df['sepal width (cm)'], bins=15, kde=True, color='skyblue')
plt.title('Distribution of Sepal Width')
plt.xlabel('Sepal Width (cm)')
plt.ylabel('Frequency')
plt.tight_layout()
plt.show()

In [None]:
plt.figure(figsize=(8, 5))
sns.scatterplot(x='sepal length (cm)', y='petal length (cm)', hue='species', data=df, palette='deep')
plt.title('Sepal Length vs Petal Length')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Petal Length (cm)')
plt.legend(title='Species')
plt.tight_layout()
plt.show()