# Task 1 — Exploring and Visualizing a Simple Dataset (Iris)

This notebook fulfills **Task 1** from the DevelopersHub Internship tasks. It uses scikit-learn's built-in Iris dataset (works offline) to:

1. Load the dataset with pandas
2. Show shape, columns, head, `.info()`, `.describe()`
3. Visualize scatter plots, histograms, and box plots (using **matplotlib only**, no seaborn)

**Skills practiced:** data loading, descriptive statistics, and basic visualization.


In [None]:
# Imports (matplotlib only for plots)
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt

# Do not set styles or colors explicitly per instructions.


In [None]:
# Load Iris dataset
iris = load_iris(as_frame=True)
df = iris.frame
df.rename(columns={'target': 'species_id'}, inplace=True)
df['species'] = df['species_id'].map(dict(enumerate(iris.target_names)))
df.head()

In [None]:
# Basic info
print('Shape:', df.shape)
print('Columns:', list(df.columns))
display(df.head())
display(df.info())
display(df.describe())

In [None]:
# Scatter plot: sepal_length vs sepal_width
plt.figure()
for sp in df['species'].unique():
    sub = df[df['species'] == sp]
    plt.scatter(sub['sepal length (cm)'], sub['sepal width (cm)'], label=sp)
plt.xlabel('sepal length (cm)')
plt.ylabel('sepal width (cm)')
plt.title('Iris: Sepal Length vs Sepal Width')
plt.legend()
plt.show()

In [None]:
# Histograms for each numeric column
numeric_cols = df.select_dtypes(include=[np.number]).columns
for col in numeric_cols:
    plt.figure()
    df[col].hist(bins=20)
    plt.xlabel(col)
    plt.ylabel('Frequency')
    plt.title(f'Histogram of {col}')
    plt.show()

In [None]:
# Box plots for numeric columns grouped by species
for col in ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']:
    plt.figure()
    data = [df[df['species']==sp][col].values for sp in df['species'].unique()]
    plt.boxplot(data, labels=df['species'].unique())
    plt.ylabel(col)
    plt.title(f'Box Plot of {col} by Species')
    plt.show()

## Notes & Insights
- Setosa typically has smaller petal lengths and widths, making it separable in feature space.
- Overlapping distributions between Versicolor and Virginica may require more advanced models for perfect separation.
