# 🔬 Visualization Mini-Lab: Trends and Outliers

Now it’s time to put your skills to the test!
In this mini-lab, you’ll explore a real dataset, create visualizations, and identify **trends and outliers**.

We’ll use the famous **Penguins dataset** (thanks to Seaborn).

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

# Load dataset
#penguins = sns.load_dataset("penguins")
penguins = pd.read_csv("../../data/penguins.csv")
penguins.head()

## 1. Explore Distributions

First, let’s look at the distribution of **body mass**.

In [None]:
sns.histplot(data=penguins, x="body_mass_g", kde=True, hue="species")
plt.title("Distribution of Body Mass by Species")
plt.show()

👉 **Task:** Do some species have higher average body mass than others?

## 2. Relationships Between Variables

Scatterplots help us see trends between two quantitative variables.

In [None]:
sns.scatterplot(data=penguins, x="flipper_length_mm", y="body_mass_g", hue="species")
plt.title("Flipper Length vs Body Mass")
plt.show()

👉 **Task:** Which species tends to have longer flippers AND higher body mass?

## 3. Box Plots for Outliers

Box plots are great for spotting outliers.

In [None]:
sns.boxplot(data=penguins, x="species", y="bill_length_mm", palette="Set2")
plt.title("Bill Length by Species")
plt.show()

👉 **Task:** Can you spot any outliers in the bill length data?

## 4. Correlation Heatmap

Correlation heatmaps let us see overall variable relationships.

In [None]:
corr = penguins.dropna().corr(numeric_only=True)
sns.heatmap(corr, annot=True, cmap="coolwarm", center=0)
plt.title("Correlation Heatmap")
plt.show()

👉 **Task:** Which variables appear to be strongly correlated?

## 5. Your Turn

- Try a different dataset (e.g., `tips`, `iris`).
- Create **at least 2 plots** that reveal a trend.
- Create **1 plot** that highlights possible outliers.
- Write a short summary of your findings.

> Remember: The goal is to **use visualizations to tell a story about the data.**

---
✅ Congratulations! You’ve completed the visualization mini-lab and Week 07.

Next module → Working with larger datasets and diving deeper into analysis.