# Python Data Visualization Exercises

## MCDA 5511: Matplotlib, Seaborn & Plotly

Practice what you've learned with hands-on visualization exercises.

**Instructions:**
- Complete the code in each cell where you see `# YOUR CODE HERE`
- Run the verification cells to check your work
- Each exercise builds on concepts from the slides
- Hints are provided - try without them first!

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import numpy as np

# Set default style
plt.style.use('seaborn-v0_8-whitegrid')

print("Setup complete!")

---
## Exercise 1: Matplotlib Basics

Create a line plot of a sine wave, add labels and title, and save as PNG.

### 1.1 Create a sine wave line plot

Create a line plot showing one complete cycle of a sine wave (0 to 2*pi).

Requirements:
- Use `np.linspace()` to create 100 x-values from 0 to 2*pi
- Compute y = sin(x)
- Use the object-oriented interface: `fig, ax = plt.subplots()`

<details>
<summary>Hint</summary>

```python
x = np.linspace(0, 2 * np.pi, 100)
y = np.sin(x)
fig, ax = plt.subplots()
ax.plot(x, y)
```
</details>

In [None]:
# YOUR CODE HERE
x = ...
y = ...

fig, ax = plt.subplots()
...

plt.show()

### 1.2 Add labels and title

Enhance your plot with:
- X-axis label: "Angle (radians)"
- Y-axis label: "sin(x)"
- Title: "Sine Wave"

<details>
<summary>Hint</summary>

```python
ax.set_xlabel("Angle (radians)")
ax.set_ylabel("sin(x)")
ax.set_title("Sine Wave")
```
</details>

In [None]:
x = np.linspace(0, 2 * np.pi, 100)
y = np.sin(x)

fig, ax = plt.subplots()
ax.plot(x, y)

# YOUR CODE HERE - add labels and title
...
...
...

plt.show()

### 1.3 Save as PNG with 300 DPI

Save your figure to a file called `sine_wave.png` with 300 DPI resolution.

<details>
<summary>Hint</summary>

```python
fig.savefig("sine_wave.png", dpi=300, bbox_inches="tight")
```
</details>

In [None]:
x = np.linspace(0, 2 * np.pi, 100)
y = np.sin(x)

fig, ax = plt.subplots()
ax.plot(x, y)
ax.set_xlabel("Angle (radians)")
ax.set_ylabel("sin(x)")
ax.set_title("Sine Wave")

# YOUR CODE HERE - save the figure
...

print("Figure saved!")

In [None]:
# Verification
from pathlib import Path
assert Path("sine_wave.png").exists(), "sine_wave.png was not created"
print("Exercise 1 Passed!")

---
## Exercise 2: Multiple Plots (Subplots)

Create a 2x2 grid of different plot types.

### 2.1 Create a 2x2 subplot grid

Create a figure with 4 subplots arranged in a 2x2 grid:
1. Top-left: Line plot of sine
2. Top-right: Line plot of cosine
3. Bottom-left: Bar chart of categories A, B, C, D with values 25, 40, 30, 55
4. Bottom-right: Histogram of 1000 random normal values

Use `plt.tight_layout()` to fix spacing.

<details>
<summary>Hint</summary>

```python
fig, axes = plt.subplots(2, 2, figsize=(10, 8))
# Access subplots with axes[row, col]
axes[0, 0].plot(...)  # top-left
axes[0, 1].plot(...)  # top-right
axes[1, 0].bar(...)   # bottom-left
axes[1, 1].hist(...)  # bottom-right
plt.tight_layout()
```
</details>

In [None]:
# Data for plots
x = np.linspace(0, 2 * np.pi, 100)
categories = ["A", "B", "C", "D"]
values = [25, 40, 30, 55]
random_data = np.random.randn(1000)

# YOUR CODE HERE
fig, axes = plt.subplots(...)

# Top-left: sine
...

# Top-right: cosine
...

# Bottom-left: bar chart
...

# Bottom-right: histogram
...

# Fix spacing
...

plt.show()

---
## Exercise 3: Seaborn Statistical Plots

Use Seaborn's built-in datasets and statistical plot functions.

In [None]:
# Load the penguins dataset
penguins = sns.load_dataset("penguins")
print(penguins.head())
print(f"\nShape: {penguins.shape}")

### 3.1 Create a boxplot by species

Create a boxplot showing the distribution of `body_mass_g` for each `species`.

<details>
<summary>Hint</summary>

```python
sns.boxplot(data=penguins, x="species", y="body_mass_g")
```
</details>

In [None]:
fig, ax = plt.subplots(figsize=(8, 5))

# YOUR CODE HERE
...

ax.set_title("Penguin Body Mass by Species")
plt.show()

### 3.2 Add a hue for sex

Enhance the boxplot by adding `hue="sex"` to compare male and female penguins within each species.

<details>
<summary>Hint</summary>

```python
sns.boxplot(data=penguins, x="species", y="body_mass_g", hue="sex")
```
</details>

In [None]:
fig, ax = plt.subplots(figsize=(10, 6))

# YOUR CODE HERE - boxplot with hue
...

ax.set_title("Penguin Body Mass by Species and Sex")
plt.show()

---
## Exercise 4: Seaborn Correlation Heatmap

Compute a correlation matrix and visualize it as a heatmap.

### 4.1 Compute correlation matrix

Compute the correlation matrix for the numeric columns in the penguins dataset.

<details>
<summary>Hint</summary>

```python
numeric_cols = penguins.select_dtypes("number")
corr = numeric_cols.corr()
```
</details>

In [None]:
# YOUR CODE HERE
numeric_cols = ...
corr = ...

print(corr)

### 4.2 Create annotated heatmap with diverging colormap

Create a heatmap of the correlation matrix with:
- Annotations showing the correlation values (`annot=True`)
- A diverging colormap (`cmap="RdBu_r"`)
- Center the colormap at 0 (`center=0`)

<details>
<summary>Hint</summary>

```python
sns.heatmap(corr, annot=True, cmap="RdBu_r", center=0, fmt=".2f")
```
</details>

In [None]:
fig, ax = plt.subplots(figsize=(8, 6))

# YOUR CODE HERE
...

ax.set_title("Penguin Feature Correlations")
plt.show()

---
## Exercise 5: Plotly Interactive Scatter

Create interactive scatter plots with hover data and color encoding.

In [None]:
# Load the tips dataset
tips = px.data.tips()
print(tips.head())

### 5.1 Create scatter with hover data

Create an interactive scatter plot with:
- x = "total_bill"
- y = "tip"
- Hover data showing "size" (party size)

<details>
<summary>Hint</summary>

```python
fig = px.scatter(tips, x="total_bill", y="tip", hover_data=["size"])
fig.show()
```
</details>

In [None]:
# YOUR CODE HERE
fig = ...

fig.show()

### 5.2 Add color by category

Enhance the scatter plot by coloring points by the "day" column.

<details>
<summary>Hint</summary>

```python
fig = px.scatter(tips, x="total_bill", y="tip", color="day", hover_data=["size"])
```
</details>

In [None]:
# YOUR CODE HERE
fig = ...

fig.show()

### 5.3 Export as HTML

Save the interactive plot to an HTML file called `tips_scatter.html`.

<details>
<summary>Hint</summary>

```python
fig.write_html("tips_scatter.html")
```
</details>

In [None]:
fig = px.scatter(tips, x="total_bill", y="tip", color="day", hover_data=["size"])

# YOUR CODE HERE
...

print("HTML file saved!")

In [None]:
# Verification
assert Path("tips_scatter.html").exists(), "tips_scatter.html was not created"
print("Exercise 5 Passed!")

---
## Exercise 6: Plotly Animation

Create an animated visualization using the Gapminder dataset.

In [None]:
# Load the gapminder dataset
gapminder = px.data.gapminder()
print(gapminder.head())
print(f"\nYears available: {sorted(gapminder['year'].unique())}")

### 6.1 Create animated scatter plot

Create the famous "Hans Rosling" style animated bubble chart:
- x = "gdpPercap" (log scale)
- y = "lifeExp"
- size = "pop" (population)
- color = "continent"
- hover_name = "country"
- animation_frame = "year"

<details>
<summary>Hint</summary>

```python
fig = px.scatter(
    gapminder,
    x="gdpPercap",
    y="lifeExp",
    size="pop",
    color="continent",
    hover_name="country",
    animation_frame="year",
    log_x=True
)
```
</details>

In [None]:
# YOUR CODE HERE
fig = px.scatter(
    gapminder,
    x=...,
    y=...,
    size=...,
    color=...,
    hover_name=...,
    animation_frame=...,
    log_x=True
)

fig.show()

### 6.2 Fix axis ranges

The animation looks jumpy because the axes rescale each frame. Fix this by setting:
- `range_x=[100, 100000]` (GDP range)
- `range_y=[25, 90]` (life expectancy range)

<details>
<summary>Hint</summary>

Add these parameters to `px.scatter()`:
```python
range_x=[100, 100000],
range_y=[25, 90]
```
</details>

In [None]:
# YOUR CODE HERE - add range_x and range_y
fig = px.scatter(
    gapminder,
    x="gdpPercap",
    y="lifeExp",
    size="pop",
    color="continent",
    hover_name="country",
    animation_frame="year",
    log_x=True,
    ...  # Add range_x
    ...  # Add range_y
)

fig.show()

---
## Bonus: Same Plot, Three Libraries

Create the same visualization using all three libraries to compare syntax.

### Challenge: Scatter plot of tips data

Create a scatter plot of `total_bill` vs `tip` colored by `day` using:
1. Matplotlib
2. Seaborn
3. Plotly

Compare the code complexity and output quality.

In [None]:
# Load tips as pandas for seaborn compatibility
tips_df = sns.load_dataset("tips")

In [None]:
# Matplotlib version
# YOUR CODE HERE
fig, ax = plt.subplots(figsize=(8, 5))

# Need to manually handle colors for each day
...

plt.show()

In [None]:
# Seaborn version
# YOUR CODE HERE
fig, ax = plt.subplots(figsize=(8, 5))

...

plt.show()

In [None]:
# Plotly version
# YOUR CODE HERE
...

fig.show()

---
## Congratulations!

You've completed all the visualization exercises. You now have hands-on experience with:

1. **Matplotlib Basics** - line plots, labels, saving figures
2. **Subplots** - creating multi-panel figures
3. **Seaborn Statistical** - boxplots, hue encoding
4. **Seaborn Heatmaps** - correlation matrices, colormaps
5. **Plotly Interactive** - hover data, color encoding, HTML export
6. **Plotly Animation** - animated scatter plots, fixed axis ranges

**Next steps:**
- Try recreating visualizations from the library galleries
- Apply these techniques to your own datasets
- Explore more plot types in each library's documentation