# creating data exploration notebooks with visualizations

## ðŸ“š Learning Objectives

By completing this notebook, you will:
- Create common chart types with Matplotlib/Seaborn
- Build an interactive Plotly chart
- Apply basic styling and annotations

## ðŸ”— Prerequisites

- âœ… Python basics
- âœ… Jupyter Notebook basics

---

## Official Structure Reference

This notebook covers practical activities from **Course 12, Unit 2**:
- creating data exploration notebooks with visualizations
- **Source:** `DETAILED_UNIT_DESCRIPTIONS.md`

---

## Overview

We will generate a small synthetic dataset locally and visualize it using:
- Matplotlib + Seaborn (static)
- Plotly (interactive)


In [None]:
import numpy as np
import pandas as pd

# Plotting libraries
import matplotlib.pyplot as plt
import seaborn as sns

# Make results reproducible
rng = np.random.default_rng(42)

n = 300
x = rng.normal(0, 1, size=n)
y = 2.0 * x + rng.normal(0, 0.6, size=n)
cat = rng.choice(['A', 'B', 'C'], size=n, p=[0.4, 0.35, 0.25])

df = pd.DataFrame({'x': x, 'y': y, 'category': cat})
df.head()


In [None]:
# Basic Seaborn scatter + regression line
sns.set_theme(style='whitegrid')
plt.figure(figsize=(7, 4))
sns.regplot(data=df, x='x', y='y', scatter_kws={'alpha': 0.35})
plt.title('y vs x with regression fit')
plt.show()

# Distribution plots
fig, ax = plt.subplots(1, 2, figsize=(10, 4))
sns.histplot(df['x'], kde=True, ax=ax[0])
ax[0].set_title('Distribution of x')

sns.boxplot(data=df, x='category', y='y', ax=ax[1])
ax[1].set_title('y by category')
plt.tight_layout()
plt.show()


In [None]:
# Interactive Plotly chart (optional)
try:
    import plotly.express as px

    fig = px.scatter(df, x='x', y='y', color='category', trendline='ols',
                     title='Interactive scatter (Plotly)')
    fig.show()
except ImportError:
    print('Plotly not installed. Install with: pip install plotly')
