# Panel Midterm Project
**Student:** Cherprang Reanchaiyuth  
**Library:** Panel (HoloViz)  
**Dataset:** Wage.csv  

This project demonstrates how to build an interactive dashboard using the Panel library. It allows users to explore how wages vary with education, job class, and age.

## What is Panel?
Panel is an open-source Python library from the HoloViz ecosystem that helps data scientists turn notebooks and analyses into interactive dashboards or apps, without needing web-development skills.
It lets you combine widgets (like dropdowns and sliders) with plots, tables, and text, creating a small user interface (UI) directly inside Jupyter or as a standalone web page.

## Objectives of Panel

- **Make interactivity simple** – turn any Python analysis into a dashboard with just a few lines of code.

- **Bridge data & visualization** – connect data (pandas, NumPy) to visuals (Matplotlib, Bokeh, Plotly, hvPlot, etc.).

- **Work everywhere** – inside Jupyter notebooks, VS Code, or served as a mini web app using panel serve.

## Setup and Library Imports
Before building the dashboard we import key libraries:
- **pandas** for data manipulation  
- **panel** for UI creation  
- **hvplot** for interactive charts  

Then we enable Panel’s extension so the widgets display correctly in Jupyter.

In [5]:
import pandas as pd
import hvplot.pandas
import panel as pn
pn.extension()

## Load and Preview the Dataset
We load `Wage.csv`, which contains individual data on age, education, job class, and wage. Previewing the data helps identify columns we’ll use for filters and visualizations.

In [6]:
df = pd.read_csv("data/Wage.csv")
df.head()

Unnamed: 0,year,age,maritl,race,education,region,jobclass,health,health_ins,logwage,wage
0,2006,18,1. Never Married,1. White,1. < HS Grad,2. Middle Atlantic,1. Industrial,1. <=Good,2. No,4.318063,75.043154
1,2004,24,1. Never Married,1. White,4. College Grad,2. Middle Atlantic,2. Information,2. >=Very Good,2. No,4.255273,70.47602
2,2003,45,2. Married,1. White,3. Some College,2. Middle Atlantic,1. Industrial,1. <=Good,1. Yes,4.875061,130.982177
3,2003,43,2. Married,3. Asian,4. College Grad,2. Middle Atlantic,2. Information,2. >=Very Good,1. Yes,5.041393,154.685293
4,2005,50,4. Divorced,1. White,2. HS Grad,2. Middle Atlantic,2. Information,1. <=Good,1. Yes,4.318063,75.043154


## Build the User Interface (Widgets)
Widgets are interactive controls that let users filter and explore data. We’ll create dropdowns and sliders for education, job class, age, and wage. These form the front-end controls of our dashboard.
### Clean Category Labels
The dataset prefixes some labels with `1.`, `2.`, etc.  
We remove those so dropdowns show clean values like **“Industrial”** and **“Information.”**
### Build the User Interface (Widgets)
We create one set of widgets the entire app will use:
- **Education** dropdown  
- **Job Class** radio buttons  
- **Age** range slider  
- **Wage** range slider  
- **Reset** button to clear filters

In [9]:
df2 = df.copy()
for col in ["education","jobclass","health","region","maritl","race","health_ins"]:
    if col in df2.columns and df2[col].dtype == object:
        df2[col] = df2[col].str.replace(r"^\s*\d+\.\s*", "", regex=True)

#Prepare widget option lists
edu_opts = [None] + sorted(df2["education"].dropna().unique().tolist())
job_opts = [None] + sorted(df2["jobclass"].dropna().unique().tolist())

#Create the widgets
education_w = pn.widgets.Select(name="Education", options=edu_opts, value=None)
jobclass_w  = pn.widgets.RadioButtonGroup(name="Job Class", options=job_opts, button_type="success")

#Create the sliders
age_slider  = pn.widgets.IntRangeSlider(
    name="Age Range", start=int(df2["age"].min()), end=int(df2["age"].max()),
    value=(int(df2["age"].min()), int(df2["age"].max())))
w_min, w_max = float(df2["wage"].min()), float(df2["wage"].max())
wage_slider = pn.widgets.RangeSlider(name="Wage Range", start=w_min, end=w_max, value=(w_min, w_max))

#Add a Reset button
reset_btn = pn.widgets.Button(name="Reset Filters", button_type="warning")
def _reset(_):
    education_w.value = None
    jobclass_w.value  = None
    age_slider.value  = (int(df2["age"].min()), int(df2["age"].max()))
    wage_slider.value = (w_min, w_max)
reset_btn.on_click(_reset)

#Display all widgets together
pn.Column(pn.pane.Markdown("### Filters"),
    education_w, jobclass_w, age_slider, wage_slider, reset_btn)

## Reactive Filtering (with `pn.bind`)
We define a function `_filtered(...)` and **bind** it to the widget values using `pn.bind`.  
This produces a **reactive DataFrame** (`filtered_data`) that updates whenever a widget changes.

In [11]:
def _filtered(edu, job, age, wage):
    d = df2.copy()
    if edu is not None:
        d = d[d["education"] == edu]
    if job is not None:
        d = d[d["jobclass"] == job]
    d = d[(d["age"] >= age[0]) & (d["age"] <= age[1])]
    d = d[(d["wage"] >= wage[0]) & (d["wage"] <= wage[1])]
    return d
#Connect all the widgets with other visualizations
filtered_data = pn.bind(_filtered, education_w, jobclass_w, age_slider, wage_slider)

## Summary Indicators (KPIs)
KPIs give a quick numeric snapshot of the **current filtered data**:
- **Avg Wage**
- **Median Wage**
- **Avg Age**
- **Rows (count)**

In [13]:
#Define KPI calculation function 
def _kpis(d):
    if d.empty:
        avg_w, med_w, avg_a, n = 0.0, 0.0, 0.0, 0
    else:
        avg_w = float(d["wage"].mean())
        med_w = float(d["wage"].median())
        avg_a = float(d["age"].mean())
        n     = int(len(d))
    return pn.FlexBox(
        pn.indicators.Number(name="Avg Wage", value=avg_w, format="{value:,.2f}"),
        pn.indicators.Number(name="Median Wage", value=med_w, format="{value:,.2f}"),
        pn.indicators.Number(name="Avg Age", value=avg_a, format="{value:,.2f}"),
        pn.indicators.Number(name="Rows", value=n, format="{value:,.0f}"),
        justify_content="space-around")

#Bind it to the filtered data
kpis_v = pn.bind(_kpis, filtered_data)
kpis_v

## Visualization: Wage vs Age (colored by Education)
Interactive scatter plot of **Age** vs **Wage**, colored by **Education**.  
Use filters to see how the relationship changes across groups.

In [15]:
#Define scatter plot function
def _scatter(d):
    if d.empty:
        return pn.pane.Markdown("_No data for this selection._")
    return d.hvplot.scatter(
        x="age", y="wage", color="education",
        legend="top", height=380, width=620, alpha=0.7,
        title="Wage vs Age by Education")

#Bind it to the filtered data
scat_v = pn.bind(_scatter, filtered_data)
scat_v

## Combine Everything into a Dashboard
We arrange **Controls** + **Tabs** (Overview, Distribution, Table) into one layout.

In [17]:
#Creates control panel
controls = pn.Card(
    pn.pane.Markdown("### Filters"),
    education_w, jobclass_w, age_slider, wage_slider, reset_btn,
    title="Controls", collapsible=False)

#Put everthing together on Dashboard
dashboard = pn.Row(controls, pn.Column(kpis_v, scat_v))
dashboard

## Reflection

Building this dashboard helped me understand:
- How to use **Panel widgets** to collect user input interactively.  
- How **`pn.bind()`** enables real-time updates between data, charts, and KPIs.  
- The importance of layout design and user experience when presenting data visually.  

**Panel** makes it possible to turn simple Python analysis into an interactive web app inside Jupyter, 
bridging the gap between coding and communication.