## VI LAB 2 

## Data collection and preparation

For this project, the data was obtained directly from the NSF Award Search portal, which is the official source used by the National Science Foundation to publish information about funded grants (referred to administratively as “awards”). This source was chosen because it provides all the attributes required by the project specification, including state, directorate, award dates, and awarded amounts, and because it allows filtering by both time period and award status.

Two datasets were collected to fully satisfy the project requirements. The first dataset contains all NSF grants awarded during the last five years (2020–2024) and serves as the baseline for analyzing current funding distribution and evolution. The second dataset contains NSF grants that were explicitly terminated during the Trump administration (2017–2021), filtered using the “terminated” award status. This separation is intentional and necessary, as the project explicitly requires analyzing both recent grants and historical cancellations from a different political period.

Due to export limitations of the NSF portal, the 2020–2024 dataset was downloaded in multiple smaller time ranges and later merged. This approach ensured complete coverage while preserving data integrity and consistency.

## Data cleaning

All major data cleaning was performed in OpenRefine to keep the Python notebook focused on visualization rather than preprocessing. In OpenRefine, column names were standardized across datasets, unnecessary administrative fields were removed, and monetary values were converted to numeric format. A derived year attribute was created from the award start date to support temporal analysis. Additionally, a categorical flag (cancelled_trump) was introduced to clearly distinguish between baseline grants and Trump-era terminated grants.

After cleaning, the datasets were exported as clean CSV files and loaded into the Python notebook. Only minimal preprocessing was performed in Python, consisting of type checks, column name normalization, and the creation of aggregated DataFrames for each visualization task.

In [1]:
import pandas as pd
import altair as alt

# Performance: required by the project (datasets > 5000 rows)
alt.data_transformers.enable("vegafusion")


DataTransformerRegistry.enable('vegafusion')

In [2]:
# load datasets

base_path = "."

df_grants = pd.read_csv(
    f"{base_path}/NSF_Grants_Last5Years_Clean.csv"
)

df_trump = pd.read_csv(
    f"{base_path}/trump17-21-csv.csv"
)


In [3]:
# Ensure correct dtypes
df_grants["year"] = df_grants["year"].astype(int)
df_grants["award_amount"] = pd.to_numeric(df_grants["award_amount"], errors="coerce")

df_trump["year"] = df_trump["year"].astype(int)
df_trump["award_amount"] = pd.to_numeric(df_trump["award_amount"], errors="coerce")


In [4]:
# Drop rows with critical missing values
df_grants = df_grants.dropna(subset=["state", "directorate", "year"])
df_trump = df_trump.dropna(subset=["directorate"])


In [5]:
year_selection = alt.selection_point(
    fields=["year"],
    bind=alt.binding_select(
        options=sorted(df_grants["year"].unique()),
        name="Year: "
    ),
    value=sorted(df_grants["year"].unique())[0]
)

state_selection = alt.selection_point(
    fields=["state"],
    bind=alt.binding_select(
        options=sorted(df_grants["state"].unique()),
        name="State: "
    )
)


# Q1

In [6]:
# Q1 aggregation: grants per state per year
q1_df = (
    df_grants
    .groupby(["state", "year"])
    .agg(
        grants_count=("award_id", "count"),
        total_amount=("award_amount", "sum")
    )
    .reset_index()
)

q1_df.head()


Unnamed: 0,state,year,grants_count,total_amount
0,AK,2020,4,1561322.0
1,AK,2021,1,49966.0
2,AK,2022,7,769845.0
3,AL,2020,11,1927697.0
4,AL,2021,4,1384493.0


In [7]:
state_click = alt.selection_point(fields=["state"], empty="all")

q1_bars = (
    alt.Chart(q1_df)
    .mark_bar()
    .encode(
        x=alt.X("state:N", sort="-y", title="State"),
        y=alt.Y("grants_count:Q", title="Number of grants"),
        color=alt.condition(
            state_click,
            alt.Color("grants_count:Q", scale=alt.Scale(scheme="blues"), title="Grants count"),
            alt.value("lightgray")
        ),
        tooltip=[
            alt.Tooltip("state:N", title="State"),
            alt.Tooltip("grants_count:Q", title="Grants"),
            alt.Tooltip("total_amount:Q", title="Total amount ($)", format=",.0f")
        ]
    )
    .add_params(year_selection, state_click)
    .transform_filter(year_selection)
    .properties(width=750, height=380, title="Q1 — Grants by State (select a year + click a state)")
)


In [8]:
q1_state_trend = (
    alt.Chart(q1_df)
    .mark_line(point=True)
    .encode(
        x=alt.X("year:O", title="Year"),
        y=alt.Y("grants_count:Q", title="Grants"),
        tooltip=[
            alt.Tooltip("state:N"),
            alt.Tooltip("year:O"),
            alt.Tooltip("grants_count:Q", title="Grants"),
            alt.Tooltip("total_amount:Q", title="Total amount ($)", format=",.0f"),
        ],
    )
    .transform_filter(state_click)
    .properties(width=750, height=180, title="Selected state — grants over time")
)


In [9]:
(q1_bars & q1_state_trend)


bar chart chosen for ranking/comparison across states,

year dropdown avoids clutter vs small multiples,

click-to-highlight supports drill-down,

linked time-series gives context and supports exploration.

# Q2

In [10]:
# Q2 aggregation: grants per directorate per year
q2_df = (
    df_grants
    .groupby(["directorate", "year"])
    .agg(
        grants_count=("award_id", "count"),
        total_amount=("award_amount", "sum")
    )
    .reset_index()
)

q2_df.head()


Unnamed: 0,directorate,year,grants_count,total_amount
0,AGS,2020,16,3300718.0
1,AGS,2022,16,1375227.0
2,AGS,2023,6,871122.0
3,AGS,2024,7,1192193.0
4,AST,2020,9,2094148.0


In [11]:
dir_click = alt.selection_point(fields=["directorate"], empty="all")


In [12]:
q2_overview = (
    alt.Chart(q2_df)
    .mark_bar()
    .encode(
        y=alt.Y("directorate:N", sort="-x", title="Directorate"),
        x=alt.X("grants_count:Q", title="Number of grants"),
        color=alt.condition(
            dir_click,
            alt.Color("grants_count:Q", scale=alt.Scale(scheme="blues"), title="Grants count"),
            alt.value("lightgray")
        ),
        tooltip=[
            alt.Tooltip("directorate:N", title="Directorate"),
            alt.Tooltip("year:O", title="Year"),
            alt.Tooltip("grants_count:Q", title="Grants"),
            alt.Tooltip("total_amount:Q", title="Total amount ($)", format=",.0f"),
        ],
    )
    .add_params(year_selection, dir_click)
    .transform_filter(year_selection)
    .properties(
        title="Q2 — Grants by Directorate (select a year + click a directorate)",
        width=750,
        height=420,
    )
)


In [13]:
q2_trend = (
    alt.Chart(q2_df)
    .mark_line(point=True)
    .encode(
        x=alt.X("year:O", title="Year"),
        y=alt.Y("grants_count:Q", title="Number of grants"),
        tooltip=[
            alt.Tooltip("directorate:N", title="Directorate"),
            alt.Tooltip("year:O", title="Year"),
            alt.Tooltip("grants_count:Q", title="Grants"),
            alt.Tooltip("total_amount:Q", title="Total amount ($)", format=",.0f"),
        ],
    )
    .transform_filter(dir_click)
    .properties(
        title="Selected directorate — grants over time",
        width=750,
        height=180,
    )
)


In [14]:
(q2_overview & q2_trend)


comments


# Q3

In [15]:
q3_cancel_df = (
    df_trump
    .groupby(["directorate"])
    .agg(cancelled_count=("award_id", "count"),
         cancelled_amount=("award_amount", "sum"))
    .reset_index()
)


In [16]:
q3_base_df = (
    df_grants
    .groupby(["directorate"])
    .agg(base_count=("award_id", "count"),
         base_amount=("award_amount", "sum"))
    .reset_index()
)


In [17]:
q3_df = (
    q3_base_df
    .merge(q3_cancel_df, on="directorate", how="outer")
    .fillna(0)
)

# Compute cancellation rate vs baseline (count-based)
q3_df["cancel_rate"] = q3_df["cancelled_count"] / q3_df["base_count"].replace(0, pd.NA)
q3_df["cancel_rate"] = q3_df["cancel_rate"].fillna(0)
q3_df.head()


  q3_df["cancel_rate"] = q3_df["cancel_rate"].fillna(0)


Unnamed: 0,directorate,base_count,base_amount,cancelled_count,cancelled_amount,cancel_rate
0,AGS,45.0,6739260.0,42.0,7671510.0,0.933333
1,AST,20.0,3856468.0,17.0,2623050.0,0.85
2,BCS,152.0,12735737.0,134.0,7984978.0,0.881579
3,BFA,22.0,4317435.0,13.0,2297368.0,0.590909
4,CBET,195.0,29234639.0,158.0,16874815.0,0.810256


In [18]:
dir_sel = alt.selection_point(fields=["directorate"], empty="all")


In [19]:
q3_scatter = (
    alt.Chart(q3_df)
    .mark_circle(opacity=0.8)
    .encode(
        x=alt.X("base_count:Q", title="Baseline grants (last 5 years)"),
        y=alt.Y("cancelled_count:Q", title="Cancelled grants (Trump era)"),
        size=alt.Size("cancelled_amount:Q", title="Cancelled amount ($)", legend=None),
        color=alt.Color("cancel_rate:Q", title="Cancellation rate", scale=alt.Scale(scheme="oranges")),
        tooltip=[
            alt.Tooltip("directorate:N", title="Directorate"),
            alt.Tooltip("base_count:Q", title="Baseline grants"),
            alt.Tooltip("cancelled_count:Q", title="Cancelled grants"),
            alt.Tooltip("cancel_rate:Q", title="Cancel rate", format=".2%"),
            alt.Tooltip("cancelled_amount:Q", title="Cancelled amount ($)", format=",.0f"),
        ],
    )
    .add_params(dir_sel)
    .properties(width=750, height=380, title="Q3 — Cancelled grants vs baseline distribution (by directorate)")
)


In [20]:
q3_bars = (
    alt.Chart(q3_df)
    .mark_bar()
    .encode(
        y=alt.Y(
            "directorate:N",
            sort="-x",
            title="Directorate",
            axis=alt.Axis(labelLimit=200)
        ),
        x=alt.X(
            "cancelled_count:Q",
            title="Cancelled grants"
        ),
        color=alt.condition(
            dir_sel,
            alt.value("#d95f02"),
            alt.value("lightgray")
        ),
        tooltip=[
            alt.Tooltip("directorate:N"),
            alt.Tooltip("cancelled_count:Q", title="Cancelled grants"),
            alt.Tooltip("cancel_rate:Q", title="Cancel rate", format=".2%")
        ]
    )
    .transform_filter(alt.datum.cancelled_count > 0)
    .add_params(dir_sel)
    .properties(
        width=750,
        height=rank_height,
        title="Cancelled grants ranking (click to focus)"
    )
)


NameError: name 'rank_height' is not defined

In [None]:
q3_cancel_by_year = (
    df_trump
    .groupby(["directorate", "year"])
    .agg(cancelled_count=("award_id", "count"),
         cancelled_amount=("award_amount", "sum"))
    .reset_index()
)

q3_trend = (
    alt.Chart(q3_cancel_by_year)
    .mark_line(point=True)
    .encode(
        x=alt.X("year:O", title="Year (Trump era)"),
        y=alt.Y("cancelled_count:Q", title="Cancelled grants"),
        tooltip=[
            alt.Tooltip("directorate:N"),
            alt.Tooltip("year:O"),
            alt.Tooltip("cancelled_count:Q", title="Cancelled grants"),
            alt.Tooltip("cancelled_amount:Q", title="Cancelled amount ($)", format=",.0f")
        ]
    )
    .transform_filter(dir_sel)
    .properties(width=750, height=180, title="Selected directorate — cancellations over time")
)


In [None]:
n_dirs = q3_df[q3_df["cancelled_count"] > 0]["directorate"].nunique()
rank_height = max(300, n_dirs * 18)


In [None]:
(q3_scatter & q3_bars & q3_trend)


comments

# Q4

In [None]:
# Q4 aggregation: total funding and number of grants per year
q4_df = (
    df_grants
    .groupby("year")
    .agg(
        total_amount=("award_amount", "sum"),
        grants_count=("award_id", "count")
    )
    .reset_index()
)

q4_df


Unnamed: 0,year,total_amount,grants_count
0,2020,270167049.0,1410
1,2021,43187690.0,223
2,2022,256978539.0,1536
3,2023,44698271.0,232
4,2024,40922659.0,240


In [None]:
year_click = alt.selection_point(fields=["year"], empty="all")


In [None]:
q4_funding_line = (
    alt.Chart(q4_df)
    .mark_line(point=True)
    .encode(
        x=alt.X("year:O", title="Year"),
        y=alt.Y(
            "total_amount:Q",
            title="Total funding amount ($)",
            axis=alt.Axis(format="~s")
        ),
        color=alt.condition(
            year_click,
            alt.value("#1f77b4"),
            alt.value("lightgray")
        ),
        tooltip=[
            alt.Tooltip("year:O", title="Year"),
            alt.Tooltip("total_amount:Q", title="Total amount ($)", format=",.0f"),
            alt.Tooltip("grants_count:Q", title="Number of grants")
        ]
    )
    .add_params(year_click)
    .properties(
        width=750,
        height=280,
        title="Q4 — Evolution of total NSF funding over the last 5 years"
    )
)


In [None]:
q4_grants_bar = (
    alt.Chart(q4_df)
    .mark_bar()
    .encode(
        x=alt.X("year:O", title="Year"),
        y=alt.Y("grants_count:Q", title="Number of grants"),
        color=alt.condition(
            year_click,
            alt.value("#ff7f0e"),
            alt.value("lightgray")
        ),
        tooltip=[
            alt.Tooltip("year:O"),
            alt.Tooltip("grants_count:Q", title="Number of grants"),
            alt.Tooltip("total_amount:Q", title="Total amount ($)", format=",.0f")
        ]
    )
    .add_params(year_click)
    .properties(
        width=750,
        height=220,
        title="Number of grants per year (click to highlight)"
    )
)


In [None]:
(q4_funding_line & q4_grants_bar)


needs work, very bad

# Q5

In [None]:
df_grants.columns = df_grants.columns.str.strip()
df_trump.columns = df_trump.columns.str.strip()


In [None]:
q5_grants = (
    df_grants
    .groupby(["state", "year"])
    .agg(
        grants_count=("award_id", "count"),
        total_amount=("award_amount", "sum")
    )
    .reset_index()
)


In [None]:
q5_trump = (
    df_trump
    .groupby(["state", "year"])
    .agg(
        cancelled_count=("award_id", "count"),
        cancelled_amount=("award_amount", "sum")
    )
    .reset_index()
)


In [None]:
q5_amount_line = (
    alt.Chart(q5_grants)
    .transform_filter(state_selection)
    .mark_line(point=True)
    .encode(
        x=alt.X("year:O", title="Year (last 5 years)"),
        y=alt.Y("total_amount:Q", title="Total funding ($)", axis=alt.Axis(format="~s")),
        tooltip=[
            alt.Tooltip("state:N"),
            alt.Tooltip("year:O"),
            alt.Tooltip("total_amount:Q", title="Total funding ($)", format=",.0f"),
            alt.Tooltip("grants_count:Q", title="Number of grants")
        ]
    )
    .add_params(state_selection)
    .properties(width=750, height=260, title="Q5 — Selected state: total funding over time (2020–2024)")
)


In [None]:
q5_count_bar = (
    alt.Chart(q5_grants)
    .transform_filter(state_selection)
    .mark_bar()
    .encode(
        x=alt.X("year:O", title="Year (last 5 years)"),
        y=alt.Y("grants_count:Q", title="Number of grants"),
        tooltip=[
            alt.Tooltip("state:N"),
            alt.Tooltip("year:O"),
            alt.Tooltip("grants_count:Q", title="Number of grants"),
            alt.Tooltip("total_amount:Q", title="Total funding ($)", format=",.0f"),
        ]
    )
    .add_params(state_selection)
    .properties(width=750, height=200, title="Selected state: number of grants per year (2020–2024)")
)


In [None]:
q5_cancelled = (
    alt.Chart(q5_trump)
    .transform_filter(state_selection)
    .mark_bar()
    .encode(
        x=alt.X("year:O", title="Year (Trump era)"),
        y=alt.Y("cancelled_count:Q", title="Cancelled grants"),
        tooltip=[
            alt.Tooltip("state:N"),
            alt.Tooltip("year:O"),
            alt.Tooltip("cancelled_count:Q", title="Cancelled grants"),
            alt.Tooltip("cancelled_amount:Q", title="Cancelled amount ($)", format=",.0f")
        ]
    )
    .add_params(state_selection)
    .properties(width=750, height=200, title="Trump era (2017–2021): cancelled grants for selected state")
)


In [None]:
(q5_amount_line & q5_count_bar & q5_cancelled)


comments

# Q6

For Question 6, state population was selected as an additional attribute not previously used in the analysis. Population is a meaningful contextual variable that enables deeper exploration beyond absolute grant counts or total funding amounts. By relating funding to population size, users can investigate whether certain states receive disproportionately high or low levels of funding relative to their population, revealing patterns that are not visible through raw totals alone.

This attribute supports an analysis by enabling per capita comparisons, outlier detection, and interactive investigation of funding efficiency across states and years. It integrates naturally with the existing state based aggregations used in earlier questions.

In [54]:
import pandas as pd
import altair as alt

df_pop_raw = pd.read_csv("estimated_population.csv")
df_abbr_raw = pd.read_csv("state_abbreviations.csv")

# Clean column names
df_pop_raw.columns = df_pop_raw.columns.str.strip()
df_abbr_raw.columns = df_abbr_raw.columns.str.strip()

In [55]:
# Ensure we have a 'state' column (full names like Alabama, Alaska, ...)
if "state" not in df_pop_raw.columns:
    raise ValueError(f"estimated_population.csv must have a 'state' column. Found: {list(df_pop_raw.columns)}")

pop_cols = [c for c in df_pop_raw.columns if c.lower().startswith("pop_")]
if not pop_cols:
    raise ValueError(f"Could not find pop_YYYY columns. Found: {list(df_pop_raw.columns)}")

df_pop_long = df_pop_raw.melt(
    id_vars=["state"],
    value_vars=pop_cols,
    var_name="year",
    value_name="population"
)

# Convert year from 'pop_2020' -> 2020
df_pop_long["year"] = df_pop_long["year"].str.replace("pop_", "", regex=False).astype(int)

# Convert population to numeric
df_pop_long["population"] = pd.to_numeric(df_pop_long["population"], errors="coerce")

# Keep only 2020-2024 (safety)
df_pop_long = df_pop_long[df_pop_long["year"].between(2020, 2024)]

# Standardize state name
df_pop_long = df_pop_long.rename(columns={"state": "state_name"})
df_pop_long["state_name"] = df_pop_long["state_name"].astype(str).str.strip()

In [56]:
df_abbr = df_abbr_raw.copy()

# Detect likely columns for state name and abbreviation
name_candidates = [c for c in df_abbr.columns if "name" in c.lower() or ("state" in c.lower() and "abbr" not in c.lower())]
abbr_candidates = [c for c in df_abbr.columns if "abbr" in c.lower() or "code" in c.lower()]

if not name_candidates or not abbr_candidates:
    raise ValueError(
        "state_abbreviations.csv must contain columns for full state name and abbreviation.\n"
        f"Columns found: {list(df_abbr.columns)}"
    )

name_col = name_candidates[0]
abbr_col = abbr_candidates[0]

df_abbr = df_abbr.rename(columns={name_col: "state_name", abbr_col: "state"})
df_abbr["state_name"] = df_abbr["state_name"].astype(str).str.strip()
df_abbr["state"] = df_abbr["state"].astype(str).str.strip()

# Normalize case (helps joins)
df_abbr["state_name_key"] = df_abbr["state_name"].str.lower()
df_pop_long["state_name_key"] = df_pop_long["state_name"].str.lower()

# Join to add 2-letter codes
df_pop_long = df_pop_long.merge(
    df_abbr[["state_name_key", "state"]],
    on="state_name_key",
    how="left"
)

# Debug unmapped names
unmapped = df_pop_long[df_pop_long["state"].isna()]["state_name"].dropna().unique()
print("Unmapped population state names (should be empty):", unmapped[:20], " ... total:", len(unmapped))

# Keep only mapped rows + required cols
df_pop_long = df_pop_long.dropna(subset=["state", "population"])
df_pop_long = df_pop_long[["state", "year", "population"]].copy()

print("Population long shape:", df_pop_long.shape)
print("Population states:", df_pop_long["state"].nunique(), "Years:", sorted(df_pop_long["year"].unique()))

Unmapped population state names (should be empty): []  ... total: 0
Population long shape: (255, 3)
Population states: 51 Years: [2020, 2021, 2022, 2023, 2024]


In [57]:
# sanity: ensure expected columns exist
required_cols = {"state", "year", "award_amount", "award_id"}
missing = required_cols - set(df_grants.columns)
if missing:
    raise ValueError(f"df_grants missing required columns: {missing}. Found: {list(df_grants.columns)}")

# Ensure year numeric
df_grants["year"] = pd.to_numeric(df_grants["year"], errors="coerce").astype("Int64")

q6_grants = (
    df_grants
    .dropna(subset=["state", "year", "award_amount"])
    .groupby(["state", "year"])
    .agg(
        total_amount=("award_amount", "sum"),
        grants_count=("award_id", "count")
    )
    .reset_index()
)

print("NSF aggregated shape:", q6_grants.shape)
print("NSF states:", q6_grants["state"].nunique(), "Years:", sorted(q6_grants["year"].unique()))


NSF aggregated shape: (231, 4)
NSF states: 52 Years: [2020, 2021, 2022, 2023, 2024]


In [58]:
q6_df = q6_grants.merge(df_pop_long, on=["state", "year"], how="inner")
q6_df["funding_per_capita"] = q6_df["total_amount"] / q6_df["population"]

print("Merged q6_df shape:", q6_df.shape)
print("Merged states:", q6_df["state"].nunique(), "Years:", sorted(q6_df["year"].unique()))

# If still empty, show mismatch hints
if q6_df.empty:
    print("\nq6_df is EMPTY. Debug hints:")
    print("Sample NSF states:", sorted(q6_grants["state"].unique())[:15])
    print("Sample POP states:", sorted(df_pop_long["state"].unique())[:15])
    print("Sample NSF years:", sorted(q6_grants["year"].unique()))
    print("Sample POP years:", sorted(df_pop_long["year"].unique()))
    raise ValueError("Merge produced empty q6_df. See debug hints above.")

Merged q6_df shape: (222, 6)
Merged states: 50 Years: [2020, 2021, 2022, 2023, 2024]


In [59]:
year_selection = alt.selection_point(
    fields=["year"],
    bind=alt.binding_select(options=sorted(q6_df["year"].unique()), name="Year: "),
    value=max(q6_df["year"].unique())
)

state_click = alt.selection_point(fields=["state"], empty="all")

q6_overview = (
    alt.Chart(q6_df)
    .mark_bar()
    .encode(
        x=alt.X("state:N", sort="-y", title="State"),
        y=alt.Y("funding_per_capita:Q", title="Funding per capita ($)", axis=alt.Axis(format="~s")),
        color=alt.condition(
            state_click,
            alt.Color("funding_per_capita:Q", scale=alt.Scale(scheme="purples"), title="Funding per capita"),
            alt.value("lightgray")
        ),
        tooltip=[
            alt.Tooltip("state:N", title="State"),
            alt.Tooltip("year:O", title="Year"),
            alt.Tooltip("population:Q", title="Population", format=",.0f"),
            alt.Tooltip("total_amount:Q", title="Total funding ($)", format=",.0f"),
            alt.Tooltip("funding_per_capita:Q", title="Funding per capita ($)", format=",.2f"),
            alt.Tooltip("grants_count:Q", title="Grants count"),
        ]
    )
    .add_params(year_selection, state_click)
    .transform_filter(year_selection)
    .properties(width=750, height=380, title="Q6 — NSF funding per capita by state (select year + click a state)")
)

In [60]:

q6_trend = (
    alt.Chart(q6_df)
    .mark_line(point=True)
    .encode(
        x=alt.X("year:O", title="Year"),
        y=alt.Y("funding_per_capita:Q", title="Funding per capita ($)", axis=alt.Axis(format="~s")),
        tooltip=[
            alt.Tooltip("state:N", title="State"),
            alt.Tooltip("year:O", title="Year"),
            alt.Tooltip("funding_per_capita:Q", title="Funding per capita ($)", format=",.2f"),
            alt.Tooltip("total_amount:Q", title="Total funding ($)", format=",.0f"),
            alt.Tooltip("population:Q", title="Population", format=",.0f"),
            alt.Tooltip("grants_count:Q", title="Grants count"),
        ]
    )
    .transform_filter(state_click)
    .properties(width=750, height=200, title="Selected state — funding per capita over time (2020–2024)")
)

(q6_overview & q6_trend)


comment

# Final Visualization