# Federal Budget & Policy Analysis — Exploratory Notebook

This notebook provides an interactive starting point for exploring the collected data.

**Prerequisites:** Run the pipeline first to populate the database:
```bash
python run_pipeline.py --init-db
python run_pipeline.py --collect
```

In [None]:
import sys
from pathlib import Path

# Add project root to path
PROJECT_ROOT = Path.cwd().parent if Path.cwd().name == 'notebooks' else Path.cwd()
sys.path.insert(0, str(PROJECT_ROOT))

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from src.utils.config import load_config
from src.analysis.policy_impact import (
    load_series, load_multiple_series,
    interrupted_time_series, test_stationarity,
    percent_change_around_event, compute_real_values
)
from src.visualization.charts import (
    plot_time_series, plot_its_results, plot_policy_comparison
)

config = load_config()
policy_periods = config['analysis']['policy_periods']

%matplotlib inline
plt.style.use('seaborn-v0_8-whitegrid')
print('Setup complete.')

## 1. Federal Deficit & Debt

In [None]:
# Load deficit and debt as % of GDP
fiscal = load_multiple_series(['FYFSGDA188S', 'GFDEGDQ188S'])
fiscal.columns = ['Deficit (% GDP)', 'Debt (% GDP)']

if not fiscal.empty:
    fig = plot_time_series(
        fiscal,
        title='Federal Deficit & Debt as % of GDP',
        ylabel='% of GDP',
        policy_dates={
            'Reagan ERTA': '1981-08-13',
            'Clinton OBRA': '1993-08-10',
            'Bush EGTRRA': '2001-06-07',
            'TCJA': '2017-12-22',
        },
    )
    plt.show()
else:
    print('No data available. Run: python run_pipeline.py --collect-fred')

## 2. Revenue vs. Expenditure

In [None]:
rev_exp = load_multiple_series(['FGRECPT', 'FGEXPND'])
rev_exp.columns = ['Receipts', 'Expenditures']

if not rev_exp.empty:
    fig = plot_time_series(
        rev_exp,
        title='Federal Receipts vs. Expenditures',
        ylabel='Billions of Dollars',
    )
    plt.show()

## 3. Trade & Tariff Impact

In [None]:
trade = load_multiple_series(['IMPGS', 'EXPGS', 'BOPGSTB'])
trade.columns = ['Imports', 'Exports', 'Trade Balance']

if not trade.empty:
    fig = plot_time_series(
        trade,
        title='U.S. Trade: Imports, Exports & Balance',
        ylabel='Billions of Dollars',
        policy_dates={
            'Sec 232 Tariffs': '2018-03-23',
            'Sec 301 Tariffs': '2018-07-06',
        },
    )
    plt.show()

## 4. Welfare Indicators

In [None]:
# Real disposable income and median household income
welfare = load_multiple_series(['DSPIC96', 'MEHOINUSA672N'])

if not welfare.empty:
    for col in welfare.columns:
        fig, ax = plt.subplots(figsize=(12, 5))
        ax.plot(welfare.index, welfare[col])
        ax.set_title(col)
        ax.grid(True, alpha=0.3)
        plt.show()

## 5. Interrupted Time Series — TCJA Impact on Deficit

In [None]:
deficit = load_series('FYFSGDA188S')

if not deficit.empty:
    its = interrupted_time_series(deficit, '2017-12-22')
    print('Intervention effect:', its['intervention_effect'])
    print('Trend change:', its['trend_change'])
    print('R²:', its['r_squared'])
    print()
    print('P-values:', its['pvalues'])
    
    fig = plot_its_results(
        its,
        title='Interrupted Time Series: TCJA Impact on Deficit (% GDP)',
        ylabel='% of GDP',
    )
    plt.show()

## 6. Policy Period Comparison

In [None]:
deficit = load_series('FYFSGDA188S')

if not deficit.empty:
    fig = plot_policy_comparison(
        deficit,
        policy_periods,
        ylabel='Deficit as % of GDP',
        title='Average Federal Deficit by Policy Era',
    )
    plt.show()

## Next Steps

- [ ] Add CPI-adjusted real-value analysis
- [ ] Run Chow tests for structural breaks at each policy date
- [ ] Build tariff pass-through estimation model
- [ ] Add distributional analysis by income quintile (using CBO data)
- [ ] Difference-in-differences where state-level variation is available