# 06_results_rollup_and_pr_2024.ipynb

## Part A — What we are doing

We create a **results dashboard** and a concise documentation file capturing:
- Statewide rebate totals (no-phase & with-phase)
- MTR summary and wage-share-in-band diagnostics
- Distribution preview (means, changes, shares)
- Off-model VAT/sales-tax input preview

We **verify** all expected CSVs are present and consistent, then write `docs/README_VAT.md`. Optionally, we stage a branch/commit and print a **PR title + body** for quick review.

---

## Part B — Inputs & expected files

We expect the following to exist:
- `outputs/vat/rebate_cost_2024.csv`
- `outputs/vat/rebate_cost_by_decile_2024.csv`
- `outputs/vat/rebate_cost_by_status_2024.csv`
- `outputs/vat/mtr_summary_2024.csv`
- `outputs/vat/wage_phaseout_shares_2024.csv`
- `outputs/vat/distribution_2024.csv`
- `outputs/vat/sales_tax_inputs_2024.csv`

If any file is missing, we print an actionable message indicating which notebook to re-run.

---

## Part C — Documentation & optional PR helper

- **Writes:** `docs/README_VAT.md` with:
  - Project overview
  - Method summary (allowance schedules, phase-out formulas)
  - Links to all outputs
  - Notes on limitations/assumptions

- **Optional** (if enabled):  
  - Create a feature branch, make a commit, and emit a suggested PR title/body you can paste on GitHub.

---

## Part D — Acceptance checks & troubleshooting

**Acceptance checks**
- All expected files exist.
- Spot checks align with earlier totals (e.g., with-phase ≤ no-phase; decile sums ≈ totals; shares ≈ 100%).
- README contains working relative paths to outputs.

**Troubleshooting**
- **Path errors** on Windows: use absolute paths, e.g.  
  `C:\Users\Ali.Melad\Dropbox\Ali Work\Kyle\California VAT\policy_engile_cali_v2\outputs\vat\...`
- If anything is inconsistent, re-run the producing notebook (02–05) and then re-run this roll-up.

---

## Part E — How to rerun end-to-end

1. Run `00_repo_audit_and_config_2024.ipynb` (writes `config/columns.yaml`).  
2. Run `01_data_prep_ca_2024.ipynb` (writes the panel).  
3. Run `02`–`05` to generate all outputs.  
4. Run `06` to verify outputs and write `docs/README_VAT.md`.


In [1]:
# 06 — Results rollup & PR prep (2024 only)
import os, sys, subprocess, textwrap, json
from pathlib import Path
import pandas as pd
import numpy as np

# --------- SETTINGS ---------
DO_GIT_ACTIONS = False  # set True to auto-create branch + commit
BRANCH_NAME = "feature/ca-vat-rebate-2024"
# ----------------------------

base = Path("..")
outputs = base / "outputs" / "vat"
docsdir = base / "docs"
docsdir.mkdir(parents=True, exist_ok=True)

expected_files = {
    "rebate_cost":                outputs / "rebate_cost_2024.csv",
    "rebate_by_decile":           outputs / "rebate_cost_by_decile_2024.csv",
    "rebate_by_status":           outputs / "rebate_cost_by_status_2024.csv",
    "mtr_summary":                outputs / "mtr_summary_2024.csv",
    "wage_phaseout_shares":       outputs / "wage_phaseout_shares_2024.csv",
    "distribution":               outputs / "distribution_2024.csv",
    "sales_tax_inputs":           outputs / "sales_tax_inputs_2024.csv",
}

# 1) Check files exist
missing = [k for k,p in expected_files.items() if not p.exists()]
if missing:
    print("⚠️ Missing outputs:", missing)
    print("Run Steps 01–05 first.")
else:
    print("✅ All expected output files found.")

# 2) Load and summarize
def read_csv(path: Path):
    try:
        return pd.read_csv(path)
    except Exception as e:
        raise RuntimeError(f"Failed reading {path}: {e}")

tables = {k: read_csv(p) for k,p in expected_files.items() if p.exists()}

# Pretty print helpers
def money(x): 
    try:
        return f"${float(x):,.0f}"
    except:
        return str(x)

def pct(x): 
    try:
        return f"{100*float(x):.2f}%"
    except:
        return str(x)

# 2a) Rebate totals
if "rebate_cost" in tables:
    rc = tables["rebate_cost"].iloc[0]
    no_phase = rc["no_phaseout_total"]
    phase    = rc["phaseout_total"]
    print("\n--- Rebate totals (2024) ---")
    print("No phase-out total:", money(no_phase))
    print("Phase-out total:   ", money(phase))
    print("Reduction from phase-out:", money(no_phase - phase))

# 2b) MTR summary
if "mtr_summary" in tables:
    mtr = tables["mtr_summary"].iloc[0]
    print("\n--- Rebate-only MTR (2024) ---")
    print("Population-weighted:", f"{mtr['population_weighted_MTR']:.4f}")
    print("Earnings-weighted:  ", f"{mtr['earnings_weighted_MTR']:.4f}")

# 2c) Wage shares in phase-out bands
if "wage_phaseout_shares" in tables:
    ws = tables["wage_phaseout_shares"].iloc[0]
    print("\n--- Wage shares in phase-out bands (2024) ---")
    print("Singles 75–125k:", pct(ws["share_wages_single_75k_125k"]))
    print("MFJ 150–200k:  ", pct(ws["share_wages_mfj_150k_200k"]))

# 2d) Distribution table preview
if "distribution" in tables:
    dist = tables["distribution"]
    print("\n--- Distribution preview (first 12 rows) ---")
    print(dist.head(12).to_string(index=False))

# 2e) Off-model inputs preview
if "sales_tax_inputs" in tables:
    sti = tables["sales_tax_inputs"]
    print("\n--- Sales-tax off-model inputs preview ---")
    print(sti.head().to_string(index=False))

# 3) Write docs/README_VAT.md
readme = textwrap.dedent(f"""\
# California VAT Rebate — 2024 Results (Household-level)

**Scope:** California households, calendar year 2024. We build a household panel from PolicyEngine’s household entity, exclude **negative AGI** households, and group by **equivalized income** (AGI / household size). *Married Households* means **MFJ-only**, inferred from household spouse presence and Head-of-Household flags; **Single Households** covers all others (Single, MFS, HOH, Widow/er).

## Policy Scenario

- **Replace**: Individual income tax, corporate income tax, general & selective sales taxes, estate & gift tax (modeled here via income-tax removal; sales-tax dynamics handled off-model).
- **Add**: Per-household VAT rebate with phase-out by AGI.

### Consumption-allowance (poverty guideline) — constants used
Singles: 1→14,580; 2→19,720; 3→24,860; 4→30,000; 5→35,140; 6→40,280; 7+→45,420  
Married (MFJ): 2→29,160; 3→34,300; 4→39,440; 5→44,580; 6→49,720; 7+→54,860

Phase-out thresholds: Single = 75,000; MFJ = 150,000  
Phase-out bands (width): Single = 50,000; MFJ = 100,000

### Rebate formula
For household size *n* (capped at 7) and status *S ∈ {{Single, MFJ}}*:
- **Base allowance A** = table value for (S, n) above.
- **excess** = max(0, AGI − threshold_S)
- **scale** = max(0, 1 − excess / band_S)
- **Rebate** = A × scale

## Key 2024 Results

- **Rebate totals**:  
  No phase-out: **{money(no_phase) if 'rebate_cost' in tables else 'n/a'}**  
  With phase-out: **{money(phase) if 'rebate_cost' in tables else 'n/a'}**  
  Reduction from phase-out: **{money(no_phase - phase) if 'rebate_cost' in tables else 'n/a'}**

- **Rebate-only MTR (+$1 wages experiment)**:  
  Population-weighted: **{mtr['population_weighted_MTR']:.4f}**  
  Earnings-weighted: **{mtr['earnings_weighted_MTR']:.4f}**

- **Share of wages in phase-out bands**:  
  Singles 75–125k: **{pct(ws['share_wages_single_75k_125k']) if 'wage_phaseout_shares' in tables else 'n/a'}**  
  MFJ 150–200k: **{pct(ws['share_wages_mfj_150k_200k']) if 'wage_phaseout_shares' in tables else 'n/a'}**

## Files

- Rebate totals: `outputs/vat/rebate_cost_2024.csv`  
- Rebate totals by decile: `outputs/vat/rebate_cost_by_decile_2024.csv`  
- Rebate totals by filing status: `outputs/vat/rebate_cost_by_status_2024.csv`  
- Rebate-only MTRs: `outputs/vat/mtr_summary_2024.csv`  
- Wage shares in phase-out bands: `outputs/vat/wage_phaseout_shares_2024.csv`  
- Distribution (baseline vs. no income tax + rebate): `outputs/vat/distribution_2024.csv`  
- Off-model VAT inputs: `outputs/vat/sales_tax_inputs_2024.csv`

## Method notes and assumptions

- Entity = **household** throughout; we exclude **AGI < 0** households.  
- **MFJ vs Single** derived from household spouse presence (`has_spouse` / `spouse_present` / `head_spouse_count`) and HOH eligibility; we treat MFJ-only as “Married.”  
- Grouping = **equivalized income** (AGI / household size) for deciles; we also report **Top 5%** and **Top 1%** by the weighted distribution.  
- Rebate constants are **hard-coded** per spec above.

""")

readme_path = docsdir / "README_VAT.md"
readme_path.write_text(readme, encoding="utf-8")
print("\n✅ Wrote", readme_path)

# 4) Optional: create branch and commit
def run(cmd, cwd=None):
    print("$", " ".join(cmd))
    res = subprocess.run(cmd, cwd=cwd, capture_output=True, text=True)
    if res.returncode != 0:
        print(res.stdout)
        print(res.stderr)
        raise RuntimeError(f"Command failed: {' '.join(cmd)}")
    return res.stdout.strip()

def inside_git_repo():
    try:
        run(["git", "rev-parse", "--is-inside-work-tree"])
        return True
    except Exception:
        return False

if DO_GIT_ACTIONS:
    if not inside_git_repo():
        raise RuntimeError("Not inside a git repo; open the project root in VS Code terminal and rerun with DO_GIT_ACTIONS=True.")

    # create a new branch
    try:
        run(["git", "checkout", "-b", BRANCH_NAME])
    except RuntimeError:
        # branch may exist; switch to it
        run(["git", "checkout", BRANCH_NAME])

    # stage files (docs + outputs for 2024)
    to_add = [
        str(readme_path),
        str(outputs / "rebate_cost_2024.csv"),
        str(outputs / "rebate_cost_by_decile_2024.csv"),
        str(outputs / "rebate_cost_by_status_2024.csv"),
        str(outputs / "mtr_summary_2024.csv"),
        str(outputs / "wage_phaseout_shares_2024.csv"),
        str(outputs / "distribution_2024.csv"),
        str(outputs / "sales_tax_inputs_2024.csv"),
    ]
    run(["git", "add"] + to_add)
    run(["git", "commit", "-m", "CA VAT rebate (2024): results, docs, and outputs"])

    print("\n✅ Git commit created on branch:", BRANCH_NAME)
    print("Next steps:")
    print("  git push -u origin", BRANCH_NAME)
    print("  # then open a PR with your provider or GH CLI, e.g.:")
    print("  gh pr create --title \"CA VAT rebate 2024\" --body \"See docs/README_VAT.md for details.\"")

# Also print a PR title/body you can paste manually
pr_title = "CA VAT rebate (2024): totals, MTRs, distribution, and docs"
pr_body = textwrap.dedent("""\
Implements and documents the California VAT rebate analysis for 2024 (household-level).

- Rebate totals (no phase-out and with phase-out)
- Rebate-only MTR (+$1 wages experiment) and wage shares in phase-out bands
- Distribution: baseline vs no income tax + phase-out rebate (deciles + top 5% + top 1%)
- Off-model VAT inputs by decile (AGI, wages, allowance, rebate)
- Documentation under docs/README_VAT.md

See the docs for constants, formulas, and assumptions.
""")

print("\n--- Suggested PR ---")
print("Title:", pr_title)
print("Body:\n", pr_body)
print("\n✅ Step 06 complete.")


✅ All expected output files found.

--- Rebate totals (2024) ---
No phase-out total: $439,892,827,841
Phase-out total:    $345,335,469,784
Reduction from phase-out: $94,557,358,056

--- Rebate-only MTR (2024) ---
Population-weighted: 0.0294
Earnings-weighted:   0.0855

--- Wage shares in phase-out bands (2024) ---
Singles 75–125k: 4.63%
MFJ 150–200k:   9.91%

--- Distribution preview (first 12 rows) ---
 year     group  mean_tax_baseline  mean_tax_reform    mean_change  total_change  pop_share  share_of_total_change
 2024  decile_1       -2872.822256    -25898.613447  -23025.791190 -4.352185e+10  11.656511               5.144342
 2024  decile_2       -3991.946569    -33852.590773  -29860.644204 -4.684509e+10   9.674770               5.537152
 2024  decile_3       -2409.696325    -25475.567102  -23065.870776 -3.436186e+10   9.187187               4.061618
 2024  decile_4        1268.987260    -32546.189595  -33815.176855 -5.448043e+10   9.935842               6.439660
 2024  decile_5   