# Before vs After RL Tuning: Example Results

This notebook runs the verifier **before** (default params) and **after** (RL-tuned params) on a few example claims, then plots the comparison.

## 1. Example case

We use **three claims** from the training set:
- One likely **real** (mayor/budget) → we expect higher confidence.
- One likely **fake** (moon landing faked) → we expect lower confidence.
- One **multi-sentence** (mayor + crime) → we get multiple sub-claims.

Each has a **ground truth** (`expected_conf`, `is_fake`) used by the RL tuner to learn better correction.

In [None]:
import sys
from pathlib import Path

ROOT = Path.cwd().resolve()
if ROOT.name != "FakeNewsVerifier":
    ROOT = ROOT.parent  # assume we're in notebooks/
if str(ROOT) not in sys.path:
    sys.path.insert(0, str(ROOT))

import json
from src.agents.claim_extractor import extract_claims
from src.agents.verifier import verify_claims
from src.agents.corrector import correct_results, DEFAULT_PARAMS

# Example cases: 3 claims with ground truth
examples = [
    {"claim": "The mayor said the budget was approved.", "ground_truth": {"is_fake": False, "expected_conf": 0.75}},
    {"claim": "The moon landing was faked in Nashville.", "ground_truth": {"is_fake": True, "expected_conf": 0.2}},
    {"claim": "The mayor said the budget was approved. It has been reported that crime is down.", "ground_truth": {"is_fake": False, "expected_conf": 0.7}},
]

labels = ["Mayor/budget", "Moon faked", "Mayor + crime"]
print("Example cases:", labels)

## 2. Before results (default params)

Run **verify → correct** with **no tuning**: the corrector uses hardcoded `DEFAULT_PARAMS` (e.g. `intervention_threshold=0.55`, `causal_bias=0.4`, `causal_truth=0.85`). We record the **final confidence** for each (sub-)claim.

In [None]:
before_params = None  # uses DEFAULT_PARAMS
before_confs = []  # mean final confidence per example
before_details = []

for ex, label in zip(examples, labels):
    claims = extract_claims(ex["claim"])
    v_res, graph = verify_claims(claims)
    corrected, _ = correct_results(v_res, graph, params=before_params)
    confs = [r["confidence"] for r in corrected]
    mean_conf = sum(confs) / len(confs) if confs else 0.0
    before_confs.append(mean_conf)
    before_details.append({"label": label, "confs": confs, "mean": mean_conf})

print("Before params:", DEFAULT_PARAMS)
print("Before mean confidence per example:", before_confs)

## 3. After results (RL-tuned params)

Build a small dataset from the same examples, run **tune_correction** (short PPO), then run **verify → correct** with the **learned params**. If PyTorch is missing, the tuner returns the same default params (so before ≈ after).

In [None]:
dataset = []
for ex in examples:
    claims = extract_claims(ex["claim"])
    v_res, graph = verify_claims(claims)
    dataset.append((v_res, graph, ex["ground_truth"]))

from src.rl_tuner import tune_correction
after_params = tune_correction(dataset, num_iterations=2)
print("After (tuned) params:", after_params)

after_confs = []
for ex, label in zip(examples, labels):
    claims = extract_claims(ex["claim"])
    v_res, graph = verify_claims(claims)
    corrected, _ = correct_results(v_res, graph, params=after_params)
    confs = [r["confidence"] for r in corrected]
    mean_conf = sum(confs) / len(confs) if confs else 0.0
    after_confs.append(mean_conf)

print("After mean confidence per example:", after_confs)

## 4. Plot: Before vs After (and expected)

For each example we plot:
- **Before**: mean final confidence with default params.
- **After**: mean final confidence with RL-tuned params.
- **Expected**: ground-truth `expected_conf` (what we want to match).

Closer **After** to **Expected** (with few unnecessary interventions) is what the RL tuner optimizes for.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

expected = [ex["ground_truth"]["expected_conf"] for ex in examples]
x = np.arange(len(labels))
w = 0.25

fig, ax = plt.subplots(figsize=(8, 5))
ax.bar(x - w, before_confs, width=w, label="Before (default)", color="C0", alpha=0.8)
ax.bar(x, after_confs, width=w, label="After (tuned)", color="C1", alpha=0.8)
ax.bar(x + w, expected, width=w, label="Expected (ground truth)", color="C2", alpha=0.6)

ax.set_xticks(x)
ax.set_xticklabels(labels)
ax.set_ylabel("Mean confidence")
ax.set_title("Before vs After RL tuning (example cases)")
ax.legend()
ax.set_ylim(0, 1.05)
plt.tight_layout()
plt.show()

**Summary**
- **Example case**: three claims (mayor/budget, moon faked, mayor+crime) with known ground truth.
- **Before**: correction uses fixed threshold 0.55 and causal bias/truth; final confidence can be off from expected.
- **After**: correction uses learned threshold (and optionally bias) so that, on average, final confidence is closer to expected and over-correction on real claims is penalized.
- Install PyTorch and use more iterations (`num_iterations=5–10`) for a clearer before/after difference.