# Quote Math Validator

This notebook validates the pricing math used by the form-builder quote engine.

The PHP `QuoteEngine` and JavaScript `quote.js` share the same deterministic formula:

```
subtotal  = (base_rate * complexity_multiplier) + addon_total
range_low  = round(subtotal * 0.9)
range_high = round(subtotal * 1.2)
```

**Goals**
1. Load `quote_math_validation.csv` — 2 500+ correct rows + ~360 intentionally broken rows.
2. Re-implement the formula in Python and assert it matches the CSV ground truth.
3. Train a lightweight `DecisionTreeClassifier` to detect rows where the stored numbers are wrong.
4. Expose a single `predict_quote(...)` function that returns `True` when the math checks out.

All packages used (`pandas`, `numpy`, `scikit-learn`) are available on the Kaggle free tier with no GPU required.

## 1. Import Required Libraries

In [None]:
import os
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, export_text
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score,
    f1_score, confusion_matrix, classification_report,
)
from sklearn.preprocessing import LabelEncoder

print("pandas", pd.__version__, "| numpy", np.__version__)

import sklearn; print("scikit-learn", sklearn.__version__)

## 2. Load and Explore the Dataset

`quote_math_validation.csv` was generated by `gen_data.py` which:
- Enumerates every `service × complexity` pair
- Pairs each with no addons, each single addon, every 2-addon combination, every 3-addon combination, and a sample of 4- and 5-addon combinations
- Adds ~15% intentionally miscalculated rows (wrong multiplier, wrong range_low, or wrong range_high) labelled `math_correct = 0`

In [None]:
from pathlib import Path
import subprocess, sys

KAGGLE_BASE = Path("/kaggle/input/datasets/crissymoon/quote-math-validation")
KAGGLE_CSV  = KAGGLE_BASE / "quote_math_validation.csv"
KAGGLE_GEN  = KAGGLE_BASE / "gen_data.py"
LOCAL_BASE  = Path("/kaggle/working")
LOCAL_CSV   = LOCAL_BASE / "quote_math_validation.csv"

# Priority: Kaggle input dataset -> already-generated local copy -> regenerate now
if KAGGLE_CSV.exists():
    CSV_PATH = KAGGLE_CSV
elif LOCAL_CSV.exists():
    CSV_PATH = LOCAL_CSV
else:
    # Dataset not attached — regenerate from gen_data.py into /kaggle/working
    gen_script = KAGGLE_GEN if KAGGLE_GEN.exists() else Path(__file__).parent / "gen_data.py" if "__file__" in dir() else None
    if gen_script and gen_script.exists():
        print("Regenerating CSV from gen_data.py ...")
        subprocess.run([sys.executable, str(gen_script)], check=True)
    else:
        raise FileNotFoundError(
            "CSV not found at Kaggle input path and gen_data.py is unavailable.\n"
            f"Expected: {KAGGLE_CSV}\n"
            "Attach the dataset 'crissymoon/quote-math-validation' to this notebook."
        )
    CSV_PATH = LOCAL_CSV

df = pd.read_csv(CSV_PATH)
print("CSV loaded from:", CSV_PATH)
print("Shape:", df.shape)
print("\nClass balance:")
print(df["math_correct"].value_counts())
print("\nError type distribution:")
print(df["error_type"].value_counts())
df.head(8)

## 3. Data Preprocessing

In [None]:
assert df.isnull().sum().sum() == 0, "Unexpected nulls -- re-run gen_data.py"

le_service    = LabelEncoder().fit(df["service_type"])
le_complexity = LabelEncoder().fit(df["complexity"])

df["service_enc"]    = le_service.transform(df["service_type"])
df["complexity_enc"] = le_complexity.transform(df["complexity"])

print("Service classes   :", list(le_service.classes_))
print("Complexity classes:", list(le_complexity.classes_))

# Exclude addon_total: it starts with "addon_" but belongs in NUM_COLS
ADDON_COLS = [c for c in df.columns if c.startswith("addon_") and c != "addon_total"]
NUM_COLS   = ["base_rate", "complexity_multiplier", "addon_total",
              "subtotal", "range_low", "range_high"]
CAT_COLS   = ["service_enc", "complexity_enc"]
FEATURES   = CAT_COLS + NUM_COLS + ADDON_COLS
TARGET     = "math_correct"

X = df[FEATURES]
y = df[TARGET]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"\nTrain: {len(X_train)}  Test: {len(X_test)}")
print(f"Train positive rate: {y_train.mean():.3f}")

## 4. Feature Engineering

The numeric columns already encode everything the formula uses, but we add three derived features that make the decision boundary trivially learnable for the tree:

| Derived feature | Formula |
|---|---|
| `expected_subtotal` | `base_rate * complexity_multiplier + addon_total` |
| `low_delta` | `range_low - round(expected_subtotal * 0.9)` |
| `high_delta` | `range_high - round(expected_subtotal * 1.2)` |

A correct row has `low_delta == 0` and `high_delta == 0`. The model learns this quickly, but the exercise is still useful for catching float-rounding edge cases or JS/PHP divergence you might introduce later.

In [None]:
def add_derived(frame):
    """Add expected_subtotal, low_delta, high_delta to a copy of the frame.
    Uses .values to avoid pandas duplicate-label alignment errors.
    """
    f = frame.copy().reset_index(drop=True)
    base   = f["base_rate"].values
    mult   = f["complexity_multiplier"].values
    addons = f["addon_total"].values
    low    = f["range_low"].values
    high   = f["range_high"].values
    exp    = base * mult + addons
    f["expected_subtotal"] = exp
    f["low_delta"]         = low  - np.round(exp * 0.9)
    f["high_delta"]        = high - np.round(exp * 1.2)
    return f

X_train_fe = add_derived(X_train)
X_test_fe  = add_derived(X_test)

FEATURES_FE = FEATURES + ["expected_subtotal", "low_delta", "high_delta"]

print("Features:", FEATURES_FE)
print("Train shape:", X_train_fe[FEATURES_FE].shape)

## 5. Build the Lightweight Model

A shallow `DecisionTreeClassifier` (`max_depth=5`) is used because:
- The pricing formula is entirely deterministic — a tree of depth 2–3 should already reach near-100% accuracy.
- It is fast, interpretable, and produces no float precision issues.
- It can be serialised to a tiny JSON/dict for use inside the PHP or JS layer if needed.

In [None]:
model = DecisionTreeClassifier(
    max_depth=5,
    min_samples_leaf=2,
    random_state=42,
    class_weight="balanced",
)

print("Model:", model)

## 6. Train the Model

In [None]:
model.fit(X_train_fe[FEATURES_FE], y_train)

train_acc = accuracy_score(y_train, model.predict(X_train_fe[FEATURES_FE]))
print(f"Training accuracy : {train_acc:.4f}")
print(f"Tree depth used   : {model.get_depth()}")
print(f"Leaf nodes        : {model.get_n_leaves()}")

## 7. Evaluate the Model

In [None]:
y_pred = model.predict(X_test_fe[FEATURES_FE])

print("Test set results")
print("----------------")
print(classification_report(y_test, y_pred, target_names=["wrong (0)", "correct (1)"]))

cm = confusion_matrix(y_test, y_pred)
cm_df = pd.DataFrame(
    cm,
    index=["actual: wrong", "actual: correct"],
    columns=["pred: wrong", "pred: correct"],
)
print("Confusion matrix:")
print(cm_df)

print(f"\nAccuracy : {accuracy_score(y_test, y_pred):.4f}")
print(f"Precision: {precision_score(y_test, y_pred):.4f}")
print(f"Recall   : {recall_score(y_test, y_pred):.4f}")
print(f"F1       : {f1_score(y_test, y_pred):.4f}")

In [None]:
# Feature importance
importance = pd.Series(model.feature_importances_, index=FEATURES_FE)
importance_sorted = importance[importance > 0].sort_values(ascending=False)

print("Feature importances (non-zero only):")
print(importance_sorted.to_string())

## 8. Run Inference on New Form Inputs

`predict_quote(...)` mirrors the `QuoteEngine::calculate()` PHP signature.  
Pass the raw form values and it returns `True` if the numbers check out, `False` otherwise.

In [None]:
BASE_RATES = {
    "web_design": 1500, "web_development": 3500, "ecommerce": 4500,
    "software": 7500, "ai_web_app": 9500, "ai_native_app": 14000,
}
COMPLEXITY_MULTIPLIERS = {
    "simple": 1.0, "moderate": 1.4, "complex": 2.0, "custom": 2.8,
}
ADDON_RATES = {
    "seo_basic": 500, "seo_advanced": 1200, "copywriting": 800,
    "branding": 1800, "maintenance": 1200, "hosting_setup": 350,
    "api_integration": 1500, "automation": 2200,
}
ADDONS_LIST = list(ADDON_RATES.keys())


def _build_row(service_type, complexity, addons,
               claimed_subtotal, claimed_range_low, claimed_range_high):
    """Build a single-row DataFrame in the shape the model expects."""
    base        = BASE_RATES[service_type]
    multiplier  = COMPLEXITY_MULTIPLIERS[complexity]
    addon_total = sum(ADDON_RATES[a] for a in addons if a in ADDON_RATES)
    exp         = base * multiplier + addon_total
    row = {
        "service_enc"          : le_service.transform([service_type])[0],
        "complexity_enc"       : le_complexity.transform([complexity])[0],
        **{f"addon_{a}": (1 if a in addons else 0) for a in ADDONS_LIST},
        "base_rate"            : base,
        "complexity_multiplier": multiplier,
        "addon_total"          : addon_total,
        "subtotal"             : claimed_subtotal,
        "range_low"            : claimed_range_low,
        "range_high"           : claimed_range_high,
        "expected_subtotal"    : exp,
        "low_delta"            : claimed_range_low  - round(exp * 0.9),
        "high_delta"           : claimed_range_high - round(exp * 1.2),
    }
    return pd.DataFrame([row])[FEATURES_FE], base, multiplier, addon_total, exp


def predict_quote(
    service_type: str,
    complexity: str,
    addons: list,
    claimed_subtotal: int,
    claimed_range_low: int,
    claimed_range_high: int,
) -> bool:
    """
    Returns True if the claimed figures match the QuoteEngine formula exactly.
    Both the deterministic rule check and the ML model must agree.
    Use predict_quote_detail() for confidence score and breakdown.
    """
    return predict_quote_detail(
        service_type, complexity, addons,
        claimed_subtotal, claimed_range_low, claimed_range_high
    )["correct"]


def predict_quote_detail(
    service_type: str,
    complexity: str,
    addons: list,
    claimed_subtotal: int,
    claimed_range_low: int,
    claimed_range_high: int,
) -> dict:
    """
    Returns a dict with:
      correct        (bool)   - True only when both rule and model agree it is valid
      confidence     (float)  - model probability for the 'correct' class (0.0-1.0)
      rule_ok        (bool)   - deterministic formula check
      model_ok       (bool)   - ML model prediction
      expected       (dict)   - what the correct values should be
      deltas         (dict)   - difference between claimed and expected values
      error_flags    (list)   - which fields are wrong, empty if all correct
    """
    row_df, base, multiplier, addon_total, exp = _build_row(
        service_type, complexity, addons,
        claimed_subtotal, claimed_range_low, claimed_range_high
    )

    expected_subtotal = round(exp)
    expected_low      = round(exp * 0.9)
    expected_high     = round(exp * 1.2)

    rule_ok = (
        claimed_subtotal   == expected_subtotal and
        claimed_range_low  == expected_low      and
        claimed_range_high == expected_high
    )

    proba     = model.predict_proba(row_df)[0]                 # [p_wrong, p_correct]
    confidence = float(proba[1])                               # probability of 'correct'
    model_ok  = confidence >= 0.5

    error_flags = []
    if claimed_subtotal   != expected_subtotal: error_flags.append("subtotal")
    if claimed_range_low  != expected_low:      error_flags.append("range_low")
    if claimed_range_high != expected_high:     error_flags.append("range_high")

    return {
        "correct"   : rule_ok and model_ok,
        "confidence": round(confidence, 4),
        "rule_ok"   : rule_ok,
        "model_ok"  : model_ok,
        "expected"  : {
            "subtotal"  : expected_subtotal,
            "range_low" : expected_low,
            "range_high": expected_high,
        },
        "deltas": {
            "subtotal"  : claimed_subtotal   - expected_subtotal,
            "range_low" : claimed_range_low  - expected_low,
            "range_high": claimed_range_high - expected_high,
        },
        "error_flags": error_flags,
    }
# --- demo ---
_c = predict_quote_detail("web_design", "simple", [], 1500, 1350, 1800)
_w = predict_quote_detail("web_design", "simple", [], 9999, 1350, 1800)
print(f"correct row  confidence={_c['confidence']:.2f}  correct={_c['correct']}")
print(f"wrong row    confidence={_w['confidence']:.2f}  correct={_w['correct']}  flags={_w['error_flags']}")
print("\npredict_quote() and predict_quote_detail() ready.")


## 6b. Decision Tree Structure

Print the rules the trained tree learned. Because the derived features (`low_delta`, `high_delta`) make the boundary trivial, the tree reaches 100% accuracy in very few splits.


In [None]:
from sklearn.tree import export_text

# Print the learned decision rules (truncated to top 4 levels for readability)
rules = export_text(
    model,
    feature_names=FEATURES_FE,
    max_depth=4,
    spacing=3,
)
print(rules)

# Compact model card
print("-" * 52)
print(f"Algorithm   : DecisionTreeClassifier")
print(f"Max depth   : {model.get_depth()} (limit was 5)")
print(f"Leaf nodes  : {model.get_n_leaves()}")
print(f"Features    : {len(FEATURES_FE)}")
print(f"Train acc   : {accuracy_score(y_train, model.predict(X_train_fe[FEATURES_FE])):.4f}")
print("-" * 52)

# Sample prediction table
samples = [
    ("web_design",    "simple",   [],              1500,  1350,  1800),
    ("web_design",    "simple",   [],              9999,  1350,  1800),
    ("ecommerce",     "moderate", ["branding"],    8100,  7290,  9720),
    ("ai_native_app", "custom",   ["automation"],  41200, 37080, 49440),
    ("software",      "custom",   ["automation"],  9700,  8730,  11640),
    ("ai_web_app",    "complex",  ["api_integration", "branding"], 22300, 20070, 26760),
]

print(f"\n{'service':<16} {'complexity':<10} {'subtotal':>9} {'prediction':<12} {'confidence':>10}")
print("-" * 62)
for svc, cmp, add, sub, lo, hi in samples:
    d = predict_quote_detail(svc, cmp, add, sub, lo, hi)
    label = "CORRECT" if d["correct"] else "WRONG"
    print(f"{svc:<16} {cmp:<10} {sub:>9} {label:<12} {d['confidence']:>10.4f}")


## 9. Confidence Scoring and Extended Tests

`predict_quote_detail()` returns the model's probability score alongside the rule check, expected values, per-field deltas, and a list of which fields are wrong.

The confidence score comes from `predict_proba` — it is the model's estimated probability that the row is mathematically correct. A score of `1.00` means the tree is certain; anything below `0.50` is classified as wrong.

In [None]:
test_cases = [
    # --- Correct cases ---
    {
        "label": "ai_web_app / complex / branding + api",
        "service_type": "ai_web_app", "complexity": "complex",
        "addons": ["branding", "api_integration"],
        "claimed_subtotal": 22300, "claimed_range_low": 20070, "claimed_range_high": 26760,
        "expect_correct": True,
    },
    {
        "label": "web_design / simple / no addons",
        "service_type": "web_design", "complexity": "simple",
        "addons": [], "claimed_subtotal": 1500,
        "claimed_range_low": 1350, "claimed_range_high": 1800,
        "expect_correct": True,
    },
    {
        "label": "ecommerce / moderate / seo_advanced + branding",
        "service_type": "ecommerce", "complexity": "moderate",
        "addons": ["seo_advanced", "branding"],
        "claimed_subtotal": 9300, "claimed_range_low": 8370, "claimed_range_high": 11160,
        "expect_correct": True,
    },
    {
        "label": "ai_native_app / custom / all addons",
        "service_type": "ai_native_app", "complexity": "custom",
        "addons": list(ADDON_RATES.keys()),
        "claimed_subtotal": 48750, "claimed_range_low": 43875, "claimed_range_high": 58500,
        "expect_correct": True,
    },
    {
        "label": "software / simple / no addons",
        "service_type": "software", "complexity": "simple",
        "addons": [], "claimed_subtotal": 7500,
        "claimed_range_low": 6750, "claimed_range_high": 9000,
        "expect_correct": True,
    },
    # --- Wrong cases ---
    {
        "label": "web_development / moderate / wrong range_high",
        "service_type": "web_development", "complexity": "moderate",
        "addons": ["seo_basic"],
        "claimed_subtotal": 5400, "claimed_range_low": 4860, "claimed_range_high": 9999,
        "expect_correct": False,
    },
    {
        "label": "web_design / simple / wrong subtotal",
        "service_type": "web_design", "complexity": "simple",
        "addons": [], "claimed_subtotal": 9999,
        "claimed_range_low": 1350, "claimed_range_high": 1800,
        "expect_correct": False,
    },
    {
        "label": "ecommerce / complex / wrong range_low",
        "service_type": "ecommerce", "complexity": "complex",
        "addons": ["hosting_setup"],
        "claimed_subtotal": 9350, "claimed_range_low": 1000, "claimed_range_high": 11220,
        "expect_correct": False,
    },
    {
        "label": "software / custom / wrong multiplier applied",
        "service_type": "software", "complexity": "custom",
        "addons": ["automation"],
        # correct: 7500*2.8+2200=23200/20880/27840; claimed uses simple (1.0) -> 9700
    "claimed_subtotal": 9700, "claimed_range_low": 8730, "claimed_range_high": 11640,
        "expect_correct": False,
    },
    {
        "label": "ai_web_app / moderate / all three fields off by 1",
        "service_type": "ai_web_app", "complexity": "moderate",
        "addons": [],
        "claimed_subtotal": 13301, "claimed_range_low": 11971, "claimed_range_high": 15961,
        "expect_correct": False,
    },
]

# Run all tests
passed = 0
failed = 0

col_w = [40, 6, 8, 10]
header = f"{'label':<{col_w[0]}} {'conf':>{col_w[1]}} {'result':<{col_w[2]}} {'pass/fail':<{col_w[3]}}"
print(header)
print("-" * sum(col_w) + "-" * 4)

for tc in test_cases:
    args = {k: v for k, v in tc.items() if k not in ("label", "expect_correct")}
    detail = predict_quote_detail(**args)

    outcome    = "CORRECT" if detail["correct"] else "WRONG"
    test_pass  = detail["correct"] == tc["expect_correct"]
    status     = "PASS" if test_pass else "FAIL"

    if test_pass:
        passed += 1
    else:
        failed += 1

    conf_str = f"{detail['confidence']:.2f}"
    print(f"{tc['label']:<{col_w[0]}} {conf_str:>{col_w[1]}} {outcome:<{col_w[2]}} {status:<{col_w[3]}}")

    if not test_pass or detail["error_flags"]:
        if detail["error_flags"]:
            print(f"  wrong fields : {detail['error_flags']}")
        if not test_pass:
            print(f"  expected     : {detail['expected']}")
            print(f"  deltas       : {detail['deltas']}")

print()
print(f"Results: {passed}/{len(test_cases)} passed, {failed} failed")

## 10. Save the Model

Serialise the trained tree, label encoders, and feature list to `/kaggle/working/` so Kaggle exposes them as downloadable output artefacts.

The bundle is a single `.pkl` file that can be loaded by the PHP backend or a future API layer.


In [None]:
import pickle
from pathlib import Path

OUT_DIR  = Path("/kaggle/working")
MODEL_PKL = OUT_DIR / "quote_math_model.pkl"

bundle = {
    "model"          : model,
    "le_service"     : le_service,
    "le_complexity"  : le_complexity,
    "features"       : FEATURES_FE,
    "base_rates"     : BASE_RATES,
    "complexity_mult": COMPLEXITY_MULTIPLIERS,
    "addon_rates"    : ADDON_RATES,
}

with open(MODEL_PKL, "wb") as f:
    pickle.dump(bundle, f)

size_kb = MODEL_PKL.stat().st_size / 1024
print(f"Model saved  : {MODEL_PKL}")
print(f"File size    : {size_kb:.1f} KB")
print(f"Tree depth   : {model.get_depth()}")
print(f"Leaf nodes   : {model.get_n_leaves()}")
print(f"Features     : {len(FEATURES_FE)}")


## 11. Reload and Verify

Load the bundle from disk and run a quick smoke test to confirm the serialised model produces identical predictions.


In [None]:
import pickle
import pandas as pd
from pathlib import Path

with open(Path("/kaggle/working/quote_math_model.pkl"), "rb") as f:
    b = pickle.load(f)

m2   = b["model"]
les  = b["le_service"]
lec  = b["le_complexity"]
feat = b["features"]
BR   = b["base_rates"]
CM   = b["complexity_mult"]
AR   = b["addon_rates"]
AL   = list(AR.keys())


def _smoke_predict(service, complexity, addons, sub, lo, hi):
    """Mirrors predict_quote_detail: rule check AND model must both agree."""
    base = BR[service]
    mult = CM[complexity]
    at   = sum(AR[a] for a in addons if a in AR)
    exp  = base * mult + at

    rule_ok = (
        sub == round(exp)       and
        lo  == round(exp * 0.9) and
        hi  == round(exp * 1.2)
    )

    row = {
        "service_enc"          : les.transform([service])[0],
        "complexity_enc"       : lec.transform([complexity])[0],
        **{f"addon_{a}": (1 if a in addons else 0) for a in AL},
        "base_rate"            : base,
        "complexity_multiplier": mult,
        "addon_total"          : at,
        "subtotal"             : sub,
        "range_low"            : lo,
        "range_high"           : hi,
        "expected_subtotal"    : exp,
        "low_delta"            : lo - round(exp * 0.9),
        "high_delta"           : hi - round(exp * 1.2),
    }
    model_ok = m2.predict(pd.DataFrame([row])[feat])[0] == 1
    return int(rule_ok and model_ok)


smoke = [
    # (service, complexity, addons, subtotal, range_low, range_high, expected_label)
    # correct rows
    ("web_design",    "simple",   [],             1500,  1350,  1800,  1),
    ("ecommerce",     "moderate", ["branding"],   8100,  7290,  9720,  1),
    ("ai_native_app", "custom",   [],             39200, 35280, 47040, 1),
    # wrong rows
    ("web_design",    "simple",   [],             9999,  1350,  1800,  0),  # wrong subtotal
    ("software",      "custom",   ["automation"], 9700,  8730,  11640, 0),  # wrong multiplier
]

all_ok = True
print(f"{'service':<16} {'complexity':<10} {'expected':>8} {'got':>5} {'ok':>4}")
print("-" * 50)
for svc, cmp, add, sub, lo, hi, expected_label in smoke:
    got = _smoke_predict(svc, cmp, add, sub, lo, hi)
    ok  = got == expected_label
    all_ok = all_ok and ok
    print(f"{svc:<16} {cmp:<10} {expected_label:>8} {got:>5} {'OK' if ok else 'FAIL':>4}")

print()
print("Reload smoke test:", "PASSED" if all_ok else "FAILED")
