###  Loading the Trained Change-Type Classifier

This block loads the trained **change-type classification model** from disk.
The model is stored as a serialized dictionary (`.pkl`) that contains both the
classifier and additional configuration metadata.

- The actual scikit-learn model is retrieved from the `"clf"` entry.
- The remaining entries store auxiliary information used during training
  (e.g., DETR settings and feature names).
- The printed class labels (`classes_`) verify the binary label mapping
  used by the classifier.

This classifier is later used to convert cached feature vectors into
predicted labels (`y_pred`) and confidence scores (`score`).


In [None]:
import joblib
import os
from pathlib import Path

# Define the path relative to the Code folder
# We check the nested path first and then the flat path
possible_paths = [
    r"..\Models\ChangeTypeClassifier\Change_Type_Classifier\change_classifier_logreg.pkl",
    r"..\Models\ChangeTypeClassifier\change_classifier_logreg.pkl"
]

CLF_PATH = None
for p in possible_paths:
    if os.path.exists(p):
        CLF_PATH = p
        break

if CLF_PATH is None:
    raise FileNotFoundError("Could not find change_classifier_logreg.pkl. Please check the Models folder.")

print(f"Loading model from: {CLF_PATH}")

model_pack = joblib.load(CLF_PATH)
print("Loaded keys:", model_pack.keys())

model_clf = model_pack["clf"]
print("model_clf type:", type(model_clf))
print("classes:", getattr(model_clf, "classes_", None))

Loaded keys: dict_keys(['clf', 'detr_model_path', 'detr_thresh', 'top_k', 'feature_names'])
âœ… model_clf type: <class 'sklearn.linear_model._logistic.LogisticRegression'>
âœ… classes: [0 1]


###  Building the Evaluation Table (`eval_rows.csv`)

This step constructs a unified evaluation table that aligns **ground-truth labels**
from the synthetic data generation process with **model predictions**
produced by the trained change-type classifier.

Each row in the resulting CSV corresponds to **one synthetic before/after image pair**
and contains:

- **Ground truth (`y_true`)**  
  Derived from the metadata field `group`
  (`substantial` / `non-substantial`).

- **Model prediction (`y_pred`)** and **confidence score (`score`)**  
  Obtained from the trained logistic-regression classifier.

- **Change type**  
  e.g., `remove_sink`, `replace_toilet2stove`.

- **Image paths**  
  Paths to the original (before) and modified (after) floorplan images.

- **Auxiliary metadata**  
  Run ID, source image index, source stem, and synthetic generation score.

To ensure correct alignment between metadata entries and cached feature vectors,
samples are matched using a unique triplet:

**(change type, source image stem, synthetic index)**

The resulting file, `Results/eval_rows.csv`, serves as the **single source of truth**
for all downstream evaluation, including:

- Quantitative metrics (accuracy, precision, recall, F1)
- Confusion matrices
- Threshold analysis
- Performance visualizations


In [None]:
import json
import numpy as np
import pandas as pd
import re
import os
import joblib
from pathlib import Path

# =======================
# 1. PATHS & SETUP
# =======================
# Define root relative to "Code" folder
ROOT_DIR = Path("..")

META_PATH = ROOT_DIR / "Data" / "metadata.jsonl"
OUT_CSV   = ROOT_DIR / "Results" / "eval_rows.csv"

# Make sure Results folder exists
OUT_CSV.parent.mkdir(parents=True, exist_ok=True)

# =======================
# 2. LOAD CACHES (Auto-detect)
# =======================
# We need to find 'features_cache.pkl'. Checking both nested and flat structures.
possible_cache_paths = [
    ROOT_DIR / "Models" / "ChangeTypeClassifier" / "Change_Type_Classifier" / "features_cache.pkl",
    ROOT_DIR / "Models" / "ChangeTypeClassifier" / "features_cache.pkl"
]

feat_cache_path = None
for p in possible_cache_paths:
    if p.exists():
        feat_cache_path = p
        break

if feat_cache_path is None:
    # If not found, initialize empty dict to avoid crash (though results will be empty)
    print("Warning: 'features_cache.pkl' not found. Cache will be empty.")
    feat_cache = {}
else:
    print(f"Loading feature cache from: {feat_cache_path}")
    feat_cache = joblib.load(feat_cache_path)

# Ensure model_pack is loaded (from previous step, or reload if missing)
if 'model_pack' not in locals():
    # Try to find the model again if not in memory
    possible_model_paths = [
        ROOT_DIR / "Models" / "ChangeTypeClassifier" / "Change_Type_Classifier" / "change_classifier_logreg.pkl",
        ROOT_DIR / "Models" / "ChangeTypeClassifier" / "change_classifier_logreg.pkl"
    ]
    for p in possible_model_paths:
        if p.exists():
            print(f"Loading model from: {p}")
            model_pack = joblib.load(p)
            break

if 'model_pack' not in locals():
    raise FileNotFoundError("Model not found in memory or on disk. Please run the model loading cell first.")

# Use the real sklearn model
sk_clf = model_pack["clf"]

# =======================
# 3. PROCESSING
# =======================
def y_true_from_group(group: str) -> int:
    return 1 if str(group).strip().lower() == "substantial" else 0

# Build cache mapping by (change, stem, idx) for faster lookup
cache_by_triplet = {}
# Regex to parse filenames like: image_001_0_synthetic.jpg
pat = re.compile(r"^(?P<stem>image_\d+)_(?P<idx>\d+)_synthetic\.(jpg|png|jpeg)$", re.IGNORECASE)

print("Building cache index...")
for (before_p, after_p), feat in feat_cache.items():
    # Standardize path separators
    a = str(after_p).replace("\\", "/")
    parts = a.split("/")
    
    if len(parts) < 2:
        continue
        
    change_folder = parts[-2]
    fname = parts[-1]
    
    m = pat.match(fname)
    if not m:
        continue
        
    stem = m.group("stem")
    idx  = int(m.group("idx"))
    cache_by_triplet[(change_folder, stem, idx)] = (feat, before_p, after_p)

print(f"Indexed {len(cache_by_triplet)} items in cache.")

rows = []
missing = 0
matched = 0
total_ok = 0

if not META_PATH.exists():
    print(f"Error: Metadata file not found at {META_PATH}")
else:
    print(f"Reading metadata from: {META_PATH}")
    with META_PATH.open("r", encoding="utf-8") as f:
        for line in f:
            if not line.strip():
                continue
            rec = json.loads(line)
            if str(rec.get("status","")).lower() != "ok":
                continue

            total_ok += 1
            change = rec.get("change")
            stem   = rec.get("src_stem")
            # Handle source index safely
            try:
                idx = int(rec.get("src_index"))
            except (ValueError, TypeError):
                continue

            # Try to find this metadata record in our cache
            hit = cache_by_triplet.get((change, stem, idx))
            if hit is None:
                missing += 1
                continue

            feat, orig_b, orig_a = hit
            X = np.asarray(feat, dtype=np.float32).reshape(1, -1)

            # Predict
            # score = P(class=1)
            proba = sk_clf.predict_proba(X)[0]
            classes = list(getattr(sk_clf, "classes_", []))
            
            # Robustly get the probability for class '1' (Substantial)
            if classes == [0, 1] or classes == [0.0, 1.0]:
                score = float(proba[1])
            else:
                # Fallback: assume the last class is the positive one
                score = float(proba[-1])

            y_pred = int(score >= 0.5)

            group = rec.get("group")
            y_true = y_true_from_group(group)

            extra = rec.get("extra", {})
            synth_score = None
            if isinstance(extra, dict) and "score" in extra:
                try:
                    synth_score = float(extra["score"])
                except:
                    synth_score = None

            rows.append({
                "src_image": str(orig_b).replace("\\", "/"),
                "out_image": str(orig_a).replace("\\", "/"),
                "change_type": change,
                "group": group,
                "y_true": int(y_true),
                "y_pred": int(y_pred),
                "score": score,
                "synth_score": synth_score,
                "run_id": rec.get("run_id"),
                "src_index": idx,
                "src_stem": stem,
            })
            matched += 1

print("-" * 30)
print(f"Total OK records in metadata: {total_ok}")
print(f"Matched with cache: {matched}")
print(f"Missing from cache: {missing}")

if rows:
    df = pd.DataFrame(rows)
    df.to_csv(OUT_CSV, index=False)
    print(f"âœ… Saved CSV to: {OUT_CSV}")
    print(df.head())
else:
    print("Warning: No rows were matched. CSV was not created.")

cache_by_triplet: 3264
OK records: 3264
Matched: 3264
Missing: 0
âœ… Saved: c:\Users\adiha\Desktop\GenAi\Results\eval_rows.csv


Unnamed: 0,src_image,out_image,change_type,group,y_true,y_pred,score,synth_score,run_id,src_index,src_stem
0,./synthetic_dataset/images_before/image_003.jpg,./synthetic_dataset/images_after/substantial/r...,remove_stove,substantial,1,1,0.614004,0.47749,20260113_143558,0,image_003
1,./synthetic_dataset/images_before/image_003.jpg,./synthetic_dataset/images_after/substantial/r...,remove_sink,substantial,1,1,0.63273,0.794007,20260113_143558,0,image_003
2,./synthetic_dataset/images_before/image_003.jpg,./synthetic_dataset/images_after/substantial/r...,replace_sink2stove,substantial,1,1,0.903136,0.794007,20260113_143558,0,image_003
3,./synthetic_dataset/images_before/image_003.jpg,./synthetic_dataset/images_after/non-substanti...,remove_1stdoor,non-substantial,0,0,0.336904,0.986824,20260113_143558,0,image_003
4,./synthetic_dataset/images_before/image_003.jpg,./synthetic_dataset/images_after/non-substanti...,remove_2sdoor,non-substantial,0,1,0.610585,0.955024,20260113_143558,0,image_003


###  Generating `Results/` and `Visuals/` from `eval_rows.csv`

This block takes the evaluation table (`Results/eval_rows.csv`) and produces the final
**numeric outputs** (saved under `Results/`) and **human-readable plots** (saved under `Visuals/`).

The script performs the following steps:

1. **Load evaluation data**  
   Reads `eval_rows.csv`, which contains `y_true`, `y_pred`, model confidence (`score`), and `change_type`.

2. **Overall metrics â†’ `Results/val_metrics.json`**  
   Computes dataset-level performance:
   - accuracy, precision, recall, F1
   - full classification report (as JSON)

3. **Confusion matrix**
   - **Numeric matrix** saved to `Results/confusion_matrix.csv`
   - **Heatmap visualization** saved to `Visuals/confusion_matrix.png`

4. **Per-change-type breakdown â†’ `Results/per_change_type_results.csv`**  
   Computes accuracy/precision/recall/F1 **per synthetic change type**
   (e.g., `remove_sink`, `remove_closet`, etc.) to identify which changes are harder/easier.

5. **Class distribution plot â†’ `Visuals/class_distribution.png`**  
   Visualizes how many samples exist per `change_type` to show dataset balance.

6. **Threshold sweep**
   - Sweeps thresholds from 0.0 to 1.0 using the classifier confidence (`score`)
   - Saves the full sweep table to `Results/threshold_sweep.csv`
   - Plots `threshold` vs `F1` to `Visuals/threshold_vs_f1.png`

7. **Error rate by confidence bins**
   - Groups samples into confidence bins: `[0.0â€“0.4, 0.4â€“0.6, 0.6â€“0.8, 0.8â€“1.0]`
   - Saves summary table to `Results/error_rate_by_confidence.csv`
   - Saves bar plot to `Visuals/error_rate_by_confidence.png`

After running this block, the repository contains a clean separation between:

- **Results/**: machine-readable evaluation outputs (CSV/JSON)
- **Visuals/**: figures and plots for reporting/presentations


In [None]:
from pathlib import Path
import json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import os

from sklearn.metrics import (
    accuracy_score,
    precision_recall_fscore_support,
    classification_report,
    confusion_matrix,
)

# ---------------------------------------------------------
# Paths (Script runs from ROOT/Code)
# ---------------------------------------------------------
# Using relative path ".." to step out of the Code folder
ROOT_DIR = Path("..")

EVAL_CSV = ROOT_DIR / "Results" / "eval_rows.csv"
RESULTS_DIR = ROOT_DIR / "Results"
VISUALS_DIR = ROOT_DIR / "Visuals"

RESULTS_DIR.mkdir(exist_ok=True, parents=True)
VISUALS_DIR.mkdir(exist_ok=True, parents=True)

if not EVAL_CSV.exists():
    raise FileNotFoundError(f"Evaluation CSV not found at: {EVAL_CSV}")

df = pd.read_csv(EVAL_CSV)
print(f"Loaded: {EVAL_CSV} | Rows: {len(df)}")

# ---------------------------------------------------------
# 1) Overall metrics -> Results/val_metrics.json
# ---------------------------------------------------------
y_true = df["y_true"].astype(int).to_numpy()
y_pred = df["y_pred"].astype(int).to_numpy()

acc = float(accuracy_score(y_true, y_pred))
prec, rec, f1, _ = precision_recall_fscore_support(y_true, y_pred, average="binary", zero_division=0)
report = classification_report(y_true, y_pred, output_dict=True, zero_division=0)

val_metrics = {
    "n": int(len(df)),
    "accuracy": acc,
    "precision": float(prec),
    "recall": float(rec),
    "f1": float(f1),
    "classification_report": report,
}

(RESULTS_DIR / "val_metrics.json").write_text(json.dumps(val_metrics, indent=2, ensure_ascii=False), encoding="utf-8")
print(f"Wrote: {RESULTS_DIR / 'val_metrics.json'}")

# ---------------------------------------------------------
# 2) Confusion matrix numeric -> Results/confusion_matrix.csv
#    + heatmap -> Visuals/confusion_matrix.png
# ---------------------------------------------------------
cm = confusion_matrix(y_true, y_pred, labels=[0,1])
cm_df = pd.DataFrame(cm, index=["true_0","true_1"], columns=["pred_0","pred_1"])
cm_df.to_csv(RESULTS_DIR / "confusion_matrix.csv", index=True)
print(f"Wrote: {RESULTS_DIR / 'confusion_matrix.csv'}")

fig = plt.figure(figsize=(5,4))
ax = plt.gca()
im = ax.imshow(cm)
ax.set_title("Confusion Matrix")
ax.set_xlabel("Predicted")
ax.set_ylabel("True")
ax.set_xticks([0,1]); ax.set_yticks([0,1])
ax.set_xticklabels(["0","1"]); ax.set_yticklabels(["0","1"])
for i in range(2):
    for j in range(2):
        ax.text(j, i, str(cm[i,j]), ha="center", va="center")
plt.tight_layout()
fig.savefig(VISUALS_DIR / "confusion_matrix.png", dpi=200)
plt.close(fig)
print(f"Wrote: {VISUALS_DIR / 'confusion_matrix.png'}")

# ---------------------------------------------------------
# 3) Per-change-type results -> Results/per_change_type_results.csv
# ---------------------------------------------------------
rows = []
for ct, g in df.groupby("change_type"):
    yt = g["y_true"].astype(int).to_numpy()
    yp = g["y_pred"].astype(int).to_numpy()
    a = float(accuracy_score(yt, yp))
    p, r, ff, _ = precision_recall_fscore_support(yt, yp, average="binary", zero_division=0)
    rows.append({
        "change_type": ct,
        "n": int(len(g)),
        "accuracy": a,
        "precision": float(p),
        "recall": float(r),
        "f1": float(ff),
    })

per_ct = pd.DataFrame(rows).sort_values("n", ascending=False)
per_ct.to_csv(RESULTS_DIR / "per_change_type_results.csv", index=False)
print(f"Wrote: {RESULTS_DIR / 'per_change_type_results.csv'}")

# ---------------------------------------------------------
# 4) Class distribution plot -> Visuals/class_distribution.png
# ---------------------------------------------------------
counts = df["change_type"].value_counts().sort_values(ascending=False)

fig = plt.figure(figsize=(9,4))
ax = plt.gca()
ax.bar(counts.index.astype(str), counts.values)
ax.set_title("Class Distribution (change_type)")
ax.set_xlabel("change_type")
ax.set_ylabel("count")
ax.tick_params(axis="x", rotation=45)
plt.tight_layout()
fig.savefig(VISUALS_DIR / "class_distribution.png", dpi=200)
plt.close(fig)
print(f"Wrote: {VISUALS_DIR / 'class_distribution.png'}")

# ---------------------------------------------------------
# 5) Threshold sweep -> Results/threshold_sweep.csv + Visuals/threshold_vs_f1.png
# ---------------------------------------------------------
scores = df["score"].astype(float).to_numpy()
thresholds = np.linspace(0.0, 1.0, 101)

sweep_rows = []
for thr in thresholds:
    yp_thr = (scores >= thr).astype(int)
    a = float(accuracy_score(y_true, yp_thr))
    p, r, ff, _ = precision_recall_fscore_support(y_true, yp_thr, average="binary", zero_division=0)
    sweep_rows.append({"threshold": float(thr), "accuracy": a, "precision": float(p), "recall": float(r), "f1": float(ff)})

sweep = pd.DataFrame(sweep_rows)
sweep.to_csv(RESULTS_DIR / "threshold_sweep.csv", index=False)
print(f"Wrote: {RESULTS_DIR / 'threshold_sweep.csv'}")

fig = plt.figure(figsize=(6,4))
ax = plt.gca()
ax.plot(sweep["threshold"], sweep["f1"])
ax.set_title("Threshold vs F1")
ax.set_xlabel("threshold")
ax.set_ylabel("f1")
plt.tight_layout()
fig.savefig(VISUALS_DIR / "threshold_vs_f1.png", dpi=200)
plt.close(fig)
print(f"Wrote: {VISUALS_DIR / 'threshold_vs_f1.png'}")

# ---------------------------------------------------------
# 6) Error rate by confidence bins -> Results/error_rate_by_confidence.csv + Visuals/error_rate_by_confidence.png
# ---------------------------------------------------------
bins = [0.0, 0.4, 0.6, 0.8, 1.0]
labels = [f"{bins[i]:.1f}-{bins[i+1]:.1f}" for i in range(len(bins)-1)]

tmp = df.copy()
tmp["is_error"] = (tmp["y_true"].astype(int) != tmp["y_pred"].astype(int)).astype(int)
tmp["bin"] = pd.cut(tmp["score"].astype(float), bins=bins, labels=labels, include_lowest=True)

err = tmp.groupby("bin", dropna=False).agg(
    n=("is_error", "size"),
    error_rate=("is_error", "mean")
).reset_index()

err.to_csv(RESULTS_DIR / "error_rate_by_confidence.csv", index=False)
print(f"Wrote: {RESULTS_DIR / 'error_rate_by_confidence.csv'}")

fig = plt.figure(figsize=(6,4))
ax = plt.gca()
ax.bar(err["bin"].astype(str), err["error_rate"].fillna(0).values)
ax.set_title("Error Rate by Confidence Bin")
ax.set_xlabel("score bin")
ax.set_ylabel("error_rate")
plt.tight_layout()
fig.savefig(VISUALS_DIR / "error_rate_by_confidence.png", dpi=200)
plt.close(fig)
print(f"Wrote: {VISUALS_DIR / 'error_rate_by_confidence.png'}")

print("\nDONE. Results and Visuals created in Results/ and Visuals/")

Loaded: c:\Users\adiha\Desktop\GenAi\Results\eval_rows.csv rows: 3264
âœ… Wrote: c:\Users\adiha\Desktop\GenAi\Results\val_metrics.json
âœ… Wrote: c:\Users\adiha\Desktop\GenAi\Results\confusion_matrix.csv
âœ… Wrote: c:\Users\adiha\Desktop\GenAi\Visuals\confusion_matrix.png
âœ… Wrote: c:\Users\adiha\Desktop\GenAi\Results\per_change_type_results.csv
âœ… Wrote: c:\Users\adiha\Desktop\GenAi\Visuals\class_distribution.png
âœ… Wrote: c:\Users\adiha\Desktop\GenAi\Results\threshold_sweep.csv
âœ… Wrote: c:\Users\adiha\Desktop\GenAi\Visuals\threshold_vs_f1.png
âœ… Wrote: c:\Users\adiha\Desktop\GenAi\Results\error_rate_by_confidence.csv
âœ… Wrote: c:\Users\adiha\Desktop\GenAi\Visuals\error_rate_by_confidence.png

ðŸŽ‰ DONE. Results and Visuals created in ROOT/Results and ROOT/Visuals


  err = tmp.groupby("bin", dropna=False).agg(
