# Telegram Message Risk Scoring: URL‑Masked Evaluation (TF‑IDF grid search, Link/URL tags removed)

This notebook implements and evaluates a URL‑independent risk model for Telegram messages. Starting from a deduplicated subset of messages with MBFC‑derived risk labels, we construct three feature representations—(1) TF‑IDF over raw message text, (2) compact style vectors built from LLM‑assigned tags (topic, rhetorical stance, calls to action, evidence), and (3) their concatenation—and train ℓ₂‑regularized logistic regression models on each. To avoid leakage across sources, we use a URL‑masked split: training and test sets contain disjoint sets of normalized domains, with the split chosen to match the global positive rate as closely as possible.

The notebook then compares these models on a held‑out test set in terms of accuracy, macro‑F1, ROC‑AUC, and calibration, and inspects their behavior across different channel types (e.g., high‑credibility news vs. high‑risk fringe channels). In contrast to the v3 notebook, we perform an explicit grid search over TF‑IDF hyperparameters (n‑gram range and vocabulary size) so that the text‑only baseline is reasonably tuned when comparing against the style and combined models.


In [None]:
# Imports and paths
import os
import pandas as pd
from pathlib import Path

# Path to the MBFC-linkable dataset used for Table 2 (domain-disjoint, URL-masked).
# Provide via env var, e.g. MBFC_DATA_PATH=/path/to/messages_with_risk_label_urls_removed_nonempty_no_linkurl_evidence.csv
DATA_PATH = Path(os.environ.get("MBFC_DATA_PATH", "data/messages_with_risk_label_urls_removed_nonempty_no_linkurl_evidence.csv"))
print({"data_path": str(DATA_PATH)})


In [None]:
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics import classification_report, roc_auc_score, accuracy_score, f1_score, recall_score
from sklearn.model_selection import GroupShuffleSplit, train_test_split
from sklearn.preprocessing import MultiLabelBinarizer
from scipy import sparse
from scipy.special import expit


class ManualLogisticRegression:
    """Simple binary logistic regression implemented with gradient descent.

    Supports sparse or dense feature matrices and an interface similar to
    sklearn's LogisticRegression for the methods used in this notebook
    (fit, predict, predict_proba).
    """

    def __init__(
        self,
        lr: float = 0.1,
        max_iter: int = 200,
        C: float = 1.0,
        class_weight=None,
        tol: float = 1e-4,
        verbose: bool = False,
        n_jobs=None,  # kept for API compatibility; not used
    ):
        self.lr = lr
        self.max_iter = max_iter
        self.C = C
        self.class_weight = class_weight
        self.tol = tol
        self.verbose = verbose
        self.n_jobs = n_jobs

    def _prepare_X(self, X):
        if sparse.issparse(X):
            return X.tocsr()
        return np.asarray(X, dtype=float)

    def fit(self, X, y):
        X = self._prepare_X(X)
        y = np.asarray(y, dtype=float)
        n_samples, n_features = X.shape
        self.coef_ = np.zeros(n_features, dtype=float)
        self.intercept_ = 0.0
        # Track loss over iterations when tol is not None
        self.loss_history_ = []

        # Per-sample weights according to class_weight
        if self.class_weight is None:
            sample_weights = np.ones_like(y)
        elif self.class_weight == "balanced":
            classes, counts = np.unique(y, return_counts=True)
            n_classes = len(classes)
            class_weight_values = {
                cls: n_samples / (n_classes * count)
                for cls, count in zip(classes, counts)
            }
            sample_weights = np.array([class_weight_values[yi] for yi in y], dtype=float)
        elif isinstance(self.class_weight, dict):
            sample_weights = np.array([
                self.class_weight.get(yi, 1.0) for yi in y
            ], dtype=float)
        else:
            raise ValueError("Unsupported class_weight specification")

        prev_loss = None
        for i in range(self.max_iter):
            z = X.dot(self.coef_) + self.intercept_
            p = expit(z)
            residual = (p - y) * sample_weights

            if sparse.issparse(X):
                grad_w = X.T.dot(residual) / n_samples
            else:
                grad_w = X.T @ residual / n_samples
            # L2 regularization on weights (not bias)
            grad_w += self.coef_ / (self.C * n_samples)
            grad_b = residual.mean()

            self.coef_ -= self.lr * grad_w
            self.intercept_ -= self.lr * grad_b

            if self.tol is not None and (i % 10 == 0 or i == self.max_iter - 1):
                z = X.dot(self.coef_) + self.intercept_
                p = expit(z)
                eps = 1e-15
                loss_vec = (
                    -(y * np.log(p + eps) + (1 - y) * np.log(1 - p + eps))
                ) * sample_weights
                loss = loss_vec.mean() + 0.5 * np.sum(self.coef_ ** 2) / (self.C * n_samples)
                # Store loss for optional diagnostics
                self.loss_history_.append(float(loss))
                if self.verbose:
                    print(f"Iter {i}: loss={loss:.6f}")
                if prev_loss is not None and abs(prev_loss - loss) < self.tol:
                    break
                prev_loss = loss

        self.classes_ = np.array([0.0, 1.0])
        return self

    def _decision_function(self, X):
        X = self._prepare_X(X)
        return X.dot(self.coef_) + self.intercept_

    def predict_proba(self, X):
        z = self._decision_function(X)
        p_pos = expit(z)
        return np.vstack([1 - p_pos, p_pos]).T

    def predict(self, X):
        proba = self.predict_proba(X)[:, 1]
        return (proba >= 0.5).astype(int)

print({"data_path": str(DATA_PATH)})
df = pd.read_csv(DATA_PATH)

# Fix legacy header quirk in the v2 MBFC CSV where the first column name was mangled.
if "source" not in df.columns:
    first_col = df.columns[0]
    df = df.rename(columns={first_col: "source"})

# Strip URLs from message text at load time so the pipeline never sees raw URLs.
df["message"] = df["message"].astype(str).str.replace(
    r"(https?://|http://|www\.[^\s]*|t\.me/[^\s]*)", " ", regex=True
).str.strip()
# Drop rows whose message becomes empty after URL stripping.
df = df[df["message"] != ""].copy()

print({"raw_rows": len(df)})

# Use MBFC-derived binary risk label as y (1 = higher-risk / lower-credibility)
df = df.dropna(subset=["risk_label"]).copy()
df["y"] = df["risk_label"].astype(int)
print({
    "rows_with_label": len(df),
    "high_risk": int(df["y"].sum()),
    "high_cred": int((1 - df["y"]).sum()),
})

groups = df["normalized_domain"].astype(str).values
y = df["y"].values
p_global = y.mean()
print({"global_pos_rate": float(p_global)})

# 1) URL-masked train+val vs test (20% test)
gss_outer = GroupShuffleSplit(n_splits=50, test_size=0.2, random_state=42)
best_score_outer = None
best_outer = None

for split_id, (trainval_idx, test_idx) in enumerate(gss_outer.split(df, y, groups)):
    y_trainval = y[trainval_idx]
    y_test = y[test_idx]
    p_trainval = y_trainval.mean()
    p_test = y_test.mean()
    score = max(abs(p_trainval - p_global), abs(p_test - p_global))
    if best_score_outer is None or score < best_score_outer:
        best_score_outer = score
        best_outer = (trainval_idx, test_idx, p_trainval, p_test, split_id)

trainval_idx, test_idx, p_trainval, p_test, outer_id = best_outer
df_trainval = df.iloc[trainval_idx].copy()
df_test = df.iloc[test_idx].copy()

# 2) Random train vs val within trainval (~70/10 from 80), stratified by label
df_train, df_val = train_test_split(
    df_trainval,
    test_size=0.125,
    random_state=43,
    stratify=df_trainval["y"],
)
p_train = float(df_train["y"].mean())
p_val = float(df_val["y"].mean())

print({
    "global_pos_rate": float(p_global),
    "outer_split_id": int(outer_id),
    "outer_balance_score": float(best_score_outer),
    "inner_split_id": 0,
    "inner_balance_score": float(max(abs(p_train - p_global), abs(p_val - p_global))),
    "train_rows": len(df_train),
    "val_rows": len(df_val),
    "test_rows": len(df_test),
    "train_pos_rate": float(p_train),
    "val_pos_rate": float(p_val),
    "test_pos_rate": float(df_test["y"].mean()),
    "train_label_counts": df_train["y"].value_counts().to_dict(),
    "val_label_counts": df_val["y"].value_counts().to_dict(),
    "test_label_counts": df_test["y"].value_counts().to_dict(),
})


THRESH_GRID = [round(t, 2) for t in np.linspace(0.05, 0.95, 19)]

def sweep_thresholds(y_true, proba, grid=THRESH_GRID):
    best = None
    for t in grid:
        pred = (proba >= t).astype(int)
        macro_f1 = f1_score(y_true, pred, average='macro')
        macro_recall = recall_score(y_true, pred, average='macro')
        rec_pos = recall_score(y_true, pred, pos_label=1)
        candidate = {
            'threshold': float(t),
            'macro_f1': macro_f1,
            'macro_recall': macro_recall,
            'recall_pos': rec_pos,
        }
        if best is None or candidate['macro_f1'] > best['macro_f1']:
            best = candidate
    return best

In [None]:
## TF-IDF text only with grid search over n-grams and max_features
ngram_grid = [(1, 1), (1, 2), (1, 3)]
max_features_grid = [20000, 50000, 100000]
candidate_lrs = [0.0003, 0.001, 0.003]

best = None

for ngram_range in ngram_grid:
    for max_features in max_features_grid:
        vectorizer = TfidfVectorizer(
            ngram_range=ngram_range,
            max_features=max_features,
            min_df=2,
            strip_accents="unicode",
        )
        X_train_text = vectorizer.fit_transform(df_train['message'].astype(str))
        X_val_text = vectorizer.transform(df_val['message'].astype(str))
        X_test_text = vectorizer.transform(df_test['message'].astype(str))

        for lr in candidate_lrs:
            clf = ManualLogisticRegression(
                max_iter=1000,
                lr=lr,
                C=1.0,
                class_weight="balanced",
                tol=None,
                n_jobs=-1,
            )
            clf.fit(X_train_text, df_train['y'])
            val_proba = clf.predict_proba(X_val_text)[:, 1]
            best_thr = sweep_thresholds(df_val['y'].values, val_proba)
            candidate = {
                'ngram_range': ngram_range,
                'max_features': max_features,
                'lr': lr,
                **best_thr,
            }
            if best is None or candidate['macro_f1'] > best['macro_f1']:
                best = candidate
                best_vectorizer = vectorizer
                best_X_train_text = X_train_text
                best_X_val_text = X_val_text
                best_X_test_text = X_test_text

# Refit best text model on train only for stacking features
X_train_text = best_X_train_text
X_val_text = best_X_val_text
X_test_text = best_X_test_text

clf_text_val = ManualLogisticRegression(
    max_iter=1000,
    lr=best['lr'],
    C=1.0,
    class_weight="balanced",
    tol=None,
    n_jobs=-1,
)
clf_text_val.fit(X_train_text, df_train['y'])
val_proba_text = clf_text_val.predict_proba(X_val_text)[:, 1]

# Final text model on train+val for test predictions
X_trainval_text = sparse.vstack([X_train_text, X_val_text])
y_trainval = pd.concat([df_train['y'], df_val['y']]).values
clf_text = ManualLogisticRegression(
    max_iter=1000,
    lr=best['lr'],
    C=1.0,
    class_weight="balanced",
    tol=None,
    n_jobs=-1,
)
clf_text.fit(X_trainval_text, y_trainval)
y_proba_text = clf_text.predict_proba(X_test_text)[:, 1]
y_pred_text = (y_proba_text >= best['threshold']).astype(int)

print({
    'best_ngram_range': best['ngram_range'],
    'best_max_features': int(best['max_features']),
    'best_lr_text': best['lr'],
    'val_macro_f1': float(best['macro_f1']),
    'val_threshold': best['threshold'],
})
print('TF-IDF only ROC AUC (test):', roc_auc_score(df_test['y'], y_proba_text))
print('TF-IDF only classification report (test, val-tuned threshold):')
print(classification_report(df_test['y'], y_pred_text))

In [None]:
## Style-only features (Qwen labels, normalized)

from typing import List, Optional


# v6: remove Qwen 'Link/URL' labels from style features
DROP_LINK_URL_LABEL = True
_LINK_URL_LABEL_NORM = 'link/url'

def tokenize_multi(value: str) -> List[str]:
    """Split a comma- or plus-separated label string into atomic pieces."""
    if not isinstance(value, str):
        return []
    value = value.replace("+", ",")
    parts = [part.strip() for part in value.split(",") if part.strip()]
    if not DROP_LINK_URL_LABEL:
        return parts
    # Normalize by lowercasing and removing whitespace so 'Link / URL' matches too.
    return [p for p in parts if ''.join(p.lower().split()) != _LINK_URL_LABEL_NORM]


# Canonical buckets used for style features (should correspond to the
# dimensions shown in the style diagram: compact theme, claim, CTA, evidence).
THEME_BUCKETS = [
    "Finance/Crypto",
    "Public health & medicine",
    "Politics",
    "Lifestyle & well-being",
    "Crime & public safety",
    "Gaming/Gambling",
    "News/Information",
    "Sports",
    "Technology",
    "Conversation/Chat/Other",
    "Other theme",
]

CLAIM_BUCKETS = [
    "Verifiable factual statement",
    "Rumour / unverified report",
    "Announcement",
    "Opinion / subjective statement",
    "Misleading context / cherry-picking",
    "Promotional hype / exaggerated profit guarantee",
    "Emotional appeal / fear-mongering",
    "Scarcity/FOMO tactic",
    "Statistics",
    "Other claim type",
    "No substantive claim",
    "Fake content",
    "Speculative forecast / prediction",
    "None / assertion only",
]

CTA_BUCKETS = [
    "Visit external link / watch video",
    "Engage/Ask questions",
    "Join/Subscribe",
    "Buy / invest / donate",
    "Attend event / livestream",
    "Share / repost / like",
    "No CTA",
    "Other CTA",
]

EVID_BUCKETS = [
    "Link/URL",
    "Statistics",
    "Quotes/Testimony",
    "Chart / price graph / TA diagram",
    "Other (Evidence)",
    "None / assertion only",
]


def _norm_theme(raw: object) -> Optional[str]:
    if not isinstance(raw, str):
        return None
    t = raw.strip()
    if not t:
        return None
    tl = t.lower()

    # Direct match to canonical buckets
    if t in THEME_BUCKETS:
        return t

    # Finance / crypto / markets
    if any(k in tl for k in [
        "crypto",
        "token",
        "coin",
        "airdrop",
        "ido",
        "staking",
        "defi",
        "exchange",
        "market",
        "finance",
        "econom",
    ]):
        return "Finance/Crypto"

    # Health / medicine / public health
    if any(k in tl for k in [
        "health",
        "covid",
        "vaccine",
        "vaccination",
        "medicine",
        "medical",
        "clinical",
        "disease",
        "pandemic",
        "public health",
        "hospital",
    ]):
        return "Public health & medicine"

    # Politics, policy, war, elections
    if any(k in tl for k in [
        "politic",
        "election",
        "parliament",
        "congress",
        "senate",
        "government",
        "president",
        "minister",
        "policy",
        "war",
        "conflict",
        "ukraine",
        "russia",
    ]):
        return "Politics"

    # Crime / public safety
    if any(k in tl for k in [
        "crime",
        "criminal",
        "terror",
        "shooting",
        "police",
        "public safety",
        "fraud",
        "scam",
    ]):
        return "Crime & public safety"

    # Gaming / gambling
    if any(k in tl for k in [
        "gaming",
        "gambling",
        "casino",
        "betting",
        "lottery",
        "poker",
    ]):
        return "Gaming/Gambling"

    # Sports
    if any(k in tl for k in ["sport", "football", "soccer", "basketball", "tennis", "nba", "nfl"]):
        return "Sports"

    # Technology / science
    if any(k in tl for k in [
        "technology",
        "tech",
        "software",
        "app ",
        "platform",
        "ai ",
        " a.i.",
        "machine learning",
        "blockchain",
        "internet",
        "social media",
        "algorithm",
        "science",
        "research",
        "study",
    ]):
        return "Technology"

    # Lifestyle / culture / society
    if any(k in tl for k in [
        "lifestyle",
        "well-being",
        "wellbeing",
        "culture",
        "entertainment",
        "media",
        "celebrity",
        "social issues",
        "society",
        "family",
        "community",
    ]):
        return "Lifestyle & well-being"

    # General news / updates
    if any(k in tl for k in ["news", "headline", "breaking", "coverage", "roundup", "update"]):
        return "News/Information"

    # Conversational / chat-style content
    if any(k in tl for k in ["comment", "conversation", "chat", "q&a", "ama", "ask me anything"]):
        return "Conversation/Chat/Other"

    return "Other theme"


def _norm_claim_labels(raw: object) -> List[str]:
    labels = tokenize_multi(raw)
    out: List[str] = []
    for lbl in labels:
        base = lbl.strip()
        if not base:
            continue
        low = base.lower()

        if base in CLAIM_BUCKETS:
            out.append(base)
            continue

        if "no substantive claim" in low:
            out.append("No substantive claim")
        elif "verifiable factual statement" in low or "reportage" in low or "study" in low:
            out.append("Verifiable factual statement")
        elif "rumour" in low or "rumor" in low or "unverified" in low:
            out.append("Rumour / unverified report")
        elif "misleading context" in low or "cherry-picking" in low:
            out.append("Misleading context / cherry-picking")
        elif "promotional hype" in low or "exaggerated profit" in low:
            out.append("Promotional hype / exaggerated profit guarantee")
        elif "emotional appeal" in low or "fear-mongering" in low or "fear mongering" in low:
            out.append("Emotional appeal / fear-mongering")
        elif "scarcity" in low or "fomo" in low:
            out.append("Scarcity/FOMO tactic")
        elif "statistic" in low:
            out.append("Statistics")
        elif "fake content" in low or "fabricated" in low:
            out.append("Fake content")
        elif "predict" in low or "forecast" in low:
            out.append("Speculative forecast / prediction")
        elif "announcement" in low:
            out.append("Announcement")
        elif "opinion" in low or "interpretive" in low or "analysis" in low or "review" in low:
            out.append("Opinion / subjective statement")
        elif "none / assertion only" in low or "assertion only" in low:
            out.append("None / assertion only")
        else:
            out.append("Other claim type")

    # Deduplicate while preserving order
    seen = set()
    result: List[str] = []
    for v in out:
        if v not in seen:
            seen.add(v)
            result.append(v)
    return result


def _norm_cta_labels(raw: object) -> List[str]:
    labels = tokenize_multi(raw)
    out: List[str] = []
    for lbl in labels:
        base = lbl.strip()
        if not base:
            continue
        low = base.lower()

        if base in CTA_BUCKETS:
            out.append(base)
            continue

        if base in {"None", "No CTA"} or "no cta" in low:
            out.append("No CTA")
        elif "engage" in low or "ask" in low or "anything" in low:
            out.append("Engage/Ask questions")
        elif "attend" in low or "event" in low or "livestream" in low or "live stream" in low:
            out.append("Attend event / livestream")
        elif "join" in low or "subscribe" in low or "follow" in low or "whitelist" in low:
            out.append("Join/Subscribe")
        elif "buy" in low or "invest" in low or "donate" in low or "stake" in low or "swap" in low:
            out.append("Buy / invest / donate")
        elif "share" in low or "repost" in low or "like" in low:
            out.append("Share / repost / like")
        elif "visit" in low or "read" in low or "watch" in low or "link" in low or "website" in low or "check" in low or "view charts" in low:
            out.append("Visit external link / watch video")
        else:
            out.append("Other CTA")

    seen = set()
    result: List[str] = []
    for v in out:
        if v not in seen:
            seen.add(v)
            result.append(v)
    return result


def _norm_evidence_labels(raw: object) -> List[str]:
    labels = tokenize_multi(raw)
    out: List[str] = []
    for lbl in labels:
        base = lbl.strip()
        if not base:
            continue
        low = base.lower()

        if base in EVID_BUCKETS:
            if base != "Link/URL":
                out.append(base)
            continue

        if "link/url" in low or "link" in low or "url" in low:
            # v6: drop Link/URL evidence tag entirely
            continue
        elif "statistic" in low:
            out.append("Statistics")
        elif "quote" in low or "testimony" in low:
            out.append("Quotes/Testimony")
        elif "chart" in low or "graph" in low or "diagram" in low:
            out.append("Chart / price graph / TA diagram")
        elif "none / assertion only" in low or "assertion only" in low:
            out.append("None / assertion only")
        else:
            out.append("Other (Evidence)")

    seen = set()
    result: List[str] = []
    for v in out:
        if v not in seen:
            seen.add(v)
            result.append(v)
    return result


def build_style_tokens(row) -> list:
    tokens: List[str] = []

    theme = _norm_theme(row.get("theme"))
    if theme is not None:
        tokens.append(f"theme={theme}")

    for label in _norm_claim_labels(row.get("claim_types")):
        tokens.append(f"claim={label}")

    for label in _norm_cta_labels(row.get("ctas")):
        tokens.append(f"cta={label}")

    for label in _norm_evidence_labels(row.get("evidence")):
        tokens.append(f"evid={label}")

    return tokens


df_train['style_tokens'] = df_train.apply(build_style_tokens, axis=1)
df_val['style_tokens'] = df_val.apply(build_style_tokens, axis=1)
df_test['style_tokens'] = df_test.apply(build_style_tokens, axis=1)
print({'example_style_tokens': df_train['style_tokens'].iloc[0][:40]})

mlb = MultiLabelBinarizer()
X_train_style = mlb.fit_transform(df_train['style_tokens'])
X_val_style = mlb.transform(df_val['style_tokens'])
X_test_style = mlb.transform(df_test['style_tokens'])
print({'style_train_shape': X_train_style.shape, 'style_val_shape': X_val_style.shape, 'style_test_shape': X_test_style.shape, 'style_vocab_size': len(mlb.classes_)})


In [None]:
candidate_lrs = [0.0003, 0.001, 0.003, 0.01, 0.03]
best = None

for lr in candidate_lrs:
    clf = ManualLogisticRegression(
        max_iter=3000,
        lr=lr,
        C=1.0,
        class_weight="balanced",
        tol=None,
        n_jobs=-1,
    )
    clf.fit(X_train_style, df_train['y'])
    val_proba = clf.predict_proba(X_val_style)[:, 1]
    best_thr = sweep_thresholds(df_val['y'].values, val_proba)
    candidate = {'lr': lr, **best_thr}
    if best is None or candidate['macro_f1'] > best['macro_f1']:
        best = candidate

# Refit best style model on train only for stacking features
clf_style_val = ManualLogisticRegression(
    max_iter=3000,
    lr=best['lr'],
    C=1.0,
    class_weight="balanced",
    tol=None,
    n_jobs=-1,
)
clf_style_val.fit(X_train_style, df_train['y'])
val_proba_style = clf_style_val.predict_proba(X_val_style)[:, 1]

# Final style model on train+val for test predictions
X_trainval_style = sparse.vstack([X_train_style, X_val_style])
y_trainval = pd.concat([df_train['y'], df_val['y']]).values
clf_style = ManualLogisticRegression(
    max_iter=3000,
    lr=best['lr'],
    C=1.0,
    class_weight="balanced",
    tol=None,
    n_jobs=-1,
)
clf_style.fit(X_trainval_style, y_trainval)
y_proba_style = clf_style.predict_proba(X_test_style)[:, 1]
y_pred_style = (y_proba_style >= best['threshold']).astype(int)

print({
    'best_lr_style': best['lr'],
    'val_macro_f1': best['macro_f1'],
    'val_threshold': best['threshold'],
})
print('Style-only ROC AUC (test):', roc_auc_score(df_test['y'], y_proba_style))
print('Style-only classification report (test, val-tuned threshold):')
print(classification_report(df_test['y'], y_pred_style))


In [None]:
## Combined TF-IDF + style features (stacked ensemble)
from sklearn.linear_model import LogisticRegression

# Build meta-features from validation and test probabilities
meta_X_val = np.stack([
    val_proba_text,
    val_proba_style,
    val_proba_text * val_proba_style,
], axis=1)
meta_X_test = np.stack([
    y_proba_text,
    y_proba_style,
    y_proba_text * y_proba_style,
], axis=1)

# Meta-model trained on validation predictions only
meta_clf = LogisticRegression(
    penalty='l2',
    C=1.0,
    solver='lbfgs',
    max_iter=1000,
)
meta_clf.fit(meta_X_val, df_val['y'].values)

meta_val_proba = meta_clf.predict_proba(meta_X_val)[:, 1]
best_thr_meta = sweep_thresholds(df_val['y'].values, meta_val_proba)

y_proba_combined = meta_clf.predict_proba(meta_X_test)[:, 1]
y_pred_combined = (y_proba_combined >= best_thr_meta['threshold']).astype(int)

print({
    'meta_model': 'stacked_logreg',
    'val_macro_f1': best_thr_meta['macro_f1'],
    'val_threshold': best_thr_meta['threshold'],
})
print('TF-IDF + style ROC AUC (test):', roc_auc_score(df_test['y'], y_proba_combined))
print('TF-IDF + style classification report (test, val-tuned threshold):')
print(classification_report(df_test['y'], y_pred_combined))

In [None]:
## Save artifacts inputs, predictions, metrics, and graph
results_dir = Path("mbfc_url_masked_logreg_results_v6")
results_dir.mkdir(exist_ok=True)

# Save the subset of the input data actually used (messages with a risk label)
input_path = results_dir / "messages_with_risk_label.csv"
df.to_csv(input_path, index=False)
print({"input_csv": str(input_path)})

# Save test-set predictions for all three models
y_true = df_test["y"].values
preds_df = df_test[["source", "msg_id", "channel", "risk_label", "y"]].copy()
preds_df.rename(columns={"y": "y_true"}, inplace=True)
preds_df["tfidf_pred"] = y_pred_text
preds_df["tfidf_proba"] = y_proba_text
preds_df["style_pred"] = y_pred_style
preds_df["style_proba"] = y_proba_style
preds_df["combined_pred"] = y_pred_combined
preds_df["combined_proba"] = y_proba_combined

preds_path = results_dir / "test_predictions_all_models.csv"
preds_df.to_csv(preds_path, index=False)
print({"predictions_csv": str(preds_path)})

# Build a tidy metrics table for all models
metrics_rows = []
models = [
    ("tfidf", y_pred_text, y_proba_text),
    ("style", y_pred_style, y_proba_style),
    ("combined", y_pred_combined, y_proba_combined),
]

for name, y_pred, y_proba in models:
    report = classification_report(y_true, y_pred, output_dict=True)
    roc = roc_auc_score(y_true, y_proba)
    acc = report.get("accuracy", accuracy_score(y_true, y_pred))

    metrics_rows.append({
        "model": name,
        "label": "overall",
        "metric": "roc_auc",
        "value": roc,
    })
    metrics_rows.append({
        "model": name,
        "label": "overall",
        "metric": "accuracy",
        "value": acc,
    })

    for label_key in ["0", "1", "macro avg", "weighted avg"]:
        if label_key not in report:
            continue
        stats = report[label_key]
        for metric_name in ["precision", "recall", "f1-score", "support"]:
            metrics_rows.append({
                "model": name,
                "label": label_key,
                "metric": metric_name,
                "value": stats[metric_name],
            })

metrics_df = pd.DataFrame(metrics_rows)
metrics_path = results_dir / "metrics_summary.csv"
metrics_df.to_csv(metrics_path, index=False)
print({"metrics_csv": str(metrics_path)})

# Make a simple bar chart for accuracy and ROC AUC per model
model_order = ["tfidf", "style", "combined"]
labels = ["TF-IDF", "Style", "TF-IDF+Style"]
accs = [
    float(metrics_df[(metrics_df["model"] == m) & (metrics_df["metric"] == "accuracy")]["value"].iloc[0])
    for m in model_order
]
aucs = [
    float(metrics_df[(metrics_df["model"] == m) & (metrics_df["metric"] == "roc_auc")]["value"].iloc[0])
    for m in model_order
]

x = np.arange(len(labels))
width = 0.35
fig, ax = plt.subplots(figsize=(6, 4))
ax.bar(x - width / 2, accs, width, label="Accuracy")
ax.bar(x + width / 2, aucs, width, label="ROC AUC")
ax.set_xticks(x)
ax.set_xticklabels(labels, rotation=15)
ax.set_ylim(0.0, 1.05)
ax.set_ylabel("Score")
ax.set_title("MBFC URL-masked logistic regression performance")
ax.legend()
fig.tight_layout()

fig_path = results_dir / "mbfc_logreg_metrics.png"
fig.savefig(fig_path, dpi=200)
plt.close(fig)
print({"metrics_figure": str(fig_path)})


In [None]:
## Extended metrics plot Accuracy, ROC AUC, macro F1, macro Recall
model_order = ["tfidf", "style", "combined"]
model_labels = ["TF-IDF", "Style", "TF-IDF+Style"]

# Overall accuracy and ROC AUC (already in metrics_df)
accs = [
    float(
        metrics_df[
            (metrics_df["model"] == model_key)
            & (metrics_df["label"] == "overall")
            & (metrics_df["metric"] == "accuracy")
        ]["value"].iloc[0]
    )
    for model_key in model_order
]
aucs = [
    float(
        metrics_df[
            (metrics_df["model"] == model_key)
            & (metrics_df["label"] == "overall")
            & (metrics_df["metric"] == "roc_auc")
        ]["value"].iloc[0]
    )
    for model_key in model_order
]

# Macro-averaged F1 and Recall
macro_f1s = [
    float(
        metrics_df[
            (metrics_df["model"] == model_key)
            & (metrics_df["label"] == "macro avg")
            & (metrics_df["metric"] == "f1-score")
        ]["value"].iloc[0]
    )
    for model_key in model_order
]
macro_recalls = [
    float(
        metrics_df[
            (metrics_df["model"] == model_key)
            & (metrics_df["label"] == "macro avg")
            & (metrics_df["metric"] == "recall")
        ]["value"].iloc[0]
    )
    for model_key in model_order
]

x = np.arange(len(model_order))

# Caption-critical note: specify whether ROC-AUC is binary or macro-averaged one-vs-rest for multiclass.
pred_path = results_dir / "test_predictions_all_models.csv"

acc_stds = auc_stds = macro_f1_stds = macro_recall_stds = None
if pred_path.exists():
    preds_df = pd.read_csv(pred_path)
    y_true = preds_df["y_true"].astype(int).to_numpy()

    def _bootstrap_metric_stds(model_key, n_boot=1000, seed=0):
        pred = preds_df[f"{model_key}_pred"].astype(int).to_numpy()
        proba = preds_df[f"{model_key}_proba"].astype(float).to_numpy()

        idx0 = np.flatnonzero(y_true == 0)
        idx1 = np.flatnonzero(y_true == 1)
        n0, n1 = len(idx0), len(idx1)
        if n0 == 0 or n1 == 0:
            raise ValueError("Bootstrap requires both classes in y_true.")

        rng = np.random.default_rng(seed)
        vals = np.empty((n_boot, 4), dtype=float)
        for bootstrap_index in range(n_boot):
            sample_idx = np.concatenate(
                [
                    rng.choice(idx0, size=n0, replace=True),
                    rng.choice(idx1, size=n1, replace=True),
                ]
            )
            yb = y_true[sample_idx]
            pb = pred[sample_idx]
            probab = proba[sample_idx]
            vals[bootstrap_index, 0] = accuracy_score(yb, pb)
            vals[bootstrap_index, 1] = roc_auc_score(yb, probab)
            vals[bootstrap_index, 2] = f1_score(yb, pb, average="macro")
            vals[bootstrap_index, 3] = recall_score(yb, pb, average="macro")

        stds = vals.std(axis=0, ddof=1)
        return float(stds[0]), float(stds[1]), float(stds[2]), float(stds[3])

    acc_stds, auc_stds, macro_f1_stds, macro_recall_stds = [], [], [], []
    for model_index, model_key in enumerate(model_order):
        s_acc, s_auc, s_f1, s_rec = _bootstrap_metric_stds(model_key, n_boot=1000, seed=7 + model_index)
        acc_stds.append(s_acc)
        auc_stds.append(s_auc)
        macro_f1_stds.append(s_f1)
        macro_recall_stds.append(s_rec)

from matplotlib.lines import Line2D

with plt.rc_context(
    {
        "font.size": 9,
        "axes.labelsize": 9,
        "xtick.labelsize": 9,
        "ytick.labelsize": 9,
        "legend.fontsize": 9,
        "pdf.fonttype": 42,  # TrueType
        "ps.fonttype": 42,  # TrueType
    }
):
    fig, ax = plt.subplots(figsize=(3.3, 2.2), dpi=300)
    fig.subplots_adjust(left=0.16, right=0.995, bottom=0.22, top=0.80)

    metric_specs = [
        ("Accuracy", accs, acc_stds, "o", "white"),
        ("ROC-AUC", aucs, auc_stds, "s", "0.85"),
        ("F1 (macro)", macro_f1s, macro_f1_stds, "^", "0.65"),
        ("Recall (macro)", macro_recalls, macro_recall_stds, "D", "0.45"),
    ]
    offsets = np.array([-0.24, -0.08, 0.08, 0.24])

    for metric_index, (metric_label, metric_values, metric_errors, marker, face) in enumerate(metric_specs):
        ax.errorbar(
            x + offsets[metric_index],
            metric_values,
            yerr=metric_errors,
            fmt=marker,
            linestyle="none",
            color="black",
            ecolor="black",
            elinewidth=0.6,
            capsize=2,
            capthick=0.6,
            markerfacecolor=face,
            markeredgecolor="black",
            markeredgewidth=0.7,
            markersize=5.2,
            zorder=3,
        )

    ax.set_xticks(x)
    ax.set_xticklabels(["TF-IDF", "Style", "TF-IDF\n+Style"])
    ax.set_ylabel("Score")

    all_vals = np.array([accs, aucs, macro_f1s, macro_recalls], dtype=float).ravel()
    y_min = max(0.0, np.floor((all_vals.min() - 0.03) * 20) / 20)
    y_max = min(1.0, np.ceil((all_vals.max() + 0.03) * 20) / 20)
    ax.set_ylim(y_min, y_max)

    ax.set_axisbelow(True)
    ax.grid(axis="y", which="major", lw=0.5, alpha=0.2)

    legend_handles = [
        Line2D(
            [0],
            [0],
            marker=marker,
            linestyle="none",
            color="black",
            markerfacecolor=face,
            markeredgecolor="black",
            markeredgewidth=0.7,
            markersize=5.2,
            label=label,
        )
        for (label, _, _, marker, face) in metric_specs
    ]
    fig.legend(
        handles=legend_handles,
        loc="upper center",
        bbox_to_anchor=(0.5, 0.995),
        ncol=2,
        frameon=False,
        handlelength=1.0,
        columnspacing=1.0,
        handletextpad=0.4,
    )

    extended_fig_path = results_dir / "mbfc_logreg_metrics_extended.png"
    extended_pdf_path = results_dir / "mbfc_logreg_metrics_extended.pdf"
    fig.savefig(extended_fig_path, dpi=300, bbox_inches=None, pad_inches=0.0)
    fig.savefig(extended_pdf_path, bbox_inches=None, pad_inches=0.0)
    plt.close(fig)

print(
    {
        "extended_metrics_figure": str(extended_fig_path),
        "extended_metrics_pdf": str(extended_pdf_path),
    }
)


# Calibration metrics plot (ECE and Brier score)
from sklearn.metrics import brier_score_loss


def _expected_calibration_error(y_true, y_proba, n_bins=10):
    y_true = np.asarray(y_true)
    y_proba = np.asarray(y_proba)
    bins = np.linspace(0.0, 1.0, n_bins + 1)
    indices = np.digitize(y_proba, bins) - 1
    ece = 0.0
    n = len(y_true)
    for b in range(n_bins):
        mask = indices == b
        if not np.any(mask):
            continue
        p_bin = y_proba[mask].mean()
        y_bin = y_true[mask].mean()
        weight = mask.sum() / n
        ece += weight * abs(p_bin - y_bin)
    return float(ece)


if pred_path.exists():
    if "preds_df" not in locals():
        preds_df = pd.read_csv(pred_path)
        y_true = preds_df["y_true"].astype(int).to_numpy()

    eces = [
        _expected_calibration_error(
            y_true,
            preds_df[f"{model_key}_proba"].astype(float).to_numpy(),
            n_bins=10,
        )
        for model_key in model_order
    ]
    briers = [
        float(brier_score_loss(y_true, preds_df[f"{model_key}_proba"].astype(float).to_numpy()))
        for model_key in model_order
    ]

    def _bootstrap_calibration_stds(model_key, n_boot=1000, seed=0):
        proba = preds_df[f"{model_key}_proba"].astype(float).to_numpy()
        idx0 = np.flatnonzero(y_true == 0)
        idx1 = np.flatnonzero(y_true == 1)
        n0, n1 = len(idx0), len(idx1)
        if n0 == 0 or n1 == 0:
            raise ValueError("Bootstrap requires both classes in y_true.")

        rng = np.random.default_rng(seed)
        vals = np.empty((n_boot, 2), dtype=float)
        for bootstrap_index in range(n_boot):
            sample_idx = np.concatenate(
                [
                    rng.choice(idx0, size=n0, replace=True),
                    rng.choice(idx1, size=n1, replace=True),
                ]
            )
            yb = y_true[sample_idx]
            probab = proba[sample_idx]
            vals[bootstrap_index, 0] = _expected_calibration_error(yb, probab, n_bins=10)
            vals[bootstrap_index, 1] = brier_score_loss(yb, probab)

        stds = vals.std(axis=0, ddof=1)
        return float(stds[0]), float(stds[1])

    ece_stds, brier_stds = [], []
    for model_index, model_key in enumerate(model_order):
        s_ece, s_brier = _bootstrap_calibration_stds(model_key, n_boot=1000, seed=29 + model_index)
        ece_stds.append(s_ece)
        brier_stds.append(s_brier)

    with plt.rc_context(
        {
            "font.size": 9,
            "axes.labelsize": 9,
            "xtick.labelsize": 9,
            "ytick.labelsize": 9,
            "legend.fontsize": 9,
            "pdf.fonttype": 42,  # TrueType
            "ps.fonttype": 42,  # TrueType
        }
    ):
        fig, ax = plt.subplots(figsize=(3.3, 2.0), dpi=300)
        fig.subplots_adjust(left=0.16, right=0.995, bottom=0.22, top=0.80)

        metric_specs = [
            ("ECE", eces, ece_stds, "o", "white"),
            ("Brier", briers, brier_stds, "s", "0.75"),
        ]
        offsets = np.array([-0.10, 0.10])

        for metric_index, (metric_label, metric_values, metric_errors, marker, face) in enumerate(metric_specs):
            ax.errorbar(
                x + offsets[metric_index],
                metric_values,
                yerr=metric_errors,
                fmt=marker,
                linestyle="none",
                color="black",
                ecolor="black",
                elinewidth=0.6,
                capsize=2,
                capthick=0.6,
                markerfacecolor=face,
                markeredgecolor="black",
                markeredgewidth=0.7,
                markersize=5.2,
                zorder=3,
            )

        ax.set_xticks(x)
        ax.set_xticklabels(["TF-IDF", "Style", "TF-IDF\n+Style"])
        ax.set_ylabel("Error")

        all_vals = np.array([eces, briers], dtype=float).ravel()
        y_min = max(0.0, np.floor((all_vals.min() - 0.01) * 100) / 100)
        y_max = min(1.0, np.ceil((all_vals.max() + 0.01) * 100) / 100)
        ax.set_ylim(y_min, y_max)

        ax.set_axisbelow(True)
        ax.grid(axis="y", which="major", lw=0.5, alpha=0.2)

        legend_handles = [
            Line2D(
                [0],
                [0],
                marker=marker,
                linestyle="none",
                color="black",
                markerfacecolor=face,
                markeredgecolor="black",
                markeredgewidth=0.7,
                markersize=5.2,
                label=metric_label,
            )
            for (metric_label, _, _, marker, face) in metric_specs
        ]
        fig.legend(
            handles=legend_handles,
            loc="upper center",
            bbox_to_anchor=(0.5, 0.995),
            ncol=2,
            frameon=False,
            handlelength=1.0,
            columnspacing=1.0,
            handletextpad=0.4,
        )

        cal_fig_path = results_dir / "mbfc_logreg_calibration_metrics.png"
        cal_pdf_path = results_dir / "mbfc_logreg_calibration_metrics.pdf"
        fig.savefig(cal_fig_path, dpi=300, bbox_inches=None, pad_inches=0.0)
        fig.savefig(cal_pdf_path, bbox_inches=None, pad_inches=0.0)
        plt.close(fig)

    print(
        {
            "calibration_metrics_figure": str(cal_fig_path),
            "calibration_metrics_pdf": str(cal_pdf_path),
        }
    )


In [None]:
## Heatmap of key metrics per model
import numpy as np

heatmap_metrics = ["Accuracy", "ROC AUC", "Macro F1", "Macro recall"]
heatmap_values = np.array([accs, aucs, macro_f1s, macro_recalls])

fig, ax = plt.subplots(figsize=(6, 4), dpi=200)
im = ax.imshow(heatmap_values, vmin=0.0, vmax=1.0, cmap="viridis")

ax.set_xticks(np.arange(len(model_labels)))
ax.set_xticklabels(model_labels, rotation=15)
ax.set_yticks(np.arange(len(heatmap_metrics)))
ax.set_yticklabels(heatmap_metrics)

for i in range(heatmap_values.shape[0]):
    for j in range(heatmap_values.shape[1]):
        val = heatmap_values[i, j]
        text_color = "white" if val < 0.5 else "black"
        ax.text(j, i, f"{val:.2f}", ha="center", va="center", color=text_color, fontsize=8)

ax.set_title("MBFC URL-masked logistic regression – metric heatmap")
cbar = fig.colorbar(im, ax=ax)
cbar.set_label("Score")
fig.tight_layout()

heatmap_path = results_dir / "mbfc_logreg_metrics_heatmap.png"
fig.savefig(heatmap_path, dpi=200)
plt.close(fig)
print({"metrics_heatmap": str(heatmap_path)})


In [None]:
# Helper utilities for validation-tuned URL-masked evaluation
import numpy as np
import re
from sklearn.metrics import f1_score, recall_score, brier_score_loss

RANDOM_STATES = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
LR_GRID = [0.001, 0.003, 0.01]
THRESH_GRID = [round(t, 2) for t in np.linspace(0.05, 0.95, 19)]
PRIMARY_METRIC = 'macro_f1'

# Simple URL pattern, aligned with the cleaning pipeline: strip http(s), www, and t.me links.
URL_PATTERN = re.compile(r'(?:https?://|http://|www\.[^\s]*|t\.me/[^\s]*)', flags=re.IGNORECASE)

def strip_urls(text: str) -> str:
    if not isinstance(text, str):
        return ""
    # Replace URLs with a space so they don't appear in the text features.
    return URL_PATTERN.sub(' ', text)

def _stack_features(a, b):
    if sparse.issparse(a):
        return sparse.vstack([a, b])
    return np.vstack([a, b])

def build_text_features(train_df, val_df, test_df):
    # URL-masked text: remove URLs from the message field before vectorization.
    def _preprocess(df):
        return df['message'].astype(str).apply(strip_urls)

    vec = TfidfVectorizer(ngram_range=(1, 2), max_features=20000, min_df=2, strip_accents='unicode')
    X_train = vec.fit_transform(_preprocess(train_df))
    X_val = vec.transform(_preprocess(val_df))
    X_test = vec.transform(_preprocess(test_df))
    return X_train, X_val, X_test

def build_style_features(train_df, val_df, test_df):
    train_tokens = train_df.apply(build_style_tokens, axis=1)
    val_tokens = val_df.apply(build_style_tokens, axis=1)
    test_tokens = test_df.apply(build_style_tokens, axis=1)
    mlb = MultiLabelBinarizer()
    X_train = mlb.fit_transform(train_tokens)
    X_val = mlb.transform(val_tokens)
    X_test = mlb.transform(test_tokens)
    return X_train, X_val, X_test

def sweep_thresholds(y_true, proba):
    rows = []
    for t in THRESH_GRID:
        pred = (proba >= t).astype(int)
        rows.append({
            'threshold': float(t),
            'macro_f1': f1_score(y_true, pred, average='macro'),
            'macro_recall': recall_score(y_true, pred, average='macro'),
            'recall_pos': recall_score(y_true, pred, pos_label=1),
        })
    best = max(rows, key=lambda r: r[PRIMARY_METRIC])
    return best, rows

def expected_calibration_error(y_true, y_proba, n_bins=10):
    """Compute Expected Calibration Error (ECE) with equal-width bins."""
    y_true = np.asarray(y_true)
    y_proba = np.asarray(y_proba)
    bins = np.linspace(0.0, 1.0, n_bins + 1)
    indices = np.digitize(y_proba, bins) - 1
    ece = 0.0
    n = len(y_true)
    for b in range(n_bins):
        mask = indices == b
        if not np.any(mask):
            continue
        p_bin = y_proba[mask].mean()
        y_bin = y_true[mask].mean()
        weight = mask.sum() / n
        ece += weight * abs(p_bin - y_bin)
    return float(ece)

def fit_with_val_search(X_train, y_train, X_val, y_val, X_test, y_test, lr_grid=None):
    lr_grid = lr_grid or LR_GRID
    best = None
    for lr in lr_grid:
        clf = ManualLogisticRegression(
            max_iter=1000, lr=lr, C=1.0, class_weight="balanced", tol=None, n_jobs=-1, verbose=False
        )
        clf.fit(X_train, y_train)
        val_proba = clf.predict_proba(X_val)[:, 1]
        best_thr, _ = sweep_thresholds(y_val, val_proba)
        candidate = {
            'lr': lr,
            'val_threshold': best_thr['threshold'],
            'val_macro_f1': best_thr['macro_f1'],
            'val_macro_recall': best_thr['macro_recall'],
            'val_recall_pos': best_thr['recall_pos'],
            'primary_score': best_thr[PRIMARY_METRIC],
        }
        if best is None or candidate['primary_score'] > best['primary_score']:
            best = candidate
    # Retrain on train+val with best lr
    X_trainval = _stack_features(X_train, X_val)
    y_trainval = np.concatenate([y_train, y_val])
    final_clf = ManualLogisticRegression(
        max_iter=1000, lr=best['lr'], C=1.0, class_weight="balanced", tol=None, n_jobs=-1, verbose=False
    )
    final_clf.fit(X_trainval, y_trainval)
    test_proba = final_clf.predict_proba(X_test)[:, 1]
    test_pred = (test_proba >= best['val_threshold']).astype(int)
    brier = brier_score_loss(y_test, test_proba)
    ece = expected_calibration_error(y_test, test_proba, n_bins=10)
    return {
        'best_lr': best['lr'],
        'threshold': best['val_threshold'],
        'test_macro_f1': f1_score(y_test, test_pred, average='macro'),
        'test_macro_recall': recall_score(y_test, test_pred, average='macro'),
        'test_recall_pos': recall_score(y_test, test_pred, pos_label=1),
        'test_roc_auc': roc_auc_score(y_test, test_proba),
        'test_accuracy': accuracy_score(y_test, test_pred),
        'test_brier': float(brier),
        'test_ece': float(ece),
    }

def train_base_for_stacking(X_train, y_train, X_val, y_val, X_test, lr_grid=None):
    """Train base ManualLogisticRegression and return val/test probabilities.

    This mirrors fit_with_val_search's hyperparameter search but is used for
    building stacked ensembles rather than for direct evaluation.
    """
    lr_grid = lr_grid or LR_GRID
    best_lr = None
    best_thr = None
    best_score = None
    best_val_proba = None
    for lr in lr_grid:
        clf = ManualLogisticRegression(
            max_iter=1000, lr=lr, C=1.0, class_weight="balanced", tol=None, n_jobs=-1, verbose=False
        )
        clf.fit(X_train, y_train)
        val_proba = clf.predict_proba(X_val)[:, 1]
        thr, _ = sweep_thresholds(y_val, val_proba)
        score = thr[PRIMARY_METRIC]
        if best_score is None or score > best_score:
            best_score = score
            best_lr = lr
            best_thr = thr
            best_val_proba = val_proba
    X_trainval = _stack_features(X_train, X_val)
    y_trainval = np.concatenate([y_train, y_val])
    final_clf = ManualLogisticRegression(
        max_iter=1000, lr=best_lr, C=1.0, class_weight="balanced", tol=None, n_jobs=-1, verbose=False
    )
    final_clf.fit(X_trainval, y_trainval)
    test_proba = final_clf.predict_proba(X_test)[:, 1]
    return {
        'lr': best_lr,
        'val_threshold': best_thr['threshold'],
        'val_macro_f1': best_thr['macro_f1'],
        'val_macro_recall': best_thr['macro_recall'],
        'val_recall_pos': best_thr['recall_pos'],
        'val_proba': best_val_proba,
        'test_proba': test_proba,
    }


In [None]:
# URL-masked, validation-tuned evaluation (no tuning on test)
results_dir = Path('mbfc_url_masked_logreg_results_v6')
results_dir.mkdir(exist_ok=True)

# For URL-masked evaluation we only keep rows
# that have a resolved normalized_domain (no NaNs in groups).
df_eval = df.dropna(subset=['normalized_domain']).copy()
print({'eval_rows': len(df_eval), 'unique_domains': int(df_eval['normalized_domain'].nunique())})
url_masked_rows = []
for split_id, seed in enumerate(RANDOM_STATES):
    gss = GroupShuffleSplit(n_splits=1, test_size=0.2, random_state=seed)
    trainval_idx, test_idx = next(gss.split(df_eval, df_eval['y'], df_eval['normalized_domain']))
    df_trainval = df_eval.iloc[trainval_idx].copy()
    df_test_split = df_eval.iloc[test_idx].copy()

    df_train_split, df_val_split = train_test_split(
        df_trainval, test_size=0.125, random_state=100 + seed, stratify=df_trainval['y']
    )
    y_train = df_train_split['y'].values
    y_val = df_val_split['y'].values
    y_test = df_test_split['y'].values

    # TF-IDF
    X_train_text, X_val_text, X_test_text = build_text_features(df_train_split, df_val_split, df_test_split)
    text_metrics = fit_with_val_search(X_train_text, y_train, X_val_text, y_val, X_test_text, y_test)
    text_metrics.update({'model': 'tfidf', 'split_seed': int(seed)})
    url_masked_rows.append(text_metrics)

    # Style
    X_train_style, X_val_style, X_test_style = build_style_features(df_train_split, df_val_split, df_test_split)
    style_metrics = fit_with_val_search(X_train_style, y_train, X_val_style, y_val, X_test_style, y_test)
    style_metrics.update({'model': 'style', 'split_seed': int(seed)})
    url_masked_rows.append(style_metrics)

    # Combined via stacked ensemble on probabilities
    base_text = train_base_for_stacking(X_train_text, y_train, X_val_text, y_val, X_test_text)
    base_style = train_base_for_stacking(X_train_style, y_train, X_val_style, y_val, X_test_style)

    meta_X_val = np.stack([
        base_text['val_proba'],
        base_style['val_proba'],
        base_text['val_proba'] * base_style['val_proba'],
    ], axis=1)
    meta_X_test = np.stack([
        base_text['test_proba'],
        base_style['test_proba'],
        base_text['test_proba'] * base_style['test_proba'],
    ], axis=1)

    from sklearn.linear_model import LogisticRegression
    meta_clf = LogisticRegression(penalty='l2', C=1.0, solver='lbfgs', max_iter=1000)
    meta_clf.fit(meta_X_val, y_val)
    meta_val_proba = meta_clf.predict_proba(meta_X_val)[:, 1]
    meta_best_thr, _ = sweep_thresholds(y_val, meta_val_proba)

    test_proba_combined = meta_clf.predict_proba(meta_X_test)[:, 1]
    test_pred_combined = (test_proba_combined >= meta_best_thr['threshold']).astype(int)

    brier_combined = brier_score_loss(y_test, test_proba_combined)
    ece_combined = expected_calibration_error(y_test, test_proba_combined, n_bins=10)
    combined_metrics = {
        'best_lr': float(base_text['lr']),  # text lr for reference
        'threshold': float(meta_best_thr['threshold']),
        'test_macro_f1': f1_score(y_test, test_pred_combined, average='macro'),
        'test_macro_recall': recall_score(y_test, test_pred_combined, average='macro'),
        'test_recall_pos': recall_score(y_test, test_pred_combined, pos_label=1),
        'test_roc_auc': roc_auc_score(y_test, test_proba_combined),
        'test_accuracy': accuracy_score(y_test, test_pred_combined),
        'test_brier': float(brier_combined),
        'test_ece': float(ece_combined),
        'model': 'combined',
        'split_seed': int(seed),
    }
    url_masked_rows.append(combined_metrics)

url_masked_df = pd.DataFrame(url_masked_rows)
val_tuned_path = results_dir / 'url_masked_val_tuned_metrics.csv'
url_masked_df.to_csv(val_tuned_path, index=False)
print({'val_tuned_metrics_csv': str(val_tuned_path)})

summary = url_masked_df.groupby('model')[
    ['test_accuracy', 'test_roc_auc', 'test_macro_f1', 'test_macro_recall', 'test_recall_pos', 'test_brier', 'test_ece']
].agg(['mean', 'std']).round(4)
print('URL-masked (val-tuned) summary:')
print(summary)

In [None]:
# Plot and save URL-masked robustness summary as grouped bars per model
import matplotlib.pyplot as plt
import numpy as np

# Only plot the core metrics: accuracy, ROC AUC, macro F1
metrics_to_plot = [
    ("test_accuracy", "Accuracy"),
    ("test_roc_auc", "ROC AUC"),
    ("test_macro_f1", "Macro F1"),
]

models = ["tfidf", "style", "combined"]
model_labels = ["TF-IDF", "Style", "TF-IDF+Style"]
metric_colors = ["tab:blue", "tab:orange", "tab:green"]

x = np.arange(len(models))
width = 0.22
fig, ax = plt.subplots(figsize=(7, 4), dpi=200)

for j, ((metric_key, metric_label), color) in enumerate(zip(metrics_to_plot, metric_colors)):
    means = [float(summary.loc[m][metric_key]["mean"]) for m in models]
    offsets = x + (j - (len(metrics_to_plot) - 1) / 2) * width
    ax.bar(offsets, means, width, label=metric_label, color=color, alpha=0.85)

ax.set_xticks(x)
ax.set_xticklabels(model_labels)
# Zoom the y-axis to highlight differences between models
ax.set_ylim(0.6, 0.9)
ax.set_ylabel("Score")
ax.set_title("URL-masked logistic regression – robustness across seeds")
ax.legend(loc="best")
ax.grid(axis="y", linestyle="--", alpha=0.3)
fig.tight_layout()

robust_plot_path = results_dir / "url_masked_val_tuned_robustness.png"
fig.savefig(robust_plot_path, dpi=200)
plt.close(fig)
print({"url_masked_robustness_figure": str(robust_plot_path)})


In [None]:
# Sanity check: ManualLogisticRegression vs sklearn on a small text subset
from sklearn.linear_model import LogisticRegression
subset_n = min(len(df), 7000)
df_subset = df.sample(n=subset_n, random_state=123, replace=False) if len(df) > subset_n else df.copy()
train_sub, test_sub = train_test_split(
    df_subset, test_size=0.3, random_state=321, stratify=df_subset['y']
)
vec_check = TfidfVectorizer(ngram_range=(1, 2), max_features=20000, min_df=2, strip_accents='unicode')
X_tr = vec_check.fit_transform(train_sub['message'].astype(str))
X_te = vec_check.transform(test_sub['message'].astype(str))
y_tr = train_sub['y'].values
y_te = test_sub['y'].values

manual = ManualLogisticRegression(lr=0.01, max_iter=4000, C=1.0, class_weight="balanced", tol=None)
manual.fit(X_tr, y_tr)
proba_manual = manual.predict_proba(X_te)[:, 1]

sk = LogisticRegression(penalty='l2', C=1.0, solver='liblinear', max_iter=4000)
sk.fit(X_tr, y_tr)
proba_sk = sk.predict_proba(X_te)[:, 1]

mae = float(np.mean(np.abs(proba_manual - proba_sk)))
print({'proba_mae_manual_vs_sklearn': mae})
print({'roc_auc_manual': float(roc_auc_score(y_te, proba_manual)),
       'roc_auc_sklearn': float(roc_auc_score(y_te, proba_sk)),
       'acc_manual@0.5': float(accuracy_score(y_te, (proba_manual >= 0.5).astype(int))),
       'acc_sklearn@0.5': float(accuracy_score(y_te, (proba_sk >= 0.5).astype(int)))})

In [None]:
# Sanity check: ManualLogisticRegression vs sklearn on a small text subset
from sklearn.linear_model import LogisticRegression
subset_n = min(len(df), 7000)
df_subset = df.sample(n=subset_n, random_state=123, replace=False) if len(df) > subset_n else df.copy()
train_sub, test_sub = train_test_split(
    df_subset, test_size=0.3, random_state=321, stratify=df_subset['y']
)
vec_check = TfidfVectorizer(ngram_range=(1, 2), max_features=20000, min_df=2, strip_accents='unicode')
X_tr = vec_check.fit_transform(train_sub['message'].astype(str))
X_te = vec_check.transform(test_sub['message'].astype(str))
y_tr = train_sub['y'].values
y_te = test_sub['y'].values

manual = ManualLogisticRegression(lr=0.01, max_iter=4000, C=1.0, class_weight="balanced", tol=1e-4)
manual.fit(X_tr, y_tr)
proba_manual = manual.predict_proba(X_te)[:, 1]

sk = LogisticRegression(penalty='l2', C=1.0, solver='liblinear', max_iter=4000)
sk.fit(X_tr, y_tr)
proba_sk = sk.predict_proba(X_te)[:, 1]

mae = float(np.mean(np.abs(proba_manual - proba_sk)))
print({'proba_mae_manual_vs_sklearn': mae})
print({'roc_auc_manual': float(roc_auc_score(y_te, proba_manual)),
       'roc_auc_sklearn': float(roc_auc_score(y_te, proba_sk)),
       'acc_manual@0.5': float(accuracy_score(y_te, (proba_manual >= 0.5).astype(int))),
       'acc_sklearn@0.5': float(accuracy_score(y_te, (proba_sk >= 0.5).astype(int))),
       'manual_loss_steps': len(manual.loss_history_),
       'manual_final_loss': float(manual.loss_history_[-1])})

## discussion
The results show that the compact and interpretable style representation can be used as a strong URL independent risk signal on the telegram. On the channel shielding 7164 message benchmark, the style only model is significantly better than the TF-IDF baseline in accuracy (0.852 vs.0.796) and macro F1 (0.846 vs.0.796), and the combined style+TF-IDF model has the best overall performance (0.892 accuracy, 0.890 macro F1). This shows that the lexical features themselves cannot capture the information related to the credibility of the "how messages talk" encoding, and when the explicit source identifier is deleted, these style patterns will be generalized in the invisible channel. At the same time, there are obvious sources of error and uncertainty: tags inherit from MBFC domain ratings rather than message level fact checking, so the model learns the agent of MBFC reputation judgment rather than the accuracy of basic facts; Style markers come from LLM, which may wrongly mark some themes, languages or rhetorical forms; The evaluation subset prefers channels linked to the MBFC rating domain, and the test set is mainly controlled by a small number of large high-risk or low-risk channels. The ROC ‑ AUC of TF ‑ IDF model is high, but the threshold performance is poor, which indicates that measurement selection and calibration are very important: the risk level of vocabulary model is high, but the calibration is not good, and the mainstream content is over marked, while the style based model produces a more reasonable positive rate, but may be conservative in some high-risk sources.



## conclusion
In a word, this method is effective and has practical appeal, but it is not decisive. It provides a calibrated and interpretable message level risk score. It complements the URL based pipeline, and can work even when the domain is lost or confused. However, it inherits the bias of MBFC, ignores the content outside the message fragments linked with MBFC, and evaluates on a single channel partition rather than multiple random partitions. In terms of calculation, the cost of logistic regression stage is low, but the scale of LLM based marking pipeline is not insignificant, and it is assumed that the style strategy is relatively stable over time and community changes. Reasonable follow-up steps include: performing channel level cross validation or multiple groupshufflesplit seeds to quantify variance; Adjust the threshold of each model on the validation set to make the TF-IDF comparison more stringent; Expand the corpus beyond the MBFC link channel, and merge message level fact check tags where possible; Stress test the style label under the confrontation or evolution tactics; Explore richer but still interpretable features (for example, simple interaction patterns or forwarding structures) that keep URLs agnostic but capture more of the behavior specific to the telegraph platform.