# Pingkit End‑to‑End Modeling Walkthrough

In this notebook we demonstrate a **minimal, fully‑reproducible pipeline** for training a probe on transformer embeddings with [Pingkit](https://github.com/tatonetti-lab/pingkit). We'll start from per‑token embeddings stored on disk, stack them into feature matrices, train an automatically‑sized MLP or CNN probe, and finally evaluate it on a held‑out test set.

All the knobs you might want to tweak live in the next cell—feel free to edit them before running the rest of the notebook!

## 0. Editable parameters

In [None]:

# ╒══════════════════════════════════════════════════════════════╕
# │                          PARAMETERS                          │
# ╘══════════════════════════════════════════════════════════════╛
# Embedding configuration
LAYER             = 39                # Transformer layer to use
EMB_PARTS         = "rs"              # 'rs', 'attn', 'mlp', or a list like ['rs','mlp']

# Paths
TRAIN_EMB_DIR     = "safety_train"    # Directory of per‑row embedding CSVs
TEST_EMB_DIR      = "safety_test"
TRAIN_LABEL_CSV   = "safety_train.csv"  # CSV with columns ['id', 'label']
TEST_LABEL_CSV    = "safety_test.csv"

# Training hyper‑parameters
MODEL_TYPE        = "mlp"             # 'mlp' or 'cnn'
BATCH_SIZE        = 128
N_EPOCHS          = 100
LEARNING_RATE     = 1e-3
RANDOM_STATE      = 405
DEVICE            = "cuda"            # 'cuda' or 'cpu'

# Where to store the trained model
ARTIFACT_ROOT     = f"artifacts/run_L{LAYER}"


The notebook will create directories as needed—no manual setup required.

## 1. Imports
First, let's gather the standard libraries and the high‑level Pingkit helpers we'll be using throughout the run.

In [None]:

import os, pathlib, json, textwrap
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from pingkit.embedding  import embed_dataset
from pingkit.extraction import extract_token_vectors
from pingkit.model      import (
    fit, save_artifacts, load_npz_features,
    _evaluate, load_artifacts
)

from sklearn.metrics     import accuracy_score, roc_auc_score
from sklearn.calibration import calibration_curve


## 2. Stack per‑row embeddings into a feature matrix
Pingkit's `extract_token_vectors` scans an embeddings directory (**one file per row per layer** as produced by `embed_dataset`) and concatenates everything into a single compressed `.npz`. We do this separately for the training and test splits—**run this section only once** after generating embeddings.

In [None]:

train_npz = extract_token_vectors(
    str(TRAIN_EMB_DIR),
    output_file=f"{TRAIN_EMB_DIR}/results/features_{EMB_PARTS}_L{LAYER}.npz",
    layers=LAYER,
    parts=EMB_PARTS,
    n_jobs=os.cpu_count(),
)
print("✅  Stacked train features:", train_npz)

test_npz = extract_token_vectors(
    str(TEST_EMB_DIR),
    output_file=f"{TEST_EMB_DIR}/results/features_{EMB_PARTS}_L{LAYER}.npz",
    layers=LAYER,
    parts=EMB_PARTS,
    n_jobs=os.cpu_count(),
)
print("✅  Stacked test  features:", test_npz)


## 3. Load features and labels
Next we load the `.npz` files into Pandas DataFrames, read the label CSVs, and **align** the indices so that every row has both features and a ground‑truth label.

In [None]:

# ── Features ───────────────────────────────────────────────────────
X_train_df, meta = load_npz_features(f"{TRAIN_EMB_DIR}/results/features_{EMB_PARTS}_L{LAYER}.npz")
X_test_df,  _    = load_npz_features(f"{TEST_EMB_DIR}/results/features_{EMB_PARTS}_L{LAYER}.npz")

# ── Labels ─────────────────────────────────────────────────────────
y_train = pd.read_csv(TRAIN_LABEL_CSV, index_col='id')['label']
y_test  = pd.read_csv(TEST_LABEL_CSV,  index_col='id')['label']

# Align indices (drop rows missing on either side)
common_train = X_train_df.index.intersection(y_train.index)
X_train_df   = X_train_df.loc[common_train]
y_train      = y_train.loc[common_train]

common_test  = X_test_df.index.intersection(y_test.index)
X_test_df    = X_test_df.loc[common_test]
y_test       = y_test.loc[common_test]

print(f"Train set: {X_train_df.shape}, labels: {y_train.shape}")
print(f"Test  set: {X_test_df.shape},  labels: {y_test.shape}")


## 4. Train the model
We call Pingkit's `fit` helper, which will:
1. Infer a reasonable network width based on the number of training examples.
2. Use an internal 20 % validation split for early stopping.
3. Return both the trained model and a per‑epoch history we could plot later.

In [None]:

model, history = fit(
    X_train_df,
    y_train.values,
    model_type     = MODEL_TYPE,
    meta           = meta,
    num_classes    = len(np.unique(y_train)),
    metric         = 'loss',
    batch_size     = BATCH_SIZE,
    learning_rate  = LEARNING_RATE,
    n_epochs       = N_EPOCHS,
    val_split      = 0.2,
    patience       = 10,
    early_stopping = True,
    random_state   = RANDOM_STATE,
    device         = DEVICE,
)


## 5. Save trained artifacts
To make future inference trivial, we persist both the network weights **and** the metadata that tells Pingkit exactly how to reconstruct the architecture.

In [None]:

weights_path, meta_path = save_artifacts(
    model,
    path=str(ARTIFACT_ROOT),
    meta=meta,
)
print("Model weights saved to :", weights_path)
print("Metadata saved to      :", meta_path)


## 6. Evaluate on the test split
We reload the saved model (just to prove that serialization works) and compute accuracy and ROC‑AUC on the hold‑out set.

In [None]:

model, _ = load_artifacts(str(ARTIFACT_ROOT), device=DEVICE)

device    = next(model.parameters()).device
X_test_np = X_test_df.values.astype(np.float32)

probs, _, _ = _evaluate(
    model,
    X_test_np,
    y_test.values,
    model_type = MODEL_TYPE,
    metric_fn  = lambda y, p: accuracy_score(y, p.argmax(1)),
    device     = device,
)

pred_labels = probs.argmax(1)
acc = accuracy_score(y_test.values, pred_labels)

# Binary vs. multiclass AUC handling
if probs.shape[1] == 2:
    auc = roc_auc_score(y_test.values, probs[:,1])
else:
    auc = roc_auc_score(y_test.values, probs, multi_class='ovr', average='macro')

print(f"Accuracy: {acc:.4f}")
print(f"AUC     : {auc:.4f}")


## 7. Calibration curve (optional)
If we're curious about how well‑calibrated the model's probabilities are, we can draw a quick reliability diagram.

In [None]:

prob_pos = probs[:, 1] if probs.shape[1] == 2 else probs.max(1)
frac_pos, mean_pred = calibration_curve(y_test.values, prob_pos, n_bins=10)

plt.figure(figsize=(6,6))
plt.plot(mean_pred, frac_pos, marker='o', linewidth=1.5, label='Model')
plt.plot([0,1],[0,1], linestyle='--', label='Perfect')
plt.xlabel('Mean predicted probability')
plt.ylabel('Fraction of positives')
plt.title('Calibration curve')
plt.legend()
plt.grid(True)
plt.show()


That's it—we've **gone from stacked transformer embeddings to a trained, saved, and evaluated probe** in just a few steps. Try experimenting with different layers, embedding parts, or model architectures by tweaking the parameters at the top—Pingkit will handle the rest. Happy probing!