In [1]:
# load data

import pandas as pd
import numpy as np
from sklearn.metrics import accuracy_score, log_loss

interactions = pd.read_csv("../data/interactions.csv")
qc_map = pd.read_csv("../data/question_concept_map.csv")

Compute CBM mastery per studen per concept

In [2]:
# CBM mastery = proportion of correct probes for that concept
# the interpretable mastery matrix

cbm_mastery = (
    interactions
    .groupby(["student_id", "concept_id"])["correct"]
    .mean()
    .reset_index()
    .rename(columns={"correct": "mastery"})
)

cbm_mastery.head()


Unnamed: 0,student_id,concept_id,mastery
0,S1,C1,0.4
1,S1,C10,0.166667
2,S1,C2,0.363636
3,S1,C3,0.4
4,S1,C4,0.083333


predict correctness using CBM mastery

For each interaction:
    prediction = mastery of the concept
    no temporal modeling
    no hidden state

In [3]:
cbm_preds = interactions.merge(
    cbm_mastery,
    on=["student_id", "concept_id"],
    how="left"
)

cbm_preds["pred_prob"] = cbm_preds["mastery"]
cbm_preds["pred_label"] = (cbm_preds["pred_prob"] >= 0.5).astype(int)


Evaluate CBM

In [4]:
cbm_accuracy = accuracy_score(cbm_preds["correct"], cbm_preds["pred_label"])
cbm_logloss = log_loss(cbm_preds["correct"], cbm_preds["pred_prob"])

cbm_accuracy, cbm_logloss

(0.7321875, 0.5067855534107106)

Why CBM outperforms BKT and DKT here

CBM computes mastery using all attempts for a studentâ€“concept pair. That means:
    it sees entire history.
    it is not constrained to predict step-by-step.
    it benefits from hindsight.

BKT and DKT, by contrast:
    predict online, one step at a time.
    do not get to look ahead.

So CBM has an information advantage in this evaluation setup.

Model Evaluation Summary.

| Model | Accuracy | Log Loss | Interpretability | Temporal Modeling        |
|-------|----------|----------|------------------|--------------------------|
| CBM   | 0.732    | 0.507    | Very High        | No                       |
| BKT   | 0.645    | 0.657    | High             | Yes (per concept)        |
| DKT   | ~0.600   | ~0.671   | Low              | Yes (sequence-level)     |
