# Baseline Model

This notebook is dedicated to the baseline model. For each problem, we predict the student's mean performance in the corresponding IU assignments, that is the proportion of correct answers. This mean IU performance corresponding to each test case is computed as additional information when performing the method-specific experiments. We therefore do not have to compute it here again, but can just extract it from other experiments already performed.

Since each problem in a UT assignment gets assigned the same predicted probability, when the predictions are evaluated, for a UT assignment we either predict only 1 or only 0, depending on whether the predicted value is above or below 0.5.

Because of this simple structure, we can deduce rules for the evaluation metrics, which is why we do not compute the predictions but the evaluation metrics directly. Details can be found the Subsection 5.1.1 in the report.

In [1]:
%load_ext autoreload
%autoreload 2

import numpy as np
import pandas as pd

import sys
import os
sys.path.append(os.path.abspath('../../sources'))

import config
import utils

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd
  from .autonotebook import tqdm as notebook_tqdm


In [2]:
df, _ = utils.read_predictions_and_conf(
    "content_based_recommendation", "version1", latest=True
)
df = df.set_index(["class_id", "ut_id", "student_id"])[
    ["y_true", "num_ut_probs", "num_iu_probs", "mean_ut_perf", "mean_iu_perf"]
]
df = utils.convert_str_cols_to_lists(df, ["y_true"])
df.head()

Read file version1_20240804_193147


Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,y_true,num_ut_probs,num_iu_probs,mean_ut_perf,mean_iu_perf
class_id,ut_id,student_id,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2JFV80TTBO,CD76U7XEG,1IB0KDMKQM,"[1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1]",16,9,0.4375,0.444444
2JFV80TTBO,CD76U7XEG,1MESTUDVQN,"[1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]",16,62,0.3125,0.758065
2JFV80TTBO,CD76U7XEG,1VUKTJH0DS,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1]",16,61,0.875,0.688525
2JFV80TTBO,CD76U7XEG,9XCM0ERZW,"[0, 0, 0, 1, 1]",5,7,0.4,1.0
2JFV80TTBO,CD76U7XEG,ANLS42FC7,"[1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0]",16,13,0.5,0.615385


In [3]:
# get number of 0 and 1 in true values
df[["num_1", "num_0"]] = df["y_true"].apply(lambda y: (y.count(1), y.count(0))).to_list()

In [4]:
# mae = ((1 - m) * N1 + m * N0) / N
df["mae"] = (
    ((1 - df["mean_iu_perf"]) * df["num_1"] + df["mean_iu_perf"] * df["num_0"])
    / df["num_ut_probs"]
).round(config.ROUND_DECIMALS)

# mse = ((1 - m)^2 * N1 + m^2 * N0) / N
df["mse"] = (
    (
        (1 - df["mean_iu_perf"]) ** 2 * df["num_1"]
        + df["mean_iu_perf"] ** 2 * df["num_0"]
    )
    / df["num_ut_probs"]
).round(config.ROUND_DECIMALS)

In [5]:
# get predictions (either all predictions are 1 or 0)
for lim in [0.3, 0.5, 0.7]:
    lim_str = int(lim * 100)
    df[f"y_pred_lim_{lim_str}"] = (df["mean_iu_perf"] > lim).astype(int)
df["y_pred_lim_dyn"] = 0

for lim_str in [int(lim * 100) for lim in [0.3, 0.5, 0.7]] + ["dyn"]:
    # initialize colums
    for met in ["acc", "f1", "precision", "recall"]:
        df[f"{met}_lim_{lim_str}"] = 0.0

    # compute metrics if predictions are 1
    mask_1 = df[f"y_pred_lim_{lim_str}"] == 1
    # precision = N1 / (N1 + N0)
    df.loc[mask_1, f"precision_lim_{lim_str}"] = (
        df.loc[mask_1, "num_1"] / df.loc[mask_1, "num_ut_probs"]
    )
    # f1 = (2 * N1) / (2 * N1 + N0)
    df.loc[mask_1, f"f1_lim_{lim_str}"] = (2 * df.loc[mask_1, "num_1"]) / (
        2 * df.loc[mask_1, "num_1"] + df.loc[mask_1, "num_0"]
    )
    # accuracy = precision
    df.loc[mask_1, f"acc_lim_{lim_str}"] = df.loc[mask_1, f"precision_lim_{lim_str}"]
    # recall = N1 / N1 = 1 if N1 > 0 else 0
    df.loc[mask_1 & (df["num_1"] > 0), f"recall_lim_{lim_str}"] = 1

    # compute metrics if predictions are 0
    mask_0 = df[f"y_pred_lim_{lim_str}"] == 0
    # accuracy = N0 / (N1 + N0)
    df.loc[mask_0, f"acc_lim_{lim_str}"] = (
        df.loc[mask_0, "num_0"] / df.loc[mask_0, "num_ut_probs"]
    )

    # round metric values
    for met in ["acc", "f1", "precision", "recall"]:
        df[f"{met}_lim_{lim_str}"] = df[f"{met}_lim_{lim_str}"].round(
            config.ROUND_DECIMALS
        )

In [6]:
df.head(3)

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,y_true,num_ut_probs,num_iu_probs,mean_ut_perf,mean_iu_perf,num_1,num_0,mae,mse,y_pred_lim_30,...,precision_lim_50,recall_lim_50,acc_lim_70,f1_lim_70,precision_lim_70,recall_lim_70,acc_lim_dyn,f1_lim_dyn,precision_lim_dyn,recall_lim_dyn
class_id,ut_id,student_id,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1
2JFV80TTBO,CD76U7XEG,1IB0KDMKQM,"[1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 1]",16,9,0.4375,0.444444,7,9,0.4931,0.2461,1,...,0.0,0.0,0.5625,0.0,0.0,0.0,0.5625,0.0,0.0,0.0
2JFV80TTBO,CD76U7XEG,1MESTUDVQN,"[1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0]",16,62,0.3125,0.758065,5,11,0.5968,0.4134,1,...,0.3125,1.0,0.3125,0.4762,0.3125,1.0,0.6875,0.0,0.0,0.0
2JFV80TTBO,CD76U7XEG,1VUKTJH0DS,"[1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1]",16,61,0.875,0.688525,14,2,0.3586,0.1441,1,...,0.875,1.0,0.125,0.0,0.0,0.0,0.125,0.0,0.0,0.0


In [7]:
utils.save_evaluation_df(df, {"folder": "baseline"}, "baseline", save_idx=True)

Saved evaluation df with filename baseline_20240826_001703.csv in folder baseline
