# Lab 6 - Bayesian Knowledge Tracing (BKT) and Variants 

This tutorial is partially based on the pyBKT model tutorial and the Jupyter notebooks available on GitHub at [https://github.com/CAHLR/pyBKT](https://github.com/CAHLR/pyBKT). 

One notable application of machine learning in education is represented **knowledge inference models**, which aim to understand how well a student is learning concepts or skills. Being able to monitor this knowledge makes it possible to improve and personalize online learning platforms or intelligent tutoring systems, by focusing on areas the student is weak in and accelerating learning of certain concepts.

In this tutorial, we study a range of popular models for modelling students' knowledge based on **Bayesian Knowledge Tracing (BKT)**. BKT was introduced in 1995 as a means to model students' knowledge as a **latent variable** in online learning environments. Specifically, the environment can maintain an estimate of the **probability that the student has learned a set of skills**, which is statistically equivalent to a 2-node dynamic Bayesian network. 

For this tutorial, we will rely on a Python implementation of the Bayesian Knowledge Tracing algorithm and more recent variants, estimating student cognitive mastery from problem solving sequences, known under the name of **pyBKT**. This package can be used to define and fit many BKT variants. 

These variants are derived from a range of papers published in the educational data mining literature and, in this tutorial, we will provide you with the main notions and implementation details needed to investigate BKT models in practice.  

**Expected Tasks**

- Follow the pyBKT getting started showcase.
- Solve a range of exercises on BKT models. 

**Learning Objectives**

- Instantiate and run a pipeline on BKT models. 
- Conduct fine-grained analyses on specific learning skills. 
- Understand and experiment with different variants of BKT.
- Compare the performance of BKT setups under different evaluation methods. 
- Inspect the influence of a BKT variant on the internal BKT parameters.

More information on the PyBKT is provided in the corresponding [Github repository](https://github.com/CAHLR/pyBKT). 

In [2]:
# Traditional packages
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import math

%matplotlib inline

## Introduction
---

BKT models operationalize the learning of a student as a **Markov process**, building upon the idea that, while students interact with an educational environment, their skill in a given concept improves. To move the theory behind BKT into practice, variables related to forgetting, learning, guessing, slipping, and so on need to be modelled, controlling for instance how fast and how well learning is happening for the student. 

The BKT model assumes that the student’s knowledge can be estimated by means of standardized questions, which can be answered correctly or incorrectly, on a concept or combination of concepts. BKT also assumes that initially a student may not know about a concept, but their knowledge gets better with learning and practice related to that concept. The following concepts will be 

- $P_0$ is the initial probability of mastering that concept (skill). 
- $P_{\text{F}}$ is the probability that the student forgot something previously learned on the concept (skill). 
- $P_{\text{L}}$ is the probability that the student has learned something that was previous not known on the concept (skill). 
- $P_{\text{S}}$ is the probability that the student gave a wrong answer even though they had learned the concept (skill).
- $P_{\text{G}}$ is the probability that the student guessed the right answer while not knowing the concept (skill). 

In this tutorial, we will use a dataset of the student’s responses to questions in a test, along with whether they answered correctly or incorrectly, and we will use a BKT model to find the values of the above probabilities.

## The ASSISTments data set
---

ASSISTments is a free tool for assigning and assessing math problems and homework. Teachers can select and assign problem sets. Once they get an assignment, students can complete it at their own pace and with the help of hints, multiple chances, and immediate feedback. Teachers get instant results broken down by individual student or for the whole class. Please, find more information on the platform [here](https://www.commonsense.org/education/website/assistments). 

In this tutorial, we will play with a simplified version of a dataset collected from the ASSISTments tool, saved on a CSV files with the following columns:  

- user_id: The ID of the student doing the problem.
- template_id: The ID of the template in ASSISTment (assistments with the same template ID have similar questions).
- assistment_id: The ID of the ASSISTment (an assistment consists of one or more problems).
- order_id: These IDs are chronological and refer to the id of the original problem log.
- problem_id: The ID of the problem.
- skill_name: Skill name associated with the problem.
- correct: 1 if correct on the first attempt, 0 if incorrect on the first attempt or asked for help.
- ms_first_response: The time in milliseconds for the student's first response.
- attempt_count: Number of student attempts on this problem.
- hint_count: Number of student hints asked by the student on this problem.
- hint_total: Number of possible hints to be asked on this problem.

In [3]:
DATA_DIR = "./../../data/"
as_data = pd.read_csv(DATA_DIR + 'as_supersmall.csv', encoding='latin', low_memory=False)

In [4]:
as_data.head(10)

Unnamed: 0,user_id,template_id,assistment_id,order_id,problem_id,skill_name,correct,ms_first_response,attempt_count,hint_count,hint_total
0,70733,30060,33175,35278766,51460,Box and Whisker,0,9575,2,0,4
1,70733,30060,33182,35278780,51467,Box and Whisker,1,6422,1,0,4
2,70733,30059,33107,35278789,51392,Box and Whisker,0,11365,3,0,3
3,70733,30060,33187,35278802,51472,Box and Whisker,1,4412,1,0,4
4,70733,30059,33111,35278810,51396,Box and Whisker,1,6902,1,0,3
5,70872,30059,33136,32268742,51421,Box and Whisker,1,7281,1,0,3
6,70872,30799,33144,32268764,51429,Box and Whisker,1,7234,1,0,3
7,72059,30799,33155,33409110,51440,Box and Whisker,0,38290,2,0,3
8,72059,30060,33181,33409165,51466,Box and Whisker,0,8366,4,0,4
9,72059,30060,33168,33409366,51453,Box and Whisker,1,9661,1,0,4


Before delving into the pyBKT description and showcase, we invite you to spend some time to explore the toy dataset presented in this tutorial, e.g., how many students/problems/skills are included, examine the skills in more detail etc. Here, you could therefore add one or more cells to perform your exploration.    

## The pyBKT Package
---

In this tutorial, we use the pyBKT package, a Python implementation of the Bayesian Knowledge Tracing algorithm and variants, estimating student cognitive mastery from problem solving sequences. We can import the core class provided by the package, that is Model.

In [5]:
from pyBKT.models import Model

The first step is to construct a BKT model. To be instantiated, a BKT model requires a series of parameters, whose default value and meaning is provided below (e.g., num_fits, seed, defaults, and any model variant(s) that may be used). Each parameter can be modified during fit/crossvalidation time too.

- **Defaults generic parameters**: 
    - num_fits (5) is the number of initialization fits used for the BKT model.
    - defaults (None) is a dictionary that can be used to pass values different than the default ones during initialization.
    - parallel (True) indicates whether the computation will use multi-threading.
    - skills ('.\*') is a regular expression used to indicate the skills the BKT model will be run on.  
    - seed (random.randint(0, 1e8)) is a seed that can be setup to enable reproducible experiments. 
    - folds (5) is the number of folds used in case of cross-validation.
    - forgets (False) indicates whether the model will consider that the student may give a wrong answer even though they had learned the concept. 
    
- **Defaults additional parameters**:
    - order_id ('order_id') is the name of the CSV column for the chronological IDs that refer to the original problem log. 
    - skill_name ('skill_name') is the name of the CSV column for the skill name associated with the problem.
    - correct ('correct') is the name of the CSV column for the correct / incorrect label on the first attempt.
    - user_id ('user_id') is the name of the CSV column for the ID of the student doing the problem. 
    - multilearn ('template_id') is the name of the column for checking whether there is a multi-skill object. 
    - multiprior ('correct') is the name of the CSV column for mapping multi-prior knowledge.  
    - multigs ('template_id') is the name of the CSV column corresponding to the desired guess/slip classes. 

- **Initializers for learnable parameters**: 
    - 'prior' (None, no inizialization) is the initial probability of answering the question correct.
    - 'learns' (None, no inizialization) is the probability that the student has learned something that was previous not known.
    - 'guesses' (None, no inizialization) is the probability that the student guessed the right answer while not knowing the concept. 
    - 'slips' (None, no inizialization) is the probability that the student gave a wrong answer even though they had learned the concept.
    - 'forgets' (None, no inizialization) is the probability that the student forgot something previously learned.
    
If you have doubts on the meaning of certain parameters, please ask to TAs or move on the next examples (they will help you understand). 

In [6]:
model = Model(seed=0)
model

Model(parallel=True, num_fits=5, seed=0, defaults=None)

The Model class is inspired by scikit-learn and, therefore, provides a range of methods a model can be called with:
- The **fit** method fits a BKT model given model and data information. Takes arguments skills, number of initialization fits, default column names (i.e. correct, skill_name), parallelization, and model types.
- The **predict** method predicts using the trained BKT model and test data information. Takes test data path or DataFrame as arguments. Returns a dictionary mapping skills to predicted values for those skills. Note that the predicted values are a tuple of (correct_predictions, state_predictions).
- The **evaluate** method evaluates a BKT model given model and data information. Takes a metric and data path or DataFrame as arguments. Returns the value of the metric for the given trained model tested on the given data.
- The **crossvalidate** method crossvalidates (trains and evaluates) the BKT model. Takes the data, metric, and any arguments that would be passed to the fit function (skills, number of initialization fits, default column names, parallelization, and model types) as arguments. 

We will show a range of examples for each of the above methods. 

### Fitting and evaluating a model

In [7]:
model = Model(seed=0)
%time model.fit(data=as_data, skills='Box and Whisker') 
%time model.evaluate(data=as_data, metric='auc') 

CPU times: user 383 ms, sys: 2.1 ms, total: 385 ms
Wall time: 138 ms
CPU times: user 135 ms, sys: 5.58 ms, total: 141 ms
Wall time: 139 ms


0.5729593444689567

First, we have fitted a BKT model on the 'Box and Whisker' skill and, then, evaluate the corresponding **training AUC** (0.64). Note that we have run the BKT fitting process on the full dataset, to understand how well the BKT model can fit the data. Evaluation methods like cross-validation will be presented later in this notebook. Furthermore, the default metric displayed is RMSE, but pyBKT supports AUC ('auc'), RMSE ('rmse'), and accuracy ('accuracy') as metrics. We will also see how to add other metrics. 

For each skill, you can get the learned parameters for 'prior', 'learns', 'guesses', 'slips', and 'forgets'. Specifically:
- **prior** ($P_0)$ the prior probability of "knowing".
- **forgets** ($P_{\text{F}})$: the probability of transitioning to the "not knowing" state given "known".
- **learns** ($P_{\text{L}}$): the probability of transitioning to the "knowing" state given "not known".
- **slips** ($P_{\text{S}}$): the probability of picking incorrect answer, given "knowing" state.
- **guesses** ($P_{\text{G}}$): the probability of guessing correctly, given "not knowing" state.

In [8]:
model.coef_

{'Box and Whisker': {'prior': 0.787244637189556,
  'learns': array([0.31609322]),
  'guesses': array([0.22952839]),
  'slips': array([0.21404466]),
  'forgets': array([0.])}}

In [9]:
model.params()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,value
skill,param,class,Unnamed: 3_level_1
Box and Whisker,prior,default,0.78724
Box and Whisker,learns,default,0.31609
Box and Whisker,guesses,default,0.22953
Box and Whisker,slips,default,0.21404
Box and Whisker,forgets,default,0.0


We could initialize the prior knowledge to $1e-40$ for Box and Whisker, before fitting the model. 

In [10]:
model = Model(seed=0)

model.coef_ = {'Box and Whisker': {'prior': 1e-40}}
model.coef_

{'Box and Whisker': {'prior': 1e-40}}

Then, we can fit the model and observe the resulting AUC score. How does it compares to the AUC score of the previous model. 

In [11]:
%time model.fit(data=as_data, skills='Box and Whisker') 
%time model.evaluate(data=as_data, metric='auc') 

CPU times: user 1.26 s, sys: 11.6 ms, total: 1.28 s
Wall time: 652 ms
CPU times: user 80.1 ms, sys: 10.6 ms, total: 90.7 ms
Wall time: 50.8 ms


0.5459607101586301

In [12]:
model.params()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,value
skill,param,class,Unnamed: 3_level_1
Box and Whisker,prior,default,0.0
Box and Whisker,learns,default,0.10311
Box and Whisker,guesses,default,0.69463
Box and Whisker,slips,default,0.16084
Box and Whisker,forgets,default,0.0


You can also train simple BKT models on different skills in the data set.

In [13]:
model = Model(seed=0)
%time model.fit(data=as_data, skills=['Box and Whisker', 'Scatter Plot']) 
%time model.evaluate(data=as_data, metric='auc') 

CPU times: user 1.23 s, sys: 0 ns, total: 1.23 s
Wall time: 603 ms
CPU times: user 139 ms, sys: 6.05 ms, total: 145 ms
Wall time: 96.7 ms


0.6618193428538256

And, then, observed the learned parameters for each skill. Note that, when multiple skills are passed to fit, the method will run a fitting procedure for each skill, separately (in this case, we will have two BKT models). 

In [14]:
model.params()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,value
skill,param,class,Unnamed: 3_level_1
Box and Whisker,prior,default,0.95817
Box and Whisker,learns,default,0.02263
Box and Whisker,guesses,default,0.02986
Box and Whisker,slips,default,0.23316
Box and Whisker,forgets,default,0.0
Scatter Plot,prior,default,0.31026
Scatter Plot,learns,default,0.49269
Scatter Plot,guesses,default,0.72363
Scatter Plot,slips,default,0.01169
Scatter Plot,forgets,default,0.0


You can also enable forgetting, by setting the corresponding parameter in the fit method. 

In [15]:
model = Model(seed=0)
%time model.fit(data=as_data, skills='Box and Whisker', forgets=True) 
%time model.evaluate(data=as_data, metric='auc') 

CPU times: user 136 ms, sys: 3.59 ms, total: 139 ms
Wall time: 51.2 ms
CPU times: user 103 ms, sys: 4.31 ms, total: 107 ms
Wall time: 66.2 ms


0.5631894106523794

In [16]:
model.params()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,value
skill,param,class,Unnamed: 3_level_1
Box and Whisker,prior,default,0.60015
Box and Whisker,learns,default,0.5611
Box and Whisker,guesses,default,0.32433
Box and Whisker,slips,default,0.05659
Box and Whisker,forgets,default,0.24059


Or train a multiguess and slip BKT model on the same skills in the data set. The **multigs** model fits a different guess/slip rate for each class. Note that, with *multigs=True*, the guess and slip classes will be specified by the *template_id*. You can specify a custom column mapping by doing *multigs='column_name'*.

In [17]:
model = Model(seed=0)
%time model.fit(data=as_data, skills=['Box and Whisker'], multigs=True) 
%time model.evaluate(data=as_data, metric='auc') 

CPU times: user 791 ms, sys: 0 ns, total: 791 ms
Wall time: 391 ms
CPU times: user 129 ms, sys: 0 ns, total: 129 ms
Wall time: 47.7 ms


0.6890429666981825

And finally, we show the BKT paramaters. By enabling *multigs*, the guess and slip classes will be specified by the template_id and, by setting *multigs=True*, the guess and slip classes will be specified by default by the template_id classes. Note that assistments with the same template ID have similar questions. What could you observe by looking at the different learned guesses and slips values below?

In [18]:
model.params()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,value
skill,param,class,Unnamed: 3_level_1
Box and Whisker,prior,default,0.10722
Box and Whisker,learns,default,0.03558
Box and Whisker,guesses,30059,0.78007
Box and Whisker,guesses,30060,0.61374
Box and Whisker,guesses,30799,0.74621
Box and Whisker,guesses,63446,0.0
Box and Whisker,guesses,63447,0.16668
Box and Whisker,guesses,63448,1.0
Box and Whisker,slips,30059,0.00275
Box and Whisker,slips,30060,0.04758


The **multilearn** model fits a different learn rate (and forget rate if enabled) rate for each class specified. Note that, with multilearn=True, the learn classes are specified by the *template_id*. You can specify a custom column mapping by doing *multilearn='column_name'*.

In [19]:
model = Model(seed=0)
%time model.fit(data=as_data, skills=['Box and Whisker'], multilearn=True) 
%time model.evaluate(data=as_data, metric='auc') 

CPU times: user 375 ms, sys: 4.82 ms, total: 380 ms
Wall time: 185 ms
CPU times: user 98.3 ms, sys: 0 ns, total: 98.3 ms
Wall time: 46.4 ms


0.56733900619813

Looking at the parameters, we will observe a 'leanrs' score for each template_id (the class column in the paras dataframe). In this case, what could you observe by looking at the different leanrs values below?    

In [20]:
model.params()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,value
skill,param,class,Unnamed: 3_level_1
Box and Whisker,prior,default,0.73549
Box and Whisker,learns,30059,0.19355
Box and Whisker,learns,30060,0.33534
Box and Whisker,learns,30799,0.2808
Box and Whisker,learns,63446,0.38742
Box and Whisker,learns,63447,0.45819
Box and Whisker,learns,63448,0.26747
Box and Whisker,guesses,default,0.3345
Box and Whisker,slips,default,0.20792
Box and Whisker,forgets,30059,0.0


You can also combine multiple variants, and use a different column to specify the different learn and forget classes. In this case, we use user_id, assuming that we are interested in learning the parameters for each student, and we also enable forgetting. 

In [21]:
model = Model(seed=0)
%time model.fit(data=as_data, skills=['Box and Whisker'], forgets=True, multilearn='user_id') 
%time model.evaluate(data=as_data, metric='auc') 

CPU times: user 140 ms, sys: 0 ns, total: 140 ms
Wall time: 53.2 ms
CPU times: user 125 ms, sys: 6.28 ms, total: 131 ms
Wall time: 120 ms


0.7211892005462759

Once we run a BKT model with *forgets=True* and *multilearn='user_id'*, we will observe individual scores for each student, as shown below. 

In [22]:
model.params()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,value
skill,param,class,Unnamed: 3_level_1
Box and Whisker,prior,default,0.89402
Box and Whisker,learns,70733,0.9977
Box and Whisker,learns,70872,0.99827
Box and Whisker,learns,72059,0.87034
Box and Whisker,learns,79748,0.99827
Box and Whisker,learns,79750,0.99833
Box and Whisker,learns,79769,0.9977
Box and Whisker,learns,81641,0.90726
Box and Whisker,learns,82482,0.99699
Box and Whisker,learns,82533,0.99833


The best performing models are typically those that combine several useful variants, such as the multilearn and multiguess/slip class variants. After this lab session, you might be interested in testing with other skills and see whether this observations is true for other skills as well.

### Make predictions

As we said, the predict method can be executed on the trained BKT model, obtaining a dictionary mapping skills to predicted
values for those skills, namely correct_predictions (each score is between 0 and 1 that measures the extent to which the model thinks that the student will answer correctly to that question) and state_predictions (each score between 0 and 1 that measures the extent to which the student has mastered that skill, after that question).
        
        
Note that, in the example below, we have run the BKT fitting process on the full dataset, to understand how well the BKT model can fit the data. Evaluation methods like cross-validation will be presented slightly after in this notebook.

In [23]:
model = Model(seed=0)
%time model.fit(data=as_data, skills='Box and Whisker') 
%time model.evaluate(data=as_data, metric='auc') 
%time preds = model.predict(data=as_data)

CPU times: user 113 ms, sys: 6.94 ms, total: 120 ms
Wall time: 51.8 ms
CPU times: user 126 ms, sys: 6.77 ms, total: 132 ms
Wall time: 104 ms
CPU times: user 64.9 ms, sys: 2.48 ms, total: 67.4 ms
Wall time: 46 ms


In [24]:
preds[preds['skill_name']=='Box and Whisker'][['user_id', 'correct', 'correct_predictions', 'state_predictions']]

Unnamed: 0,user_id,correct,correct_predictions,state_predictions
0,70733,0,0.64188,0.67961
1,70733,1,0.60758,0.60714
2,70733,0,0.73022,0.86625
3,70733,1,0.69049,0.78230
4,70733,1,0.76266,0.93478
...,...,...,...,...
219,96296,1,0.79353,1.00000
220,96296,1,0.79353,1.00000
221,96296,1,0.79353,1.00000
222,96296,1,0.79353,1.00000


Note that, if the BKT model is asked to predict on skills not included in the training set, the output predictions for that skills will be a best effort guess of 0.5 for both the correct and state predictions.

### Extend the evaluation

The pyBKT package makes also possible to extend the range of metrics you can compute while evaluating a BKT model. To this end, you need to define a custom function that, given true_vals (true values for the correct target) and pred_vals (the predicted values for the correct target), computes and returns the score corresponding to the desired metric.  

In [25]:
def mae(true_vals, pred_vals):
    return np.mean(np.abs(true_vals - pred_vals))

%time model.evaluate(data=as_data, metric=mae)

CPU times: user 175 ms, sys: 13.5 ms, total: 188 ms
Wall time: 59.2 ms


0.3706428847340067

### Perform cross validation

Finally, the pyBKT package offers also a cross-validation method. You can specify the number of folds, a seed, and a metric (one of the three default ones, namely 'rmse', 'auc' or 'accuracy', or a custom Python function as we have see above). Furthermore, simialrly to the fit method, arguments for cross-validation a BKT variant and for defining the data path/data and skill names are accepted.  

In [26]:
model = Model(seed=0)
%time model.crossvalidate(data=as_data, skills='Box and Whisker', folds=5, metric='auc')

CPU times: user 1.78 s, sys: 0 ns, total: 1.78 s
Wall time: 911 ms


Unnamed: 0_level_0,auc
skill,Unnamed: 1_level_1
Box and Whisker,0.54387


In this showcase, we just opted to five folds due to the time constraints. In the other cases, you need to select an appropriate number of folds based on the data you are dealing with, as discussed in the lectures.  

## Exercises
---

That's your turn! We ask you to complete the following exercises. In case you do not finish them during the lab session, please feel free to complete later, at your earliest convenience. TAs are happy to address any question or doubts you might have.

Kindly note that the following exercises have the goal of supporting you in getting familiar with the library functions, and may not fully represent the sequences of steps and the design choices made in a real-world or homework scenario. Elements concerning the latter scenarios will be discussed during the session. Furthermore, due to running time constraints, the following exercises will be run in a train-test split or full data set mode, while we leave the opportunity to run them under a cross-validation setting after this lab session.     

In all your models, we ask you to set the *seed* to 0, to let you reproduce the same results across different runs.

Note that the expected running time may vary according to the device or environment. 

#### Question 1 [expected total time for BKT fitting: 2 mins]

- Fit a BKT model with default parameters on the full data set, only for the skill 'Addition and Subtraction Integers'. 
- Compute the correct predictions from the BKT model, by using the predict method of the Model class. 
- Manually calculate the RMSE between the true correct value and the predicted correct value (refer to Slide 51 of Lecture 4 to get the RMSE formula). 
- Compare with the RMSE returned by the evaluate method of the BKT model. 

In [1]:
### EXERCISE CELL ###

#### Question 2 [expected total time for BKT fitting: 7 mins]
- Perform a user-based train-test split of the data, with 20% of the users in the test set.
- Fit the two BKT model variants on the training set, only for the skill 'Addition and Subtraction Integers'. 
    - default;
    - forgets=True;
- Which model variant listed below has the highest test AUC for 'Addition and Subtraction Integers' in the test set?

In [2]:
### EXERCISE CELL ###

#### Question 3  [expected total time for BKT fitting: 3 mins]
- Bin values in the ms_first_response column in *as_data* to categories ('less than 10s', 'less than 20s', 'less than 30s','less than 40s', 'less than 50s', 'other'). 
- Fit BKT models with different learn rates, according to the ms_first_response categories above, on the full data set, only for the skill 'Addition and Subtraction Integers'. You need to play with the multilearn parameter of the BKT fit method.
- Create a bar plot to show the $P_{\text{L}}$ (learns) value for each ms_first_response category above. You basically need to play with the dataframe returned by model.params(), to prepare the data to be shown in the plot.
- Does binned response time influence the $P_{\text{L}}$ parameter for the skill 'Addition and Subtraction Integers'? Which bin result in the highest $P_{\text{L}}$ scores?

In [3]:
### EXERCISE CELL ###

#### Question 4 [expected total time for BKT fitting: 8 mins]
- Use the same bins ms_first_response to categories ('less than 10s', 'less than 20s', 'less than 30s','less than 40s', 'less than 50s', 'other').
- Fit a BKT model with template-id multilearn (default), on the full data set, only for the skill 'Addition and Subtraction Integers'.
- Fit a BKT model with binned-response-time-based multilearn, on the full data set, only for the skill 'Addition and Subtraction Integers'.
- Does the binned-response-time-based multilearn improve the AUC of the model compared to the default template_id-based multilearn?

In [4]:
### EXERCISE CELL ###

## Summary
---

In this tutorial, we have seen several important aspects of Bayesian Knowledge Tracing (BKT). We have shown how a typical data set for knowledge tracing should look like. We have illustrated how BKT models can be trained on different skills. We have shown how different variants of BKT can help you improve the goodness of your model. Many of the ideas described in this tutorial can be adapted to other data sets and projects. Finally, we have shown some examples of predictions and evaluations, covering also cross-validation. If you are interested in the implementation details of the different variants, we invite you to explore the codebase stored in the pyBKT Github repository. 