Skip to content
This repository has been archived by the owner on Aug 11, 2024. It is now read-only.
/ KT Public archive

Knowledge Tracing Models with PyTorch

Notifications You must be signed in to change notification settings

seewoo5/KT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Knowledge Tracing Models

Implementations of various Knowledge Tracing models in PyTorch

IMPORTANT NOTE: This repository is now archived, since I'm not working on the area anymore. Consider to use better alternatives such as pyKT.

Pre-processed Dataset

  • Download Link: https://drive.google.com/file/d/1AOzqCpfxgK271mfYeUEe_sN9Aflszj2u/view?usp=sharing
  • Dataset format: log files are seperated by users. Once you download and unzip each tar.gz file, there's a folder processed, and there are 5 subdirectories named from 1 to 5 for cross validation. Each subdirectory has its own train/val/test separation, where test dataset is shared by all 5 separations. Separation ratio are given in the following table. For each user, {user_id}.csv contains two columns (with headers): tag(skill_id, or question_id, which is a single positive integer) and correctness (0 or 1).
Dataset Train:Val:Test Link
ASSISTments2009 56:14:30 https://sites.google.com/site/assistmentsdata/home/assistment-2009-2010-data/skill-builder-data-2009-2010
ASSISTments2012 6:2:2 https://sites.google.com/site/assistmentsdata/home/2012-13-school-data-with-affect
ASSISTments2015 56:14:30 https://sites.google.com/site/assistmentsdata/home/2015-assistments-skill-builder-data
ASSISTmentsChall 6:2:2 https://sites.google.com/view/assistmentsdatamining
STATICS 56:14:30 https://pslcdatashop.web.cmu.edu/Project?id=48
Junyi Academy 6:2:2 https://pslcdatashop.web.cmu.edu/Project?id=244
KDDCup2010 6:2:2 https://pslcdatashop.web.cmu.edu/KDDCup/
EdNet-KT1 6:2:2 https://github.com/riiid/ednet
  • For ASSISTments2009, ASSISTments2015, and STATICS data we use the same data (with different format) that used in this DKVMN implementation. Also, two files with same name (same user_id) but belong to different subdirectories may not coincides in this case, which actually does not important when we train & test models.
  • Currently, ASSISTments2012 and KDDCup2010 dataset is not pre-processed yet.
  • For EdNet-KT1, the dataset has only one split.

Usage

python main.py --num_workers=8 --gpu=0 --device=cuda --model=DKT --num_epochs=6 
--eval_steps=5000 --train_batch=2048 --test_batch=2048 --seq_size=200 
--input_dim=100 --hidden_dim=100 --name=ASSISTments2009_DKT_dim_100_100 
--dataset_name=ASSISTments2009 --cross_validation=1

Here are descriptions of arguments:

  • name: name of the run. More precisely, the weight of the best model will be saved in the directory weight/{ARGS.name}/.

  • gpu: number(s) of gpu.

  • device: device. cpu, cuda, or others.

  • base_path: the path where datasets are located.

  • num_workers: number of workers for gpu training.

  • dataset_name: the name of the benchmark dataset. Currently, ASSISTments2009, ASSISTments2015, ASSISTmentsChall, STATICS, Junyi, and EdNet-KT1 are available.

  • model: name of the model. DKT, DKVMN, or NPA. (SAKT is not available yet)

  • num_layers: number of LSTM layers, for DKT and NPA. Set to be 1 as a default value.

  • input_dim: input embedding dimension of interactions, for DKT and NPA.

  • hidden_dim: hidden dimension of LSTM models, for DKT and NPA.

  • key_dim: dimension of key vectors of DKVMN.

  • value_dim: dimension of value vectors of DKVMN.

  • summary_dim: dimension of the last FC layer of DKVMN.

  • concept_num: number of latent concepts, for DKVMN.

  • attention_dim: dimension of the attention layer of NPA.

  • fc_dim: largest dimension for the last FC layers of NPA.

  • dropout: dropout rate of the model.

  • random_seed: random seed for initialization, for reproducibility. Set to be 1 as default.

  • num_epochs: number of training epochs.

  • eval_steps: number of steps to evaluate trained model on validation set. The model weight with best performance will be saved.

  • train_batch: batch size while training.

  • test_batch: batch size while testing.

  • lr: learning rate.

  • warmup_step: warmup step for Noam optimizer.

  • seq_size: length of interaction sequence to be feeded into models. The sequence whose length is shorter than seq_size will be padded.

  • cross_validation: if cross_validation is 0, then the model is trained & tested only on the first dataset. If cross_validation is 1, then the model is trained & tested on all 5 splits, and give average results (with standard deviation).

Common features

  • All models are trained with Noam optimizer.
  • For Junyi Academy and EdNet-KT1, the model is trained & tested only on one train/val/test split, since dataset is huge enough and it takes long time for cross-validation.

DKT (Deep Knowledge Tracing)

Dataset ACC (%) AUC (%) Hyper Parameters
ASSISTments2009 77.02 ± 0.07 81.81 ± 0.10 input_dim=100, hidden_dim=100
ASSISTments2015 74.94 ± 0.04 72.94 ± 0.05 input_dim=100, hidden_dim=100
ASSISTmentsChall 68.67 ± 0.09 72.29 ± 0.06 input_dim=100, hidden_dim=100
STATICS 81.27 ± 0.06 82.87 ± 0.10 input_dim=100, hidden_dim=100
Junyi Academy 85.4 80.58 input_dim=100, hidden_dim=100
EdNet-KT1 72.72 76.99 input_dim=100, hidden_dim=100
  • All models are trained with batch size 2048 and sequence size 200.

DKVMN (Dynamic Key-Value Memory Network)

Dataset ACC (%) AUC (%) Hyper Parameters
ASSISTments2009 75.61 ± 0.21 79.56 ± 0.29 key_dim = 50, value_dim = 200, summary_dim = 50, concept_num = 20, batch_size = 1024
ASSISTments2015 74.71 ± 0.02 71.57 ± 0.08 key_dim = 50, value_dim = 100, summary_dim = 50, concept_num = 20, batch_size = 2048
ASSISTmentsChall 67.16 ± 0.05 67.38 ± 0.07 key_dim = 50, value_dim = 100, summary_dim = 50, concept_num = 20, batch_size = 2048
STATICS 80.66 ± 0.09 81.16 ± 0.08 key_dim = 50, value_dim = 100, summary_dim = 50, concept_num = 50, batch_size = 1024
Junyi Academy 85.04 79.68 key_dim = 50, value_dim = 100, summary_dim = 50, concept_num = 50, batch_size = 512
EdNet-KT1 72.32 76.48 key_dim = 100, value_dim = 100, summary_dim = 100, concept_num = 100, batch_size = 256
  • Due to memory issues, not all models are trained with batch size 2048.

NPA (Neural Padagogical Agency)

Dataset ACC (%) AUC (%) Hyper Parameters
ASSISTments2009 77.11 ± 0.08 81.82 ± 0.13 input_dim=100, hidden_dim=100, attention_dim=100, fc_dim=200
ASSISTments2015 75.02 ± 0.05 72.94 ± 0.08 input_dim=100, hidden_dim=100, attention_dim=100, fc_dim=200
ASSISTmentsChall 69.34 ± 0.03 73.26 ± 0.03 input_dim=100, hidden_dim=100, attention_dim=100, fc_dim=200
STATICS 81.38 ± 0.14 83.1 ± 0.25 input_dim=100, hidden_dim=100, attention_dim=100, fc_dim=200
Junyi Academy 85.57 81.10 input_dim=100, hidden_dim=100, attention_dim=100, fc_dim=200
EdNet-KT1 73.05 77.58 input_dim=100, hidden_dim=100, attention_dim=100, fc_dim=200
  • All models are trained with batch size 2048 and sequence size 200.

SAKT (Self-Attentive Knowledge Tracing)

Dataset ACC (%) AUC (%) Hyper Parameters
ASSISTments2009 76.36 ± 0.15 80.78 ± 0.10 hidden_dim=100, seq_size=100, batch_size=512
ASSISTments2015 74.57 ± 0.07 71.49 ± 0.03 hidden_dim=100, seq_size=50, batch_size=512
ASSISTmentsChall 67.53 ± 0.06 69.70 ± 0.32 hidden_dim=100, seq_size=200, batch_size=512
STATICS 80.45 ± 0.13 80.30 ± 0.31 hidden_dim=100, seq_size=500, batch_size=128
Junyi Academy 85.27 80.36 hidden_dim=100, seq_size=200, batch_size=512
EdNet-KT1 72.44 76.60 hidden_dim=200, seq_size=200, batch_size=512

About

Knowledge Tracing Models with PyTorch

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages