<a href="https://colab.research.google.com/github/ZaynabAttahiru/KnowledgeTracing/blob/main/DKTNotebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Deep Knowledge Tracing

Deep Knowledge Tracing (DKT) is an application of the Long Short-Term Memory architecture as a model for tracing student's knowledge as they interact with coursework. In the basic form of DKT, the students' interaction is represented as a tuple of question and answer set `x = {xq,xa}`.

In this notebook, we will be implementing the basic model of DKT on a couple of datasets. Later on we will we try and apply a distributed learning approach

## The Model

In this section, we will define a basic RNN model with the architecture of an LSTM in PyTorch. This code was adapted from [this Github repository](https://github.com/chsong513/DeepKnowledgeTracing-DKT-Pytorch).
First, we will install the dependencies necessary for this codebase to work - PyTorch as well as the federated learning framework, [Flower](https://flower.dev).
Additionally we will download the dataset required to run the model in this notebook.

In [1]:
!pip install torch
!pip install flwr==0.17.0
!curl https://raw.githubusercontent.com/ZaynabAttahiru/KnowledgeTracing/main/data_test.csv > ./sample_data/data_train.csv
!curl https://raw.githubusercontent.com/ZaynabAttahiru/KnowledgeTracing/main/data_train.csv > ./sample_data/data_test.csv

Collecting flwr==0.17.0
  Downloading flwr-0.17.0-py3-none-any.whl (229 kB)
[K     |████████████████████████████████| 229 kB 5.1 MB/s 
Installing collected packages: flwr
Successfully installed flwr-0.17.0
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  590k  100  590k    0     0  2278k      0 --:--:-- --:--:-- --:--:-- 2278k
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 2033k  100 2033k    0     0  6374k      0 --:--:-- --:--:-- --:--:-- 6354k


In [12]:
%%writefile dktmodel.py

import torch
import torch.nn as nn
from torch.autograd import Variable

class DKTModel(nn.Module):
  def __init__(self, input_dim, hidden_dim, layer_dim, output_dim, device):
    super(DKTModel, self).__init__()
    self.hidden_dim = hidden_dim
    self.layer_dim = layer_dim
    self.output_dim = output_dim
    self.rnn = nn.RNN(input_dim, hidden_dim, layer_dim, batch_first = True, nonlinearity='tanh')
    self.fc = nn.Linear(self.hidden_dim, self.output_dim)
    self.sig = nn.Sigmoid()
    self.device = device

  def forward(self,x):
    h0 = Variable(torch.zeros(self.layer_dim, x.size(0), self.hidden_dim))
    out, hn = self.rnn(x, h0)
    result = self.sig(self.fc(out))
    return result


Overwriting dktmodel.py


## Handling the Data

We will now define a script that includes the helper functions that will handle all the data manipulations required to run the model defined above. This will include defining the students' interactions as one-hot encodings.

In [13]:
%%writefile dataloader.py

import numpy as np
import torch
import torch.utils.data as Data
import itertools

"""Reading the dataset"""
def getData(path, num_of_questions, max_step, data_type):
  print('loading ' + data_type +  ' data...')
  trace_data = []
  with open(path, 'r') as file:
    for len, ques, ans in itertools.zip_longest(*[file] * 3):
      len = int(len.strip().strip(','))
      ques = [int(q) for q in ques.strip().strip(',').split(',')]
      ans = [int(a) for a in ans.strip().strip(',').split(',')]
      slices = len//max_step + (1 if len % max_step > 0 else 0)

      for i in range(slices):
         temp = temp = np.zeros(shape=[max_step, 2 * num_of_questions])
         if len > 0:
           if len >= max_step:
             steps = max_step
           else:
             steps = len
           for j in range(steps):
             if ans[i*max_step + j] == 1:
                temp[j][ques[i*max_step + j]] = 1
             else:
                temp[j][ques[i*max_step + j] + num_of_questions] = 1
           len = len - max_step

         trace_data.append(temp.tolist())
    print('done: ' + str(np.array(trace_data).shape))
    return np.array(trace_data)

"""Load the dataset"""
def getDataLoader(batch_size, num_of_questions, max_step):
  train_data = torch.tensor(getData('sample_data/data_train.csv', num_of_questions, max_step, 'train').astype(float).tolist(),
                            dtype=torch.float32)
  test_data = torch.tensor(getData('sample_data/data_test.csv', num_of_questions, max_step, 'test').astype(float).tolist(),
                            dtype=torch.float32)
  trainloader = Data.DataLoader(train_data, batch_size=batch_size, shuffle=True)
  testloader = Data.DataLoader(test_data, batch_size=batch_size, shuffle=True)
  num_examples = {"trainset": len(trainloader), "testset": len(testloader)}

  return trainloader, testloader


Overwriting dataloader.py


Next up, we will write a script that will contain the functions that will be used to evaluate our model against the dataset.

In [14]:
%%writefile eval.py

import tqdm
import torch
import torch.optim as optim
import torch.nn as nn
from sklearn import metrics
from torch.autograd import Variable

def performance(ground_truth, prediction):
  grtruth = ground_truth.detach().cpu().numpy()
  predict = torch.round(prediction).detach().cpu().numpy()

  fpr, tpr, thresholds = metrics.roc_curve(grtruth, prediction.detach().cpu().numpy())
  auc = metrics.auc(fpr,tpr)

  f1 = metrics.f1_score(grtruth, predict)
  recall = metrics.recall_score(grtruth, predict)
  precision = metrics.precision_score(grtruth, predict)

  print('auc: ' + str(auc) + ' f1: ' + str(f1) + ' recall: ' + str(recall) + ' precision: ' + str(precision) + '\n')
  return auc, recall


class lossFunc(nn.Module):
  def __init__(self, num_of_questions, max_step, device):
    super(lossFunc, self).__init__()
    self.crossEntropy = nn.BCELoss()
    self.num_of_questions = num_of_questions
    self.max_step = max_step
    self.device = device

  def forward(self, pred, batch):
    loss = 0
    prediction = torch.Tensor([], device=self.device)
    ground_truth = torch.Tensor([], device=self.device)

    for student in range(pred.shape[0]):
      delta = batch[student][:,0:self.num_of_questions] + batch[
              student][:,self.num_of_questions:]
      temp = pred[student][:self.max_step - 1].mm(delta[1:].t())
      index = torch.tensor([[i for i in range(self.max_step - 1)]],
                                 dtype=torch.long, device=self.device)
      p = temp.gather(0, index)[0]
      a = (((batch[student][:, 0:self.num_of_questions] -
                   batch[student][:, self.num_of_questions:]).sum(1) + 1) //
                 2)[1:]
      for i in range(len(p) - 1, -1, -1):
        if p[i] > 0:
          p = p[:i + 1]
          a = a[:i + 1]
          break
      loss += self.crossEntropy(p, a)
      prediction = torch.cat([prediction, p])
      ground_truth = torch.cat([ground_truth, a])
    return loss, prediction, ground_truth


"""Defining the train and test functions"""

def train(model, trainloader, optimizer, loss_func, device):
  model.to(device)
  for batch in tqdm.tqdm(trainloader, desc='Training: ', mininterval=2):
    batch = batch.to(device)
    pred = model(batch)
    loss, prediction, ground_truth = loss_func(pred, batch)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

  return model, optimizer


def train_epoch(model, trainloader, loss_func, device, start_epoch, end_epoch, log_progress):
  optimizer = optim.Adam(model.parameters(), lr=0.002)

  print(f"Training from epoch(s) {start_epoch} to {end_epoch} w/ {len(trainloader)} batches each.", flush=True)
  results = []

  for epoch in range(start_epoch, end_epoch+1):
    ground_truth = torch.Tensor([], device=device)
    prediction = torch.Tensor([], device=device)
    pbar = tqdm.tqdm(trainloader, desc=f'Training epoch: {epoch}', mininterval=2) if log_progress else trainloader
    for batch in pbar:
      batch = batch.to(device)
      pred = model(batch)
      loss, predict, truth = loss_func(pred, batch)
      prediction = torch.cat([prediction, predict])
      ground_truth = torch.cat([ground_truth, truth])
      optimizer.zero_grad()
      loss.backward()
      optimizer.step()


def test(model, testloader, loss_func, device):
  model.to(device)
  ground_truth = torch.Tensor([], device=device)
  prediction = torch.Tensor([], device=device)
  for batch in tqdm.tqdm(testloader, desc='Testing: ', mininterval=2):
    batch = batch.to(device)
    pred = model(batch)
    loss, p, a = loss_func(pred, batch)
    prediction = torch.cat([prediction, p])
    ground_truth = torch.cat([ground_truth, a])
  auc, recall = performance(ground_truth, prediction)
  return auc, recall

Overwriting eval.py


Let's get the dataset all loaded up! The dataset is represented as a DataLoader object with a shape of (10217, 50, 248) for the trainset and (2879, 50, 248). The last number represents the input sequence of questions and answers for each student.

In [15]:
from dktmodel import DKTModel
import torch
import torch.optim as optim
import eval
from dataloader import getDataLoader

"""Defining the parameters"""
max_step = 50
batch_size = 64
number_of_questions = 124
input = number_of_questions * 2
hidden = 200
layer = 1
output = number_of_questions
lr = 0.002
epochs = 10
device = torch.device('cpu')

trainloader, testloader = getDataLoader(batch_size, number_of_questions, max_step) 


loading train data...
done: (2879, 50, 248)
loading test data...
done: (10217, 50, 248)


In [16]:
model = DKTModel(input,hidden,layer,output,device)
loss_func = eval.lossFunc(number_of_questions, max_step, device)
eval.train_epoch(model, trainloader, loss_func, device, 1, 10, True)


Training from epoch(s) 1 to 10 w/ 45 batches each.


  2)[1:]
Training epoch: 1: 100%|██████████| 45/45 [00:05<00:00,  7.62it/s]
Training epoch: 2: 100%|██████████| 45/45 [00:05<00:00,  8.21it/s]
Training epoch: 3: 100%|██████████| 45/45 [00:05<00:00,  8.10it/s]
Training epoch: 4: 100%|██████████| 45/45 [00:05<00:00,  8.04it/s]
Training epoch: 5: 100%|██████████| 45/45 [00:05<00:00,  8.13it/s]
Training epoch: 6: 100%|██████████| 45/45 [00:05<00:00,  8.13it/s]
Training epoch: 7: 100%|██████████| 45/45 [00:05<00:00,  8.19it/s]
Training epoch: 8: 100%|██████████| 45/45 [00:05<00:00,  8.20it/s]
Training epoch: 9: 100%|██████████| 45/45 [00:05<00:00,  8.20it/s]
Training epoch: 10: 100%|██████████| 45/45 [00:05<00:00,  8.08it/s]
