prediction of models #3

Chunpai · 2021-02-04T21:46:38Z

Thanks for sharing your implementations. I checked your code on DKT and DKVMN. It seems your models predict only one target_id for a single sequence. Am I right?

Thanks.
Chunpai

seewoo5 · 2021-02-05T02:07:44Z

You're right, and the other models are the same. The only reason is that the NPA model, which uses BiLSTM to encode past interaction, can't be trained to predict all the responses in a given sequence because of its bi-directional property. However, all the other models can be trained by computing losses for all interactions in a sequence (not only the last one), and this actually makes training much faster. Although I'm going to fix it later, and you can fix it and send PR if you want.

Chunpai · 2021-02-05T03:00:19Z

Thank you for your clarification.

bernardoleite · 2021-11-24T09:06:47Z

Hi there!

Has anyone by chance made an implementation that allows predicting a complete sequence instead of one target_id for a single sequence? If so, I would be very grateful if you could share.

Also, I take this opportunity to confirm: considering that current code only makes predictions for one target_id, can we compare the obtained results with state-of-the-art (where whole sequences are considered for prediction)? I apologize in advance if I miss some implementation detail.

Regards.

seewoo5 · 2021-12-21T06:27:12Z

@bernardoleite First of all, I may not have time to do the implementation for now. Actually I'm planning to do refactoring the whole repository using Pytorch Lightning and adding some recent KT models, but I don't have enough time to do that. I'm also considering about using EduData instead of my own pre-processed datasets.

For your second question, I think that making predictions only for one target_id is the right way to evaluate models, but most of the other results and papers actually divide whole sequence into several sub-sequences of fixed length and do prediction for each sequence. This would give worse performance than one-by-one prediction. For example, when you want to do prediction for the first question in the second subsequence, then the input does not include any previous interactions from the first subsequence. However, if a model make predictions for single target id at once, you can feed the previous interactions as much as you can (same size as the maximum length of the model trained), which should give a better prediction result.

bernardoleite · 2022-01-05T22:23:37Z

@seewoo5 I believe it's a good option to do refactoring using Pytorch Lighting (I'm becoming a fan of it too). Regarding the second question, I am more enlightened now. Thanks for the comprehensive explanation.

Regards,
Bernardo

seewoo5 closed this as completed Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prediction of models #3

prediction of models #3

Chunpai commented Feb 4, 2021

seewoo5 commented Feb 5, 2021

Chunpai commented Feb 5, 2021

bernardoleite commented Nov 24, 2021 •

edited

Loading

seewoo5 commented Dec 21, 2021

bernardoleite commented Jan 5, 2022

prediction of models #3

prediction of models #3

Comments

Chunpai commented Feb 4, 2021

seewoo5 commented Feb 5, 2021

Chunpai commented Feb 5, 2021

bernardoleite commented Nov 24, 2021 • edited Loading

seewoo5 commented Dec 21, 2021

bernardoleite commented Jan 5, 2022

bernardoleite commented Nov 24, 2021 •

edited

Loading