# Item Response Ranking for IRT

This notebook will show you how to train and use the IRR-IRT.
Refer to [IRR doc](../../docs/IRR.md) for more details.
First, we will show how to get the data (here we use a0910 as the dataset).
Then we will show how to train a IRR-IRT and perform the parameters persistence.
At last, we will show how to load the parameters from the file and evaluate on the test dataset.

In [1]:
import logging
from longling.lib.structure import AttrDict
from longling import set_logging_info
from EduCDM.IRR import pair_etl as etl, point_etl as vt_etl, extract_item

set_logging_info()

params = AttrDict(
    batch_size=256,
    n_neg=10,
    n_imp=10,
    logger=logging.getLogger(),
    hyper_params={"user_num": 4164}
)
item_knowledge = extract_item("../../data/a0910/item.csv", 123, params)
train_data, train_df = etl("../../data/a0910/train.csv", item_knowledge, params)
valid_data, _ = vt_etl("../../data/a0910/valid.csv", item_knowledge, params)
test_data, _ = vt_etl("../../data/a0910/test.csv", item_knowledge, params)

train_data, valid_data, test_data

reading records from ../../data/a0910/item.csv: 100%|██████████| 19529/19529 [00:00<00:00, 56917.24it/s]
rating2triplet: 100%|██████████| 17051/17051 [00:15<00:00, 1084.01it/s]


(<longling.lib.iterator.AsyncLoopIter at 0x21339b3f790>,
 <torch.utils.data.dataloader.DataLoader at 0x213360c2070>,
 <torch.utils.data.dataloader.DataLoader at 0x21339bf1040>)

In [2]:
train_df

Unnamed: 0,user_id,item_id,score
0,1615,12977,1.0
1,782,13124,0.0
2,1084,16475,0.0
3,593,8690,0.0
4,127,14225,1.0
...,...,...,...
186044,2280,6019,0.0
186045,121,2,1.0
186046,601,5425,1.0
186047,573,2412,0.0


In [3]:
from EduCDM.IRR import IRT

cdm = IRT(
    4163 + 1,
    17746 + 1,
    123
)
cdm.train(
    train_data,
    valid_data,
    epoch=2,
)
cdm.save("IRR-IRT.params")

Epoch 0: 727it [01:33,  7.75it/s]
evaluating: 100%|██████████| 101/101 [00:00<00:00, 105.09it/s]
formatting item df: 100%|██████████| 10415/10415 [00:00<00:00, 12372.72it/s]
ranking metrics: 10415it [00:30, 339.73it/s]
Epoch 1: 100%|██████████| 727/727 [01:32<00:00,  7.85it/s]
evaluating: 100%|██████████| 101/101 [00:00<00:00, 114.54it/s]
formatting item df: 100%|██████████| 10415/10415 [00:00<00:00, 12163.67it/s]
ranking metrics: 10415it [00:30, 345.54it/s]


[Epoch 0] Loss: 6.039850, PointLoss: 0.674785, PairLoss: 11.404916
[Epoch 0]
      ndcg@k  precision@k  recall@k      f1@k     len@k  support@k  ndcg@k(B)  \
1   1.000000     0.674796  0.474262  0.526246  1.000000      10415   1.000000   
3   0.890374     0.674828  0.737278  0.685842  1.906961      10415   0.416909   
5   0.893487     0.674204  0.794263  0.711467  2.229573      10415   0.425199   
10  0.893582     0.673970  0.815999  0.719991  2.423428      10415   0.426647   

    precision@k(B)  recall@k(B)   f1@k(B)  len@k(B)  support@k(B)  
1         0.328277     0.255120  0.275113       1.0         10415  
3         0.211042     0.434478  0.268160       3.0         10415  
5         0.149650     0.481674  0.215526       5.0         10415  
10        0.082141     0.502023  0.134496      10.0         10415  
auc: 0.839144	map: 0.912163	mrr: 0.904479	coverage_error: 3.005709	ranking_loss: 0.281795	len: 2.458569	support: 10415	map(B): 0.889633	mrr(B): 0.902134
[Epoch 1] Loss: 6.025793

In [6]:
from EduCDM.IRR.metrics import result_format
cdm.load("IRR-IRT.params")
print(result_format(cdm.eval(test_data)))

evaluating: 100%|██████████| 218/218 [00:00<00:00, 235.21it/s]
formatting item df: 100%|██████████| 13682/13682 [00:01<00:00, 12473.67it/s]
ranking metrics: 13682it [00:43, 312.27it/s]


      ndcg@k  precision@k  recall@k      f1@k     len@k  support@k  ndcg@k(B)  \
1   1.000000     0.667666  0.369623  0.433539  1.000000      13682   1.000000   
3   0.859902     0.667556  0.661748  0.632971  2.268528      13682   0.447393   
5   0.866905     0.667895  0.770354  0.690166  2.981582      13682   0.466546   
10  0.868267     0.667864  0.845262  0.723902  3.723652      13682   0.472685   

    precision@k(B)  recall@k(B)   f1@k(B)  len@k(B)  support@k(B)  
1         0.332773     0.215246  0.242867       1.0         13682  
3         0.252351     0.421886  0.288129       3.0         13682  
5         0.200278     0.509787  0.262800       5.0         13682  
10        0.127204     0.579789  0.192704      10.0         13682  
auc: 0.767497	map: 0.868629	mrr: 0.870646	coverage_error: 4.649708	ranking_loss: 0.320000	len: 4.075428	support: 13682	map(B): 0.818432	mrr(B): 0.870910
