## Top-K Ranking Model Training

### Main Arguments
Arguments that should be specified. These arguments determine the model/data (path)/task setting.

``` yaml
model_name: SASRec
loss_n: BPR
dataset: MovieLens_1mimpCTRAll
path: '/work/lihanyu/ReChorus2/data'
metric: NDCG,HR
model_mode: Impression
```

`SASRec` is a sequential model.

The `model_mode` for general/seq models include `["", "Impression"]`, which corresponds to normal top-K ranking and impression-based ranking setting.

For top-K or impression-based ranking task, the `metrics` should be NDCG and HR.

The `num_neg` (negative sample num in training) should only be specified in the top-K task.

The `loss_n` should only be specified in the impression-based ranking task. Detailed explaination is in [BaseImpressionModel.py](https://github.com/THUwangcy/ReChorus/tree/master/src/models/BaseImpressionModel.py). Only BPR is implemented in top-K task therefore no need for specification.

In [5]:
!python /work/lihanyu/ReChorus/src/main.py --model_name SASRec --num_layers 3 --num_heads 2 --history_max 20 --emb_size 64 --lr 2e-4 --l2 1e-6 --loss_n BPR --dataset MovieLens_1mimpCTRAll --path '/work/lihanyu/ReChorus2/data' --metric NDCG,HR --topk 1,2,3,5,10 --main_metric NDCG@2 --model_mode Impression

Namespace(model_name='SASRec', model_mode='Impression')
--------------------------------------------- BEGIN: 2024-06-05 18:51:01 ---------------------------------------------

 Arguments          | Values               
 batch_size         | 256                 
 data_appendix      |                     
 dataset            | MovieLens_1mimpCT...
 dropout            | 0                   
 early_stop         | 10                  
 emb_size           | 64                  
 epoch              | 200                 
 eval_batch_size    | 256                 
 gpu                | 0                   
 history_max        | 20                  
 impression_idkey   | time                
 l2                 | 1e-06               
 loss_n             | BPR                 
 lr                 | 0.0002              
 main_metric        | NDCG@2              
 num_heads          | 2                   
 num_layers         | 3                   
 num_neg            | 1                   
 num_w

## Re-ranking Model Training

After training and saving of the base ranker (should have the "Impression" `model_mode`),

Add a yaml file with the backbone ranking model parameters in:

`model/SASRecImpression/ml_SASRec_best.yaml`

``` yaml
emb_size: 64
num_heads: 2
num_layers: 3
```

### Main Arguments
Arguments that should be specified. These arguments determine the model/data (path)/task setting.

``` yaml
model_name: PRM
loss_n: BPR
dataset: MovieLens_1mimpCTRAll
path: '/work/lihanyu/ReChorus2/data'
metric: NDCG,HR
num_workers: 0
ranker_name: SASRec
ranker_config_file: ml_SASRec_best.yaml
ranker_model_file: 'SASRecImpression__MovieLens_1mimpCTRAll__0__lr=0.0002__l2=1e-06__emb_size=64__num_layers=3__num_heads=2.pt'
model_mode: Sequential
```

`PRM` is a re-ranker model.

The `model_mode` for re-ranker models include `["General","Sequential"]`, which corresponds to model type of its base-ranker.

For impression-based ranking task, the `metrics` should be NDCG and HR.

The `loss_n` should only be specified in the impression-based ranking task. Detailed explaination is in [BaseImpressionModel.py](https://github.com/THUwangcy/ReChorus/tree/master/src/models/BaseImpressionModel.py).

The `num_workers` parameter should be 0 for re-ranker models, because of the collate_batch process in [BaseImpressionModel.py](https://github.com/THUwangcy/ReChorus/tree/master/src/models/BaseImpressionModel.py).

It takes 3 parts to specify a trained base-ranker to use:
* `ranker_name`
* `ranker_config_file`
* `ranker_model_file`


In [10]:
!python /work/lihanyu/ReChorus/src/main.py --model_name PRM --emb_size 64 --history_max 20 --n_blocks 1 --num_heads 1 --lr 1e-3 --l2 0 --loss_n BPR --dataset MovieLens_1mimpCTRAll --path '/work/lihanyu/ReChorus2/data' --metric NDCG,HR --topk 1,2,3,5,10,20 --main_metric NDCG@2 --num_workers 0 --ranker_name SASRec --ranker_config_file ml_SASRec_best.yaml --ranker_model_file 'SASRecImpression__MovieLens_1mimpCTRAll__0__lr=0.0002__l2=1e-06__emb_size=64__num_layers=3__num_heads=2.pt' --model_mode Sequential

Namespace(model_name='PRM', model_mode='Sequential')
--------------------------------------------- BEGIN: 2024-06-05 19:07:54 ---------------------------------------------

 Arguments          | Values               
 batch_size         | 256                 
 data_appendix      |                     
 dataset            | MovieLens_1mimpCT...
 dropout            | 0                   
 early_stop         | 10                  
 emb_size           | 64                  
 epoch              | 200                 
 eval_batch_size    | 256                 
 gpu                | 0                   
 history_max        | 20                  
 impression_idkey   | time                
 l2                 | 0.0                 
 loss_n             | BPR                 
 lr                 | 0.001               
 main_metric        | NDCG@2              
 n_blocks           | 1                   
 num_heads          | 1                   
 num_hidden_unit    | 64                  
 num_neg 

### Log & Saved Model & Saved Recommendation Results

The logged output contains  parts:
* The Arguments and Values
* Data Processing
* The Model Structure
* Training Process:
  * Test performance before training
  * Loss and validation performance in each training epoch
  * Validation and test performance after training
* Saved Model and Recommendation Results of Validation & Test