This repository is the implementation of Exploration and Regularization of the Latent Action Space in Recommendation in WWW 23'.
@inproceedings{liu2023exploration,
author = {Liu, Shuchang and Cai, Qingpeng and Sun, Bowen and Wang, Yuhao and Jiang, Ji and Zheng, Dong and Jiang, Peng and Gai, Kun and Zhao, Xiangyu and Zhang, Yongfeng},
title = {Exploration and Regularization of the Latent Action Space in Recommendation},
year = {2023},
isbn = {9781450394161},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3543507.3583244},
doi = {10.1145/3543507.3583244},
pages = {833–844},
numpages = {12},
location = {Austin, TX, USA},
series = {WWW '23}
}
conda create -n hac python=3.9
conda activate hac
conda install pytorch torchvision -c pytorch
conda install pandas matplotlib scikit-learn
pip install tqdm
conda install -c anaconda ipykernel
python -m ipykernel install --user --name hac --display-name "HAC"
Modify train_env.sh:
- Change the directories, data_path, and output_path for your dataset
- Set the following arguments with X in {RL4RS, ML1M}:
- --model {X}UserResponse\
- --reader {X}DataReader\
- --train_file ${data_path}{X}_b_train.csv\
- --val_file ${data_path}{X}_b_test.csv\
- Set your model_path and log_path in the script.
Run:
bash train_enb.sh
bash train_xxx.sh
Examples:
DDPG:
bash train_ddpg.sh
BehaviorDDPG:
bash train_superddpg.sh
Online Supervise Learning:
bash train_online_sasrec.sh
Offline Supervise Learning:
train_supervise.sh
Use the same script but change "--n_iter ${N_ITER}" to "--n_iter ${PREVIOUS_N_ITER} ${N_ITER}"
bash test.sh
- HACTraining.ipynb