Exploration and Regularization of the Latent Action Space in Recommendation

This repository is the implementation of Exploration and Regularization of the Latent Action Space in Recommendation in WWW 23'.

Citing

@inproceedings{liu2023exploration,
  author = {Liu, Shuchang and Cai, Qingpeng and Sun, Bowen and Wang, Yuhao and Jiang, Ji and Zheng, Dong and Jiang, Peng and Gai, Kun and Zhao, Xiangyu and Zhang, Yongfeng},
  title = {Exploration and Regularization of the Latent Action Space in Recommendation},
  year = {2023},
  isbn = {9781450394161},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3543507.3583244},
  doi = {10.1145/3543507.3583244},
  pages = {833–844},
  numpages = {12},
  location = {Austin, TX, USA},
  series = {WWW '23}
}

0. Setup

conda create -n hac python=3.9
conda activate hac
conda install pytorch torchvision -c pytorch
conda install pandas matplotlib scikit-learn
pip install tqdm
conda install -c anaconda ipykernel
python -m ipykernel install --user --name hac --display-name "HAC"

1. Pretrain User Response Model as Environment Component

Modify train_env.sh:

Change the directories, data_path, and output_path for your dataset
Set the following arguments with X in {RL4RS, ML1M}:
- --model {X}UserResponse\
- --reader {X}DataReader\
- --train_file ${data_path}{X}_b_train.csv\
- --val_file ${data_path}{X}_b_test.csv\
Set your model_path and log_path in the script.

Run:

bash train_enb.sh

2. Training

2.1 Script list

bash train_xxx.sh

Examples:

DDPG:

bash train_ddpg.sh

BehaviorDDPG:

bash train_superddpg.sh

Online Supervise Learning:

bash train_online_sasrec.sh

Offline Supervise Learning:

train_supervise.sh

2.2 Continue training

Use the same script but change "--n_iter ${N_ITER}" to "--n_iter ${PREVIOUS_N_ITER} ${N_ITER}"

3. Result Observation

bash test.sh

HACTraining.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dataset/kr		dataset/kr
env		env
model		model
reader		reader
HACTraining.ipynb		HACTraining.ipynb
README.md		README.md
eval_env.py		eval_env.py
plot_utils.py		plot_utils.py
test.py		test.py
test.sh		test.sh
train_a2c.sh		train_a2c.sh
train_ddpg.py		train_ddpg.py
train_ddpg.sh		train_ddpg.sh
train_env.py		train_env.py
train_env.sh		train_env.sh
train_env_eval.sh		train_env_eval.sh
train_hac.py		train_hac.py
train_hac.sh		train_hac.sh
train_online_sasrec.sh		train_online_sasrec.sh
train_pgra.sh		train_pgra.sh
train_superddpg.sh		train_superddpg.sh
train_supervise.py		train_supervise.py
train_supervise.sh		train_supervise.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploration and Regularization of the Latent Action Space in Recommendation

Citing

0. Setup

1. Pretrain User Response Model as Environment Component

2. Training

2.1 Script list

2.2 Continue training

3. Result Observation

About

Releases

Packages

Languages

CharlieMat/Hyper-Actor-Critic-for-Recommendation

Folders and files

Latest commit

History

Repository files navigation

Exploration and Regularization of the Latent Action Space in Recommendation

Citing

0. Setup

1. Pretrain User Response Model as Environment Component

2. Training

2.1 Script list

2.2 Continue training

3. Result Observation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages