Pctx

This repository provides the pyTorch-based open-source code for implementing Pctx described in our paper Pctx: Tokenizing Personalized Context for Generative Recommendation.

We suggest running our code on 2 GPUs, each with at least 24GB of memory.

After choosing a specific dataset,

If you want to reproduce our results directly (test only), please follow Part 1: Step0 -> Step1 -> Step2.
If you want to run the codes of Pctx for training (including training, validating and testing), please follow Part 2: Step0 -> Step1 (you may skip) -> Step2 -> Step3.

Part 1. Test Pctx only, in order to reproduce our results directly

To directly reproduce our results, please follow the steps outlined below.

Please note that results may vary slightly depending on the environment (e.g., differences in GPUs), and thus, may exhibit minor discrepancies when compared to those reported in our paper.

Step0: preparation

for environment (python==3.9.23):

pip install -r requirements.txt

Step1: download our ckpt and sem_ids file

Please download our ckpt and sem_ids file from test_file_dir via google drive.

Place the test_file_dir folder into the Pctx/ directory, ensuring it is located in the same directory as the .sh files.

Step2: test on GR task

Musical_Instruments

bash test_instrument.sh

Industrial_and_Scientific

bash test_scientific.sh

Video_Games

bash test_game.sh

Part2. Train Pctx from scratch

Step0: preparation

for environment (python==3.9.23):

pip install -r requirements.txt

This step is exactly the same as Part 1, Step 0. If you have already done that, please skip this step.

Step1: for pre-trained neural model (you may skip)

As described in our paper, we need to get encoded user context representation from an auxiliary model (neural model), such as DuoRec.

Before getting the representation, we need to train such an auxiliary model (neural model).

In this step, we provide codes for you to train such an auxiliary model (e.g., DuoRec).

You could run

bash train_pretrained.sh

to train the neural model, and get the corresponding .pth file for 'train_upstream_{datasetName}.sh'.

We have already prepared the three .pth files on three datasets for your convenience at pretrained_auxiliary_model_DuoRec folder.

If you do not want to train such a new auxiliary model from scratch, please skip this step by using the .pth files we provide.

Step2: for upstream

Related parameters are in genrec/models/Pctx/config_upstream.yaml. The most related parameters are in the .sh file. We have already set the parameters to appropriate values.

As described in our paper, we need to get encoded user context representation from an auxiliary model (neural model), such as DuoRec.

In this step, we provide codes for you to get encoded user context representation from the pretrained neural model (the pretrained neural model is done in Part 2, Step 1), and cluster them into centroids. This process is defined as upstream.

Musical_Instruments

bash train_upstream_instrument.sh

Industrial_and_Scientific

bash train_upstream_scientific.sh

Video_Games

bash train_upstream_game.sh

Step3: for GR task

Related parameters are in genrec/models/Pctx/config.yaml and genrec/models/default.yaml. The most related parameters are in the .sh file. We have already set the parameters to appropriate values.

In this step, we provide codes for you to train Pctx on the generative recommendation task.

Musical_Instruments

bash train_instrument.sh

Industrial_and_Scientific

bash train_scientific.sh

Video_Games

bash train_game.sh

Citing this work

Please cite the following paper if you find our code, processed datasets, or tokenizers etc. helpful.

@article{zhong2025pctx,
  title={Pctx: Tokenizing Personalized Context for Generative Recommendation},
  author={Zhong, Qiyong and Su, Jiajie and Ma, Yunshan and McAuley, Julian and Hou, Yupeng},
  journal={arXiv preprint arXiv:2510.21276},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pctx

Part 1. Test Pctx only, in order to reproduce our results directly

Part2. Train Pctx from scratch

Citing this work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
asset		asset
genrec		genrec
pretrained_auxiliary_model_DuoRec		pretrained_auxiliary_model_DuoRec
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
test_game.sh		test_game.sh
test_instrument.sh		test_instrument.sh
test_scientific.sh		test_scientific.sh
train_game.sh		train_game.sh
train_instrument.sh		train_instrument.sh
train_pretrained.sh		train_pretrained.sh
train_scientific.sh		train_scientific.sh
train_upstream_game.sh		train_upstream_game.sh
train_upstream_instrument.sh		train_upstream_instrument.sh
train_upstream_scientific.sh		train_upstream_scientific.sh

Folders and files

Latest commit

History

Repository files navigation

Pctx

Part 1. Test Pctx only, in order to reproduce our results directly

Part2. Train Pctx from scratch

Citing this work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages