This repository provides the official PyTorch implementation and reproduction for the paper titled "Beyond Static Best-of-N: Bayesian List-wise Alignment for LLM-based Recommendation".
We have organized the project structure to facilitate reproduction. The datasets (e.g., Steam) are located in ./data/.
./data/sft_data: Data for Supervised Fine-Tuning../data/csv_data: Raw CSV data for preprocessing and training../data/bon_data: Generated data for BLADE training.
-
SFT Training:
Before using the BLADE training framework, you need to run SFT to fine-tune your base model for alignment with the recommendation task. Use the following command to perform SFT training:# Usage: bash ./scripts/SFT.sh <GPU_ID> <Category> bash ./scripts/SFT.sh 0 Steam -
Preprocessing:
After completing SFT training, you need to generate candidate lists and compute rewards (MGU, ILD, Hit, NDCG) to prepare the data for the BLADE training.# Usage: bash ./scripts/run_preprocessing.sh <GPU_IDS> <Category> bash ./scripts/run_preprocessing.sh 0,1,2,3 Steam -
BLADE Training:
Finally, use the following command to perform BLADE training:# Usage: CUDA_VISIBLE_DEVICES=<GPU_IDS> bash ./scripts/BLADE.sh <Category> CUDA_VISIBLE_DEVICES=0,1,2,3 bash ./scripts/BLADE.sh Steam
