This is the code reference for the main process of SampleLLM accepted by WWW'25 Industry Track. The codes merely serve as a process reference and cannot run directly.
- data_generation is for the first stage (generating LLM-based samples)
- sampling is for the second stage (feature attribution-based importance sampling)
If you feel our work is insightful and want to use the code or cite our paper, please add the following citation to your paper references.
@article{gao2025samplellm,
title={SampleLLM: Optimizing Tabular Data Synthesis in Recommendations},
author={Gao, Jingtong and Du, Zhaocheng and Li, Xiaopeng and Zhao, Xiangyu and Wang, Yichao and Li, Xiangyang and Guo, Huifeng and Tang, Ruiming},
journal={arXiv preprint arXiv:2501.16125},
year={2025}
}