HLPD-Aligning-LLMs-to-Human-Language-Preference-for-Machine-Revised-Text-Detection

[AAAI 2026 paper]

Abstract

To prevent misinformation and social issues arising from trustworthy-looking content generated by LLMs, it is cru- cial to develop efficient and reliable methods for identifying the source of texts. Previous approaches have demonstrated exceptional performance in detecting texts fully generated by LLMs. However, these methods struggle when confronting more advanced LLM output or text with adversarial multi-task machine revision, especially in the black-box setting, where the generating model is unknown. To address this challenge, grounded in the hypothesis that human writing possesses dis- tinctive stylistic patterns, we propose Human Language Pref- erence Detection (HLPD). HLPD employs a reward-based alignment process, Human Language Preference Optimization (HLPO), to shift the scoring model’s token distribution toward human-like writing, making the model more sensitive to hu- man writing, therefore enhancing the identification of machine- revised text. We test HLPD in an adversarial multi-task eval- uation framework that leverages a five- dimensional prompt generator and multiple advanced LLMs to create diverse re- vision scenarios. When detecting texts revised by GPT-series models, HLPD achieves a 15.11% relative improvement in AUROC over ImBD, surpassing Fast-DetectGPT by 45.56%. When evaluated on texts generated by advanced LLMs, HLPD achieves the highest average AUROC, exceeding ImBD by 5.53% and Fast-DetectGPT by 34.14%

Results

training code coming soon.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HLPD-Aligning-LLMs-to-Human-Language-Preference-for-Machine-Revised-Text-Detection

Abstract

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

HLPD-Aligning-LLMs-to-Human-Language-Preference-for-Machine-Revised-Text-Detection

Abstract

Results

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages