Skip to content

mj-hwang/ReCouPLe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Causally Robust Reward Learning from Reason-Augmented Preference Feedback

Python License: MIT

This repository provides the official implementation of the paper:

Causally Robust Reward Learning from Reason-Augmented Preference Feedback
Minjune Hwang, Yigit Korkmaz, Daniel Seita†, Erdem Bıyık†
ICLR 2026

PbRL is widely used for shaping agent behavior to match a user's preference, yet its sparse binary feedback makes it vulnerable to causal confusion. We introduce ReCouPLe, a lightweight framework that uses natural language rationales to clarify true causal signals behind preference and to improve generalization, by employing orthogonal decomposition.

Installation

Under Development

About

Causally Robust Reward Learning from Reason-Augmented Preference Feedback

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages