Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Quantile RewardScaler #6

Open
thejaminator opened this issue Feb 26, 2023 · 1 comment
Open

Add a Quantile RewardScaler #6

thejaminator opened this issue Feb 26, 2023 · 1 comment
Labels
good first issue Good for newcomers

Comments

@thejaminator
Copy link
Owner

conditionme/scaling/scaler.py defines the scalers we currently have.
This doesn't seem to really help, as our reward distribution from reward models tend to be very heavily skewed.

We can implement a quantile scaler. Maybe that helps.

can try binning rewards into 10 buckets as in Quark

Thanks @tomekkorbak for this suggestion!

@thejaminator thejaminator added the good first issue Good for newcomers label Feb 26, 2023
@SahilDave04
Copy link

Hey can you elaborate your request more. I'm not exactly able to understand your request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants