[ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
-
Updated
Feb 19, 2025 - Python
[ICLR 2025] IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Datasets collection and preprocessings framework for NLP extreme multitask learning
Efficient LLM inference on Slurm clusters using vLLM.
official implementation of ICLR'2025 paper: Rethinking Bradley-Terry Models in Preference-based Reward Modeling: Foundations, Theory, and Alternatives
[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis
A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward models and learning strategies across training, inference, and post-inference stages.
An easy python package to run quick basic QA evaluations. This package includes standardized QA evaluation metrics and semantic evaluation metrics: Black-box and Open-Source large language model prompting and evaluation, exact match, F1 Score, PEDANT semantic match, transformer match. Our package also supports prompting OPENAI and Anthropic API.
Learning to route instances for Human vs AI Feedback (ACL 2025 Main)
[ACL2024 Findings]DMoERM: Recipes of Mixture-of-Experts for Effective Reward Modeling
The code used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging"
RewardAnything: Generalizable Principle-Following Reward Models
Source code of our paper "Transferring Textual Preferences to Vision-Language Understanding through Model Merging", ACL 2025
Code for SFT and RL
Building an LLM with RLHF involves fine-tuning using human-labeled preferences. Based on Learning to Summarize from Human Feedback, it uses supervised learning, reward modeling, and PPO to improve response quality and alignment.
Implementation for our COLM paper "Off-Policy Corrected Reward Modeling for RLHF"
Official Repository for RELIC : Enhancing Reward Model Generalization for Low-Resource Indic Languages with Few-Shot Examples
Add a description, image, and links to the reward-modeling topic page so that developers can more easily learn about it.
To associate your repository with the reward-modeling topic, visit your repo's landing page and select "manage topics."