An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
-
Updated
Jul 17, 2024 - Python
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.
Shaping Language Models with Cognitive Insights
RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback
[TSMC] Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework
LMRax is a framework built on JAX to train transformers language models by reinforcement learning, along with the reward model training.
Add a description, image, and links to the reinforcement-learning-from-human-feedback topic page so that developers can more easily learn about it.
To associate your repository with the reinforcement-learning-from-human-feedback topic, visit your repo's landing page and select "manage topics."