#

reinforcement-learning-from-human-feedback

Here are 12 public repositories matching this topic...

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & Mixtral)

reinforcement-learning raylib transformers deepspeed large-language-models reinforcement-learning-from-human-feedback vllm

Updated Sep 30, 2024
Python

openpsi-project / ReaLHF

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

distributed-systems reinforcement-learning distributed-computing transformers large-scale-machine-learning deepspeed megatron-lm large-language-models llm reinforcement-learning-from-human-feedback llm-training llm-framework

Updated Sep 20, 2024
Python

umenzi / diversity-rlhf

Code for Bachelor thesis, The Human Factor: Addressing Diversity in Reinforcement Learning from Human Feedback.

reinforcement-learning tudelft-cse-research-project rlhf reinforcement-learning-from-human-feedback

Updated Aug 17, 2024
Python

SJ9VRF / Reinforcement-Learning-for-Human-Feedback-RLHF-

This repository contains the implementation of a Reinforcement Learning with Human Feedback (RLHF) system using custom datasets. The project utilizes the trlX library for training a preference model that integrates human feedback directly into the optimization of language models.

language-model language-mo llms rlhf reinforcement-learning-from-human-feedback

Updated Aug 17, 2024
Python

ymetz / rlhfblender

RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback

react python reinforcement-learning experimentation human-ai-interaction reinforcement-learning-from-human-feedback

Updated Jul 8, 2024
Python

tatsu-lab / alpaca_farm

A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.

natural-language-processing deep-learning instruction-following large-language-models reinforcement-learning-from-human-feedback

Updated Jul 1, 2024
Python

liushunyu / Ask-AC

[TSMC] Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework

reinforcement-learning reinforcement-learning-from-human-feedback action-advising

Updated Jun 28, 2024
Python

PKU-Alignment / safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Updated Jun 13, 2024
Python

tlc4418 / llm_optimization

A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.

deep-learning ensembles best-of-n large-language-models reinforcement-learning-from-human-feedback reward-models

Updated Mar 9, 2024
Python

XplainMind / LLMindCraft

Shaping Language Models with Cognitive Insights

docker transformers pretraining deepspeed large-language-models reinforcement-learning-from-human-feedback instruct-tuning

Updated Feb 29, 2024
Python

nlp-uoregon / Okapi

Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback

multilingual nlp bloom natural-language-processing reinforcement-learning chatbot dataset question-answering llama language-model large-language-models rlhf instruction-tuning reinforcement-learning-from-human-feedback

Updated Aug 18, 2023
Python

Almost-Intelligence / LMRax

LMRax is a framework built on JAX to train transformers language models by reinforcement learning, along with the reward model training.

reinforcement-learning transformer language-model jax reinforcement-learning-from-human-feedback

Updated Mar 3, 2023
Python

Improve this page

Add a description, image, and links to the reinforcement-learning-from-human-feedback topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the reinforcement-learning-from-human-feedback topic, visit your repo's landing page and select "manage topics."