RewardBench: the first evaluation tool for reward models.
-
Updated
Oct 23, 2024 - Python
RewardBench: the first evaluation tool for reward models.
Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3
The MAGICAL benchmark suite for robust imitation learning (NeurIPS 2020)
This repository contains the source code for our paper: "NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning". For more details, please refer to our project website at https://sites.google.com/view/san-navistar.
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
Python-based GUI to collect Feedback of Chemist in Molecules
Official code for ICML 2024 paper, "RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences" (ICML 2024 Spotlight)
Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".
PyTorch implementations for Offline Preference-Based RL (PbRL) algorithms
Data and models for the paper "Configurable Safety Tuning of Language Models with Synthetic Preference Data"
Code for "Monocular Depth Estimation via Listwise Ranking using the Plackett-Luce Model" as published at CVPR 2021.
This repository contains the source code for our paper: "Feedback-efficient Active Preference Learning for Socially Aware Robot Navigation", accepted to IROS-2022. For more details, please refer to our project website at https://sites.google.com/view/san-fapl.
Preference Learning with Gaussian Processes and Bayesian Optimization
A paper under AAAI-20 review
[P]reference and [R]ule [L]earning algorithm implementation for Python 3 (https://arxiv.org/abs/1812.07895)
Code for the project: "Analysis of Recommendation-systems based on User Preferences".
UI for straightforward Bradley-Terry feedback loop
In this project, we design a recurrent neural network to simulate a cognitive model of decision-making called Multi Alternative Decision Field Theory (MDFT). We train this RNN to learn the parameters of MDFT.
Constructive Preference Elicitation for Social Choice With Setwise max-margin Learning.
Python library for preference based learning
Add a description, image, and links to the preference-learning topic page so that developers can more easily learn about it.
To associate your repository with the preference-learning topic, visit your repo's landing page and select "manage topics."