Skip to content
Switch branches/tags
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

ReQueST — (Re)ward (Que)ry (S)ynthesis via (T)rajectory Optimization

ReQueST is a reward modeling algorithm that asks the user for feedback on hypothetical trajectories synthesized using a pretrained model of the environment dynamics, instead of real trajectories generated by rolling out a partially-trained agent in the environment. Compared to previous approaches, this enables

  1. training more robust reward models that work off-policy,
  2. learning about unsafe states without visiting them, and
  3. better query-efficiency through the use of active learning.

This codebase implements ReQueST in three domains:

  1. An MNIST classification task.
  2. A simple state-based 2D navigation task.
  3. The Car Racing task from the OpenAI Gym.

All experiments use labels from a synthetic oracle instead of a real human.


  1. Setup the Anaconda virtual environment with conda env create -f environment.yml
  2. Patch the gym car_racing environment by running bash from ReQueST/scripts
  3. Replace gym/envs/box2d/ with ReQueST/scripts/
  4. Clone the world models repo
  5. Download MNIST
  6. Set wm_dir, mnist_dir, and home_dir in ReQueST/
  7. Install the rqst package with python install
  8. Download, then unzip it into ReQueST/data
  9. Jupyter notebooks in ReQueST/notebooks provide an entry-point to the code base, where you can play around with the environments, visualize synthesized queries, and reproduce the figures from the paper.


If you find this software useful in your work, we kindly request that you cite the following paper:

  title={Learning Human Objectives by Evaluating Hypothetical Behavior},
  author={Reddy, Siddharth and Dragan, Anca D. and Levine, Sergey and Legg, Shane and Leike, Jan},
  journal={arXiv preprint arXiv:1912.05652},


This is not an officially supported Google product.


Code for the paper, "Learning Human Objectives by Evaluating Hypothetical Behavior"




No releases published


No packages published