Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Learning Rewards from Linguistic Feedback

This repository contains (1) data, (2) model training, and (3) model analysis code to support the paper:


Install the environment via Conda:

$ conda env create -f environment.yml

Run tests:

$ python -m unittest

Data Exploration

Appendix.pdf contains additional information about models and experiments, including full transcripts from informative teacher-learner pairs.

Provided iPython notebooks under the /notebooks diectory can be used to explore datasets and re-run model evaluation.

To run them:

$ cd notebooks/
$ jupyter lab


The human-human and human-agent datasets can be found in notebooks/data/: human_trial_data.json and agent_trial_data.json respectively.

The easiest way to get started with them is to use the aaai_experiment_data_exploration.ipynb notebook.


Training code / scripts are in the aaai_inference_network_training.ipynb notebook. The data augmentation step will cache results in the notebooks/data/ subfolder.


Evaluation code / scripts are in the aaai_model_evaluation.ipynb notebook.

These can be run independently of the training notebook and will use pretrained models. Running it will cache results in the notebooks/data/ subfolder.

Pre-trained Models

Pretrained models are available in the data/model_training_10fold subdirectory. There is one .pt file for each cross-validation split. These models are loaded and used automatically in the aaai_model_evaluation.ipynb notebook.