Skip to content

tsumers/rewards

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Learning Rewards from Linguistic Feedback

This repository contains (1) data, (2) model training, and (3) model analysis code to support the paper: https://arxiv.org/abs/2009.14715

Requirements

Install the environment via Conda:

$ conda env create -f environment.yml

Run tests:

$ python -m unittest

Data Exploration

Appendix.pdf contains additional information about models and experiments, including full transcripts from informative teacher-learner pairs.

Provided iPython notebooks under the /notebooks diectory can be used to explore datasets and re-run model evaluation.

To run them:

$ cd notebooks/
$ jupyter lab

Datasets

The human-human and human-agent datasets can be found in notebooks/data/: human_trial_data.json and agent_trial_data.json respectively.

The easiest way to get started with them is to use the aaai_experiment_data_exploration.ipynb notebook.

Training

Training code / scripts are in the aaai_inference_network_training.ipynb notebook. The data augmentation step will cache results in the notebooks/data/ subfolder.

Evaluation

Evaluation code / scripts are in the aaai_model_evaluation.ipynb notebook.

These can be run independently of the training notebook and will use pretrained models. Running it will cache results in the notebooks/data/ subfolder.

Pre-trained Models

Pretrained models are available in the data/model_training_10fold subdirectory. There is one .pt file for each cross-validation split. These models are loaded and used automatically in the aaai_model_evaluation.ipynb notebook.