Skip to content

Simon-Wan/HandMeThat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HandMeThat: Human-Robot Communication in Physical and Social Environments

This is the code used to generate HandMeThat Dataset and evaluate agents on it.

HandMeThat: Human-Robot Communication in Physical and Social Environments

Yanming Wan*, Jiayuan Mao*, and Joshua B. Tenenbaum

[Paper] [Supplementary Material] [Project Page] (* indicates equal contributions.)

Prerequisites

Install Source

Clone this repository:

git clone https://github.com/Simon-Wan/HandMeThat

Clone the third party repositories (XTX, ALFWorld):

git clone https://github.com/princeton-nlp/XTX.git
git clone https://github.com/alfworld/alfworld.git

Add the packages to your PYTHONPATH environment variable.

export PYTHONPATH=.:$PYTHONPATH:<path_to_xtx>:<path_to_alfworld>

Create a Conda Environment

Create a conda environment for HandMeThat, and install the requirements.

conda create -n hand-me-that python=3.9
conda activate hand-me-that
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
conda install numpy scipy pyyaml networkx tabulate
conda install h5py tqdm click transformers
conda install -c conda-forge importlib_metadata
pip install jericho lark textworld opencv-python ai2thor jacinle
python -m spacy download en_core_web_sm

This includes the required python packages from the third-party repositories.

Prepare the HandMeThat Dataset

We provide 2 versions of the HandMeThat dataset. Version 1 is the original dataset presented in HandMeThat paper. Version 2 is a later version with more data pieces on fewer tasks. Please refer to Dataset Generation section for more details.

Download the version 1 (V1) dataset from Google Drive link and place the zipped file at ./datasets/v1.

Unzip the dataset so that ./datasets/v1/HandMeThat_with_expert_demonstration is a folder containing 10,000 json files.

Download the version 2 (V2) dataset from Google Drive link and place the files at ./datasets/v2.

Unzip the dataset so that ./datasets/v2/HandMeThat_with_expert_demonstration is a folder containing 116,146 json files. The data split information is presented in ./datasets/v2/HandMeThat_data_info.json.

Quickstart

Play a HandMeThat game:

from data_generation.text_interface.jericho_env import HMTJerichoEnv
import numpy as np
step_limit = 40
dataset = './datasets/v2/HandMeThat_with_expert_demonstration'
eval_env = HMTJerichoEnv(dataset, split='test', fully=False, step_limit=step_limit)
obs, info = eval_env.reset()
print(obs.replace('. ', '.\n'))
for _ in range(step_limit):
    action = input('> ')
    # uncomment the following part to get started with a random agent instead
    # _ = input('Press [Enter] to continue')
    # action = np.random.choice(info['valid'])
    # print('Action:', action)
    obs, reward, done, info = eval_env.step(action)
    print(obs.replace('. ', '.\n'), '\n\n')
    if done:
        break
print('moves: {}, score: {}'.format(info['moves'], info['score']))

Run python main.py to execute the quickstart code.

Dataset Generation

To generate HandMeThat dataset:

python data_generation/generation.py --num 1000 --quest_type bring_me

To generate HandMeThat data on some particular goal, use the argument --goal to specify the goal index.

Configure the Sampling Space

  1. The object hierarchy and initial position sampling space are specified in text files.
  2. All current goals are listed in ./data_generation/sampling/goal_sampling.py, and new goals can be specified using the given templates.
  3. To specify the number of objects in each category, please refer to the code.

Differences between V1 and V2 Datasets

  1. V2 Only contain the tasks on 25 selected goal templates, that are more easily predictable by humans.
  2. V2 Only contain "bring me" type instructions, and mainly focus on pick-and-place tasks.
  3. We generate more data on each specific goal.
  4. We add "subgoal" to each data piece, which is a FOL sequence corresponding to the wanted actions, and the information can be used in goal inference.
  5. We revise the process of random truncation of human trajectory as well as how human generate an utterance, to ensure that most of the generated tasks are human-solvable.

Baseline Models Training and Evaluation

This current release contains the basic training setting for Seq2Seq, DRRN, and offline-DRRN models. The models can be evaluated on validation and test split.

We tested each model on both fully- and partially-observable setting on all four hardness levels. These experiment results are presented in the main paper and supplementary materials. The hyperparameter we used are the default values in this released repository.

DRRN / offline-DRRN

To train the model (e.g., 'DRRN' with 'fully' observable setting):

python scripts/train_rl.py --model DRRN --observability fully

To evaluate the model (e.g., validate) on specific hardness level (e.g., level1):

python scripts/eval_rl.py --model DRRN --observability fully --level level1 --eval_split validate --memory_file memory_5 --weight_file weights_5

Use --model offlineDRRN for offline-DRRN setting.

Seq2Seq

To train the model (e.g., 'partially' observable setting):

python scripts/train_seq.py --observability partially

To evaluate the model (e.g., test) on specific hardness level (e.g., level1):

python scripts/eval_seq.py --observability partially --level level1 --eval_split test --eval_model_name weights_50000.pt

Random Agent

To evaluate the random agent:

python scripts/eval.py --agent random --level level1 --eval_split test

About

Initial commit

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published