Skip to content

The associated repo for paper "Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification"

Notifications You must be signed in to change notification settings

UM-ARM-Lab/Efficient-Eng-2-LTL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification

Demo

[Homepage] [Paper] [Video] [Poster]

The associated repo for paper "Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification".

Repo Structure

The constrained decoding inference code is based on: microsoft/semantic_parsing_with_constrained_lm

Reproduce the Results

Following are the instructions for reproducing the result for our model. For baselines, come and check out this link.

Environment Setup

Install the dependencies:

pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113 # make sure the version is compatible with your cuda version
pip install transformers datasets
pip install sentencepiece
pip install jsons appdirs blobfile cached-property httpx  typer whoosh more_itertools
pip install --upgrade protobuf==3.20.0

Download BART-large model:

cd ./run
python ./semantic_parsing_with_constrained_lm/finetune/download_huggingface_lms.py

Prepare the Dataset

The processed dataset (with augmentation from LLM) in already included in the repo. This step is only needed if you want to reprocess the dataset.

To actually process the raw dataset, you can follow the steps below:

  1. Pre-process: In each of the three dataset folders, run all cells in "preprocess.ipynb" to generate the processed dataset. (the annotation result is included in the notebook).
  2. Augmentation: For each of the three datasets, run all commands in "augment.ipynb" to generate the augmented dataset. Note that this step requires a GPT-3 API key.
  3. Move to training folder: You then need to reformat the dataset and move it to the run/semantic_parsing_with_constrained_lm/domains/ltl/data folder. A script will be provided later to help you automate this process.

Train

In our paper, we use the BART-large model because it is efficient to fine-tune on a single GPU. Our proposed method can be easily applied to other potentially stronger language models like T5-XXL or GPT-3.

export PRETRAINED_MODEL_DIR=huggingface_models/bart-large
export TRAINED_MODEL_DIR=trained_models/

cd ./run
DOMAIN=TODO # for example, DOMAIN=pick-syn-aug
python -m semantic_parsing_with_constrained_lm.finetune.lm_finetune \
        --config-name semantic_parsing_with_constrained_lm.finetune.configs.emnlp_train_config \
        --exp-names ltl_${DOMAIN}_utterance

Here DOMAIN determines which experiment to run. DOMAIN: {dataset_name}-{experiment_name}

  • dataset_name: {drone, cleanup, pick}
  • experiment_name:
    • syn-aug: synthetic with augmentation
    • syn: synthetic without augmentation
    • golden-cross0-split{0,1,2,3,4}: golden dataset with cross-validation

Inference

export PRETRAINED_MODEL_DIR=huggingface_models/bart-large
export TRAINED_MODEL_DIR=trained_models/

DOMAIN=TODO

python -m semantic_parsing_with_constrained_lm.run_exp \
--config-name semantic_parsing_with_constrained_lm.configs.ltl_config \
--log-dir logs/ \
--model Bart \
--eval-split test-full \
--exp-names "ltl_Bart_test-full_${DOMAIN}_constrained_utterance_train-0"

The domain name is the same as the training step.

Cite

@article{pan2023data,
  title={Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification},
  author={Pan, Jiayi and Chou, Glen and Berenson, Dmitry},
  journal={arXiv preprint arXiv:2303.08006},
  year={2023}
}

About

The associated repo for paper "Data-Efficient Learning of Natural Language to Linear Temporal Logic Translators for Robot Task Specification"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published