ViLaIn

An official implementation of Vision-Language Interpreter (ViLaIn). See our paper for more details.

Requirements

This implementation requires Python>=3.10 and torch>=2.0.0. To install PyTorch, please follow the instruction in https://pytorch.org/.
Install fast-downward and VAL following the build instructions. After the installation, copy the validate binary under the downward directory.
Install Grounding DINO following the instructions.

The ProDG dataset

data contains PDDL files, observations, and instructions for three domains, which we denoted the ProDG dataset in the paper. This directory also contains annotated bounding boxes in annotated_bboxes. The directory structure is as follows:

data
 └─domains
    └─domain.pddl                   (A PDDL domain file)
    └─problems                      (PDDL problem files)
       └─problem*.pddl
    └─observations                  (Observations for the initial state)
       └─problem*.jpg
    └─instructions                  (Linguistic instructions)
       └─problem*.txt
    └─annotated_bboxes              (Annotated bounding boxes)
       └─problem*.json

Results

results/reported_results contains the generated PDDL problems and found plans reported in the paper. In the directory, there are also three subdirectories for each domain:

plain: the results without corrective reprompting
refine_once: the results by applying corrective reprompting for the problems in plain
refine_twice: the results by applying corrective reprompting for the problems in refine_once

Getting Started

Detecting Objects and Generating Captions

To detect objects with bounding boxes and generate captions, run:

export domain=cooking
export grounding_dino_dir=./GroundingDINO
export result_dir=./results/temp/${domain}

python scripts/main.py \
    --data_dir "./data/${domain}" \
    --result_dir ${result_dir} \
    --grounding_dino_dir ${grounding_dino_dir} \
    --predict_bboxes

This step should be done prior to PDDL problem generation.

Generating PDDL Problems and Finding Plans

To generate PDDL problems based on the predicted bounding boxes and captions and find plans, run:

export domain=cooking
export downward_dir=./downward
export result_dir=./results/temp/${domain}
export num_repeat=2
export num_examples=3

python scripts/main.py \
    --downward_dir ${downward_dir} \
    --data_dir "./data/${domain}" \
    --result_dir "${result_dir}" \
    --num_repeat ${num_repeat} \
    --num_examples ${num_examples} \
    --gen_step "plain" \
    --generate_problem \
    --find_plan

Evaluating Generated PDDL Problems and Found Plans

To evaluate the generated PDDL problems and validate the found plans, run:

export domain=cooking
export downward_dir=./downward
export result_dir=./results/temp/${domain}
export num_repeat=2

python scripts/evaluate.py \
    --downward_dir ${downward_dir} \
    --data_dir "./data/${domain}" \
    --result_dir "${result_dir}" \
    --num_repeat ${num_repeat} \
    --gen_step "plain"

Refining Generated PDDL Problems

To refine the generated PDDL problems by corrective reprompting, run:

export domain=cooking
export downward_dir=./downward
export result_dir=./results/temp/${domain}
export num_repeat=2

python scripts/main.py \
    --downward_dir ${downward_dir} \
    --data_dir "./data/${domain}" \
    --result_dir "${result_dir}" \
    --num_repeat ${num_repeat} \
    --gen_step "refine_once" \
    --prev_gen_step "plain" \
    --refine_problem \
    --use_cot \
    --find_plan

Citation

@misc{shirai2023visionlanguage,
      title={Vision-Language Interpreter for Robot Task Planning}, 
      author={Keisuke Shirai and Cristian C. Beltran-Hernandez and Masashi Hamaya and Atsushi Hashimoto and Shohei Tanaka and Kento Kawaharazuka and Kazutoshi Tanaka and Yoshitaka Ushiku and Shinsuke Mori},
      year={2023},
      eprint={2311.00967},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
results/reported_results		results/reported_results
scripts		scripts
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

results/reported_results

results/reported_results

scripts

scripts

.gitignore

.gitignore

Dockerfile

Dockerfile

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

ViLaIn

Requirements

The ProDG dataset

Results

Getting Started

Detecting Objects and Generating Captions

Generating PDDL Problems and Finding Plans

Evaluating Generated PDDL Problems and Found Plans

Refining Generated PDDL Problems

Citation

About

Releases

Packages

Contributors 2

Languages

License

omron-sinicx/ViLaIn

Folders and files

Latest commit

History

Repository files navigation

ViLaIn

Requirements

The ProDG dataset

Results

Getting Started

Detecting Objects and Generating Captions

Generating PDDL Problems and Finding Plans

Evaluating Generated PDDL Problems and Found Plans

Refining Generated PDDL Problems

Citation

About

Resources

License

Stars

Watchers

Forks

Languages