Skip to content

QingrongH/LLM-TPC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM-TPC

Code for the paper "Think-Program-reCtify: 3D Situated Reasoning with Large Language Models"

[Project Page] [Paper]

Install

conda create -n llm-tpc python=3.9 -y
conda activate llm-tpc
pip install openai==0.28 numpy scikit-learn matplotlib omegaconf torch torch_redstone einops tqdm open_clip_torch trimesh plyfile shapely
pip install dgl-cu113 -f https://data.dgl.ai/wheels/repo.html

Dataset

Organize the data as follows in data.

data
├── openshape
│   ├── model.pt
│   └── open_clip_pytorch_model.bin
├── qa
│   └── SQA_test.json
├── scans
│   ├── scene0000_00
│   │   ├── scene0000_00_vh_clean_2.0.010000.segs.json
│   │   ├── scene0000_00_vh_clean_2.labels.ply
│   │   ├── scene0000_00_vh_clean_2.ply
│   │   ├── scene0000_00.aggregation.json
│   │   └── scene0000_00.txt
│   └── ...
└── scannetv2-labels.combined.tsv

ScanNet

To acquire the access to ScanNet dataset, please refer to ScanNet and follow the instructions there. You will get a download-scannet.py script after your request for the ScanNet dataset is approved. Use the commands below to download the portion of ScanNet that is necessary for LLM-TPC:

python download-scannet.py -o data --type _vh_clean_2.0.010000.segs.json
python download-scannet.py -o data --type _vh_clean_2.labels.ply
python download-scannet.py -o data --type _vh_clean_2.ply
python download-scannet.py -o data --type .aggregation.json
python download-scannet.py -o data --type .txt

SQA3D

Download the question-answer pairs from SQA3D and put SQA_test.json under data/qa.

OpenShape

We use the pointbert-vitg14-rgb and OpenCLIP ViT-bigG-14 checkpoint from OpenShape. Download model.pt from here and open_clip_pytorch_model.bin from here. Put them under data/openshape.

Inference

cd scripts
# Input your OPENAI_API_KEY in 'llm-tpc/config.json'
python example.py --agent llm-tpc/config.json

Evaluation

cd scripts
python eval.py --log_dir ../logs/test/llm-tpc

Visualization

cd src/dataset
python visualize_bbox.py

Acknowledgement

  • Agents: the codebase we built upon.
  • ReferIt3D: we design APIs for spacial relation recognition based on ReferIt3D.
  • OpenShape: we design APIs for open-vocabulary object attribute classification based on OpenShape.
  • ScanRefer: code for visualization.

Citation

@article{qingrong2024llm-tpc,
  title={Think-Program-reCtify: 3D Situated Reasoning with Large Language Models},
  author={Qingrong He and Kejun Lin and Shizhe Chen and Anwen Hu and Qin Jin},
  journal={arXiv preprint arXiv:2404.14705},
  year={2024}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages