LLM-TPC

Code for the paper "Think-Program-reCtify: 3D Situated Reasoning with Large Language Models"

Install

conda create -n llm-tpc python=3.9 -y
conda activate llm-tpc
pip install openai==0.28 numpy scikit-learn matplotlib omegaconf torch torch_redstone einops tqdm open_clip_torch trimesh plyfile shapely
pip install dgl-cu113 -f https://data.dgl.ai/wheels/repo.html

Dataset

Organize the data as follows in data.

data
├── openshape
│   ├── model.pt
│   └── open_clip_pytorch_model.bin
├── qa
│   └── SQA_test.json
├── scans
│   ├── scene0000_00
│   │   ├── scene0000_00_vh_clean_2.0.010000.segs.json
│   │   ├── scene0000_00_vh_clean_2.labels.ply
│   │   ├── scene0000_00_vh_clean_2.ply
│   │   ├── scene0000_00.aggregation.json
│   │   └── scene0000_00.txt
│   └── ...
└── scannetv2-labels.combined.tsv

ScanNet

To acquire the access to ScanNet dataset, please refer to ScanNet and follow the instructions there. You will get a download-scannet.py script after your request for the ScanNet dataset is approved. Use the commands below to download the portion of ScanNet that is necessary for LLM-TPC:

python download-scannet.py -o data --type _vh_clean_2.0.010000.segs.json
python download-scannet.py -o data --type _vh_clean_2.labels.ply
python download-scannet.py -o data --type _vh_clean_2.ply
python download-scannet.py -o data --type .aggregation.json
python download-scannet.py -o data --type .txt

SQA3D

Download the question-answer pairs from SQA3D and put SQA_test.json under data/qa.

OpenShape

We use the pointbert-vitg14-rgb and OpenCLIP ViT-bigG-14 checkpoint from OpenShape. Download model.pt from here and open_clip_pytorch_model.bin from here. Put them under data/openshape.

Inference

cd scripts
# Input your OPENAI_API_KEY in 'llm-tpc/config.json'
python example.py --agent llm-tpc/config.json

Evaluation

cd scripts
python eval.py --log_dir ../logs/test/llm-tpc

Visualization

cd src/dataset
python visualize_bbox.py

Acknowledgement

Agents: the codebase we built upon.
ReferIt3D: we design APIs for spacial relation recognition based on ReferIt3D.
OpenShape: we design APIs for open-vocabulary object attribute classification based on OpenShape.
ScanRefer: code for visualization.

Citation

@article{qingrong2024llm-tpc,
  title={Think-Program-reCtify: 3D Situated Reasoning with Large Language Models},
  author={Qingrong He and Kejun Lin and Shizhe Chen and Anwen Hu and Qin Jin},
  journal={arXiv preprint arXiv:2404.14705},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
docs		docs
scripts		scripts
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

docs

docs

scripts

scripts

src

src

README.md

README.md

Repository files navigation

LLM-TPC

Install

Dataset

ScanNet

SQA3D

OpenShape

Inference

Evaluation

Visualization

Acknowledgement

Citation

About

Releases

Packages

Languages

QingrongH/LLM-TPC

Folders and files

Latest commit

History

Repository files navigation

LLM-TPC

Install

Dataset

ScanNet

SQA3D

OpenShape

Inference

Evaluation

Visualization

Acknowledgement

Citation

About

Resources

Stars

Watchers

Forks

Languages