Skip to content

Xqle/PointLLM-R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PointLLM-R: Enhancing 3D Point Cloud Reasoning via Chain-of-Thought

Chaoqi Chen*  Qile Xu*  Wenjun Zhou  Hui Huang
Visual Computing Research Center (VCC), Shenzhen University
*Equal contribution   Corresponding author

     



About

This is the official repository for the ACM SIGGRAPH 2026 paper "PointLLM-R: Enhancing 3D Point Cloud Reasoning via Chain-of-Thought".

Existing 3D multimodal large language models (MLLMs) produce answers via direct question-to-answer mapping, without explicit intermediate reasoning — making their outputs uninterpretable and brittle on complex queries involving multi-hop inference, functional attribute judgment, or commonsense integration.

We address this by proposing a two-stage data generation framework that automatically produces high-quality Chain-of-Thought (CoT) annotations for 3D point cloud QA. The first stage refines raw QA pairs through multi-dimensional quality assessment. The second stage employs HiLPO (Human-in-the-Loop Prompt Optimization), an iterative prompt refinement mechanism that guides an LLM to generate structured, step-by-step rationales grounded in 3D geometry. Using this pipeline, we construct PoCoTI, the first large-scale 3D point cloud dataset with CoT annotations (~55K samples). Fine-tuning PointLLM on PoCoTI yields PointLLM-R-7B, a model that generates verifiable reasoning paths before answering, outperforming all baselines — including 13B-parameter models — on generative 3D classification and captioning benchmarks.

This repo is built on top of PointLLM. Installation, environment setup, Objaverse data preparation, inference, and standard evaluation all follow PointLLM's instructions. This README focuses on what is new: the PoCoTI dataset, the fine-tuning procedure, and differences in evaluation.


Installation

Follow the PointLLM installation guide. Requirements are identical.

pip install -e .

Data Preparation

Objaverse Point Clouds

Follow PointLLM's data preparation instructions to download the Objaverse colored point cloud files. The default expected path is ./data/objaverse_data.

PoCoTI Dataset

PoCoTI contains ~55K point cloud QA pairs, each with a structured 5-step CoT annotation. Download from HuggingFace:

huggingface-cli download QileXu/PoCoTI-55K \
    --repo-type dataset \
    --local-dir ./data/anno_data

Place the file at ./data/anno_data/PoCoTI_55k.json. Each entry has the following structure:

{
  "object_id": "<objaverse_object_id>",
  "conversation_type": "single_round",
  "conversations": [
    {
      "from": "human",
      "value": "<point>\n<question>"
    },
    {
      "from": "gpt",
      "value": "<REASONING>\nStep 1: ...\nStep 2: ...\nStep 3: ...\nStep 4: ...\nStep 5: ...\n</REASONING>\n<ANSWER> ... </ANSWER>"
    }
  ]
}

object_id maps to the Objaverse point cloud files in ./data/objaverse_data.


Fine-tuning PointLLM-R

bash scripts/finetune_CoT.sh

Pre-trained Model

To skip fine-tuning, download PointLLM-R-7B directly:

huggingface-cli download QileXu/PointLLM-R-7B

Evaluation

Evaluation scripts and metrics follow PointLLM. Three benchmarks are supported:

Benchmark Script Split GT Annotations
Objaverse (classification + captioning) scripts/eval/objaverse.sh 3,000 val objects Provided by PointLLM
ModelNet40 (zero-shot classification) scripts/eval/modelnet40_cls.sh 2,468 test objects Provided by PointLLM
OmniObject3D (zero-shot classification) scripts/eval/omniobject3d.sh 5,989 val objects QileXu/OmniObject3D_brief_description_val_GT

Download the OmniObject3D GT annotations:

huggingface-cli download QileXu/OmniObject3D_brief_description_val_GT \
    --repo-type dataset \
    --local-dir ./data/anno_data

The judge is set via --gpt_type in the eval scripts.

To evaluate on Objaverse captioning:

# In scripts/eval/objaverse.sh, set:
# MODEL_VERSION=QileXu/PointLLM-R-7B
# PROMPT_INDEX=2        (2 = captioning, 0/1 = classification)
bash scripts/eval/objaverse.sh

Interactive Chat

python pointllm/eval/PointLLM_chat.py \
    --model_path QileXu/PointLLM-R-7B

After startup, the script prompts you to enter an object ID (from ./data/objaverse_data) and then enter your questions interactively. Enter q to quit, or exit to end the current object's conversation and switch to a new one.

About

[SIGGRAPH 2026] PointLLM-R: Enhancing 3D Point Cloud Reasoning via Chain-of-Thought

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors