Text-to-CAD Generation with Chain-of-Thought and Geometric Reward
Accepted at NeurIPS 2025 (Poster)
CAD-Coder is a framework that generates executable CadQuery scripts from natural language descriptions, enabling direct 3D CAD model creation.
| Resource | Link |
|---|---|
| Paper | arXiv:2505.19713 |
| Dataset | gudo7208/CAD-Coder |
| Model Weights | gudo7208/CAD-Coder |
# Clone the repository
git clone https://github.com/gudo7208/CAD-Coder.git
cd CAD-Coder
# Create conda environment
conda create -n cadcoder python=3.10
conda activate cadcoder
# Install dependencies
pip install transformers
pip install "numpy<2.0" cadquery==2.3.1 # Optional: for code execution
pip install vllm # Optional: for batch inference"""CAD-Coder Inference Example"""
import re
import os
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = "gudo7208/CAD-Coder"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype="auto", device_map="auto")
# 测试数据 (来自 cad_data_test_cot.json 第一条)
prompt = """Please create a CadQuery Python code to generate a model based on the following description. The reasoning process MUST BE enclosed within <think> </think> tags. The final CadQuery code MUST BE put in \\boxed{```python
code
```} with ONLY the executable code inside the box, nothing else.The final model is represented by r.
In the <think> section, simulate the thought process of an engineer converting text descriptions into a CAD model. Follow these steps:
1. **Description Analysis**:
- Break down the description into different parts or components
- Identify key parameters for each part (coordinate systems, Euler angles, translation vectors)
- Understand the spatial relationships and assembly sequence between parts
2. **Coordinate System Planning**:
- Determine the coordinate systems used for each part
- Parse how Euler angle rotations and translation vectors are applied
- Ensure understanding of local-to-global coordinate system transformations
3. **Sketch Construction Strategy**:
- Analyze how to create each 2D sketch (loops, lines, points)
- Determine scaling factors for each sketch
- Plan how to transform sketches into 3D space
4. **Extrusion Operation Planning**:
- Identify extrusion parameters for each part (direction, distance)
- Understand how to add extrusions to existing solids (new or merge)
- Verify dimensions after extrusion match the description
5. **Code Implementation Strategy**:
- Plan the sequence of CadQuery operations
- Determine necessary CadQuery functions and methods
- Consider how to organize code for clarity and readability
6. **When scaling in CadQuery:**:
- Directly scale the size, define a scaling factor variable, and apply the scaling factor directly to all coordinates and dimensions
- Don't try to use a non-existent `.scale()` method on Workplane objects
After your thinking process, provide clean, working CadQuery Python code to create the 3D model.
Think step by step, but only keep a minimum draft for each thinking step, with 50 words at most.
description:
Start by creating a new coordinate system with Euler angles set to zero and a translation vector also set to zero. Next, draw a two-dimensional sketch on the first face. This sketch consists of a single loop made up of four lines. The first line starts at the origin (0.0, 0.0) and ends at (0.6, 0.0). The second line starts at (0.6, 0.0) and ends at (0.6, 0.375). The third line starts at (0.6, 0.375) and ends at (0.0, 0.375). Finally, the fourth line completes the loop by starting at (0.0, 0.375) and ending at the origin (0.0, 0.0). After drawing the sketch, apply a scale factor of 0.6 to the entire sketch. To transform the scaled two-dimensional sketch into a three-dimensional model, extrude the sketch 0.075 units along the normal direction. The final dimensions of the rectangular block are a length of 0.6 units, a width of 0.375 units, and a height of 0.075 units."""
def extract_code(response):
"""Extract Python code from response"""
# Try \boxed{```python ... ```}
match = re.search(r'\\boxed\{```python\n(.*?)```\}', response, re.DOTALL)
if match:
return match.group(1).strip()
# Try ```python ... ```
match = re.search(r'```python\n(.*?)```', response, re.DOTALL)
if match:
return match.group(1).strip()
return response
def execute_code(code, output_path="output.py"):
"""Save and execute CadQuery code"""
with open(output_path, "w", encoding="utf-8") as f:
f.write(code)
print(f"Code saved to {output_path}")
try:
exec_globals = {}
exec(code, exec_globals)
if 'r' in exec_globals:
print("Execution SUCCESS! Model 'r' created.")
return True, exec_globals['r']
else:
print("Execution completed but 'r' not found.")
return False, None
except Exception as e:
print(f"Execution FAILED: {e}")
return False, None
# Generate
text = tokenizer.apply_chat_template(
[{"role": "user", "content": prompt}],
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
print("=" * 50)
print("Model Response:")
print("=" * 50)
print(response)
# Extract and execute
print("\n" + "=" * 50)
print("Extracted Code:")
print("=" * 50)
code = extract_code(response)
print(code)
print("\n" + "=" * 50)
print("Execution Result:")
print("=" * 50)
success, result = execute_code(code)For large-scale inference, use the provided batch_inference.py script:
python batch_inference.py \
--model_path gudo7208/CAD-Coder \
--data_path cad_data_test_cot.json \
--output_dir ./output@article{guan2025cadcoder,
title={CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward},
author={Guan, Yandong and Wang, Xilin and Xing, Ximing and Zhang, Jing and Xu, Dong and Yu, Qian},
journal={arXiv preprint arXiv:2505.19713},
year={2025}
}This project is licensed under the Apache 2.0 License.