CadQuery Code Generator

Everything is explained in the good_luck.ipynb file.

CadQuery Code Generator

This project implements a simple, fully custom image-to-code architecture to generate CadQuery scripts from mechanical CAD images using a CNN + LSTM model. It was developed for a technical challenge, with the objective of being simple, efficient, and explainable.

Objective

Generate CadQuery Python code that recreates the geometry shown in a CAD image.

Model Architecture

The solution follows an encoder-decoder design:

Encoder (CnnEncoder):
- A lightweight convolutional neural network that reduces a 128x128 image to a 512-dimensional feature vector.
- Acts as a visual extractor for the geometry.
Decoder (CodeLSTMDecoder):
- A 2-layer LSTM conditioned on the image features.
- Trained to generate valid Python code token-by-token using GPT2 tokenization.

        Image                     Tokens
       (RGB)                     (Code)
         |                          |
    [ CnnEncoder ]             [ LSTM Decoder ]
         |                          |
      Features   -->     "import cadquery..."

Tokenizer: GPT2 tokenizer is used for code, as it handles indentation and Python-specific tokens well. pad_token is explicitly set to eos_token for compatibility.

Training Strategy

Training done from scratch on the full CADCODER/GenCAD-Code dataset (~147k samples).
Model trained for only 10 epochs on a high-end NVIDIA L4 GPU-equipped AWS EC2 instance.

Device: NVIDIA L4 (16 GB VRAM)
CUDA Version: 12.8
Training time: ~7h
Memory usage: ~16.7 GB

Loss curve

Initial loss: 5.77 Final loss after 10 epochs: ~0.15 The loss decreased consistently, even with a simple model.

Link to Pretrained Model

You can download the pretrained model weights here:
Download from Google Drive

Evaluation

We evaluate two metrics:

Valid Syntax Rate (VSR): using evaluate_syntax_rate_simple.
- Checks if the generated code runs without error.
IOU (Shape Similarity): using get_iou_best.
- Compares the generated shape with the ground truth.

Due to limited compute resources, I were only able to evaluate a subset of the test set. On a representative example (sampleIdx = 840), the results were:

Valid Syntax Rate: 1.00
IOU (shape similarity): 0.64

While the syntax validity is good, the overall IOU remains relatively low. This indicates that the model often generates code that compiles, but the resulting shape is still far from accurate. These results are not yet satisfying, and further improvements are needed, especially in structure consistency and geometry alignment. Additionally, the model frequently struggles to correctly generate the initial import cadquery as cq statement. To ensure the generated code is syntactically valid and executable, this import line is automatically corrected in the evaluation script when missing or malformed.

How to Run

1. Train the model

python train.py

This will:

Load the dataset
Initialize model from scratch
Train for 30 epochs (or fewer)
Save weights in weights/ folder

No virtualenv configuration provided, but using one is recommended for reproducibility.

2. Run evaluation

python eval.py

This loads one sample, runs inference, and prints:

Ground truth code
Generated code
Syntax validity rate
IOU (3D shape comparison)

Design Choices

Component	Reason
GPT2 Tokenizer	Handles Python-like syntax well; avoids whitespace errors
CNN Encoder	Simple and lightweight, avoids heavy ViTs for faster prototyping
LSTM Decoder	Easy to train on limited GPU; interpretable code output
No pretrained	Full training from scratch to control all layers and demonstrate skill

Possible Improvements

Use a pretrained image encoder (e.g., ResNet18 or ViT) to improve feature extraction
Replace LSTM with Transformer decoder (e.g., GPT2)
Improve training workflow with callbacks, logging ...
Add support for exporting the model and tokenizer to ONNX for deployment and inference optimization
Test different image sizes, max token lengths, and tokenization strategies

Conclusion

This project shows a functional baseline for CAD-to-code generation, using simple components and no external training tricks. The model is small, explainable, and performs reasonably well with only 10 epochs.

With more time and compute, we could easily scale this approach to achieve high performances.

Author: R.Choukri Date: June 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Everything is explained in the good_luck.ipynb file.

CadQuery Code Generator

Objective

Model Architecture

Training Strategy

Loss curve

Link to Pretrained Model

Evaluation

How to Run

1. Train the model

2. Run evaluation

Design Choices

Possible Improvements

Conclusion

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dataset		dataset
model		model
weights		weights
.gitignore		.gitignore
README.md		README.md
eval.py		eval.py
good_luck.ipynb		good_luck.ipynb
train.py		train.py

RChoukri03/CadQuery_Code_Generator

Folders and files

Latest commit

History

Repository files navigation

Everything is explained in the good_luck.ipynb file.

CadQuery Code Generator

Objective

Model Architecture

Training Strategy

Loss curve

Link to Pretrained Model

Evaluation

How to Run

1. Train the model

2. Run evaluation

Design Choices

Possible Improvements

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages