LiAuto-GeoX: Efficient Grounded Driving Transformer

Jiawei Lian^1,2,*, Haoyi Sun^2,*, Yang Wu^1,*, Lifu Mu^2,*, Siyuan Wang^1,2, Le Hui^3,4,†, Ning Mao^2,†,‡, Tao Wei², Pan Zhou², Kun Zhan², Jian Yang^1,†

¹🎓 Nanjing University of Science and Technology ²🏢 Li Auto Inc. ³🎓 Northwestern Polytechnical University
⁴🎓 Department of Computing, The Hong Kong Polytechnic University

^*Equal Contribution ^†✉️ Corresponding Author ^‡Project Leader

Release Plan

[✓] Paper Release
[✓] LiAuto-GeoX Weight
[✓] Inference Instructions
GeoX-Large
Data Processing
Training Pipeline

Pretrained Models

Before using the models, please request access to the checkpoints once they are released.
All released models will be evaluated under the same protocol as reported in the paper.

Model	Parameters	Input Setting	Download
LiAuto-GeoX	0.15B	Surround-view / Video	🤗 Hugging Face
LiAuto-GeoX-Teacher	1.1B	Surround-view	-

Inference

Setup

Install the required dependencies:

pip install -r requirements.txt

Usage Examples

Single Frame Example - Basic inference with RGB images:

CUDA_VISIBLE_DEVICES=2 python inference.py \
    --image_folder /path/to/your/images \
    --port 8082

RGB + Sky Mask Example - Filter out sky regions for cleaner reconstruction:

CUDA_VISIBLE_DEVICES=2 python inference.py \
    --image_folder /path/to/your/images \
    --port 8082 \
    --mask_sky

RGB + Pose Example - Use ground truth camera poses for better accuracy:

CUDA_VISIBLE_DEVICES=2 python inference.py \
    --image_folder /path/to/your/images \
    --camera_folder /path/to/your/cameras \
    --port 8083

After running inference, open your browser and navigate to http://localhost:PORT (replace PORT with your specified port) to visualize the 3D reconstruction results interactively.

Additional Options:

--conf_threshold: Adjust the confidence threshold (default: 10.0) to filter low-confidence points. Lower values show more points, higher values show fewer but more confident points.
--mask_black_bg: Filter out black background pixels
--mask_white_bg: Filter out white background pixels
--save_glb: Export the reconstruction as a GLB file

Acknowledgements

Thanks to these great repositories: DINOv2, CUT3R, VGGT, DA3, PI3, DVGT, OmniVGGT, FastVGGT, LiteVGGT, SparseWorld-TC, and many other inspiring works in the community.

Citation

If you find LiAuto-GeoX useful for your work, please cite:

@article{lian2026geox,
  author    = {Lian, Jiawei and Sun, Haoyi and Wu, Yang and Mu, Lifu and Wang, Siyuan and Wei, Tao and Hui, Le and Mao, Ning and Zhou, Pan and Zhan, Kun and Yang, Jian},
  title     = {LiAuto-GeoX: Efficient Grounded Driving Transformer},
  journal   = {arXiv:2606.05774},
  year      = {2026},
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assert		assert
configs		configs
exp		exp
geox		geox
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt
visual_util.py		visual_util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiAuto-GeoX: Efficient Grounded Driving Transformer

Release Plan

Pretrained Models

Inference

Setup

Usage Examples

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LiAuto-GeoX: Efficient Grounded Driving Transformer

Release Plan

Pretrained Models

Inference

Setup

Usage Examples

Acknowledgements

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages