Skip to content

ljwwwiop/GeoX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LiAuto-GeoX Logo

LiAuto-GeoX: Efficient Grounded Driving Transformer

Project Page arXiv Hugging Face

Jiawei Lian1,2,*, Haoyi Sun2,*, Yang Wu1,*, Lifu Mu2,*, Siyuan Wang1,2, Le Hui3,4,†, Ning Mao2,†,‡, Tao Wei2, Pan Zhou2, Kun Zhan2, Jian Yang1,†

1🎓 Nanjing University of Science and Technology    2🏢 Li Auto Inc.    3🎓 Northwestern Polytechnical University   
4🎓 Department of Computing, The Hong Kong Polytechnic University

*Equal Contribution    ✉️ Corresponding Author    Project Leader


Release Plan

  • [✓] Paper Release
  • [✓] LiAuto-GeoX Weight
  • [✓] Inference Instructions
  • GeoX-Large
  • Data Processing
  • Training Pipeline

Pretrained Models

Before using the models, please request access to the checkpoints once they are released.
All released models will be evaluated under the same protocol as reported in the paper.

Model Parameters Input Setting Download
LiAuto-GeoX 0.15B Surround-view / Video 🤗 Hugging Face
LiAuto-GeoX-Teacher 1.1B Surround-view -

Inference

Setup

Install the required dependencies:

pip install -r requirements.txt

Usage Examples

Single Frame Example - Basic inference with RGB images:

CUDA_VISIBLE_DEVICES=2 python inference.py \
    --image_folder /path/to/your/images \
    --port 8082

RGB + Sky Mask Example - Filter out sky regions for cleaner reconstruction:

CUDA_VISIBLE_DEVICES=2 python inference.py \
    --image_folder /path/to/your/images \
    --port 8082 \
    --mask_sky

RGB + Pose Example - Use ground truth camera poses for better accuracy:

CUDA_VISIBLE_DEVICES=2 python inference.py \
    --image_folder /path/to/your/images \
    --camera_folder /path/to/your/cameras \
    --port 8083

After running inference, open your browser and navigate to http://localhost:PORT (replace PORT with your specified port) to visualize the 3D reconstruction results interactively.

Additional Options:

  • --conf_threshold: Adjust the confidence threshold (default: 10.0) to filter low-confidence points. Lower values show more points, higher values show fewer but more confident points.
  • --mask_black_bg: Filter out black background pixels
  • --mask_white_bg: Filter out white background pixels
  • --save_glb: Export the reconstruction as a GLB file

Acknowledgements

Thanks to these great repositories: DINOv2, CUT3R, VGGT, DA3, PI3, DVGT, OmniVGGT, FastVGGT, LiteVGGT, SparseWorld-TC, and many other inspiring works in the community.


Citation

If you find LiAuto-GeoX useful for your work, please cite:

@article{lian2026geox,
  author    = {Lian, Jiawei and Sun, Haoyi and Wu, Yang and Mu, Lifu and Wang, Siyuan and Wei, Tao and Hui, Le and Mao, Ning and Zhou, Pan and Zhan, Kun and Yang, Jian},
  title     = {LiAuto-GeoX: Efficient Grounded Driving Transformer},
  journal   = {arXiv:2606.05774},
  year      = {2026},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages