🕙 ToDo

Yiyuan Zhang^1,2* Kaixiong Gong^1,2* Wanli Ouyang² Xiangyu Yue^1,†

¹ Multimedia Lab, The Chinese University of Hong Kong
² OpenGVLab，Shanghai AI Laboratory
^* Equal Contribution ^† Corresponding Author

Point as A Foreign Language, Let Large Language Models (LLMs) Perceive 3D Physical World as Reading Articles!

🌟 News

2023.7.31: Github Repository Initialization. The paper will be released very soon.

Motivation

We propose to utilize pretrained language models for point cloud understanding. Differernt from existing methods leveraging image as intermediate, we found that language models can read point clouds as a foreign language. Benefit from pretraining on the large-scalle corpus, language models performs better in long-tailed and out-of-distibution tasks in 3D vision area.

A Brief Summary

💡 For multimodal research, our method explores the underlying representation relationship between different modalities, specifically, language and 3D point cloud, and demonstrates that models pretrained on natural language can read 3D point clouds.
💡 For 3D vision research, our method performs end-to-end point cloud understanding without hand-crafted structure designs. And it also demonstrates the feasibility of using natural corpus text as pretraining data for 3D vision.
💡 For the vision-language area, our method experimentally validates that 3D point clouds and text can be encoded by the same parameters. A new promising direction appears for the tasks involving modality alignment between text and point clouds.
💡 With outstanding performance across benchmarks including ModelNet-40, S3DIS, and ShapeNetPart, our method demonstrates its effectiveness on both coarse-grained and fine-grained 3D point cloud tasks.

🕙 ToDo

Support Billion-scale Large Language Models.
Large Language Model with More Modalities.
Support Outdoor LiDAR Scenes.

✉️ Contact

If you are interested in this project, welcome to contribute to our project!

To contact us, you can send an email to yiyuanzhang.ai@gmail.com ,kaixionggong@gmail.com, or xyyue@ie.cuhk.edu.hk!

License

This project is released under the Apache 2.0 license.

Acknowledgement

This code is developed based on an excellent open-sourced project OpenPoints.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Point as A Foreign Language, Let Large Language Models (LLMs) Perceive 3D Physical World as Reading Articles!

🌟 News

Motivation

A Brief Summary

🕙 ToDo

✉️ Contact

License

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Point as A Foreign Language, Let Large Language Models (LLMs) Perceive 3D Physical World as Reading Articles!

🌟 News

Motivation

A Brief Summary

🕙 ToDo

✉️ Contact

License

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages