Extend ImageBind to 3D Point Cloud domain: Point-Bind

Thanks very much for releasing such insightful work!

We develop a project based on ImageBind by aligning 3D point cloud modality with image, text, and audio as [Point-Bind](https://github.com/ZrrSkywalker/Point-Bind). Our project exhibits four main characters:
- **Align 3D with ImageBind .** With a joint embedding space, 3D objects can be aligned with their corresponding 2D images, textual descriptions, and audio.
- **3D LLM via LLaMA-Adapter.** In [Multi-modal LLaMA-Adapter](https://github.com/ZrrSkywalker/LLaMA-Adapter/tree/main/imagebind_LLM) (ImageBind-LLM), we introduce an LLM following 3D instructions in Engish/中文.
- **3D Zero-shot Classify/Seg/Det .** Point-Bind achieves state-of-the-art performance for 3D zero-shot tasks, including classification, segmentation, and detection.
- **Embedding Arithmetic with 3D.** We observe that 3D features from Point-Bind can be added with other modalities to compose their semantics.

The Multi-modality LLaMA-Adapter (ImageBind-LLM) with Point-Bind's 3D embeddings is as follows:
<img width="995" alt="imagebind-llm" src="https://github.com/facebookresearch/ImageBind/assets/54577425/3b1ae782-43b6-4819-97eb-90b69e080280">
Thanks!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Extend ImageBind to 3D Point Cloud domain: Point-Bind #67

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Extend ImageBind to 3D Point Cloud domain: Point-Bind #67

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions