Skip to content

Extend ImageBind to 3D Point Cloud domain: Point-Bind #67

@ZrrSkywalker

Description

@ZrrSkywalker

Thanks very much for releasing such insightful work!

We develop a project based on ImageBind by aligning 3D point cloud modality with image, text, and audio as Point-Bind. Our project exhibits four main characters:

  • Align 3D with ImageBind . With a joint embedding space, 3D objects can be aligned with their corresponding 2D images, textual descriptions, and audio.
  • 3D LLM via LLaMA-Adapter. In Multi-modal LLaMA-Adapter (ImageBind-LLM), we introduce an LLM following 3D instructions in Engish/中文.
  • 3D Zero-shot Classify/Seg/Det . Point-Bind achieves state-of-the-art performance for 3D zero-shot tasks, including classification, segmentation, and detection.
  • Embedding Arithmetic with 3D. We observe that 3D features from Point-Bind can be added with other modalities to compose their semantics.

The Multi-modality LLaMA-Adapter (ImageBind-LLM) with Point-Bind's 3D embeddings is as follows:
imagebind-llm
Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions