How to use ImageBind to generate image or audio? #42

NateDong72 · 2023-05-12T08:41:14Z

I can run the example code. But how to run the model to generate the some images and audio?

SoftologyPro · 2023-05-13T05:01:40Z

Agreed. How can you guys spend all that time training the model and writing the paper and setting up the demo website and not spend a few hours giving working example scripts to show us how to use it?

echo-lalia · 2023-05-13T06:00:26Z

I don't think the model can actually generate those things; I think it just 'translates' the information from one form to another. I think it'll have to be built into an extension for SD-WebUI or something, in order to let us play with it more easily.

WilTay1 · 2023-05-13T06:46:20Z

I don't think the model can actually generate those things; I think it just 'translates' the information from one form to another. I think it'll have to be built into an extension for SD-WebUI or something, in order to let us play with it more easily.

But the model can be downloaded and loaded in the script.

bakachan19 · 2023-05-15T13:18:22Z

I am also interested in this. Any news?
Also, how can you retrieve an image based on image and audio/text? I am referring to the embedding space arithmetic examples in Figure 4 in the paper.
Do you just sum the image embeddings with the audio/text embedding and perform cosine similarity with all the image embeddings and get the most similar image?
Thanks!

ikuinen · 2023-05-16T05:36:20Z

I am also interested in this. Any news? Also, how can you retrieve an image based on image and audio/text? I am referring to the embedding space arithmetic examples in Figure 4 in the paper. Do you just sum the image embeddings with the audio/text embedding and perform cosine similarity with all the image embeddings and get the most similar image? Thanks!

We made a quick attempt: https://github.com/sail-sg/BindDiffusion

Zeqiang-Lai · 2023-05-16T10:40:41Z

See also Anything2Image and InternGPT, it is implemented with Diffusers.

SoftologyPro · 2023-05-17T23:32:26Z

See also Anything2Image , it is implemented with Diffusers.

This works well with a nice gradio GUI interface.

ChloeL19 · 2023-05-23T12:30:52Z

I'm rather new to diffusion, but does Imagebind provide any sort of decoder? I thought it was just training an encoder, and if that's the case how are these diffusion methods working?

Zeqiang-Lai · 2023-05-23T12:46:34Z

I'm rather new to diffusion, but does Imagebind provide any sort of decoder? I thought it was just training an encoder, and if that's the case how are these diffusion methods working?

Maybe this could help Zeqiang-Lai/Anything2Image#4

celster · 2023-07-04T13:12:14Z

I'm rather new to diffusion, but does Imagebind provide any sort of decoder? I thought it was just training an encoder, and if that's the case how are these diffusion methods working?

Maybe this could help Zeqiang-Lai/Anything2Image#4

This is great!!
I'm also looking for "Image+Text --> Image". For example, take a photo and ask to perform some augmentation to the person on the photo (e.g. makeup).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use ImageBind to generate image or audio? #42

How to use ImageBind to generate image or audio? #42

NateDong72 commented May 12, 2023

SoftologyPro commented May 13, 2023

echo-lalia commented May 13, 2023

WilTay1 commented May 13, 2023

bakachan19 commented May 15, 2023 •

edited

Loading

ikuinen commented May 16, 2023

Zeqiang-Lai commented May 16, 2023 •

edited

Loading

SoftologyPro commented May 17, 2023

ChloeL19 commented May 23, 2023

Zeqiang-Lai commented May 23, 2023

celster commented Jul 4, 2023

How to use ImageBind to generate image or audio? #42

How to use ImageBind to generate image or audio? #42

Comments

NateDong72 commented May 12, 2023

SoftologyPro commented May 13, 2023

echo-lalia commented May 13, 2023

WilTay1 commented May 13, 2023

bakachan19 commented May 15, 2023 • edited Loading

ikuinen commented May 16, 2023

Zeqiang-Lai commented May 16, 2023 • edited Loading

SoftologyPro commented May 17, 2023

ChloeL19 commented May 23, 2023

Zeqiang-Lai commented May 23, 2023

celster commented Jul 4, 2023

bakachan19 commented May 15, 2023 •

edited

Loading

Zeqiang-Lai commented May 16, 2023 •

edited

Loading