The issue about Audio to Image Generation

An amazing work!!! 

It's well known that https://github.com/lucidrains/DALLE2-pytorch and https://github.com/LAION-AI/dalle2-laion used open-clip as pretrianed text and image encoder. However, I have noticed that you used a private DALLE-2 to generate the image conditioned on audio. 

Whether is it possible to use open source DALLE-2 instea of private reimplemented counterpart? Does it have some problems with open source DALLE-2? I would appreciate if you can share experience.

In my view, If it was possible to use open source DALLE-2 to adapt the ImageBind, it could directly create some very interesting applications and increase the impact of this work!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The issue about Audio to Image Generation #40

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The issue about Audio to Image Generation #40

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions