-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to generate image from Image+Text? #4
Comments
I don't have time to implement it now, you could refer to Anything2Image/anything2image/api.py Line 76 in 681958d
The stable-diffusion-unclip we used take two condition, (1) prompt (2) clip image embedding. When we replace the clip image embedding with imagebind embedding, we could achieve anything2image. The prompt in api.py refer to the prompt mentioned before. The text refer to the text imagebind embedding, which will replace the image embedding and feed into the diffusion model. |
Thanks! |
Sorry for bothering you again.
but not temperature scaled.
Thank you for your time! |
It is obtained via test and trial. I didn't dive into the theory too much due to the limitation of time. |
Oh, I see. |
Hi.
Thanks for the great work you have provided.
In the readme I saw that there are several supported tasks:
Audio to Image
Audio+Text to Image
Audio+Image to Image
Image to Image
Text to Image
Thermal to Image
Depth to Image: Coming soon.
I am new to this type of applications, so I was wondering if it is possible to generate and image from image +text? For example, given an image of a dog and the text "pink flowers" I would like to generate an image that contains a dog and pink flowers.
If so, could you provide the code for an example? I was looking at the code in the
api.py
and I am a bit confused of the use of theprompt
andtext
. Moreover, do I need to normalize the embeddings of the image and text before summing them together, or should I need to normalize the summed embedding?I greatly appreciate your help.
Thanks.
The text was updated successfully, but these errors were encountered: