Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable Diffusion fails to produce images with bad "Shape" above 80 or so characters prompt input #1839

Closed
groovybits opened this issue Mar 12, 2024 · 3 comments

Comments

@groovybits
Copy link

Stable Diffusion seems to not be able to do more than 80 characters input for the example. I cannot see where it allows more tokens generated, which I suspect is wrong?

I get this as output on input above 80 characters somewhere, this triggers the issue...

 Running `target/debug/examples/stable-diffusion --prompt '1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120' --height 512 --width 512 --n-steps 1 --tracing`
Running with prompt "1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120".
Building the Clip transformer.
Error: shape mismatch in broadcast_add, lhs: [1, 254, 1024], rhs: [1, 77, 1024]

I am digging more into it myself but posting here to see if this is a base stable diffusion in candle bug or a candle example bug, and if a quick easy fix is known once seen by devs?

Thanks!

@LaurentMazare
Copy link
Collaborator

(sorry for the late reply) The 77 limit is actually an underlying issue of stable diffusion, see huggingface/diffusers#2136 for example for more details.
I'll tweak things so that there is a more explicit error message. Maybe it would be possible to chunk the prompt in batches of 77 and combine things later on but I feel that it may add a bit too much complexity for a niche use case.

@groovybits
Copy link
Author

(sorry for the late reply) The 77 limit is actually an underlying issue of stable diffusion, see huggingface/diffusers#2136 for example for more details. I'll tweak things so that there is a more explicit error message. Maybe it would be possible to chunk the prompt in batches of 77 and combine things later on but I feel that it may add a bit too much complexity for a niche use case.

Ah thank you! That makes sense, I guess they do this in the python ones. Yes that will help, it is hard to know what is going on and possibly need to know how to truncate it so doesn't go over since it also seems tricky to know exactly how many tokens it will be that I put int to know it is above 77.

@groovybits
Copy link
Author

Closing this since I am happy and see now to limit the tokens. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants