Skip to content

Conversation

@ArDiouscuros
Copy link
Contributor

@Any-Winter-4079
Any idea how to test it?

@Any-Winter-4079
Copy link
Contributor

Any-Winter-4079 commented Oct 7, 2022

@ArDiouscuros Do you have an M1?
If not you have some examples from commands and outputs here https://github.com/invoke-ai/InvokeAI/blob/development/docs/help/SAMPLER_CONVERGENCE.md (plus an interesting read imho)

@ArDiouscuros
Copy link
Contributor Author

@Any-Winter-4079
No I don't have M1.
So test on Windows \ CUDA shows that I could reproduce M1 results on Cuda system.
I hope it could be useful.

image

@Any-Winter-4079
Copy link
Contributor

Awesome!

@Any-Winter-4079
Copy link
Contributor

Any-Winter-4079 commented Oct 7, 2022

"brown corgi, clean background, flat design:1.0" -s 20 -S 637276129 -W 448 -H 640 -C 7.5 -A ddim
Run on MPS machine
Screenshot 2022-10-08 at 00 32 21
(So you can definitely reproduce MPS images on CUDA!)

@tildebyte tildebyte force-pushed the mps-compatible-noise branch from 89c91c5 to 0b7ca6a Compare October 9, 2022 16:25
@Birch-san
Copy link

this should be called "use CPU noise", right? the thing that's special about random on MPS is that it doesn't support seeds, so we fallback to CPU.

@Any-Winter-4079
Copy link
Contributor

Any-Winter-4079 commented Oct 9, 2022

@Birch-san could we mimic cuda seed behavior on our CPU?
If we knew how seeds/RNG behave on cuda, maybe we could mimic it and make images reproducible? Or is that not possible ?

@ArDiouscuros
Copy link
Contributor Author

@Birch-san could we mimic cuda seed behavior on our CPU? If we knew how seeds behave on cuda, maybe we could mimic it and make images reproducible? Or is that not possible

@Any-Winter-4079
The main issue is CPU/CUDA random number generators have different algorithms. And if we know the CUDA implementation of RNG then we could port it to CPU. But currently nobody does, and torch team just closing the issues about different rand results with the same speed b.c. "it is expected".

@Vargol
Copy link
Contributor

Vargol commented Oct 10, 2022

As far as I can tell whenever any project has wanted consistent random values across platforms they've always fallen back to using their 'own' random number algorithm rather than one built into the OS (or it this case CUDA API).

Tends to be a variation on the Mersenne Twister

Actually check the Wikipedia page for it.
https://en.wikipedia.org/wiki/Mersenne_Twister
it lists a fair few languages / standard libraries that use it by default

As an aside, II did actually have a play with replacing the noise function with one based on perlins noise, just wanted to see the affect, but I didn't get anything out of it. I was probably building the tensor wrong as I don't really get how a 5 dimension tensor fits together.

@Any-Winter-4079
Copy link
Contributor

Any-Winter-4079 commented Oct 10, 2022

On https://numba.pydata.org/numba-doc/0.33.0/cuda/random.html it says:

The xoroshiro128+ algorithm has a period of 2**128 - 1, which is shorter than the period of the XORWOW algorithm used by default in cuRAND

Is cuRAND the library in question and if so, maybe XORWOW is what they use?
They point to that algorithm too here, but it's from 2011 https://forums.developer.nvidia.com/t/what-algorithm-does-curand-use/21720

Also https://en.wikipedia.org/wiki/Xorshift

xorwow
Marsaglia suggested scrambling the output by combining it with a simple additive counter modulo 232 (which he calls a "Weyl sequence" after Weyl's equidistribution theorem). This also increases the period by a factor of 232, to 2192−232:

This performs well, but fails a few tests in BigCrush.[6] This generator is the default in Nvidia's CUDA toolkit.[7]

@Birch-san
Copy link

could we mimic cuda seed behavior on our CPU?

fun idea. dunno how. how about asking CUDA users to join us on CPU seed? 😛

@Any-Winter-4079
Copy link
Contributor

Any-Winter-4079 commented Oct 11, 2022

@ArDiouscuros should we rename it to CPU noise as Birch San says?
Let me know what you think and I'll test to merge it if you request my review (and we can discuss about cuda images on mps on another issue).

@ArDiouscuros
Copy link
Contributor Author

@ArDiouscuros should we rename it to CPU noise as Birch San says? Let me know what you think and I'll test to merge it if you request my review (and we can discuss about cuda images on mps on another issue).

@Any-Winter-4079 sure, why not.
But I have no buttons here to request a review.
At least I am not aware how to do such request...

@lstein
Copy link
Collaborator

lstein commented Oct 15, 2022

Hi, just reviewed the code and it looks fine. I'll run on my system later today and merge unless I notice anything amiss.

@mauwii
Copy link
Contributor

mauwii commented Oct 15, 2022

@lstein, @ArDiouscuros and @Any-Winter-4079: If you need someone with M1 to test things, feel free to mention me. I have a MacBook Air M1 2020 with 16GB RAM and 8GPU Cores. I would also have a MacBook Pro available from work, but since it is also "just" a 2020 Model with M1, it is not really different from my MacBook Air (since it already has the 8GPU cores, only difference would be the cooling system which is passive on MacBook Air models and Active on MacBook Pro 🙈 )

Copy link
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving this, but since it is been lingering for a while, there are a bunch of conflicts that need to be resolved. I will attempt to do this..

@lstein lstein merged commit 5e8d1ca into invoke-ai:development Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants