Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same seed yields different results on GPU #5

Open
Quasimondo opened this issue May 28, 2020 · 7 comments
Open

Same seed yields different results on GPU #5

Quasimondo opened this issue May 28, 2020 · 7 comments

Comments

@Quasimondo
Copy link

Running texturizer with the same seed results in slightly different results when repeated. I would expect the same seed to generated the identical texture every time. Here are three examples:

crop1_gen
crop1_gen_a
crop1_gen_b

@Quasimondo
Copy link
Author

Looks like this is caused by some non-deterministic CUDA functions - since when running the same seed on the CPU the results are identical.

From https://pytorch.org/docs/stable/notes/randomness.html:

There are some PyTorch functions that use CUDA functions that can be a source of nondeterminism. One class of such CUDA functions are atomic operations, in particular atomicAdd, which can lead to the order of additions being nondetermnistic. Because floating-point addition is not perfectly associative for floating-point operands, atomicAdd with floating-point operands can introduce different floating-point rounding errors on each evaluation, which introduces a source of nondeterministic variance (aka noise) in the result.

@htoyryla
Copy link

Try torch.backends.cudnn.deterministic = True ?

I use this all the time, appears to save memory.

@Quasimondo
Copy link
Author

Ah, sorry I didn't mention that of course I had tried all the usual settings mentioned in the docs. It does not help.

@htoyryla
Copy link

Oh yes... good to know.

@alexjc
Copy link
Collaborator

alexjc commented May 28, 2020

It's good to know the CPU results are identical, that rules out issues & bugs directly in this library ;-)

It's not clear to me what is causing the non-determinism on GPU. torch.nn.Conv2d should become deterministic with torch.backends.cudnn.deterministic = True, but the other candidate in the encoder is only torch.nn.AvgPool2d — which has no warnings in the documentation about determinism.

It may be that the L-BFGS optimizer itself also introduces some non-determinism. Testing a plain SGD optimizer would rule that out...

@Quasimondo
Copy link
Author

Quasimondo commented May 28, 2020

I had first suspected the LBFGS solver and tentatively replaced it with Adam. The results look like crap, but they also differ when run on CUDA with the same seed.

I forgot to quote the most important line from that "Reprocibility" link above:
**

"There is currently no simple way of avoiding nondeterminism in these functions."

**

It's a pity since I was trying to do some seed interpolation animations as well as a sliding window test for getting larger textures than fit in the GPU, but both of them do not work with CUDA since they change on every iteration.

@alexjc
Copy link
Collaborator

alexjc commented May 28, 2020

The interpolation would not be deterministic anyway, as soon as you change the content of the starting tensor, it would introduce different numbers and hence instability.
https://arxiv.org/abs/1705.02092v1

As for larger textures, I have a variety of code for tiling but not with gram matrices yet... I will see what's possible!

@alexjc alexjc changed the title Same seed yields different results Same seed yields different results on GPU May 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants