-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong result with scatter += (Not a bug, just a misusing, found by Tullio.jl) #145
Comments
You can't write to the same place in parallel. |
@N5N3, as I said in the Tullio issue, I think this is a problem in documentation. LoopVectorization's README states in the "Warning" section at the top:
Any suggestions on a better way to explain this so people can understand/expect this issue whenever |
It seems that i mis-understood simd before, I thought the scattering By the way, if we can't write to the same place in parallel (and simd is a kind of parallel), why @avx is ok for sum? |
Iterating in the wrong order would indeed be OK, but doing different steps in parallel is much worse, as several of them can start from the same old value, instead of from each other's outputs. Summing can be done safely, it runs a number of separate accumulators in parallel, and then adds them at the end. Is there at the moment a way to tell LoopVectorization to leave index |
I'll add that feature as I make changes in the next few weeks. |
and the result
If I misusing somthing, close it directly.
The text was updated successfully, but these errors were encountered: