Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong result with scatter += (Not a bug, just a misusing, found by Tullio.jl) #145

Closed
N5N3 opened this issue Aug 18, 2020 · 5 comments
Closed

Comments

@N5N3
Copy link

N5N3 commented Aug 18, 2020

using LoopVectorization
function func_avx()
    J = [1 1 1]
    A = [i for i in 1:10]
    B = ones(2,10)
    @avx for i in eachindex(J)
        for j in axes(B,2)
            B[J[i],j] += A[j]
        end
    end
    B
end
function func()
    J = [1 1 1]
    A = [i for i in 1:10]
    B = ones(2,10)
    for i in eachindex(J)
        for j in axes(B,2)
            B[J[i],j] += A[j]
        end
    end
    B
end

and the result

func() return
2×10 Array{Float64,2}:
 4.0  7.0  10.0  13.0  16.0  19.0  22.0  25.0  28.0  31.0
 1.0  1.0   1.0   1.0   1.0   1.0   1.0   1.0   1.0   1.0
func_avx() return
2×10 Array{Float64,2}:
 2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0  10.0  11.0
 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0   1.0   1.0

If I misusing somthing, close it directly.

@N5N3 N5N3 changed the title bug with scatter += ? bug with scatter += (Found by Tullio.jl)? Aug 18, 2020
@chriselrod
Copy link
Member

You can't write to the same place in parallel.

@chriselrod
Copy link
Member

@N5N3, as I said in the Tullio issue, I think this is a problem in documentation.

LoopVectorization's README states in the "Warning" section at the top:

  1. Are not relying on a specific execution order. @avx can and will re-order operations and loops inside its scope, so the correctness cannot depend on a particular order. You cannot implement cumsum with @avx.

Any suggestions on a better way to explain this so people can understand/expect this issue whenever length(unique(J)) != length(J)?

@N5N3
Copy link
Author

N5N3 commented Aug 18, 2020

It seems that i mis-understood simd before,

I thought the scattering += is a sum, as the re-order is ok.

By the way, if we can't write to the same place in parallel (and simd is a kind of parallel), why @avx is ok for sum?

@mcabbott
Copy link

Iterating in the wrong order would indeed be OK, but doing different steps in parallel is much worse, as several of them can start from the same old value, instead of from each other's outputs.

Summing can be done safely, it runs a number of separate accumulators in parallel, and then adds them at the end.

Is there at the moment a way to tell LoopVectorization to leave index i alone, but feel free to vectorise along index j?

@N5N3 N5N3 changed the title bug with scatter += (Found by Tullio.jl)? Wrong result with scatter += (Not a bug, just a misusing, found by Tullio.jl) Aug 18, 2020
@chriselrod
Copy link
Member

Is there at the moment a way to tell LoopVectorization to leave index i alone, but feel free to vectorise along index j?

I'll add that feature as I make changes in the next few weeks.
I'll give the users much better control over its behavior.

@N5N3 N5N3 closed this as completed Jul 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants