Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if statements & difficulties using matrices vs vectors #22

Closed
magerton opened this issue Jan 17, 2020 · 7 comments
Closed

if statements & difficulties using matrices vs vectors #22

magerton opened this issue Jan 17, 2020 · 7 comments

Comments

@magerton
Copy link

Hi @chriselrod, I've upgraded from @vectorize to @avx and am excited to use an officially registered package.

Looks like a change (maybe in SIMDPirates?) broke something for me -- the code below works unvectorized, but is broken when I add the @avx. Seems like there are 2 issues. First, it looks like if statements in an @avx loop don't work. Second, it appears that something goes wrong when I iterate over a matrix or a view, not a vector. @code_warntype suggests there are problems with type inference. See unvectorized, broken vectorized, and fixed vectorized code below.

I'm using julia 1.3.1 and

[[LoopVectorization]]
deps = ["LinearAlgebra", "MacroTools", "Parameters", "SIMDPirates", "SLEEFPirates", "VectorizationBase"]
git-tree-sha1 = "a8d158a4971113443269739f3cc51ae992e95a58"
uuid = "bdcacae8-1622-11e9-2a5c-532679323890"
version = "0.3.6"

[[SIMDPirates]]
deps = ["MacroTools", "VectorizationBase"]
git-tree-sha1 = "500294a8b1001bdda2483fc6d675956798ad8764"
uuid = "21efa798-c60a-11e8-04d3-e1a92915a26a"
version = "0.1.6"

[[SLEEFPirates]]
deps = ["SIMDPirates", "VectorizationBase"]
git-tree-sha1 = "4733445246d3d5536c7aee1bffb55ab37b88347b"
uuid = "476501e8-09a2-5ece-8869-fb82de89a1fa"
version = "0.1.3"

[[VectorizationBase]]
deps = ["CpuId", "LinearAlgebra"]
git-tree-sha1 = "a2576763aa20968ffb5668e2e15d45ae8e364d05"
uuid = "3d5dd08c-fd9d-11e8-17fa-ed2836048c2f"
version = "0.1.9"
using LoopVectorization
using Base: OneTo
using LinearAlgebra: stride1
using Test

add_1_dim(x::AbstractArray) = reshape(x, size(x)..., 1)
check_finite(x::AbstractArray) = all(isfinite.(x)) || throw(error("x not finite!"))

"Given k-dimensional array `x` where `n=size(x,k)`, compute multinomial logistic Pr(i ∈ 1:n | x[:, ..., :, k] )"
function softmax3!(q::AA, lse::A, tmpmax::A, x::AA, maxk=size(q, ndims(q)) ) where {T<:Real, A<:Array{T}, AA<:AbstractArray{T}}
    ndims(q) == 1+ndims(lse) || throw(DimensionMismatch())
    xsizes = size(x)
    xsizes == size(q) || throw(DimensionMismatch("size(x) = $(size(x)) but size(q) = $(size(q))"))
    nk = last(xsizes)
    for i = OneTo(ndims(lse))
        size(q,i) == size(lse,i) == size(tmpmax,i) || throw(DimensionMismatch("size(x) = $(size(x)),  size(lse) = $(size(lse)), and size(tmpmax) = $(size(tmpmax))"))
    end
    0 < maxk <= nk || throw(DomainError(maxk))
    1 == stride1(q) == stride1(x) || throw(error("Arrays not strided"))

    isempty(x) && throw(error("x empty"))
    check_finite(x)

    maximum!(add_1_dim(tmpmax), x)
    fill!(lse, zero(T))

    xx = reshape(x, :, nk)
    qq = reshape(q, :, nk)

    for k in OneTo(nk)
        for i in eachindex(lse)
            tmp = exp(xx[i,k] - tmpmax[i])
            lse[i] += tmp
            k <= maxk && (qq[i,k] = tmp)
        end
    end

    qq[:,OneTo(maxk)] ./= vec(lse)
end

function softmax3_avx_broken!(q::AA, lse::A, tmpmax::A, x::AA, maxk=size(q, ndims(q)) ) where {T<:Real, A<:Array{T}, AA<:AbstractArray{T}}
        ndims(q) == 1+ndims(lse) || throw(DimensionMismatch())
        xsizes = size(x)
        xsizes == size(q) || throw(DimensionMismatch("size(x) = $(size(x)) but size(q) = $(size(q))"))
        nk = last(xsizes)
        for i = OneTo(ndims(lse))
            size(q,i) == size(lse,i) == size(tmpmax,i) || throw(DimensionMismatch("size(x) = $(size(x)),  size(lse) = $(size(lse)), and size(tmpmax) = $(size(tmpmax))"))
        end
        0 < maxk <= nk || throw(DomainError(maxk))
        1 == stride1(q) == stride1(x) || throw(error("Arrays not strided"))

        isempty(x) && throw(error("x empty"))
        check_finite(x)

        maximum!(add_1_dim(tmpmax), x)
        fill!(lse, zero(T))

        xx = reshape(x, :, nk)
        qq = reshape(q, :, nk)

        for k in OneTo(maxk)
            @avx for i in eachindex(lse)
                tmp = exp(xx[i,k] - tmpmax[i])
                lse[i] += tmp
                # FIXME - would prefer to replace 2nd loop w/ if stmt
                # k <= maxk && (qq[i,k] = tmp)
                qq[i,k] = tmp
            end
        end

        for k in maxk+1:nk
            @avx for i in eachindex(lse)
                tmp = exp(xx[i,k] - tmpmax[i])
                lse[i] += tmp
            end
        end

        qq[:,OneTo(maxk)] ./= vec(lse)
end

function softmax3_avx_fixed!(q::AA, lse::A, tmpmax::A, x::AA, maxk=size(q, ndims(q)) ) where {T<:Real, A<:Array{T}, AA<:AbstractArray{T}}
        ndims(q) == 1+ndims(lse) || throw(DimensionMismatch())
        xsizes = size(x)
        xsizes == size(q) || throw(DimensionMismatch("size(x) = $(size(x)) but size(q) = $(size(q))"))
        nk = last(xsizes)
        for i = OneTo(ndims(lse))
            size(q,i) == size(lse,i) == size(tmpmax,i) || throw(DimensionMismatch("size(x) = $(size(x)),  size(lse) = $(size(lse)), and size(tmpmax) = $(size(tmpmax))"))
        end
        0 < maxk <= nk || throw(DomainError(maxk))
        1 == stride1(q) == stride1(x) || throw(error("Arrays not strided"))

        isempty(x) && throw(error("x empty"))
        check_finite(x)

        maximum!(add_1_dim(tmpmax), x)
        fill!(lse, zero(T))

        xx = reshape(x, :, nk)
        qq = reshape(q, :, nk)
        tmpmaxvec = vec(tmpmax)  # Always required??
        lsevec = vec(lse)        # Always required??

        for k in OneTo(maxk)
            qk = view(qq, :, k) # required if using a View
            xk = view(xx, :, k) # required if using a View
            @avx for i in eachindex(lsevec)
                tmp = exp(xk[i] - tmpmaxvec[i])
                lsevec[i] += tmp
                qk[i] = tmp
            end
        end

        for k in maxk+1:nk
            qk = view(qq, :, k)
            xk = view(xx, :, k)
            @avx for i in eachindex(lsevec)
                tmp = exp(xk[i] - tmpmaxvec[i])
                lsevec[i] += tmp
            end
        end

        qq[:,OneTo(maxk)] ./= vec(lse)
end


ni, nj, nk = (100, 100, 10)
x = rand(ni, nj, nk)
q = similar(x0)
tmpmax = zeros(ni,nj)
lse = similar(tmpmax)

for f! in (softmax3!, softmax3_avx_fixed!, softmax3_avx_broken!)
    @testset "$(f!) with arrays" begin
        fill!(q,0)
        fill!(lse,0)
        @show @code_warntype f!(q, lse, tmpmax, x, 1)
        f!(q, lse, tmpmax, x, 1)
        @test all(sum(q; dims=3) .<= 1)
        fill!(q,0)
        fill!(lse,0)
        f!(q, lse, tmpmax, x)
        @test sum(q; dims=3)  ones(ni,nj)
    end

    @testset "$(f!) with views" begin
        nkm1 = nk-1
        @views qvw = q[:,:,1:nkm1]
        @views xvw = x[:,:,1:nkm1]
        fill!(q,0)
        fill!(lse,0)
        @show @code_warntype f!(qvw, lse, tmpmax, xvw, 1)
        f!(qvw, lse, tmpmax, xvw, 1)
        @test all(sum(qvw; dims=3) .<= 1)
        fill!(q,0)
        fill!(lse,0)
        f!(qvw, lse, tmpmax, xvw)
        @test sum(qvw; dims=3)  ones(ni,nj)
    end
end

breakage is of 2 forms

The vector/matrix issue:

#= D:\libraries\julia\dev\ShaleDrillingLikelihood\example\softmax3test.jl:136 =# @code_warntype(f!(q, lse, tmpmax, x, 1)) = nothing
softmax3_avx_broken! with arrays: Error During Test at D:\libraries\julia\dev\ShaleDrillingLikelihood\example\softmax3test.jl:133
  Got exception outside of a @test
  AssertionError: Ni == N + 1
  Stacktrace:
   [1] packed_strided_ptr_index(::Core.SimpleVector, ::Int64, ::Type{Float64}) at D:\libraries\julia\packages\SIMDPirates\nxDqU\src\memory.jl:996
   [2] #s28#117(::Any, ::Any, ::Any, ::Any, ::Any, ::Any) at D:\libraries\julia\packages\SIMDPirates\nxDqU\src\memory.jl:1020
   [3] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any,N} where N) at .\boot.jl:524
   [4] macro expansion at .\gcutils.jl:91 [inlined]
   [5] softmax3_avx_broken!(::Array{Float64,3}, ::Array{Float64,2}, ::Array{Float64,2}, ::Array{Float64,3}, ::Int64) at D:\libraries\julia\dev\ShaleDrillingLikelihood\example\softmax3test.jl:62
   [6] top-level scope at D:\libraries\julia\dev\ShaleDrillingLikelihood\example\softmax3test.jl:137
   [7] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.3\Test\src\Test.jl:1107
   [8] top-level scope at D:\libraries\julia\dev\ShaleDrillingLikelihood\example\softmax3test.jl:134
   [9] include_string(::Module, ::String, ::String) at .\loading.jl:1075
   [10] include_string(::Module, ::String, ::String, ::Int64) at D:\libraries\julia\packages\CodeTools\sf1Tz\src\eval.jl:30
   [11] (::Atom.var"#127#132"{String,Int64,String,Bool})() at D:\libraries\julia\packages\Atom\X8fAI\src\eval.jl:94
   [12] withpath(::Atom.var"#127#132"{String,Int64,String,Bool}, ::String) at D:\libraries\julia\packages\CodeTools\sf1Tz\src\utils.jl:30
   [13] withpath(::Function, ::String) at D:\libraries\julia\packages\Atom\X8fAI\src\eval.jl:47
   [14] #126 at D:\libraries\julia\packages\Atom\X8fAI\src\eval.jl:93 [inlined]
   [15] with_logstate(::Atom.var"#126#131"{String,Int64,String,Bool}, ::Base.CoreLogging.LogState) at .\logging.jl:395
   [16] with_logger at .\logging.jl:491 [inlined]
   [17] #125 at D:\libraries\julia\packages\Atom\X8fAI\src\eval.jl:92 [inlined]
   [18] hideprompt(::Atom.var"#125#130"{String,Int64,String,Bool}) at D:\libraries\julia\packages\Atom\X8fAI\src\repl.jl:85
   [19] macro expansion at D:\libraries\julia\packages\Atom\X8fAI\src\eval.jl:91 [inlined]
   [20] macro expansion at D:\libraries\julia\packages\Media\ItEPc\src\dynamic.jl:24 [inlined]
   [21] (::Atom.var"#124#129")(::Dict{String,Any}) at D:\libraries\julia\packages\Atom\X8fAI\src\eval.jl:86
   [22] handlemsg(::Dict{String,Any}, ::Dict{String,Any}) at D:\libraries\julia\packages\Atom\X8fAI\src\comm.jl:164
   [23] (::Atom.var"#19#21"{Array{Any,1}})() at .\task.jl:333

and

LoadError: "Don't know how to handle expression:\nk <= maxk && (qq[i, k] = tmp)"
in expression starting at D:\libraries\julia\dev\ShaleDrillingLikelihood\example\softmax3test.jl:62
push!(::LoopVectorization.LoopSet, ::Expr, ::Int64) at graphs.jl:821
add_block!(::LoopVectorization.LoopSet, ::Expr, ::Int64) at graphs.jl:280
add_loop!(::LoopVectorization.LoopSet, ::Expr, ::Int64) at graphs.jl:347
add_loop! at graphs.jl:344 [inlined]
copyto! at constructors.jl:6 [inlined]
LoopVectorization.LoopSet(::Expr) at constructors.jl:46
@avx(::LineNumberNode, ::Module, ::Any) at constructors.jl:86
@magerton
Copy link
Author

Attaching file with tests (just pull off the .txt required by Github to upload)

softmax3test.jl.txt

@chriselrod
Copy link
Member

I'm working on these.

As a note

                # FIXME - would prefer to replace 2nd loop w/ if stmt
                # k <= maxk && (qq[i,k] = tmp)

Once it works, it'll probably be slower than the 2 loops, because it will evaluate the store on every iteration (but it'll apply a mask).

@magerton
Copy link
Author

Thanks, @chriselrod!

Good to know about the if statement & two loops -- I'll keep them separate then.

@chriselrod
Copy link
Member

I'm now testing basic conditionals and your softmax3 functions, based on the second (previously broken) version.

However, that FIXME wont work yet, because the library doesn't have much support for using the iterating variable for anything other than indexing.
(Also, FWIW, the 1 versions, with @avx on the inner loop, like you wrote, got slightly better performance.)

@magerton
Copy link
Author

Oh, fantastic! Thanks, @chriselrod!

@chriselrod
Copy link
Member

Closed because adding the store to the same loop has now been fixed and added to the testsuite.

@magerton
Copy link
Author

Thanks, @chriselrod!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants