Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test runner stumbles over path separators #236

Closed
finmod opened this issue Jun 20, 2020 · 8 comments · Fixed by #241
Closed

Test runner stumbles over path separators #236

finmod opened this issue Jun 20, 2020 · 8 comments · Fixed by #241
Labels
bug Something isn't working tests Adds or changes tests.

Comments

@finmod
Copy link

finmod commented Jun 20, 2020

test CUDA fails with the following message:

 ptxas application ptx input, line 348; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas fatal   : Ptx assembly aborted due to errors
  Stacktrace:
   [1] CuModule(::String, ::Dict{CUDA.CUjit_option_enum,Any}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\lib\cuda\module.jl:40
   [2] _cufunction(::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:335
   [3] _cufunction at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:302 [inlined]
   [4] #75 at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:21 [inlined]
   [5] get!(::GPUCompiler.var"#75#76"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},typeof(CUDA._cufunction),GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}}, ::Dict{UInt64,Any}, ::UInt64) at .\dict.jl:452
   [6] macro expansion at .\lock.jl:183 [inlined]
   [7] check_cache(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:19
   [8] + at .\int.jl:53 [inlined]
   [9] hash_64_64 at .\hashing.jl:35 [inlined]
   [10] hash_uint64 at .\hashing.jl:62 [inlined]
   [11] hx at .\float.jl:568 [inlined]
   [12] hash at .\float.jl:571 [inlined]
   [13] cached_compilation(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:0
   [14] cached_compilation(::Function, ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:37
   [15] cufunction(::Function, ::Type; name::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:296
   [16] cufunction(::Function, ::Type{T} where T) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:291
   [17] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:108
   [18] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
   [19] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:227
   [20] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1186
   [21] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184
   [22] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1113
   [23] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184

Error in testset device\wmma:
Error During Test at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
  Test threw exception
  Expression: $(Expr(:escape, quote
    #= C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:228 =# @cuda threads = 32 kernel(a_dev, b_dev, c_dev, d_dev, alpha, beta)
    d = Array(d_dev)
    new_a = if a_layout == ColMajor
            a
        else
            transpose(a)
        end
    new_b = if b_layout == ColMajor
            b
        else
            transpose(b)
        end
    new_c = if c_layout == ColMajor
            c
        else
            transpose(c)
        end
    new_d = if d_layout == ColMajor
            d
        else
            transpose(d)
        end
    if do_mac
        all(isapprox.(alpha * new_a * new_b + beta * new_c, new_d; rtol = sqrt(eps(Float16))))
    else
        all(isapprox.(alpha * new_a * new_b, new_d; rtol = sqrt(eps(Float16))))
    end
end))
  CUDA error: a PTX JIT compilation failed (code 218, ERROR_INVALID_PTX)
  ptxas application ptx input, line 46; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 46; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 63; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 63; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 64; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 64; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 353; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 353; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 358; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 358; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas fatal   : Ptx assembly aborted due to errors
  Stacktrace:
   [1] CuModule(::String, ::Dict{CUDA.CUjit_option_enum,Any}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\lib\cuda\module.jl:40
   [2] _cufunction(::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:335
   [3] _cufunction at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:302 [inlined]
   [4] #75 at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:21 [inlined]
   [5] get!(::GPUCompiler.var"#75#76"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},typeof(CUDA._cufunction),GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}}, ::Dict{UInt64,Any}, ::UInt64) at .\dict.jl:452
   [6] macro expansion at .\lock.jl:183 [inlined]
   [7] check_cache(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:19
   [8] + at .\int.jl:53 [inlined]
   [9] hash_64_64 at .\hashing.jl:35 [inlined]
   [10] hash_uint64 at .\hashing.jl:62 [inlined]
   [11] hx at .\float.jl:568 [inlined]
   [12] hash at .\float.jl:571 [inlined]
   [13] cached_compilation(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:0
   [14] cached_compilation(::Function, ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:37
   [15] cufunction(::Function, ::Type; name::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:296
   [16] cufunction(::Function, ::Type{T} where T) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:291
   [17] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:108
   [18] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
   [19] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:227
   [20] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1186
   [21] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184
   [22] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1113
   [23] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184

Error in testset device\wmma:
Error During Test at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
  Test threw exception
  Expression: $(Expr(:escape, quote
    #= C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:228 =# @cuda threads = 32 kernel(a_dev, b_dev, c_dev, d_dev, alpha, beta)
    d = Array(d_dev)
    new_a = if a_layout == ColMajor
            a
        else
            transpose(a)
        end
    new_b = if b_layout == ColMajor
            b
        else
            transpose(b)
        end
    new_c = if c_layout == ColMajor
            c
        else
            transpose(c)
        end
    new_d = if d_layout == ColMajor
            d
        else
            transpose(d)
        end
    if do_mac
        all(isapprox.(alpha * new_a * new_b + beta * new_c, new_d; rtol = sqrt(eps(Float16))))
    else
        all(isapprox.(alpha * new_a * new_b, new_d; rtol = sqrt(eps(Float16))))
    end
end))
  CUDA error: a PTX JIT compilation failed (code 218, ERROR_INVALID_PTX)
  ptxas application ptx input, line 44; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 44; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 61; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 61; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 343; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 343; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 348; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 348; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas fatal   : Ptx assembly aborted due to errors
  Stacktrace:
   [1] CuModule(::String, ::Dict{CUDA.CUjit_option_enum,Any}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\lib\cuda\module.jl:40
   [2] _cufunction(::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:335
   [3] _cufunction at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:302 [inlined]
   [4] #75 at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:21 [inlined]
   [5] get!(::GPUCompiler.var"#75#76"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},typeof(CUDA._cufunction),GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}}, ::Dict{UInt64,Any}, ::UInt64) at .\dict.jl:452
   [6] macro expansion at .\lock.jl:183 [inlined]
   [7] check_cache(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:19
   [8] + at .\int.jl:53 [inlined]
   [9] hash_64_64 at .\hashing.jl:35 [inlined]
   [10] hash_uint64 at .\hashing.jl:62 [inlined]
   [11] hx at .\float.jl:568 [inlined]
   [12] hash at .\float.jl:571 [inlined]
   [13] cached_compilation(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:0
   [14] cached_compilation(::Function, ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:37
   [15] cufunction(::Function, ::Type; name::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:296
   [16] cufunction(::Function, ::Type{T} where T) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:291
   [17] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:108
   [18] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
   [19] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:227
   [20] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1186
   [21] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184
   [22] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1113
   [23] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184

ERROR: LoadError: Test run finished with errors
in expression starting at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\runtests.jl:402
ERROR: Package CUDA errored during testing

(@v1.4) pkg>
@maleadt
Copy link
Member

maleadt commented Jun 21, 2020

Could you offer some more details? Version of CUDA (CUDA.version())? Using artifacts? Which GPU?

@finmod
Copy link
Author

finmod commented Jun 21, 2020

using CUDA
CUDA.version()
v"10.1.0"
Annotation 2020-06-21 175927

@maleadt
Copy link
Member

maleadt commented Jun 22, 2020

Your first report mentions test failures for the device/wmma testset, while your second shows that these tests are skipped (Skipping the following tests: cutensor, device/wmma)... Are you reporting the same thing here?

Also include the actual errors, a screenshot like that isn't particularly useful.

@maleadt maleadt added the needs information Further information is requested label Jun 22, 2020
@finmod
Copy link
Author

finmod commented Jun 22, 2020

All these outputs follow from the same command >test CUDA. The top half of the report is the screenshot above showing that device\array and device\cuda fail. Then the output is as follows:

ptxas application ptx input, line 348; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas fatal   : Ptx assembly aborted due to errors
  Stacktrace:
   [1] CuModule(::String, ::Dict{CUDA.CUjit_option_enum,Any}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\lib\cuda\module.jl:40
   [2] _cufunction(::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:335
   [3] _cufunction at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:302 [inlined]
   [4] #75 at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:21 [inlined]
   [5] get!(::GPUCompiler.var"#75#76"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},typeof(CUDA._cufunction),GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}}, ::Dict{UInt64,Any}, ::UInt64) at .\dict.jl:452
   [6] macro expansion at .\lock.jl:183 [inlined]
   [7] check_cache(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:19
   [8] + at .\int.jl:53 [inlined]
   [9] hash_64_64 at .\hashing.jl:35 [inlined]
   [10] hash_uint64 at .\hashing.jl:62 [inlined]
   [11] hx at .\float.jl:568 [inlined]
   [12] hash at .\float.jl:571 [inlined]
   [13] cached_compilation(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:0
   [14] cached_compilation(::Function, ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:37
   [15] cufunction(::Function, ::Type; name::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:296
   [16] cufunction(::Function, ::Type{T} where T) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:291
   [17] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:108
   [18] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
   [19] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:227
   [20] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1186
   [21] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184
   [22] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1113
   [23] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184

Error in testset device\wmma:
Error During Test at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
  Test threw exception
  Expression: $(Expr(:escape, quote
    #= C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:228 =# @cuda threads = 32 kernel(a_dev, b_dev, c_dev, d_dev, alpha, beta)
    d = Array(d_dev)
    new_a = if a_layout == ColMajor
            a
        else
            transpose(a)
        end
    new_b = if b_layout == ColMajor
            b
        else
            transpose(b)
        end
    new_c = if c_layout == ColMajor
            c
        else
            transpose(c)
        end
    new_d = if d_layout == ColMajor
            d
        else
            transpose(d)
        end
    if do_mac
        all(isapprox.(alpha * new_a * new_b + beta * new_c, new_d; rtol = sqrt(eps(Float16))))
    else
        all(isapprox.(alpha * new_a * new_b, new_d; rtol = sqrt(eps(Float16))))
    end
end))
  CUDA error: a PTX JIT compilation failed (code 218, ERROR_INVALID_PTX)
  ptxas application ptx input, line 46; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 46; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 63; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 63; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 64; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 64; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 353; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 353; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 358; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 358; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas fatal   : Ptx assembly aborted due to errors
  Stacktrace:
   [1] CuModule(::String, ::Dict{CUDA.CUjit_option_enum,Any}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\lib\cuda\module.jl:40
   [2] _cufunction(::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:335
   [3] _cufunction at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:302 [inlined]
   [4] #75 at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:21 [inlined]
   [5] get!(::GPUCompiler.var"#75#76"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},typeof(CUDA._cufunction),GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}}, ::Dict{UInt64,Any}, ::UInt64) at .\dict.jl:452
   [6] macro expansion at .\lock.jl:183 [inlined]
   [7] check_cache(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:19
   [8] + at .\int.jl:53 [inlined]
   [9] hash_64_64 at .\hashing.jl:35 [inlined]
   [10] hash_uint64 at .\hashing.jl:62 [inlined]
   [11] hx at .\float.jl:568 [inlined]
   [12] hash at .\float.jl:571 [inlined]
   [13] cached_compilation(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:0
   [14] cached_compilation(::Function, ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:37
   [15] cufunction(::Function, ::Type; name::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:296
   [16] cufunction(::Function, ::Type{T} where T) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:291
   [17] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:108
   [18] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
   [19] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:227
   [20] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1186
   [21] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184
   [22] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1113
   [23] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184

Error in testset device\wmma:
Error During Test at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
  Test threw exception
  Expression: $(Expr(:escape, quote
    #= C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:228 =# @cuda threads = 32 kernel(a_dev, b_dev, c_dev, d_dev, alpha, beta)
    d = Array(d_dev)
    new_a = if a_layout == ColMajor
            a
        else
            transpose(a)
        end
    new_b = if b_layout == ColMajor
            b
        else
            transpose(b)
        end
    new_c = if c_layout == ColMajor
            c
        else
            transpose(c)
        end
    new_d = if d_layout == ColMajor
            d
        else
            transpose(d)
        end
    if do_mac
        all(isapprox.(alpha * new_a * new_b + beta * new_c, new_d; rtol = sqrt(eps(Float16))))
    else
        all(isapprox.(alpha * new_a * new_b, new_d; rtol = sqrt(eps(Float16))))
    end
end))
  CUDA error: a PTX JIT compilation failed (code 218, ERROR_INVALID_PTX)
  ptxas application ptx input, line 44; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 44; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 61; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 61; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 343; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 343; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas application ptx input, line 348; error   : Feature 'WMMA with floating point types' requires .target sm_70 or higher
  ptxas application ptx input, line 348; error   : Modifier '.m16n16k16' requires .target sm_70 or higher
  ptxas fatal   : Ptx assembly aborted due to errors
  Stacktrace:
   [1] CuModule(::String, ::Dict{CUDA.CUjit_option_enum,Any}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\lib\cuda\module.jl:40
   [2] _cufunction(::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:335
   [3] _cufunction at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:302 [inlined]
   [4] #75 at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:21 [inlined]
   [5] get!(::GPUCompiler.var"#75#76"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},typeof(CUDA._cufunction),GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}}, ::Dict{UInt64,Any}, ::UInt64) at .\dict.jl:452
   [6] macro expansion at .\lock.jl:183 [inlined]
   [7] check_cache(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:19
   [8] + at .\int.jl:53 [inlined]
   [9] hash_64_64 at .\hashing.jl:35 [inlined]
   [10] hash_uint64 at .\hashing.jl:62 [inlined]
   [11] hx at .\float.jl:568 [inlined]
   [12] hash at .\float.jl:571 [inlined]
   [13] cached_compilation(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:0
   [14] cached_compilation(::Function, ::GPUCompiler.FunctionSpec{typeof(kernel),Tuple{CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float16,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},CuDeviceArray{Float32,2,CUDA.AS.Global},Float16,Float32}}, ::UInt64) at C:\Users\Denis\.julia\packages\GPUCompiler\lqbF2\src\cache.jl:37
   [15] cufunction(::Function, ::Type; name::Nothing, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:296
   [16] cufunction(::Function, ::Type{T} where T) at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:291
   [17] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\src\compiler\execution.jl:108
   [18] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\setup.jl:196
   [19] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:227
   [20] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1186
   [21] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184
   [22] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.4\Test\src\Test.jl:1113
   [23] top-level scope at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\device\wmma.jl:184

ERROR: LoadError: Test run finished with errors
in expression starting at C:\Users\Denis\.julia\packages\CUDA\5t6R9\test\runtests.jl:402
ERROR: Package CUDA errored during testing

(@v1.4) pkg>

@maleadt
Copy link
Member

maleadt commented Jun 22, 2020

Please use triple backtick to quote output.

Does your system have a single GPU?

@maleadt maleadt changed the title Test CUDA fails Test runner stumbles over path separators Jun 22, 2020
@maleadt
Copy link
Member

maleadt commented Jun 22, 2020

Ah, now I see, this is Windows vs Linux, device/wmma not matching device\wmma...

@maleadt maleadt added bug Something isn't working tests Adds or changes tests. and removed needs information Further information is requested labels Jun 22, 2020
@finmod
Copy link
Author

finmod commented Jun 22, 2020

Most probably...but it must be a single occurrence in the code because other tests are fine.

@maleadt maleadt reopened this Jun 24, 2020
@maleadt
Copy link
Member

maleadt commented Aug 25, 2020

Should be fixed again.

@maleadt maleadt closed this as completed Aug 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working tests Adds or changes tests.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants