Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM error: Cannot cast between two non-generic address spaces #190

Closed
dpsanders opened this issue Jun 1, 2020 · 12 comments
Closed

LLVM error: Cannot cast between two non-generic address spaces #190

dpsanders opened this issue Jun 1, 2020 · 12 comments

Comments

@dpsanders
Copy link

julia> using CUDA, StaticArrays, IntervalArithmetic
julia> const M = SA[1.0 2; 3 4];
julia> f(x) = M * x;
julia> v = cu( [ IntervalBox(1..1, 1..1) ] );
1-element CuArray{IntervalBox{2,Float64},1,Nothing}:
 [1, 1] × [1, 1]

julia> f.(v)
ERROR: LLVM error: Cannot cast between two non-generic address spaces
Stacktrace:
 [1] handle_error(::Cstring) at /home/dpsanders/.julia/packages/LLVM/wQgrk/src/core/context.jl:103
 [2] macro expansion at /home/dpsanders/.julia/packages/LLVM/wQgrk/src/base.jl:18 [inlined]
 [3] LLVMTargetMachineEmitToMemoryBuffer at /home/dpsanders/.julia/packages/LLVM/wQgrk/lib/8.0/libLLVM_h.jl:3377 [inlined]
 [4] emit(::LLVM.TargetMachine, ::LLVM.Module, ::LLVM.API.LLVMCodeGenFileType) at /home/dpsanders/.julia/packages/LLVM/wQgrk/src/targetmachine.jl:42
 [5] mcgen at /home/dpsanders/.julia/dev/GPUCompiler/src/mcgen.jl:73 [inlined]
 [6] macro expansion at /home/dpsanders/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:245 [inlined]
 [7] macro expansion at /home/dpsanders/.julia/dev/GPUCompiler/src/driver.jl:204 [inlined]
 [8] macro expansion at /home/dpsanders/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:245 [inlined]
 [9] codegen(::Symbol, ::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, strip::Bool, strict::Bool) at /home/dpsanders/.julia/dev/GPUCompiler/src/driver.jl:200
 [10] compile(::Symbol, ::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, strip::Bool, strict::Bool) at /home/dpsanders/.julia/dev/GPUCompiler/src/driver.jl:36
 [11] _cufunction(::GPUCompiler.FunctionSpec{GPUArrays.var"#26#27",Tuple{CUDA.CuKernelContext,CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(f),Tuple{Base.Broadcast.Extruded{CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/dpsanders/.julia/dev/CUDA/src/compiler/execution.jl:308
 [12] _cufunction at /home/dpsanders/.julia/dev/CUDA/src/compiler/execution.jl:302 [inlined]
 [13] #77 at /home/dpsanders/.julia/dev/GPUCompiler/src/cache.jl:21 [inlined]
 [14] get!(::GPUCompiler.var"#77#78"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},typeof(CUDA._cufunction),GPUCompiler.FunctionSpec{GPUArrays.var"#26#27",Tuple{CUDA.CuKernelContext,CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(f),Tuple{Base.Broadcast.Extruded{CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}}, ::Dict{UInt64,Any}, ::UInt64) at ./dict.jl:452
 [15] macro expansion at ./lock.jl:183 [inlined]
 [16] check_cache(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{GPUArrays.var"#26#27",Tuple{CUDA.CuKernelContext,CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(f),Tuple{Base.Broadcast.Extruded{CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/dpsanders/.julia/dev/GPUCompiler/src/cache.jl:19
 [17] + at ./int.jl:53 [inlined]
 [18] hash_64_64 at ./hashing.jl:35 [inlined]
 [19] hash_uint64 at ./hashing.jl:62 [inlined]
 [20] hx at ./float.jl:568 [inlined]
 [21] hash at ./float.jl:571 [inlined]
 [22] cached_compilation(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{GPUArrays.var"#26#27",Tuple{CUDA.CuKernelContext,CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(f),Tuple{Base.Broadcast.Extruded{CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/dpsanders/.julia/dev/GPUCompiler/src/cache.jl:0
 [23] cached_compilation(::Function, ::GPUCompiler.FunctionSpec{GPUArrays.var"#26#27",Tuple{CUDA.CuKernelContext,CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(f),Tuple{Base.Broadcast.Extruded{CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}}}}}}, ::UInt64) at /home/dpsanders/.julia/dev/GPUCompiler/src/cache.jl:37
 [24] cufunction(::Function, ::Type; name::String, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/dpsanders/.julia/dev/CUDA/src/compiler/execution.jl:296
 [25] macro expansion at /home/dpsanders/.julia/dev/CUDA/src/compiler/execution.jl:108 [inlined]
 [26] gpu_call(::CUDA.CuArrayBackend, ::Function, ::Tuple{CuArray{IntervalBox{2,Float64},1,Nothing},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(f),Tuple{Base.Broadcast.Extruded{CuArray{IntervalBox{2,Float64},1,Nothing},Tuple{Bool},Tuple{Int64}}}}}, ::Int64; name::String) at /home/dpsanders/.julia/dev/CUDA/src/gpuarrays.jl:32
 [27] #gpu_call#1 at /home/dpsanders/.julia/packages/GPUArrays/HGtNV/src/device/execution.jl:61 [inlined]
 [28] copyto! at /home/dpsanders/.julia/packages/GPUArrays/HGtNV/src/host/broadcast.jl:63 [inlined]
 [29] copyto! at ./broadcast.jl:864 [inlined]
 [30] copy at ./broadcast.jl:840 [inlined]
 [31] materialize(::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1},Nothing,typeof(f),Tuple{CuArray{IntervalBox{2,Float64},1,Nothing}}}) at ./broadcast.jl:820
 [32] top-level scope at REPL[19]:1

The following works:

julia> f(M::S, x::T) where {S,T}  = M * x;
julia> f.(Ref(M), v)
1-element CuArray{IntervalBox{2,Float64},1,Nothing}:
 [3, 3] × [7, 7]
@TommyXR
Copy link

TommyXR commented Jun 4, 2020

I get the exact same error. It was working fine until I uninstalled some older versions of GCC I think (unless it's a coincidence). I haven't been able to fix it, removing the artifact, creating new environments, nothing works. Native CUDA tests run fine though.

I do not get this error when using the deprecated CuArrays package

@maleadt
Copy link
Member

maleadt commented Jun 4, 2020

I'm not seeing this, but instead:

ERROR: InvalidIRError: compiling kernel broadcast(CUDA.CuKernelContext, CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(f),Tuple{Base.Broadcast.Extruded{CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}}}}) resulted in invalid LLVM IR                                                                                                                                                                                                           
Reason: unsupported dynamic function invocation (call to mul_down(a::T, b::T) where T<:Union{Float32, Float64} in RoundingEmulator at /home/tim/Julia/depot/packages/RoundingEmulator/5KuEV/src/rounding.jl:252)
Stacktrace:
 [1] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/rounding.jl:129
 [2] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/rounding.jl:272
 [3] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/arithmetic.jl:101
 [4] macro expansion at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:50
 [5] _mul at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:37
 [6] * at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:8
 [7] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/multidim/arithmetic.jl:42
 [8] f at REPL[3]:1
 [9] _broadcast_getindex_evalf at broadcast.jl:631
 [10] _broadcast_getindex at broadcast.jl:604
 [11] getindex at broadcast.jl:564
 [12] #20 at /home/tim/Julia/pkg/GPUArrays/src/host/broadcast.jl:58
Reason: unsupported dynamic function invocation (call to mul_up(a::T, b::T) where T<:Union{Float32, Float64} in RoundingEmulator at /home/tim/Julia/depot/packages/RoundingEmulator/5KuEV/src/rounding.jl:206)
Stacktrace:
 [1] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/rounding.jl:129
 [2] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/rounding.jl:272
 [3] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/arithmetic.jl:101
 [4] macro expansion at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:50
 [5] _mul at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:37
 [6] * at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:8
 [7] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/multidim/arithmetic.jl:42
 [8] f at REPL[3]:1
 [9] _broadcast_getindex_evalf at broadcast.jl:631
 [10] _broadcast_getindex at broadcast.jl:604
 [11] getindex at broadcast.jl:564
 [12] #20 at /home/tim/Julia/pkg/GPUArrays/src/host/broadcast.jl:58
Reason: unsupported dynamic function invocation (call to mul_down(a::T, b::T) where T<:Union{Float32, Float64} in RoundingEmulator at /home/tim/Julia/depot/packages/RoundingEmulator/5KuEV/src/rounding.jl:252)
Stacktrace:
 [1] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/rounding.jl:129
 [2] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/rounding.jl:272
 [3] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/arithmetic.jl:103
 [4] macro expansion at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:50
 [5] _mul at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:37
 [6] * at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:8
 [7] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/multidim/arithmetic.jl:42
 [8] f at REPL[3]:1
 [9] _broadcast_getindex_evalf at broadcast.jl:631
 [10] _broadcast_getindex at broadcast.jl:604
 [11] getindex at broadcast.jl:564
 [12] #20 at /home/tim/Julia/pkg/GPUArrays/src/host/broadcast.jl:58
Reason: unsupported dynamic function invocation (call to mul_up(a::T, b::T) where T<:Union{Float32, Float64} in RoundingEmulator at /home/tim/Julia/depot/packages/RoundingEmulator/5KuEV/src/rounding.jl:206)
Stacktrace:
 [1] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/rounding.jl:129
 [2] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/rounding.jl:272
 [3] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/intervals/arithmetic.jl:103
 [4] macro expansion at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:50
 [5] _mul at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:37
 [6] * at /home/tim/Julia/depot/packages/StaticArrays/mlIi1/src/matrix_multiply.jl:8
 [7] * at /home/tim/Julia/depot/packages/IntervalArithmetic/DPeAx/src/multidim/arithmetic.jl:42
 [8] f at REPL[3]:1
 [9] _broadcast_getindex_evalf at broadcast.jl:631
 [10] _broadcast_getindex at broadcast.jl:604
 [11] getindex at broadcast.jl:564
 [12] #20 at /home/tim/Julia/pkg/GPUArrays/src/host/broadcast.jl:58

with Julia 1.3, 1,4 and both CUDA#master and released v0.1.

@dpsanders
Copy link
Author

Hmm, I just ran it in a clean environment and now I get

ERROR: InvalidIRError: compiling kernel broadcast(CUDA.CuKernelContext, CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(f),Tuple{Base.Broadcast.Extruded{CuDeviceArray{IntervalBox{2,Float64},1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}}}}) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to gpu_gc_pool_alloc)
Stacktrace:
 [1] * at /home/dpsanders/.julia/packages/IntervalArithmetic/DPeAx/src/intervals/arithmetic.jl:0

so that looks like it's my fault.

@dpsanders
Copy link
Author

Although I don't see anything in the code that looks suspicious.
I can't check in detail right now, will get back to this when I can.

@maleadt
Copy link
Member

maleadt commented Jun 4, 2020

Reason: unsupported call to an unknown function (call to gpu_gc_pool_alloc)

That one is fixed on CUDA#master. I guess it also includes the mul_down frames?

@maleadt
Copy link
Member

maleadt commented Jun 4, 2020

Just tested with CuArrays/CUDAnative, and also get these compilation failures (not the LLVM error).

@TommyXR do you have another reproducer?

@TommyXR
Copy link

TommyXR commented Jun 4, 2020

So with CuArrays, I get no problem. I can install the package, use functions like CuArrays.zeros and
operations on CuArrays.

With CUDA, on 0.1.0, I can create arrays, modify them with scalar indexing, but as soon as I try to make any broadcasting operation on them (like adding them with x .+ x, compare with x .> 4, ..., then it breaks with the exact same error as the original post.

I am on Julia 1.4.1. I tried using the CUDA master branch, but I get an unsatisfiable requirement, even with a clean environment.

ERROR: Unsatisfiable requirements detected for package GPUCompiler [61eb1bfa]:
 GPUCompiler [61eb1bfa] log:
 ├─possible versions are: [0.1.0, 0.2.0, 0.3.0] or uninstalled
 └─restricted to versions 0.4 by CUDA [052768ef] — no versions left
   └─CUDA [052768ef] log:
     ├─possible versions are: 1.0.0 or uninstalled
     └─CUDA [052768ef] is fixed to version 1.0.0

I tried installing GPUCompiler on branch master, but then I get even more unsatisfiable requirements with Adapt and GPUArrays

@maleadt
Copy link
Member

maleadt commented Jun 5, 2020

I am on Julia 1.4.1. I tried using the CUDA master branch, but I get an unsatisfiable requirement, even with a clean environment.

You need to use the Manifest.

With CUDA, on 0.1.0, I can create arrays, modify them with scalar indexing, but as soon as I try to make any broadcasting operation on them (like adding them with x .+ x, compare with x .> 4, ..., then it breaks with the exact same error as the original post.

That's strange, as we have CI for Julia 1.4 so that ought to work just fine. Please provide more details: platform, exact Julia version (where you got it from), a minimal reproducer, other package versions, etc.

@TommyXR
Copy link

TommyXR commented Jun 7, 2020

Sorry for the reply delay, I didn't have access to a computer yesterday.

Here are the specs of my machine:

  • OS: Debian Bullseye 64bit (multiarch enabled, but everything listed here is 64bits)

  • Kernel: 5.6.0-2-amd64

  • GPU: Nvidia RTX 2060 super

  • CPU: AMD Ryzen 2600

  • Nvidia driver: 440.82-2, from apt

  • Julia installation: 1.4.1+dfsg-1, from the main apt repository

  • CUDA version: From artifact, 10.2.89 (10.1 on my machine, working with raw CUDA and tensorflow-gpu)

  • LLVM version: 8 bundled with Julia, 8 and 9 on my machine

Using the CUDA#master manifest, you can just pull it and run using CUDA; CuArray(1:100) .+ 1 to get the error message copied below. Again, this doesn't happen when using the package CuArrays, (which uses the same CUDA artifact) everything seems to be working fine.

ERROR: LLVM error: Cannot cast between two non-generic address spaces
Stacktrace:
 [1] handle_error(::Cstring) at /home/tommyxr/.julia/packages/LLVM/wQgrk/src/core/context.jl:103
 [2] macro expansion at /home/tommyxr/.julia/packages/LLVM/wQgrk/src/base.jl:18 [inlined]
 [3] LLVMTargetMachineEmitToMemoryBuffer at /home/tommyxr/.julia/packages/LLVM/wQgrk/lib/8.0/libLLVM_h.jl:3377 [inlined]
 [4] emit(::LLVM.TargetMachine, ::LLVM.Module, ::LLVM.API.LLVMCodeGenFileType) at /home/tommyxr/.julia/packages/LLVM/wQgrk/src/targetmachine.jl:42
 [5] mcgen at /home/tommyxr/.julia/packages/GPUCompiler/3BNIj/src/mcgen.jl:73 [inlined]
 [6] macro expansion at /home/tommyxr/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:245 [inlined]
 [7] macro expansion at /home/tommyxr/.julia/packages/GPUCompiler/3BNIj/src/driver.jl:228 [inlined]
 [8] macro expansion at /home/tommyxr/.julia/packages/TimerOutputs/NvIUx/src/TimerOutput.jl:245 [inlined]
 [9] codegen(::Symbol, ::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, strip::Bool, validate::Bool) at /home/tommyxr/.julia/packages/GPUCompiler/3BNIj/src/driver.jl:224
 [10] compile(::Symbol, ::GPUCompiler.CompilerJob{GPUCompiler.PTXCompilerTarget,CUDA.CUDACompilerParams}; libraries::Bool, deferred_codegen::Bool, optimize::Bool, strip::Bool, validate::Bool) at /home/tommyxr/.julia/packages/GPUCompiler/3BNIj/src/driver.jl:36
 [11] compile at /home/tommyxr/.julia/packages/GPUCompiler/3BNIj/src/driver.jl:32 [inlined]
 [12] _cufunction(::GPUCompiler.FunctionSpec{GPUArrays.var"#20#21",Tuple{CUDA.CuKernelContext,CuDeviceArray{Int64,1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(+),Tuple{Base.Broadcast.Extruded{CuDeviceArray{Int64,1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}},Int64}}}}; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/tommyxr/.julia/packages/CUDA/bSp7v/src/compiler/execution.jl:308
 [13] _cufunction at /home/tommyxr/.julia/packages/CUDA/bSp7v/src/compiler/execution.jl:302 [inlined]
 [14] #85 at /home/tommyxr/.julia/packages/GPUCompiler/3BNIj/src/cache.jl:21 [inlined]
 [15] get!(::GPUCompiler.var"#85#86"{Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},typeof(CUDA._cufunction),GPUCompiler.FunctionSpec{GPUArrays.var"#20#21",Tuple{CUDA.CuKernelContext,CuDeviceArray{Int64,1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(+),Tuple{Base.Broadcast.Extruded{CuDeviceArray{Int64,1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}},Int64}}}}}, ::Dict{UInt64,Any}, ::UInt64) at ./dict.jl:452
 [16] macro expansion at ./lock.jl:183 [inlined]
 [17] check_cache(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{GPUArrays.var"#20#21",Tuple{CUDA.CuKernelContext,CuDeviceArray{Int64,1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(+),Tuple{Base.Broadcast.Extruded{CuDeviceArray{Int64,1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}},Int64}}}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/tommyxr/.julia/packages/GPUCompiler/3BNIj/src/cache.jl:19
 [18] + at ./int.jl:53 [inlined]
 [19] hash_64_64 at ./hashing.jl:35 [inlined]
 [20] hash_uint64 at ./hashing.jl:62 [inlined]
 [21] hx at ./float.jl:568 [inlined]
 [22] hash at ./float.jl:571 [inlined]
 [23] cached_compilation(::typeof(CUDA._cufunction), ::GPUCompiler.FunctionSpec{GPUArrays.var"#20#21",Tuple{CUDA.CuKernelContext,CuDeviceArray{Int64,1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(+),Tuple{Base.Broadcast.Extruded{CuDeviceArray{Int64,1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}},Int64}}}}, ::UInt64; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/tommyxr/.julia/packages/GPUCompiler/3BNIj/src/cache.jl:0
 [24] cached_compilation(::Function, ::GPUCompiler.FunctionSpec{GPUArrays.var"#20#21",Tuple{CUDA.CuKernelContext,CuDeviceArray{Int64,1,CUDA.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(+),Tuple{Base.Broadcast.Extruded{CuDeviceArray{Int64,1,CUDA.AS.Global},Tuple{Bool},Tuple{Int64}},Int64}}}}, ::UInt64) at /home/tommyxr/.julia/packages/GPUCompiler/3BNIj/src/cache.jl:37
 [25] cufunction(::Function, ::Type; name::String, kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/tommyxr/.julia/packages/CUDA/bSp7v/src/compiler/execution.jl:296
 [26] macro expansion at /home/tommyxr/.julia/packages/CUDA/bSp7v/src/compiler/execution.jl:108 [inlined]
 [27] gpu_call(::CUDA.CuArrayBackend, ::Function, ::Tuple{CuArray{Int64,1,Nothing},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(+),Tuple{Base.Broadcast.Extruded{CuArray{Int64,1,Nothing},Tuple{Bool},Tuple{Int64}},Int64}}}, ::Int64; name::String) at /home/tommyxr/.julia/packages/CUDA/bSp7v/src/gpuarrays.jl:32
 [28] #gpu_call#1 at /home/tommyxr/.julia/packages/GPUArrays/w0xGN/src/device/execution.jl:61 [inlined]
 [29] copyto! at /home/tommyxr/.julia/packages/GPUArrays/w0xGN/src/host/broadcast.jl:56 [inlined]
 [30] copyto! at ./broadcast.jl:864 [inlined]
 [31] copy at ./broadcast.jl:840 [inlined]
 [32] materialize(::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1},Nothing,typeof(+),Tuple{CuArray{Int64,1,Nothing},Int64}}) at ./broadcast.jl:820
 [33] top-level scope at REPL[7]:1

@maleadt
Copy link
Member

maleadt commented Jun 8, 2020

* Julia installation: 1.4.1+dfsg-1, from the main apt repository

Try with a binary Julia from https://julialang.org/downloads/

@TommyXR
Copy link

TommyXR commented Jun 8, 2020

So everything works fine with the 1.4.2 binary. Actually, it even works with the 1.4.1 one (which is the version I have with apt).

Looks like either apt has a broken package, either my julia install broke for whatever reason. I'll look it up on my side, but I guess it isn't related to CUDA, so the issue can probably be closed.

Thank you a lot for the help!

@maleadt
Copy link
Member

maleadt commented Jun 9, 2020

Our LLVM is heavily patched, and distributions often do not use exactly the same patchset, which might explain this error. Glad it got resolved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants