Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add math ops (including broadcast) for half types #581

Closed
DrChainsaw opened this issue Nov 29, 2020 · 2 comments
Closed

Add math ops (including broadcast) for half types #581

DrChainsaw opened this issue Nov 29, 2020 · 2 comments
Labels
enhancement New feature or request

Comments

@DrChainsaw
Copy link

Math ops like sqrt and log does not seem to be implemented for half precision (e.g. Float16 and BFloat16):

julia> sqrt.(cu(Float16[1,2]))
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`
└ @ GPUArrays E:\Programs\julia\.julia\packages\GPUArrays\jhRU7\src\host\indexing.jl:43
ERROR: MethodError: no method matching sqrt(::Float16)
You may have intended to import Base.sqrt
Closest candidates are:
  sqrt(::Float32) at E:\Programs\julia\.julia\packages\CUDA\YeS8q\src\device\intrinsics\math.jl:193
  sqrt(::Float64) at E:\Programs\julia\.julia\packages\CUDA\YeS8q\src\device\intrinsics\math.jl:192
Stacktrace:
 [1] _broadcast_getindex_evalf at .\broadcast.jl:648 [inlined]
 [2] _broadcast_getindex at .\broadcast.jl:621 [inlined]      
 [3] getindex at .\broadcast.jl:575 [inlined]
 [4] copy at .\broadcast.jl:876 [inlined]
 [5] materialize(::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1},Nothing,typeof(CUDA.sqrt),Tuple{CuArray{Float16,1}}}) at .\broadcast.jl:837
 [6] top-level scope at REPL[2]:1

julia> sqrt.(cu(CUDA.BFloat16s.BFloat16[1,2]))
┌ Warning: Performing scalar operations on GPU arrays: This is very slow, consider disallowing these operations with `allowscalar(false)`
└ @ GPUArrays E:\Programs\julia\.julia\packages\GPUArrays\jhRU7\src\host\indexing.jl:43
ERROR: MethodError: no method matching sqrt(::BFloat16s.BFloat16)
You may have intended to import Base.sqrt
Closest candidates are:
  sqrt(::Float32) at E:\Programs\julia\.julia\packages\CUDA\YeS8q\src\device\intrinsics\math.jl:193
  sqrt(::Float64) at E:\Programs\julia\.julia\packages\CUDA\YeS8q\src\device\intrinsics\math.jl:192
Stacktrace:
 [1] _broadcast_getindex_evalf at .\broadcast.jl:648 [inlined]
 [2] _broadcast_getindex at .\broadcast.jl:621 [inlined]
 [3] getindex at .\broadcast.jl:575 [inlined]
 [4] copy at .\broadcast.jl:876 [inlined]
 [5] materialize(::Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1},Nothing,typeof(CUDA.sqrt),Tuple{CuArray{BFloat16s.BFloat16,1}}}) at .\broadcast.jl:837
 [6] top-level scope at REPL[10]:1

Describe the solution you'd like
That correct values are returned.

Describe alternatives you've considered

I tried my luck in just defining @inline CUDA.sqrt(x::Float16) = ccall("extern hsqrt", llvmcall, Float16, (Float16,), x) after finding this doc, but it seems something more is required (maybe the h-file is not included?):

julia> sqrt.(cu(Float16[1,2]))
ERROR: InvalidIRError: compiling kernel broadcast_kernel(CUDA.CuKernelContext, CuDeviceArray{Float16,1,1}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(CUDA.sqrt),Tuple{Base.Broadcast.Extruded{CuDeviceArray{Float16,1,1},Tuple{Bool},Tuple{Int64}}}}, Int64) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to hsqrt)

Additional context

I might be able to submit a PR if I get some pointers as to what is required. I'll poke around a bit to see if I can spot something in the meantime. Not sure about the broadcast though. Would it solve itself if one creates the correct math functions?

@DrChainsaw DrChainsaw added the enhancement New feature or request label Nov 29, 2020
@DrChainsaw
Copy link
Author

Ok, I pretty much forgot that this issue exists: #391

@maleadt
Copy link
Member

maleadt commented Nov 30, 2020

Yeah these are tricky, and nobody is actively working on them right now. The CUDA implementations often just cast to Float32 though, so you could do the same for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants