-
Notifications
You must be signed in to change notification settings - Fork 25
Closed as not planned
Labels
enhancementNew feature or requestNew feature or request
Description
I just wanted to make an issue to track this.
So, I was able to get part of the way here with a few manual overloads, even being able to differentiate through sum(x) for x a CuArray. However it didn't work for sum(cos.(x)) which I guess compiles a special kernel. The error from this looks harder to overload:
You can try this out on a GPU here: https://colab.research.google.com/drive/1H1FzBaahClBOPO-q09vGr8b5qYWc0cgX?usp=sharing
Here are the rules I manually defined within this notebook:
using CUDA, Mooncake, DifferentiationInterface, GPUArrays
using Mooncake:
@is_primitive, @zero_adjoint, DefaultCtx, CoDual, NoPullback, NoRData,
primal, tangent, zero_fcodual
@zero_adjoint(
DefaultCtx,
Tuple{typeof(Mooncake.lgetfield),CuArray,Val{:dims}}
)
@zero_adjoint(
DefaultCtx,
Tuple{Type{<:CuArray},UndefInitializer,Vararg}
)
@zero_adjoint(
DefaultCtx,
Tuple{typeof(Base.mightalias),CuArray,CuArray}
)
@zero_adjoint(
DefaultCtx,
Tuple{typeof(Base.unsafe_convert),Type{<:CuDeviceVector},CuArray}
)
@is_primitive(
DefaultCtx,
Tuple{typeof(mapreduce),typeof(identity),typeof(Base.add_sum),CuArray}
)
function Mooncake.rrule!!(
::CoDual{typeof(mapreduce)},
::CoDual{typeof(identity)},
::CoDual{typeof(Base.add_sum)},
A::CoDual{<:CuArray}
)
y = zero_fcodual(sum(primal(A)))
pullback(dy) = (tangent(A) .+= dy; ntuple(_ -> NoRData(), 4))
return y, pullback
endI then executed:
x = randn(512)
x_device = cu(x)
f(z) = sum(cos.(z)) # n.b., works with sum(z)!
prep = prepare_gradient(f, AutoMooncake(), x_device)However, the next error looks a bit trickier and I worry if this requires the handling of tasks:
Mooncake.build_rrule(Mooncake.MooncakeInterpreter(), Tuple{CUDA.var"##cufunction#1206", Base.Pairs{Symbol, Union{Nothing, Bool}, Tuple{Symbol, Symbol}, @NamedTuple{always_inline::Bool, maxthreads::Nothing}}, typeof(cufunction), GPUArrays.var"#gpu_broadcast_kernel_linear#38", Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}}}}; debug_mode=false)Stacktrace (click me)
Stacktrace:
[1] build_rrule(interp::Mooncake.MooncakeInterpreter{DefaultCtx}, sig_or_mi::Type; debug_mode::Bool, silence_debug_messages::Bool)
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:1136
[2] build_rrule
@ ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:1077 [inlined]
[3] (::Mooncake.DynamicDerivedRule{Dict{Any, Any}})(::CoDual{CUDA.var"##cufunction#1206", Mooncake.NoFData}, ::CoDual{@Kwargs{always_inline::Bool, maxthreads::Nothing}, Mooncake.NoFData}, ::CoDual{typeof(cufunction), Mooncake.NoFData}, ::CoDual{GPUArrays.var"#gpu_broadcast_kernel_linear#38", Mooncake.NoFData}, ::CoDual{Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}}}, Mooncake.NoFData})
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:1736
[4] cufunction
@ ~/.julia/packages/CUDA/ja0IX/src/compiler/execution.jl:365 [inlined]
[5] (::Tuple{Mooncake.Stack{Int32}, Base.RefValue{Tuple{Mooncake.LazyZeroRData{typeof(Core.kwcall), Nothing}, Mooncake.LazyZeroRData{@NamedTuple{always_inline::Bool, maxthreads::Nothing}, Nothing}, Mooncake.LazyZeroRData{typeof(cufunction), Nothing}, Mooncake.LazyZeroRData{GPUArrays.var"#gpu_broadcast_kernel_linear#38", Nothing}, Mooncake.LazyZeroRData{Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}}}, Nothing}}}, CoDual{Tuple{Symbol, Symbol}, Mooncake.NoFData}, Mooncake.DynamicDerivedRule{Dict{Any, Any}}, Mooncake.Stack{Tuple{Any, Any}}})(none::CoDual{typeof(Core.kwcall), Mooncake.NoFData}, none::CoDual{@NamedTuple{always_inline::Bool, maxthreads::Nothing}, Mooncake.NoFData}, none::CoDual{typeof(cufunction), Mooncake.NoFData}, none::CoDual{GPUArrays.var"#gpu_broadcast_kernel...
@ Base.Experimental ./<missing>:0
[6] (::Mooncake.DerivedRule{Tuple{typeof(Core.kwcall), @NamedTuple{always_inline::Bool, maxthreads::Nothing}, typeof(cufunction), GPUArrays.var"#gpu_broadcast_kernel_linear#38", Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}}}}, Tuple{CoDual{typeof(Core.kwcall), Mooncake.NoFData}, CoDual{@NamedTuple{always_inline::Bool, maxthreads::Nothing}, Mooncake.NoFData}, CoDual{typeof(cufunction), Mooncake.NoFData}, CoDual{GPUArrays.var"#gpu_broadcast_kernel_linear#38", Mooncake.NoFData}, CoDual{Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneT...
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:966
[7] (::Mooncake.DynamicDerivedRule{Dict{Any, Any}})(::CoDual{typeof(Core.kwcall), Mooncake.NoFData}, ::CoDual{@NamedTuple{always_inline::Bool, maxthreads::Nothing}, Mooncake.NoFData}, ::CoDual{typeof(cufunction), Mooncake.NoFData}, ::CoDual{GPUArrays.var"#gpu_broadcast_kernel_linear#38", Mooncake.NoFData}, ::CoDual{Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}}}, Mooncake.NoFData})
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:1739
[8] (::Mooncake.RRuleZeroWrapper{Mooncake.DynamicDerivedRule{Dict{Any, Any}}})(::CoDual{typeof(Core.kwcall), Mooncake.NoFData}, ::CoDual{@NamedTuple{always_inline::Bool, maxthreads::Nothing}, Mooncake.NoFData}, ::CoDual{typeof(cufunction), Mooncake.NoFData}, ::CoDual{GPUArrays.var"#gpu_broadcast_kernel_linear#38", Mooncake.NoFData}, ::CoDual{Type{Tuple{KernelAbstractions.CompilerMetadata{KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicCheck, Nothing, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, KernelAbstractions.NDIteration.NDRange{1, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}, CartesianIndices{1, Tuple{Base.OneTo{Int64}}}}}, CuDeviceVector{Float32, 1}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuDeviceVector{Float32, 1}, Tuple{Bool}, Tuple{Int64}}}}}}, Mooncake.NoFData})
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:302
[9] #_#4
@ ~/.julia/packages/CUDA/ja0IX/src/CUDAKernels.jl:109 [inlined]
[10] (::Tuple{Mooncake.Stack{Int32}, Base.RefValue{Tuple{Mooncake.LazyZeroRData{CUDA.CUDAKernels.var"##_#4", Nothing}, Mooncake.LazyZeroRData{Tuple{Int64}, Nothing}, Mooncake.LazyZeroRData{Nothing, Nothing}, Mooncake.LazyZeroRData{KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, Nothing}, Mooncake.LazyZeroRData{Tuple{CuArray{Float32, 1, CUDA.DeviceMemory}, Vararg{Any}}, Any}}}, Mooncake.RRuleZeroWrapper{Mooncake.DynamicDerivedRule{Dict{Any, Any}}}, Mooncake.RRuleZeroWrapper{Mooncake.DynamicDerivedRule{Dict{Any, Any}}}, Mooncake.RRuleZeroWrapper{Mooncake.DynamicDerivedRule{Dict{Any, Any}}}, Mooncake.DynamicDerivedRule{Dict{Any, Any}}, Mooncake.RRuleZeroWrapper{Mooncake.DynamicDerivedRule{Dict{Any, Any}}}, Mooncake.LazyDerivedRule{Tuple{CUDA.var"##launch_configuration#1014", Int64, Int64, typeof(launch_configuration), CuFunction}, Mooncake.DerivedRule{Tuple{CUDA.var"##launch_configuration#1014", Int64, Int64, typeof(launch_configuration), CuFunction}, Tuple{CoDual{CUDA.var"##launch_configuration#1014", Mooncake.NoFData}, CoDual{Int64, Mooncake.NoFData}, CoDual{Int64, Mooncake.NoFData}, CoDual{typeof(launch_configuration), Mooncake.NoFData}, CoDual{CuFunction, Mooncake.FData{@NamedTuple{handle::Ptr{Mooncake.NoTangent}, mod::Mooncake.MutableTangent{@NamedTuple{handle::Ptr{Mooncake.NoTangent}, ctx::Mooncake.Tangent{@NamedTuple{handle::Ptr{Moonc...
@ Base.Experimental ./<missing>:0
[11] DerivedRule
@ ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:966 [inlined]
[12] _build_rule!(rule::Mooncake.LazyDerivedRule{Tuple{CUDA.CUDAKernels.var"##_#4", Tuple{Int64}, Nothing, KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, CuArray{Float32, 1, CUDA.DeviceMemory}, Vararg{Any}}, Mooncake.DerivedRule{Tuple{CUDA.CUDAKernels.var"##_#4", Tuple{Int64}, Nothing, KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, Tuple{CuArray{Float32, 1, CUDA.DeviceMemory}, Vararg{Any}}}, Tuple{CoDual{CUDA.CUDAKernels.var"##_#4", Mooncake.NoFData}, CoDual{Tuple{Int64}, Mooncake.NoFData}, CoDual{Nothing, Mooncake.NoFData}, CoDual{KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, Mooncake.NoFData}, CoDual}, CoDual{Nothing, Mooncake.NoFData}, Tuple{NoRData}, Tuple{NoRData, NoRData, NoRData, NoRData, Any}, true, Val{5}}}, args::Tuple{CoDual{CUDA.CUDAKernels.var"##_#4", Mooncake.NoFData}, CoDual{Tuple{Int64}, Mooncake.NoFData}, CoDual{Nothing, Mooncake.NoFData}, CoDual{KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_...
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:1827
[13] LazyDerivedRule
@ ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:1822 [inlined]
[14] Kernel
@ ~/.julia/packages/CUDA/ja0IX/src/CUDAKernels.jl:108 [inlined]
[15] (::Tuple{Mooncake.Stack{Int32}, Base.RefValue{Tuple{Mooncake.LazyZeroRData{typeof(Core.kwcall), Nothing}, Mooncake.LazyZeroRData{@NamedTuple{ndrange::Tuple{Int64}}, Nothing}, Mooncake.LazyZeroRData{KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, Nothing}, Mooncake.LazyZeroRData{Tuple{CuArray{Float32, 1, CUDA.DeviceMemory}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuArray{Float32, 1, CUDA.DeviceMemory}, Tuple{Bool}, Tuple{Int64}}}}}, Nothing}}}, Mooncake.LazyDerivedRule{Tuple{CUDA.CUDAKernels.var"##_#4", Tuple{Int64}, Nothing, KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, CuArray{Float32, 1, CUDA.DeviceMemory}, Vararg{Any}}, Mooncake.DerivedRule{Tuple{CUDA.CUDAKernels.var"##_#4", Tuple{Int64}, Nothing, KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, Tuple{CuArray{Float32, 1, CUDA.DeviceMemory}, Vararg{Any}}}, Tuple{CoDual{CUDA.CUDAKernels.var"##_#4", Mooncake.NoFData}, CoDual{Tuple{Int64}, Mooncake.NoFData}, CoDual{Nothing, Mooncake.NoFData}, CoDual{KernelAbstractions.Kernel{CUDABackend,...
@ Base.Experimental ./<missing>:0
[16] (::Mooncake.DerivedRule{Tuple{typeof(Core.kwcall), @NamedTuple{ndrange::Tuple{Int64}}, KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, Tuple{CuArray{Float32, 1, CUDA.DeviceMemory}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuArray{Float32, 1, CUDA.DeviceMemory}, Tuple{Bool}, Tuple{Int64}}}}}}, Tuple{CoDual{typeof(Core.kwcall), Mooncake.NoFData}, CoDual{@NamedTuple{ndrange::Tuple{Int64}}, Mooncake.NoFData}, CoDual{KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, Mooncake.NoFData}, CoDual{Tuple{CuArray{Float32, 1, CUDA.DeviceMemory}, Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuArray{Float32, 1, CUDA.DeviceMemory}, Tuple{Bool}, Tuple{Int64}}}}}, Tuple{CuArray{Float32, 1, CUDA.DeviceMemory}, Mooncake.FData{@NamedTuple{style::Mooncake.NoFData, f::Mooncake.NoFData, args::Tuple{Mooncake.FData{@NamedTuple{x::CuArray{Float32, 1, CUDA.DeviceMemory}, keeps::Mooncake.NoFData, defaults::Mooncake.NoFData}}}, axes::Mooncake.NoFData}}}}}, CoDual{Nothing, Mooncake.NoFData}, Tuple{NoRData}, NTuple{4, NoRData}, true, Val{4}})(...
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:966
[17] (::Mooncake.DynamicDerivedRule{Dict{Any, Any}})(::CoDual{typeof(Core.kwcall), Mooncake.NoFData}, ::CoDual{@NamedTuple{ndrange::Tuple{Int64}}, Mooncake.NoFData}, ::CoDual{KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, Mooncake.NoFData}, ::CoDual{CuArray{Float32, 1, CUDA.DeviceMemory}, CuArray{Float32, 1, CUDA.DeviceMemory}}, ::CoDual{Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuArray{Float32, 1, CUDA.DeviceMemory}, Tuple{Bool}, Tuple{Int64}}}}, Mooncake.FData{@NamedTuple{style::Mooncake.NoFData, f::Mooncake.NoFData, args::Tuple{Mooncake.FData{@NamedTuple{x::CuArray{Float32, 1, CUDA.DeviceMemory}, keeps::Mooncake.NoFData, defaults::Mooncake.NoFData}}}, axes::Mooncake.NoFData}}})
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:1739
[18] (::Mooncake.RRuleZeroWrapper{Mooncake.DynamicDerivedRule{Dict{Any, Any}}})(::CoDual{typeof(Core.kwcall), Mooncake.NoFData}, ::CoDual{@NamedTuple{ndrange::Tuple{Int64}}, Mooncake.NoFData}, ::CoDual{KernelAbstractions.Kernel{CUDABackend, KernelAbstractions.NDIteration.DynamicSize, KernelAbstractions.NDIteration.DynamicSize, GPUArrays.var"#gpu_broadcast_kernel_linear#38"}, Mooncake.NoFData}, ::CoDual{CuArray{Float32, 1, CUDA.DeviceMemory}, CuArray{Float32, 1, CUDA.DeviceMemory}}, ::CoDual{Base.Broadcast.Broadcasted{CUDA.CuArrayStyle{1, CUDA.DeviceMemory}, Tuple{Base.OneTo{Int64}}, typeof(cos), Tuple{Base.Broadcast.Extruded{CuArray{Float32, 1, CUDA.DeviceMemory}, Tuple{Bool}, Tuple{Int64}}}}, Mooncake.FData{@NamedTuple{style::Mooncake.NoFData, f::Mooncake.NoFData, args::Tuple{Mooncake.FData{@NamedTuple{x::CuArray{Float32, 1, CUDA.DeviceMemory}, keeps::Mooncake.NoFData, defaults::Mooncake.NoFData}}}, axes::Mooncake.NoFData}}})
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:302
[19] f
@ ./In[6]:1 [inlined]
[20] (::Tuple{Mooncake.Stack{Int32}, Base.RefValue{Tuple{Mooncake.LazyZeroRData{typeof(f), Nothing}, Mooncake.LazyZeroRData{CuArray{Float32, 1, CUDA.DeviceMemory}, Nothing}}}, Mooncake.LazyDerivedRule{Tuple{typeof(unsafe_copyto!), CuArray{Float32, 1, CUDA.DeviceMemory}, Int64, CuArray{Float32, 1, CUDA.DeviceMemory}, Int64, Int64}, Mooncake.DerivedRule{Tuple{typeof(unsafe_copyto!), CuArray{Float32, 1, CUDA.DeviceMemory}, Int64, CuArray{Float32, 1, CUDA.DeviceMemory}, Int64, Int64}, Tuple{CoDual{typeof(unsafe_copyto!), Mooncake.NoFData}, CoDual{CuArray{Float32, 1, CUDA.DeviceMemory}, CuArray{Float32, 1, CUDA.DeviceMemory}}, CoDual{Int64, Mooncake.NoFData}, CoDual{CuArray{Float32, 1, CUDA.DeviceMemory}, CuArray{Float32, 1, CUDA.DeviceMemory}}, CoDual{Int64, Mooncake.NoFData}, CoDual{Int64, Mooncake.NoFData}}, CoDual{CuArray{Float32, 1, CUDA.DeviceMemory}, CuArray{Float32, 1, CUDA.DeviceMemory}}, Tuple{NoRData}, NTuple{6, NoRData}, false, Val{6}}}, CoDual{Symbol, Mooncake.NoFData}, CoDual{Symbol, Mooncake.NoFData}, CoDual{Symbol, Mooncake.NoFData}, Mooncake.RRuleZeroWrapper{Mooncake.DynamicDerivedRule{Dict{Any, Any}}}, Mooncake.RRuleZeroWrapper{Mooncake.DynamicDerivedRule{Dict{Any, Any}}}, Mooncake.LazyDerivedRule{Tuple{typeof(Base.Broadcast.throwdm), Tuple{Base.OneTo{Int64}}, Tuple{Base.OneTo{Int64}}}, Mooncake.DerivedRule{Tuple{typeof(Base.Broadcast.throwdm), Tuple{Base.OneTo{Int64}}, Tuple{Base.OneTo{Int64}}}, Tuple{CoDual{typeof(Base.Broadcast.throwdm), Mooncake.N...
@ Base.Experimental ./<missing>:0
[21] (::Mooncake.DerivedRule{Tuple{typeof(f), CuArray{Float32, 1, CUDA.DeviceMemory}}, Tuple{CoDual{typeof(f), Mooncake.NoFData}, CoDual{CuArray{Float32, 1, CUDA.DeviceMemory}, CuArray{Float32, 1, CUDA.DeviceMemory}}}, CoDual{Float32, Mooncake.NoFData}, Tuple{Float32}, Tuple{NoRData, NoRData}, false, Val{2}})(::CoDual{typeof(f), Mooncake.NoFData}, ::CoDual{CuArray{Float32, 1, CUDA.DeviceMemory}, CuArray{Float32, 1, CUDA.DeviceMemory}})
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interpreter/s2s_reverse_mode_ad.jl:966
[22] prepare_gradient_cache(::Function, ::Vararg{Any}; kwargs::@Kwargs{debug_mode::Bool, silence_debug_messages::Bool})
@ Mooncake ~/.julia/packages/Mooncake/lBHAV/src/interface.jl:509
[23] prepare_gradient_cache
@ ~/.julia/packages/Mooncake/lBHAV/src/interface.jl:506 [inlined]
[24] prepare_gradient_nokwarg(::Val{true}, ::typeof(f), ::AutoMooncake{Nothing}, ::CuArray{Float32, 1, CUDA.DeviceMemory})
@ DifferentiationInterfaceMooncakeExt ~/.julia/packages/DifferentiationInterface/sPszY/ext/DifferentiationInterfaceMooncakeExt/onearg.jl:114
[25] #prepare_gradient#46
@ ~/.julia/packages/DifferentiationInterface/sPszY/src/first_order/gradient.jl:11 [inlined]
[26] prepare_gradient(::typeof(f), ::AutoMooncake{Nothing}, ::CuArray{Float32, 1, CUDA.DeviceMemory})
@ DifferentiationInterface ~/.julia/packages/DifferentiationInterface/sPszY/src/first_order/gradient.jl:8
[27] top-level scope
@ In[7]:1Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request