White list for builtin functions and stdlib #5

Roger-luo · 2020-02-19T22:37:04Z

Glad to see this is implemented with Cassette, I never had time to actually finish it.

I'm trying this with Zygote on my own machine learning models and hit some error, e.g

julia> avoid_allocations(record, back, rand(10))
ERROR: BoundsError
Stacktrace:
 [1] copyto! at /Users/roger/.julia/packages/Cassette/7OymZ/src/context.jl:450 [inlined]
 [2] copyto! at ./array.jl:303 [inlined]
 [3] overdub(::Cassette.Context{nametype(ReplayCtx),AutoPreallocation.AllocationReplay{Array{Array,1}},Nothing,Cassette.var"##PassType#404",Nothing,Cassette.DisableHooks}, ::typeof(copyto!), ::Array{Any,1}, ::Array{Any,1}) at /Users/roger/.julia/packages/Cassette/7OymZ/src/overdub.jl:0
 [4] _collect_indices at ./array.jl:578 [inlined]
 [5] overdub(::Cassette.Context{nametype(ReplayCtx),AutoPreallocation.AllocationReplay{Array{Array,1}},Nothing,Cassette.var"##PassType#404",Nothing,Cassette.DisableHooks}, ::typeof(Base._collect_indices), ::Tuple{Base.OneTo{Int64}}, ::Array{Any,1}) at /Users/roger/.julia/packages/Cassette/7OymZ/src/overdub.jl:0
 [6] collect at ./array.jl:562 [inlined]
 [7] _totuple at ./tuple.jl:248 [inlined]
 [8] Tuple at ./tuple.jl:220 [inlined]
 [9] back at /Users/roger/Documents/TNFilters.jl/src/layers/mps.jl:144 [inlined]
 [10] overdub(::Cassette.Context{nametype(ReplayCtx),AutoPreallocation.AllocationReplay{Array{Array,1}},Nothing,Cassette.var"##PassType#404",Nothing,Cassette.DisableHooks}, ::TNFilters.var"#back#21"{Context,MPS{10,2,4,Float64,Array{Float64,3}},Array{Int64,1},Array{Any,1},TNFilters.var"#54#back#7"{TNFilters.var"#5#6"{Array{Float64,3}}}}, ::Array{Float64,1}) at /Users/roger/.julia/packages/Cassette/7OymZ/src/overdub.jl:0
 [11] #75#back at /Users/roger/.julia/packages/ZygoteRules/6nssF/src/adjoint.jl:49 [inlined]

This is because there is a part of the backward rules make use of Tuple(::Array) to convert gradient of a Tuple back, which looks like

function back(Δ)
        _, Δ = tr_back(Δ)
        grads = []
        for i in length(op):-1:2
            _, Δ, grad_tensor = stack[i-1](Δ)
            accum_param(__context__, op.tensors[i][configs[i]], grad_tensor)
            push!(grads, grad_tensor)
        end
        accum_param(__context__, op.tensors[1][configs[1]], Δ)
        push!(grads, Δ)
        return (;tensors=Tuple(grads)), nothing
    end

I know this could be workaround by not using Tuple, but I guess in most cases this is not a bottleneck and it will can be generated by other things (since it's just a type conversion) so #4 might not be helpful in such case. I think in practice, we could white list some functions that will use dynamic realloc but usually not performance/memory bottleneck. I think we did this in Alloc.jl before: https://github.com/FluxML/Alloc.jl/pull/1/files#diff-c678e3660b372f14911e30988c8632ceR45

This will also partly workaround #2 I think and for copy and similar I found it could be worth to handle them explicitly since there might be some performance issue (might be related to #2 ).

The text was updated successfully, but these errors were encountered:

oxinabox · 2020-02-20T11:03:47Z

Seems legit yes.

Roger-luo mentioned this issue Feb 20, 2020

add blacklist #6

Merged

Roger-luo closed this as completed Feb 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

White list for builtin functions and stdlib #5

White list for builtin functions and stdlib #5

Roger-luo commented Feb 19, 2020

oxinabox commented Feb 20, 2020

White list for builtin functions and stdlib #5

White list for builtin functions and stdlib #5

Comments

Roger-luo commented Feb 19, 2020

oxinabox commented Feb 20, 2020