Successful inference may poison type limiting cache #29293

Keno · 2018-09-21T03:07:29Z

Consider the following setup:

struct SizedArray{T, Shape, N} <: AbstractArray{T, N}
    a::Array{T, N}
end
SizedArray(a::Array) = SizedArray{eltype(a), size(a), ndims(a)}(a)
Base.size(a::SizedArray{T, Shape}) where {T, Shape} = Shape
Base.getindex(a::SizedArray, args...) = Base.getindex(a, args...)

function Base.:*(a::SizedArray{T}, b::SizedArray{T}) where {T}
    SizedArray{T, (size(a)[1], size(b)[2]), 2}(a.a*b.a)
end

struct ImmutableChain{T<:Tuple}
    ops::T
end
(x::ImmutableChain{Tuple{}})(y) = y
(x::ImmutableChain)(y) = ImmutableChain(Base.tail(x.ops))(x.ops[1](y))

struct Mult{T}
    W::T
end
(x::Mult{T})(y) where {T} = x.W*y

c1 = ImmutableChain((
    Mult(SizedArray(rand(4, 4))),
    Mult(SizedArray(rand(3, 4))),
    Mult(SizedArray(rand(2, 3))),
    Mult(SizedArray(rand(1, 2))),
));

c2 = ImmutableChain((
    Mult(SizedArray(rand(3, 4))),
    Mult(SizedArray(rand(2, 3))),
    Mult(SizedArray(rand(1, 2))),
));

Now, running

@code_typed c2(SizedArray(rand(3, 1)))
@code_typed c1(SizedArray(rand(4, 1)))

results in both of these returning a good answer (SizedArray{Float64,(1, 1),2}).
However, running them the opposite way:

@code_typed c1(SizedArray(rand(4, 1)))
@code_typed c2(SizedArray(rand(3, 1)))

results in inference giving me Any for both. I.e. not only do I get a worse result, but the first invocation poisoned the inference cache for the second invocation, making it very hard to figure out what is going on.

The text was updated successfully, but these errors were encountered:

This attempts to fix inference for the case in #29293 (the one returning `Any`). It does not fix the cache poisoning part of that issue, which is a separate concern. The idea here is that we avoid applying limiting if the argtypes of the frame become strictly simpler (thus guaranteeing eventual termination). It is important that the complexity relation be transitive and anti-reflexive.

vtjnash · 2018-09-21T04:54:51Z

In the past, I’ve frequently closed this as “won’t fix”, since I expect the cost of checking if the answer in the cache is too precise to be excessive. The limit flag was supposed to prevent the better answer from getting wose though—so maybe something is more strange here than usual.

Keno · 2018-09-21T04:59:36Z

In the past, I’ve frequently closed this as “won’t fix”, since I expect the cost of checking if the answer in the cache is too precise to be excessive

Can we have a flag for this? That would help during development.

vtjnash · 2018-09-21T15:18:06Z

That flag is basically just what any code claiming not to be inference should be doing. I don't test with this often, since as I said, it's a bit slow (I think it can make it through bootstrapping these days). Just looking quickly through the code, I think there's only two places where it touches the cache:

diff --git a/base/compiler/typeinfer.jl b/base/compiler/typeinfer.jl
index ae842497ed..f77d69e335 100644
--- a/base/compiler/typeinfer.jl
+++ b/base/compiler/typeinfer.jl
@@ -56,7 +56,7 @@ function typeinf(frame::InferenceState)
                 end
             end
         end
-        if cached
+        if cached && frame.parent === nothing
             for caller in results
                 cache_result(caller, min_valid, max_valid)
             end
@@ -456,19 +456,6 @@ function typeinf_edge(method::Method, @nospecialize(atypes), sparams::SimpleVect
     code = code_for_method(method, atypes, sparams, caller.params.world)
     code === nothing && return Any, nothing
     code = code::MethodInstance
-    if isdefined(code, :inferred)
-        # return rettype if the code is already inferred
-        # staged functions make this hard since they have two "inferred" conditions,
-        # so need to check whether the code itself is also inferred
-        inf = code.inferred
-        if !isa(inf, CodeInfo) || (inf::CodeInfo).inferred
-            if isdefined(code, :inferred_const)
-                return AbstractEvalConstant(code.inferred_const), code
-            else
-                return code.rettype, code
-            end
-        end
-    end
     if !caller.cached && caller.parent === nothing
         # this caller exists to return to the user
         # (if we asked resolve_call_cyle, it might instead detect that there is a cycle that it can't merge)

This attempts to fix inference for the case in #29293 (the one returning `Any`). It does not fix the cache poisoning part of that issue, which is a separate concern. The idea here is that we avoid applying limiting if the argtypes of the frame become strictly simpler (thus guaranteeing eventual termination). It is important that the complexity relation be transitive and anti-reflexive.

vtjnash · 2018-10-23T19:19:45Z

I found what I think may be at least part of the problem. In fe39f10#diff-c0d983631c937ee5acfd0137cd7332deL1865, I added the entire edge to the needs-limiting heuristic. But later in #24337, I only added the first node to the "don't cache" list. However, that might need to add both nodes (parent too), since it was the combination of them that formed the repeated-edge and resulted in the weakened result.

But this is likely also insufficient, since we're still only marking those up to the "topmost" repeated edge, while we likely need to mark everything up to the edge that triggered the limit.

diff --git a/base/compiler/abstractinterpretation.jl b/base/compiler/abstractinterpretation.jl
index e163f9c6d6..d424d3c183 100644
--- a/base/compiler/abstractinterpretation.jl
+++ b/base/compiler/abstractinterpretation.jl
@@ -308,7 +308,7 @@ function abstract_call_method(method::Method, @nospecialize(sig), sparams::Simpl
             end
             infstate = sv
             topmost = topmost::InferenceState
-            while !(infstate.parent === topmost.parent)
+            while !(infstate === topmost.parent)
                 if call_result_unused(infstate)
                     # If we won't propagate the result any further (since it's typically unused),
                     # it's OK that we keep and cache the "limited" result in the parents

This attempts to fix inference for the case in #29293 (the one returning `Any`). It does not fix the cache poisoning part of that issue, which is a separate concern. The idea here is that we avoid applying limiting if the argtypes of the frame become strictly simpler (thus guaranteeing eventual termination). It is important that the complexity relation be transitive and anti-reflexive.

vtjnash · 2018-10-29T17:54:17Z

Actually, upon review, this should be fully fixed by #29795. I thought there were more code-paths to address, but it appears to have been just a single off-by-one-step.

fixed by 92ab600

(cherry picked from commit 92ab600)

This reverts commit 92ab600.

(cherry picked from commit 92ab600)

This attempts to fix inference for the case in #29293 (the one returning `Any`). It does not fix the cache poisoning part of that issue, which is a separate concern. The idea here is that we avoid applying limiting if the argtypes of the frame become strictly simpler (thus guaranteeing eventual termination). It is important that the complexity relation be transitive and anti-reflexive.

(cherry picked from commit 92ab600)

This attempts to fix inference for the case in #29293 (the one returning `Any`). It does not fix the cache poisoning part of that issue, which is a separate concern. The idea here is that we avoid applying limiting if the argtypes of the frame become strictly simpler (thus guaranteeing eventual termination). It is important that the complexity relation be transitive and anti-reflexive.

JeffBezanson added the compiler:inference Type inference label Sep 21, 2018

Keno mentioned this issue Sep 21, 2018

RFC: Allow structural recursion without triggering edge cycle limiting #29294

Closed

vtjnash changed the title ~~Type limiting poisons inference cache~~ Successful inference may poison type limiting cache Sep 21, 2018

vtjnash added a commit that referenced this issue Oct 23, 2018

partially address #29293

e884b25

vtjnash added a commit that referenced this issue Oct 24, 2018

partially address #29293

263d9d4

vtjnash added a commit that referenced this issue Oct 24, 2018

partially address #29293

92ab600

vtjnash closed this as completed Oct 29, 2018

KristofferC pushed a commit that referenced this issue Oct 29, 2018

partially address #29293

6395732

(cherry picked from commit 92ab600)

vtjnash added a commit that referenced this issue Oct 31, 2018

Revert "partially address #29293"

2982a42

This reverts commit 92ab600.

KristofferC pushed a commit that referenced this issue Nov 28, 2018

partially address #29293

58d22f0

(cherry picked from commit 92ab600)

KristofferC pushed a commit that referenced this issue Dec 12, 2018

partially address #29293

9d06cfb

(cherry picked from commit 92ab600)

KristofferC pushed a commit that referenced this issue Feb 11, 2019

partially address #29293

2673dec

(cherry picked from commit 92ab600)

KristofferC pushed a commit that referenced this issue Feb 20, 2020

partially address #29293

542e394

(cherry picked from commit 92ab600)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Successful inference may poison type limiting cache #29293

Successful inference may poison type limiting cache #29293

Keno commented Sep 21, 2018 •

edited

vtjnash commented Sep 21, 2018

Keno commented Sep 21, 2018

vtjnash commented Sep 21, 2018

vtjnash commented Oct 23, 2018

vtjnash commented Oct 29, 2018

Successful inference may poison type limiting cache #29293

Successful inference may poison type limiting cache #29293

Comments

Keno commented Sep 21, 2018 • edited

vtjnash commented Sep 21, 2018

Keno commented Sep 21, 2018

vtjnash commented Sep 21, 2018

vtjnash commented Oct 23, 2018

vtjnash commented Oct 29, 2018

Keno commented Sep 21, 2018 •

edited