Reduce invalidations in deserialization #41872

martinholters · 2021-08-12T12:07:00Z

This is more of an issue than a PR perhaps, but I had to try to fix it to get a handle on it. So let me take my attempt and turn it into a PR for the sake of discussion.

So here is the issue (ref. #38983 (comment)): We start from a package with the followeing (pathological) content:

module Foo

foo(x=1) = hcat(x)

@assert precompile(foo, ())

end # module

Now, in a fresh session (after using Foo once in another session to trigger precompilation):

julia> using SnoopCompile, AbstractTrees

julia> @snoopr using Foo
Any[] # we precompiled Foo.foo() and it wasn't invalidated, so ...

julia> print_tree(@snoopi_deep(Foo.foo()), maxdepth=3) # this shows inference don't doing anything, right?
InferenceTimingNode: 0.015228/0.061387 on Core.Compiler.Timings.ROOT() with 1 direct children
└─ InferenceTimingNode: 0.000138/0.046158 on Foo.foo() with 1 direct children
   └─ InferenceTimingNode: 0.000199/0.046021 on Foo.foo(1::Int64) with 2 direct children
      ├─ InferenceTimingNode: 0.000840/0.045254 on hcat(::Int64) with 13 direct children
      │  ⋮
      │  
      └─ InferenceTimingNode: 0.000526/0.000568 on hcat((1,)::Int64) with 1 direct children
         ⋮

So why does inference basically start over here? Here is my analysis:
Upon deserialization, we first mark all CodeInstances invalid. Then we go and check all methods whether they have been invalidated by external callees. If not, we mark them valid. But we only do this for methods which actually have external callees. In this case, that means we check foo(::Int) and find it is still valid. But we never check foo(), leaving it marked as invalid. And as it is never actually invalidated by anything, @snoopr cannot report it. But then it needs to be re-inferred when called, which thanks to constant propagation and omission of the inlined hcat(::Int) method from the cache becomes relatively heavy.

So what I try to achieve with this change is to reverse that logic and first consider everything valid until it gets detected as invalidated. For methods with external callees this is pretty straight-forward, but we need to consider that if one of these gets invalidated, we have to propagate the invalidation to its callers (which might not have external callees themselves). Luckily, we have the backedges recorded to do just this. So with this patch I get the expected:

julia> using SnoopCompile, AbstractTrees

julia> @snoopr using Foo
Any[]

julia> print_tree(@snoopi_deep(Foo.foo()), maxdepth=3)
InferenceTimingNode: 0.016345/0.016345 on Core.Compiler.Timings.ROOT() with 0 direct children

Invalidation still seems to be working:

julia> Base.hcat(::Int) = "oops"

julia> using SnoopCompile, AbstractTrees

julia> @snoopr using Foo
4-element Vector{Any}:
  MethodInstance for Foo.foo()
 1
  MethodInstance for Foo.foo(::Int64)
  "insert_backedges"

julia> print_tree(@snoopi_deep(Foo.foo()), maxdepth=3)
InferenceTimingNode: 0.015316/0.015540 on Core.Compiler.Timings.ROOT() with 1 direct children
└─ InferenceTimingNode: 0.000078/0.000224 on Foo.foo() with 1 direct children
   └─ InferenceTimingNode: 0.000095/0.000146 on Foo.foo(::Int64) with 1 direct children
      └─ InferenceTimingNode: 0.000051/0.000051 on hcat(::Int64) with 0 direct children
         ⋮

@vtjnash is that reasoning valid and if so, does the implementation strategy look reasonable? Also cc @timholy.

One thing I should add is that the effect on real-world packages seems to be rather limited. But would still be nice to have this sorted out, if only too avoid distraction when trying to figure out why one's precompile statements don't have the desired effect---which is what brought me here...

martinholters · 2021-08-18T07:30:07Z

@timholy does the data pushed to _jl_debug_method_invalidation make sense (i.e. is comprehensible to an adapted SnoopCompile) with this patch (also in connection with #41913)? (Note that the call to jl_invalidate_method_instance will add additional entries.)

timholy · 2021-08-18T08:12:09Z

I think it should be fine, but I haven't tested. Between getting a review (this LGTM, but it would be better if @vtjnash has time to comment) and merging, I can build this locally and test it.

vtjnash · 2021-08-18T14:44:41Z

I believe that you cannot do this order, since the state of the invalidation trees is not quite consistent yet, so we might actually end up deleting many methods whose only fault was being inserted in an unpredictable order. I think we instead need to add to the list of methods here for post-processing the internal ones, with an empty array for the (flattened) edges.

martinholters · 2021-08-19T08:42:54Z

Okay... That goes way beyond my current understanding of the deserialization process and I'm not sure digging deeper into that is a worthwhile investment of my time - looking at stuff where (I at least think that) I know what I'm doing sounds more efficient.

In #38983 and #41872, it was discovered that only CodeInstances with external backedges got validated after deserialization. This stores new CodeInstances in a list, and if they have neither been validated nor invalidated, it validates them. Closes #41872

In #38983 and #41872, it was discovered that only CodeInstances with external backedges get validated after deserialization. This adds a "second chance" for each CodeInstance: it validates any that have neither been validated nor invalidated by the end of deserialization. Closes #41872

In JuliaLang#38983 and JuliaLang#41872, it was discovered that only CodeInstances with external backedges get validated after deserialization. This adds a "second chance" for each CodeInstance: it validates any that have neither been validated nor invalidated by the end of deserialization. Closes JuliaLang#41872

less is more

432473d

martinholters mentioned this pull request Aug 12, 2021

precompile interacts badly with Const-specialization #38983

Closed

kimikage mentioned this pull request Aug 12, 2021

Fatal performance degradation due to precompilation or inference bugs JuliaGraphics/Colors.jl#496

Closed

JeffBezanson added the compiler:precompilation Precompilation of modules label Aug 12, 2021

timholy requested a review from vtjnash August 18, 2021 08:12

timholy mentioned this pull request Aug 22, 2021

Validate CodeInstances with no external edges #41961

Merged

timholy closed this in #41961 Aug 23, 2021

DilumAluthge deleted the mh/fewer-invalidations-during-deserialization branch August 24, 2021 05:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce invalidations in deserialization #41872

Reduce invalidations in deserialization #41872

martinholters commented Aug 12, 2021

martinholters commented Aug 18, 2021

timholy commented Aug 18, 2021

vtjnash commented Aug 18, 2021

martinholters commented Aug 19, 2021

Reduce invalidations in deserialization #41872

Reduce invalidations in deserialization #41872

Conversation

martinholters commented Aug 12, 2021

martinholters commented Aug 18, 2021

timholy commented Aug 18, 2021

vtjnash commented Aug 18, 2021

martinholters commented Aug 19, 2021