inferece: remove `CachedMethodTable` #44240

aviatesk · 2022-02-18T14:25:24Z

Previously the method lookup result was created per frame and so the
look cache hasn't been use that much. With this change the cache is
created per inference, and so the cached result will be used when we
already saw the same match in the same inference shot, and it may speed
up the lookup time a bit.

This commit also setups new AbstractInterpreter interface get_method_lookup_cache
which specifies what method lookup cache is used by each AbstractInterpreter.
NativeInterpreter creates a cache per inference, and so it is valid
since lookup is done in the same world age in the same inference shot.
External AbstractInterpreter doesn't opt into this cache by default,
and its behavior won't change in anyway.

aviatesk · 2022-02-18T14:26:23Z

@nanosoldier runbenchmarks("inference", vs=":master")

vtjnash

Is this that helpful anymore? For concrete types, we already cache those in a hash table at the next level

aviatesk · 2022-02-18T14:36:43Z

I didn't confirm noticeable performance improvements when running the inference benchmark locally.

For concrete types, we already cache those in a hash table at the next level

But does this mean cache might be useful when a signature is abstract?

nanosoldier · 2022-02-18T14:59:06Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2022-02-18T15:11:40Z

Lol, the benchmark shows that this change make it less efficient both in time and space...
@nanosoldier runbenchmarks("inference", vs=":master")

vtjnash · 2022-02-18T15:33:06Z

yeah, we might want to do this and have only non-concrete types put here, otherwise we may end up doing the lookup twice in the cache-miss case.

nanosoldier · 2022-02-18T15:44:14Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2022-02-19T02:08:44Z

@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2022-02-19T02:41:37Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

aviatesk · 2022-02-21T05:52:27Z

ok, I really couldn't confirm any performance improvement in CachedMethodTable in the current infrastructure.
Now I'd like to propose to eliminate that entirely, and it should save a bit of space:
@nanosoldier runbenchmarks("inference", vs=":master")

nanosoldier · 2022-02-21T06:25:29Z

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

base/compiler/types.jl

Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in #44240), now I'd like to propose to eliminate that entirely and save a bit of space.

Apologize for the confusion! In <JuliaLang/julia#44240>, we figured out that `CachedMethodTable` doesn't give us any performance benefit (both in time and space), and so we removed that from Julia Base. This commit updates the (newly updated) GPUCompiler's overloads accordingly.

Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in JuliaLang#44240), now I'd like to propose to eliminate that entirely and save a bit of space.

…InferenceState)` interface In #44240 we removed the `CachedMethodTable` support as it turned out to be ineffective under the current compiler infrastructure. Because of this, there is no strong reason to keep a method table per `InferenceState`. This commit simply removes the `method_table(::AbstractInterpreter, ::InferenceState)` interface and should make it clearer which interface should be overloaded to implement a contextual dispatch.

Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in JuliaLang#44240), now I'd like to propose to eliminate that entirely and save a bit of space.

…InferenceState)` interface (#44389) In #44240 we removed the `CachedMethodTable` support as it turned out to be ineffective under the current compiler infrastructure. Because of this, there is no strong reason to keep a method table per `InferenceState`. This commit simply removes the `method_table(::AbstractInterpreter, ::InferenceState)` interface and should make it clearer which interface should be overloaded to implement a contextual dispatch.

Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in JuliaLang#44240), now I'd like to propose to eliminate that entirely and save a bit of space.

Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in #44240), now I'd like to propose to eliminate that entirely and save a bit of space.

…InferenceState)` interface (#44389) In #44240 we removed the `CachedMethodTable` support as it turned out to be ineffective under the current compiler infrastructure. Because of this, there is no strong reason to keep a method table per `InferenceState`. This commit simply removes the `method_table(::AbstractInterpreter, ::InferenceState)` interface and should make it clearer which interface should be overloaded to implement a contextual dispatch.

@time

`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ```

@time

`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ```

@time

`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ``` (cherry picked from commit 8445744)

@time

`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ```

@time

`CachedMethodTable` was removed within #44240 as we couldn't confirm any performance improvement then. However it turns out the optimization was critical in some real world cases (e.g. #46492), so this commit revives the mechanism with the following tweaks that should make it more effective: - create method table cache per inference (rather than per local inference on a function call as on the previous implementation) - only use cache mechanism for abstract types (since we already cache lookup result at the next level as for concrete types) As a result, the following snippet reported at #46492 recovers the compilation performance: ```julia using ControlSystems a_2 = [-5 -3; 2 -9] C_212 = ss(a_2, [1; 2], [1 0; 0 1], [0; 0]) @time norm(C_212) ``` > on master ``` julia> @time norm(C_212) 364.489044 seconds (724.44 M allocations: 92.524 GiB, 6.01% gc time, 100.00% compilation time) 0.5345224838248489 ``` > on this commit ``` julia> @time norm(C_212) 26.539016 seconds (62.09 M allocations: 5.537 GiB, 5.55% gc time, 100.00% compilation time) 0.5345224838248489 ```

aviatesk added the compiler:inference Type inference label Feb 18, 2022

aviatesk requested a review from Keno February 18, 2022 14:25

vtjnash reviewed Feb 18, 2022

View reviewed changes

aviatesk force-pushed the avi/lookupcache branch from d0f9e26 to 38d8fa1 Compare February 19, 2022 02:08

aviatesk force-pushed the avi/lookupcache branch from 38d8fa1 to df07fea Compare February 21, 2022 05:50

aviatesk changed the title ~~inference: use the same method lookup cache across same inference trial~~ inferece: remove CachedMethodTable Feb 21, 2022

vtjnash reviewed Feb 22, 2022

View reviewed changes

base/compiler/types.jl Outdated Show resolved Hide resolved

vtjnash approved these changes Feb 22, 2022

View reviewed changes

aviatesk commented Feb 22, 2022

View reviewed changes

base/compiler/types.jl Outdated Show resolved Hide resolved

vtjnash added the status:merge me PR is reviewed. Merge when all tests are passing label Feb 22, 2022

inference: remove CachedMethodTable

ecbc263

Since we couldn't confirm any performance benefit from `CachedMethodTable` in the current infrastructure (see the benchmark results in #44240), now I'd like to propose to eliminate that entirely and save a bit of space.

aviatesk force-pushed the avi/lookupcache branch from 56eefec to ecbc263 Compare February 23, 2022 02:35

aviatesk merged commit 26b0b6e into master Feb 23, 2022

aviatesk deleted the avi/lookupcache branch February 23, 2022 05:08

aviatesk mentioned this pull request Feb 23, 2022

remove CachedMethodTable JuliaGPU/GPUCompiler.jl#296

Merged

DilumAluthge removed the status:merge me PR is reviewed. Merge when all tests are passing label Feb 25, 2022

aviatesk mentioned this pull request Mar 1, 2022

AbstractInterpreter: remove method_table(::AbstractInterpreter, ::InferenceState) interface #44389

Merged

aviatesk mentioned this pull request Mar 9, 2022

Backports for julia 1.8.0-beta2 #44324

Merged

47 tasks

aviatesk mentioned this pull request Aug 29, 2022

inference: revive CachedMethodTable mechanism #46535

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

inferece: remove `CachedMethodTable` #44240

inferece: remove `CachedMethodTable` #44240

aviatesk commented Feb 18, 2022

aviatesk commented Feb 18, 2022

vtjnash left a comment

aviatesk commented Feb 18, 2022

nanosoldier commented Feb 18, 2022

aviatesk commented Feb 18, 2022

vtjnash commented Feb 18, 2022

nanosoldier commented Feb 18, 2022

aviatesk commented Feb 19, 2022

nanosoldier commented Feb 19, 2022

aviatesk commented Feb 21, 2022

nanosoldier commented Feb 21, 2022

inferece: remove CachedMethodTable #44240

inferece: remove CachedMethodTable #44240

Conversation

aviatesk commented Feb 18, 2022

aviatesk commented Feb 18, 2022

vtjnash left a comment

Choose a reason for hiding this comment

aviatesk commented Feb 18, 2022

nanosoldier commented Feb 18, 2022

aviatesk commented Feb 18, 2022

vtjnash commented Feb 18, 2022

nanosoldier commented Feb 18, 2022

aviatesk commented Feb 19, 2022

nanosoldier commented Feb 19, 2022

aviatesk commented Feb 21, 2022

nanosoldier commented Feb 21, 2022

inferece: remove `CachedMethodTable` #44240

inferece: remove `CachedMethodTable` #44240