Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some updates and fixes to inlining cost #27857

Merged
merged 1 commit into from Jul 7, 2018
Merged

Conversation

JeffBezanson
Copy link
Sponsor Member

@JeffBezanson JeffBezanson commented Jun 29, 2018

  • Instead of always inlining functions marked @inline, increase the cost threshold 10x 20x
  • Don't inline functions inferred not to return
  • statement_cost no longer needs to look at nested Exprs in general
  • Fix cost of :copyast

This also fixes the regression in mapslices (#27417). It was due to inlining a function marked @inline, even though the argument types were abstract and the body contained many dynamic calls. I believe most compilers just increase the threshold (by a lot) for functions marked inline, and have a separate always_inline to truly force inlining. We should do the same thing; otherwise we end up inlining silly things the user is unlikely to have intended.

@JeffBezanson JeffBezanson added the optimizer Optimization passes (mostly in base/compiler/ssair/) label Jun 29, 2018
@JeffBezanson
Copy link
Sponsor Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@KristofferC
Copy link
Sponsor Member

KristofferC commented Jun 29, 2018

I believe most compilers just increase the threshold (by a lot) for functions marked inline, and have a separate always_inline to truly force inlining.

Can we get that always_inline then? Because when the inliner gets it "wrong", the effect can be dramatic and not having an escape hatch seems like a step backwards.

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Sponsor Member Author

Yes, you can have always_inline, but beware: if we can't infer the types of arguments (f(x::Any)), we will inline the function anyway, with everything in it dynamically dispatched, instead of dispatching to an optimized version. That is a dramatic effect, and probably not what you want.

@timholy
Copy link
Sponsor Member

timholy commented Jun 29, 2018

I suppose another alternative is to aggressively eliminate many current @inline annotations, and save the current @inline to mean always_inline. The data in #22210 (comment) seem to suggest that might be feasible.

But there's a lot of sense in this proposal.

return 0
end
argcost = 0
for a in ex.args
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for making this non-recursive?

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The structure of the IR has changed such that calls no longer contain other calls, so it's just unnecessary.

@JeffBezanson
Copy link
Sponsor Member Author

Performance summary: some nice speedups from avoiding inlining things with dynamic dispatches, but some big slowdowns in find functions. The slowdowns were mostly related to the use of invoke in deprecations, so I just changed that.

@JeffBezanson
Copy link
Sponsor Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Something went wrong when running your job:

NanosoldierError: failed to run benchmarks against primary commit: failed process: Process(`sudo cset shield -e su nanosoldier -- -c ./benchscript.sh`, ProcessExited(1)) [1]

Logs and partial data can be found here
cc @ararslan

@JeffBezanson
Copy link
Sponsor Member Author

@nanosoldier runbenchmarks(ALL, vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Sponsor Member Author

Fixes #26446

@JeffBezanson
Copy link
Sponsor Member Author

@nanosoldier runbenchmarks("collection" && "queries & updates", vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Sponsor Member Author

@nanosoldier runbenchmarks("collection" && "queries & updates", vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan

@JeffBezanson
Copy link
Sponsor Member Author

@nanosoldier runbenchmarks("string" && "readuntil", vs = ":master")

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - no performance regressions were detected. A full report can be found here. cc @ararslan

end
return plus_saturate(argcost, T_IFUNC_COST[iidx])
elseif ftyp isa DataType && isdefined(ftyp, :instance)
f = ftyp.instance
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch should be dead-code

(also, it's available as the singleton_type function)

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might indicate a problem elsewhere, but it's not dead code --- in fact I needed this to fix a regression in an earlier version of the PR. A call to Core.sizeof had typeof(Core.sizeof) as the type of the function.

end
a = ex.args[2]
if a isa Expr
cost += statement_cost(a, -1, src, spvals, slottypes, params)
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cost = plus_saturate(cost, ...)

- Instead of always inlining functions marked at-inline, increase
  the cost threshold 20x
- Don't inline functions inferred not to return
- statement_cost no longer needs to look at nested Exprs in general
- Fix cost of `:copyast`
@JeffBezanson JeffBezanson merged commit 9277d3a into master Jul 7, 2018
@JeffBezanson JeffBezanson deleted the jb/inliningcostfixes branch July 7, 2018 19:47
@KristofferC
Copy link
Sponsor Member

Bump for @forceinline...

Issues like JuliaArrays/StaticArrays.jl#494 are quite frustrating when you always have to worry that the threshold might have gotten reached.

@JeffBezanson
Copy link
Sponsor Member Author

There might be something wrong with our cost model in this case --- a few dozen arithmetic ops should not be a problem. I'm worried about dynamic dispatches that take thousands of times longer than that. We can add forceinline as well though.

@KristofferC
Copy link
Sponsor Member

KristofferC commented Sep 13, 2018

The example there has a SIMD loop though so might be why it doesn't inline. Anyway, having a way to force would be good in the inevitable cases where the cost model gets it wrong.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Sep 13, 2018

If the function type is maximally typed (isdispatchtuple(specTypes)), we could simply always obey the user annotation. In that case, you're already then at least guaranteed that keeping the dispatch barrier won't improve type information.

JeffBezanson added a commit that referenced this pull request Sep 18, 2018
JeffBezanson added a commit that referenced this pull request Sep 25, 2018
KristofferC pushed a commit that referenced this pull request Sep 30, 2018
KristofferC pushed a commit that referenced this pull request Feb 11, 2019
KristofferC pushed a commit that referenced this pull request Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimization passes (mostly in base/compiler/ssair/)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants