[query/lir] use lir method splitting instead of method wrapping#8963
Conversation
83d57d9 to
64113e9
Compare
fc8e6ec to
e4d2281
Compare
|
This is still 2-5% slower than the current main line branch. I have a few more things to try. Benchmarks where the change is >20%: FYI @tpoterba @chrisvittal The combiner improvement persists across multiple runs and is real. That's quite nice! |
|
Phew! I finally got to parity. I'm impressed how well the old method wrapping logic worked. Benchmarks with >20% change: |
|
oh, man, this is super exciting. 3x on the combiner? yes please! We can probably make incremental performance improvements to the LIR method splitting code to bring the compile and execute back down, and that one I consider a little less critical anyway. |
5728640 to
c8747da
Compare
|
FYI @patrick-schultz Tim came up in scorecard, but I know you had some interest in this and thought you might like to look at my solution. |
| while (i >= 0) { | ||
| val l = locals(i) | ||
| if (!l.isInstanceOf[Parameter]) { | ||
| if (!l.isInstanceOf[Parameter] && l.name != "spills") { |
There was a problem hiding this comment.
Oops, that wasn't supposed to make it in. Deleted!
tpoterba
left a comment
There was a problem hiding this comment.
This was a fun read. Mostly cosmetic comments
| @@ -950,10 +950,10 @@ object EmitStream { | |||
|
|
|||
There was a problem hiding this comment.
as I mentioned in chat, these need to be fields because we need persistent state in compiled iterators used in TableMapPartitions.
| var k = 0 | ||
|
|
||
| // recursion will blow out the stack | ||
| val stack = mutable.Stack[(Int, Iterator[Int])]() |
There was a problem hiding this comment.
can we use our ArrayStack here? this one is a List. This might be why our big compile benchmark got slower 🤷
| // although that is only non-trivial when the CFG is irreducible, which | ||
| // is relatively rare. | ||
|
|
||
| class Region( |
There was a problem hiding this comment.
I have a lot of brain baggage attached to this name. Can we call it CRegion or some similar mangling?
There was a problem hiding this comment.
I renamed it PSTRegion.
| } | ||
| } | ||
|
|
||
| class PST( |
| @@ -0,0 +1,581 @@ | |||
| package is.hail.lir | |||
There was a problem hiding this comment.
this file was a bit hard to follow using github's viewer, but in an editor with type inference, it's quite straightforward. Nice!
| def apply(c: Classx[_], m: Method): Unit = { | ||
| new SplitMethod(c, m).split() | ||
| object SplitMethod { | ||
| val TargetMethodSize: Int = 500 |
There was a problem hiding this comment.
have we done experiments tweaking this? I'm happy to benchmark a bunch of values when this goes in.
There was a problem hiding this comment.
I ran some experiments, but also while I was changing the code, so further tuning might be valuable. Spilling can increase method size, and I've seem methods nearly 2x the target size. The JVM max is 8K. I started with 2K, but found 500 works better. I think I tried 250, but it was not better (I don't remember how much worse).
| private val paramFields = m.parameterTypeInfo.zipWithIndex.map { case (ti, i) => | ||
| c.newField(genName("f", s"arg$i"), ti) | ||
| private val spillsClass = new Classx(genName("C", s"${ m.name }Spills"), "java/lang/Object") | ||
| private val spillsCtor = { |
There was a problem hiding this comment.
this is for empirical performance reasons, right?
| case _: GotoX => UnitInfo | ||
| case _: IfX => BooleanInfo | ||
| case x: IfX => | ||
| if (!regionBlocks(x.Ltrue) && !regionBlocks(x.Lfalse)) |
There was a problem hiding this comment.
could you explain this bit?
| x.setLfalse(newLfalse) | ||
| } | ||
| } | ||
| case x: SwitchX => IntInfo |
| i = sortedsubr.length - 1 | ||
| while (i >= 0) { | ||
| val ri = sortedsubr(i) | ||
| if (ri.size > 20 && |
There was a problem hiding this comment.
should this be a parameter to include in benchmarking?
There was a problem hiding this comment.
I don't think so.
This is just to make sure that we don't split something that had just been split, which would go into an infinite loop. It is paired with this:
splitSlice(ri.start, ri.end)
val s = blockSize(blockPartitions.find(ri.start))
assert(s < 20)
to make sure split things are sufficiently small that we won't be tempted to resplit them.
I don't think this is a performance issue, but there is potentially a pathological case where splitting could fail to reduce the size of the method: you have a massive region, but is made up of size 15 chlid regions which are all independent with no substructure. Then nothing would get split, and we might blow the method size limit. It's actually hard to imagine what such a function might look like, a bunch of switches that all jump to each other. I'm not quite sure how to split such a function, actually.
use locals instead of fields
target method size 500
split out loop regions unconditionally run SplitMethod for loops
2573527 to
a46fd4d
Compare
…-is#8963) * [query] use lir method splitting instead of method wrapping use locals instead of fields * add PST * wip split uses pst * compiles * passes IRSuite * cleaned up splitting, passing IRSuite * fix up rebase * minimal local spilling, IRSuite passes * cleanup * move some more fields => locals target method size 500 * compute loop regions in PST split out loop regions unconditionally run SplitMethod for loops * don't use exceptions for return handling * don't split loops by default, for benchmarking * cleanup, fixes * fix rebase * back off Field => Local * address comments
Summary of changes: