-
Notifications
You must be signed in to change notification settings - Fork 3
Description
The rewrite Walk(Inline(...)) scales as O(n^2). Any compiler pass should scale at most linearly in program size.
The quadratic scaling comes from inline.py where each block is split in two, and the inline region is inserted in the middle. Here, splitting blocks in two comes with the extra O(n) factor:
after_block = ir.Block()
stmt = call_like.next_stmt
while stmt is not None:
stmt.detach()
after_block.stmts.append(stmt)
stmt = call_like.next_stmtThe while loop is over all statements within a block. This seems harmless at first. But for heavily inlined kernels, most of the program will eventually be contained in a single block. Here, n can approach the total program size.
Additionally, there are statements like inline_region.blocks.popfirst() that also provide O(n) factors (popping the first element of a list).
@cduck has observed quadratic scaling in practice, so this bottleneck is relevant in practice.
A temporary fix has been introduced in #552.
However, the inline pass should be refactored more generally to remove any O(n^2) components.