Optimize compile times by not skipping allocas#3168
Conversation
Instead of skipping past allocas whenever inserting a new insruction, which ate up a lot of compilation time, they are inserted at the default insertion point. The result is that allocas that would have coallesced just after the global load an input loads are dispersed throughout the commands. So as part of dxil finalization, the allocas are moved to the beginning of the entry block of each function. This results in some minor changes to a couple tests due to the allocas preceding the loads.
pow2clk
left a comment
There was a problem hiding this comment.
There's some noise here because of the renaming. I've commented on the parts that I think are most relevant. All the renaming is simple and should be entirely unsurprising.
| } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Besides the simple removal of SkipAllocas, this is the meat of the change. It is required because validation will find that uses of the allocas are not dominated by them if they are left where they are inserted.
It does change the output a bit. Previously, loads of globals and inputs ended up before the allocas. This alteration necessitates the minor test changes.
| "getelementptr [4 x float], [4 x float]* %7, i32 0, i32 3", | ||
| "getelementptr [4 x float], [4 x float]* %7, i32 0, i32 10", | ||
| "getelementptr [4 x float], [4 x float]* %3, i32 0, i32 3", | ||
| "getelementptr [4 x float], [4 x float]* %3, i32 0, i32 10", |
There was a problem hiding this comment.
Renumbering is a result of the allocas coming first now.
|
✅ Build DirectXShaderCompiler 1.0.3676 completed (commit 035737c863 by @pow2clk) |
|
Hmm. I forgot to add @NicoM1 as a reviewer. 😉 |
D'aww, hope everyone is surviving launch :D PR looks great! All the best to the team! |
Instead of skipping past allocas whenever inserting a new insruction,
which ate up a lot of compilation time, they are inserted at the default
insertion point.
The result is that allocas that would have coalesced just after the
global load and input loads are dispersed throughout the commands. So as
part of dxil finalization, the allocas are moved to the beginning of the
entry block of each function. This results in some minor changes to a
couple tests due to the allocas preceding the loads.