-
Notifications
You must be signed in to change notification settings - Fork 11.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[RISCV] Lower fixed vectors extract_vector_elt through stack at high …
…LMUL This is the extract side of D159332. The goal is to avoid non-linear costing on patterns where an entire vector is split back into scalars. This is an idiomatic pattern for SLP. Each vslide operation is linear in LMUL on common hardware. (For instance, the sifive-x280 cost model models slides this way.) If we do a VL unique extracts, each with a cost linear in LMUL, the overall cost is O(LMUL2) * VLEN/ETYPE. To avoid the degenerate case, fallback to the stack if we're beyond LMUL2. There's a subtly here. For this to work, we're *relying* on an optimization in LegalizeDAG which tries to reuse the stack slot from a previous extract. In practice, this appear to trigger for patterns within a block, but if we ended up with an explode idiom split across multiple blocks, we'd still be in quadratic territory. I don't think that variant is fixable within SDAG. It's tempting to think we can do better than going through the stack, but well, I haven't found it yet if it exists. Here's the results for sifive-s280 on all the variants I wrote (all 16 x i64 with V): output/sifive-x280/linear_decomp_with_slidedown.mca:Total Cycles: 20703 output/sifive-x280/linear_decomp_with_vrgather.mca:Total Cycles: 23903 output/sifive-x280/naive_linear_with_slidedown.mca:Total Cycles: 21604 output/sifive-x280/naive_linear_with_vrgather.mca:Total Cycles: 22804 output/sifive-x280/recursive_decomp_with_slidedown.mca:Total Cycles: 15204 output/sifive-x280/recursive_decomp_with_vrgather.mca:Total Cycles: 18404 output/sifive-x280/stack_by_vreg.mca:Total Cycles: 12104 output/sifive-x280/stack_element_by_element.mca:Total Cycles: 4304 I am deliberately excluding scalable vectors. It functionally works, but frankly, the code quality for an idiomatic explode loop is so terrible either way that it felt better to leave that for future work. Differential Revision: https://reviews.llvm.org/D159375
- Loading branch information
Showing
4 changed files
with
472 additions
and
280 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.