Analyze matmul epilogue to determine vectorization #2169

jacobhinkle · 2024-05-01T15:50:08Z

PR #2105 implements operand analysis to determine vectorization of gmem loads. Quote:

Are we supporting a general pointwise epilogue fusion, or we only support biases? For example, if I have a "bias" whose shape is [N, M], I transpose it to [M, N] and add it to matmul result, can this be handled? It is very important that what we accept into the scheduler is compatible with what we assume here. Building out a general vectorization analysis should refer to the pointwise scheduler and uses code in https://github.com/NVIDIA/Fuser/blob/main/csrc/scheduler/vectorize_helper.h, and this is what #807 is doing.

Regarding this, I believe we have two options:

Make sure that we accept only the very limited cases of epilogue fusion (i.e. just with bias and activation) into the schedule and use this simple analysis.
Use vectorize_helper to build out a complete analysis for pointwise epilogue like the pointwise scheduler.
Whichever option we take, I don't think that is easy and well tested. For option 1, we need to review the scheduler canScheduleCompileTime code and brainstorm more adversarial examples, and for option 2, we need to copy some code from the pointwise scheduler like in Matmul, enable epilogue input vectorization #807. @drzejan2 do you remember the status of Matmul, enable epilogue input vectorization #807?

But anyway, epilogue vectorization is a much more difficult task than A and B. Can we move it to a separate PR?

Originally posted by @zasdfgbnm in #2105 (comment)

Option 1 is tracked in #2167. This issue corresponds to option 2 listed above. We might additionally need to update MatmulParams::SupportedVectorization if we support different vectorizations for the different input and output tensors

See #807 and #682

The text was updated successfully, but these errors were encountered:

This PR restricts the accepted matmul segments for the nvfuser matmul scheduler to only those containing pointwise epilogues. Additionally, it rules out cases for which we cannot yet reliably determine epilogue input vectorization due to transposes (TODO, see #2169). Note that this check can be lifted when more epilogue cases are supported, e.g. #2213. Fixes #2167. This is stacked on #2175 and follow-up PR to that introducing LinearOp because currently segmentation fails for matmuls unless the complete fusion can be scheduled (see #1707). The MatmulOp and LinearOp IR nodes remove the need to inspect operand producer branches, so segmentation should work fine once that work is merged. This PR will be marked as draft until then.

This PR does the following: 1. Rename `RolesMap` to `TensorRolesMap` and introduce `DimRolesMap` which is a mapping from `ValGroup` to `MatmulDomain`. 2. Compute a canonical dim ordering on `ValGroup`s based on allocation domains of inputs and outputs. This is used to compute vectorization properly but can be used for canonicalization of loop domains in scheduleMatmul in a future PR. 3. Properly infer vectorization for every operand, epilogue input, and output based on canonical dim ordering. This is in preparation for further generalization to accomodate multiple MmaOps in a single Fusion. Fixes #2169. --------- Co-authored-by: Gao, Xiang <qasdfgtyuiop@gmail.com>

jacobhinkle added the matmul label May 1, 2024

jacobhinkle self-assigned this May 1, 2024

This was referenced May 1, 2024

Support misaligned matmul using vectorization analysis #2105

Merged

allocation order propagation for matmul/linear #2198

Open

jacobhinkle mentioned this issue May 9, 2024

[WIP] Check that matmul epilogue contains only pointwise ops #2225

Draft

jacobhinkle mentioned this issue May 31, 2024

Infer matmul dimension roles to compute vectorization #2303

Merged

jacobhinkle closed this as completed in #2303 Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analyze matmul epilogue to determine vectorization #2169

Analyze matmul epilogue to determine vectorization #2169

jacobhinkle commented May 1, 2024

Analyze matmul epilogue to determine vectorization #2169

Analyze matmul epilogue to determine vectorization #2169

Comments

jacobhinkle commented May 1, 2024