Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyze matmul epilogue to determine vectorization #2169

Closed
jacobhinkle opened this issue May 1, 2024 · 0 comments · Fixed by #2303
Closed

Analyze matmul epilogue to determine vectorization #2169

jacobhinkle opened this issue May 1, 2024 · 0 comments · Fixed by #2303
Assignees
Labels

Comments

@jacobhinkle
Copy link
Collaborator

PR #2105 implements operand analysis to determine vectorization of gmem loads. Quote:

Are we supporting a general pointwise epilogue fusion, or we only support biases? For example, if I have a "bias" whose shape is [N, M], I transpose it to [M, N] and add it to matmul result, can this be handled? It is very important that what we accept into the scheduler is compatible with what we assume here. Building out a general vectorization analysis should refer to the pointwise scheduler and uses code in https://github.com/NVIDIA/Fuser/blob/main/csrc/scheduler/vectorize_helper.h, and this is what #807 is doing.

Regarding this, I believe we have two options:

  1. Make sure that we accept only the very limited cases of epilogue fusion (i.e. just with bias and activation) into the schedule and use this simple analysis.
  2. Use vectorize_helper to build out a complete analysis for pointwise epilogue like the pointwise scheduler.
    Whichever option we take, I don't think that is easy and well tested. For option 1, we need to review the scheduler canScheduleCompileTime code and brainstorm more adversarial examples, and for option 2, we need to copy some code from the pointwise scheduler like in Matmul, enable epilogue input vectorization #807. @drzejan2 do you remember the status of Matmul, enable epilogue input vectorization #807?

But anyway, epilogue vectorization is a much more difficult task than A and B. Can we move it to a separate PR?

Originally posted by @zasdfgbnm in #2105 (comment)

Option 1 is tracked in #2167. This issue corresponds to option 2 listed above. We might additionally need to update MatmulParams::SupportedVectorization if we support different vectorizations for the different input and output tensors

See #807 and #682

@jacobhinkle jacobhinkle self-assigned this May 1, 2024
jacobhinkle added a commit that referenced this issue May 9, 2024
This PR restricts the accepted matmul segments for the nvfuser matmul
scheduler to only those containing pointwise epilogues. Additionally, it
rules out cases for which we cannot yet reliably determine epilogue input
vectorization due to transposes (TODO, see #2169).

Note that this check can be lifted when more epilogue cases are
supported, e.g. #2213.

Fixes #2167.

This is stacked on #2175 and follow-up PR to that introducing LinearOp
because currently segmentation fails for matmuls unless the complete
fusion can be scheduled (see #1707). The MatmulOp and LinearOp IR nodes
remove the need to inspect operand producer branches, so segmentation
should work fine once that work is merged. This PR will be marked as
draft until then.
jacobhinkle added a commit that referenced this issue Jun 6, 2024
This PR does the following:
1. Rename `RolesMap` to `TensorRolesMap` and introduce `DimRolesMap`
which is a mapping from `ValGroup` to `MatmulDomain`.
2. Compute a canonical dim ordering on `ValGroup`s based on allocation
domains of inputs and outputs. This is used to compute vectorization
properly but can be used for canonicalization of loop domains in
scheduleMatmul in a future PR.
3. Properly infer vectorization for every operand, epilogue input, and
output based on canonical dim ordering.

This is in preparation for further generalization to accomodate multiple
MmaOps in a single Fusion.

Fixes #2169.

---------

Co-authored-by: Gao, Xiang <qasdfgtyuiop@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant