-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MLIR] HLO to LHLO conversion and fusion #41424
Comments
Does it have to be either/or or can we have both? Maybe with a different focus at different level/layers of the stack? |
We also have an implementation for fusion in Linalg that works at the buffer level (in the same file that was linked, called One difference we have in XLA is that we have a fusion pass that pre-computes what operations we want to fuse, so we do not use a fix point during the actual fusion on Linalg itself. I believe that is also what the aforementioned interfaces are used for. This is modeled with a I think we already share the core components (Linalg fusion, which is in core, HLO to Linalg, which is in the TensorFlow) and we should expand on building these out. There are some LinAlg to HLO patterns for example that would be great to move from IREE to TensorFlow. |
Thanks @joker-eph , @sherhut for the replies
We will need some fusion at buffer level for sure. Operations that have reduction semantics cannot be represented as well in linalg on tensors (its an on-going disucssion). So fusing things like matmul/conv/reduce with other ops will probably use linalg on buffers or vector dialect.
FWIW, the conversion from HLO to Linalg and fusion parts like in TF and MLIR respectively, and just invoking two passes should give you that. Then the missing piece as you mention is the bufferization. But if I understand correctly its similar to the one being used (or being considered) for buffer allocation from HLO -> LHLO. So then using the fusion on tensors pass in Linalg should hopefully not take a lot of work. If there are any issues, we can address them cause we are using these and have an incentive to fix them.
Not sure how using a fixed point doesnt achieve what you want. In IREE too, the "things to be fused" are decided before hand. Using the fusion on tensors should give you the same thing.
This has been a source of confusion for a long time. The only ones we have in IREE go from HLO -> Linalg on buffers. The reason they are in IREE is cause AFAICS, the TF codegen strategy is to go from HLO -> LHLO -> Linalg. That diverges a bit. IREE uses HLO as an "input" IR for now. All other dialects used in IREE either live in IREE or in MLIR. Is there something specific you are looking at that you think we could port from IREE to TF which is outside of the HLO -> Linalg on buffers conversion. |
On Fri, Jul 17, 2020 at 11:25 AM MaheshRavishankar ***@***.***> wrote:
Thanks @joker-eph <https://github.com/joker-eph> , @sherhut
<https://github.com/sherhut> for the replies
Does it have to be either/or or can we have both? Maybe with a different
focus at different level/layers of the stack?
(at the moment I believe the only fusion pass in the MLIR-HLO land is
operating on tensor)
We will need some fusion at buffer level for sure. Operations that have
reduction semantics cannot be represented as well in linalg on tensors (its
an on-going disucssion). So fusing things like matmul/conv/reduce with
other ops will probably use linalg on buffers or vector dialect.
The main obstacle back then was the lack of bufferization support for
Linalg, which has now landed in MLIR core, so we could revisit it. It is
not a high priority right now, though.
FWIW, the conversion from HLO to Linalg and fusion parts like in TF and
MLIR respectively, and just invoking two passes should give you that. Then
the missing piece as you mention is the bufferization. But if I understand
correctly its similar to the one being used (or being considered) for
buffer allocation from HLO -> LHLO. So then using the fusion on tensors
pass in Linalg should hopefully not take a lot of work. If there are any
issues, we can address them cause we are using these and have an incentive
to fix them.
One difference we have in XLA is that we have a fusion pass
<https://github.com/tensorflow/tensorflow/blob/6e7992cae2f4af31ebc9753d7baf21ede69c39b3/tensorflow/compiler/mlir/hlo/lib/Dialect/mhlo/transforms/mhlo_fusion.cc>
that pre-computes what operations we want to fuse, so we do not use a fix
point during the actual fusion on Linalg itself. I believe that is also
what the aforementioned interfaces are used for. This is modeled with a
FusionOp node and we just fuse everything starting from the result of
that node.
Not sure how using a fixed point doesnt achieve what you want. In IREE
too, the "things to be fused" are decided before hand. Using the fusion on
tensors should give you the same thing.
I think we already share the core components (Linalg fusion, which is in
core, HLO to Linalg, which is in the TensorFlow) and we should expand on
building these out. There are some LinAlg to HLO patterns for example that
would be great to move from IREE to TensorFlow.
This has been a source of confusion for a long time. The only ones we have
in IREE go from HLO -> Linalg on buffers. The reason they are in IREE is
cause AFAICS, the TF codegen strategy is to go from HLO -> LHLO -> Linalg.
I still don't understand, so the confusion isn't over yet. There is not a
"single unique codegen strategy in TF" and I don't understand
why HLO->Linalg should be outside of TF right now.
Ref: iree-org/iree#2011 (comment)
…--
Mehdi
That diverges a bit. IREE uses HLO as an "input" IR for now. All other
dialects used in IREE either live in IREE or in MLIR. Is there something
specific you are looking at that you think we could port from IREE to TF
which is outside of the HLO -> Linalg on buffers conversion.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#41424 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAZXKDDA6RQ2RVW6R635OXLR4CJRHANCNFSM4O2WVSKA>
.
|
I meat to say that it was an obstacle and the generic buffer allocation support in MLIR could handle it now. It simply was not a priority to get done but that might change once this code gets more use again.
I am not saying that a fix point does not achieve it. I just tried to explain what the fusion pass that landed recently is about and why these interfaces were added. That is independent of whether we fuse on tensors or buffers and certainly not an argument against fusion on tensors. In fact, I am not arguing against it at all, just providing context.
I wondered about these patterns. They go from HLO to LinAlg and could hence be placed in TF/HLO repos. |
I see that recently there has been lot of work from @bondhugula for HLO to LHLO conversion. I am wondering what the plan here is w.r.t fusion. Is the fusion being done in HLO or LHLO? I remember a previous PR added fusion to XLA-HLO (using OpInterfaces), is that being used? Is the plan to do fusion in LHLO, which operates on memrefs?
The reason I ask is that in IREE we have been relying heavily on fusion at tensor level rather than fusion at buffer level. For elementwise operations this has worked fairly well so far. These two passes implement the fusion that covers a large number of elementwise-operations and their fusion
The latter covers fusion of generic ops, indexed generic ops and tensor reshape ops with each other. Fusion at tensor level is much easier to do, and the fusion is effectively doing a fixed-point iteration to do maximal fusion. (There might be cases where maximal fusion can be detrimental, but I haven't encountered such a case when element-wise operations are involved). I wanted to see if it would benefit the overall codegen in Tensorflow by using these passes. They aren't complete and might need enhancements to cover additional cases we haven't encountered in IREE, but so far for fusion elementwise operations, these two passes together have worked very well for us. I am trying to see if we can align codegen strategies between IREE and TF so that we can build more shared components.
@stellaraccident , @sherhut , @joker-eph , @antiagainst , @nicolasvasilache for comments/visiblity.
The text was updated successfully, but these errors were encountered: