New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove IREE usage of the Global Dialect Registry #3036
Conversation
FYI @joker-eph @marbre |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG! I haven't seen anything that looks obviously incorrect or non-idiomatic skimming through this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if my approval counts here, but I've been blocked on this so I have a vested interest to see this landing inside Google :)
// The context is not yet available here and runOnOperation is too late | ||
// to load the dialects. Therefore we have to register all dialects that | ||
// the conversion patterns might create. | ||
// TODO(#2958): Register dialects dynamically based on the | ||
// HALConversionDialectInterface that are available. | ||
IREE::Strings::StringsDialect, IREE::TensorList::TensorListDialect>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunate - why is it needed? The interface is used to find/work with dialects that are already registered by someone else - if they aren't registered, they aren't needed. For example, if you are iree-opt and don't have the tensorlist dialect pulled in then you won't be able to work with tensorlist - that's expected - but if you are iree-tf-opt and do have it then this pass should find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if these dialects are registered, we have to tell the pass manager to load them before running the pass pipeline. It uses the declared dependent dialects of the pass to do this (https://github.com/llvm/llvm-project/blob/master/mlir/lib/Pass/Pass.cpp#L749). This pass creates entities from these dialects if it sees the corresponding tf_strings and tf_tensorlist dialects. Per the TODO here, I don't see another way to do this without some changes to MLIR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This absolutely needs to be fixed in MLIR - is there an upstream bug for it? This is a full inversion of dependencies and like this there is no way to add an out-of-tree custom dialect that lowers into the HAL. That's really bad. If the next thing done is fixing this I'm ok with it, but it's a really bad change and it's not good that this was forced by upstream without a workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joker-eph could we have a bug to track providing some way of allowing dynamic dialect dependencies for a pass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds to me right now like a problem to solve within IREE's infrastructure more than with MLIR? It isn't clear to me what kind of facility are missing from upstream right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loads the dialect unnecessarily though even if the pass isn't called. What would be nice is for some way for this pass's getDependentDialects
to look at the currently loaded dialects, but we can't here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now with the global registry, we have a way for a out-of-tree user to say "here's how to lower this dialect" and if the dialect is registered and the interface for. Without it, we don't.
Sure, but another way to see it that your extension mechanism for users to plug in your system was "hijacking" the upstream global registry in ConvertFlowToHAL.cpp
:
for (auto *dialect : context->getLoadedDialects()) {
There are many ways to get around this, for example a IREEFlowToHalRegistry
wouldn't be flawed in the same way and would allow any out-of-tree user to plug themselves there.
Similarly, every extension points where you hit this issue is just a place where the IREE extension mechanisms need to be revisited and be more defensive (rely less on global side effects that you don't have control on, remove implicitness, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok here's another way that I think is a lot better. Subclasses of HALConversionDialectInterface are responsible for loading dialects they need on construction. It's still not super pretty because there isn't an easy hook for them to implement (calling child methods from constructors is hard), and it sometimes loads the dialects unnecessarily, but at least it doesn't have layering issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, independently, I made post related to this here: https://llvm.discourse.group/t/rfc-revamp-dialect-registration/1559/3
I think it's plausible for a dialect to know all dialects that might be created by any of its interfaces or any interfaces implemented on its ops since the set of interfaces implemented by a dialect is a closed world fixed at dialect build time. Thus, I think that having context->getOrLoadDialect<IREE::Strings::StringsDialect>();
in the dialect constructor is actually pretty palatable. Maybe the dependency on IREE::Strings::StringsDialect can be squirreled away in TfStringsToHALConversionInterface::getDependentDialects, but that seems minor. I don't see a strong need to go full lazy and only have IREE::Strings::StringsDialect loaded when TfStringsToHALConversionInterface is loaded, but it seems like a fine design as well -- we will need to be sure to collect all the interfaces in the FlowToHal conversion pass constructor, but that seems layered nicely.
One way this could be handled nicely by the framework is for DialectInterfaces to have a hook "getDependentDialects". addInterfaces in the dialect constructor would automatically call.
On the topic of
for (auto *dialect : context->getLoadedDialects()) {
I think it is actually layered nicely and is not a hack. Because we maintain the invariant "if there is an op in the module, then its dialect is loaded", then the above is just a handy alternative to this:
std::set<Dialect*> dialectSet;
module.walk([&](Operation *op) {
dialectSet.insert(op->getDialect());
};
for (auto *dialect : dialectSet) {
Iterating over getLoadedDialects has the advantage of working even without access to the module. It has the disadvantage that it might iterate over more dialects than strictly necessary, but that seems generally ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the topic of
for (auto *dialect : context->getLoadedDialects()) {
I think it is actually layered nicely and is not a hack. Because we maintain the invariant "if there is an op in the module, then its dialect is loaded", then the above is just a handy alternative to this:
It is nice in principle but:
- That works only if you already have the module, you can't create the pipeline first and then get a module. You'd have to recreate the pipeline for every new module.
- A pass in the pipeline can introduce a new dialect, which will be loaded through
getDependentDialects()
at the "run()" time, while you seem to need the information at pipeline build time here.
registry.insert<AffineDialect, | ||
Vulkan::VulkanDialect, | ||
gpu::GPUDialect, | ||
linalg::LinalgDialect, | ||
scf::SCFDialect, | ||
spirv::SPIRVDialect, | ||
vector::VectorDialect>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big layering violation. Is this the final form of this registration system? Requiring the transitive list of all dialects that may be used by any subsequent lowering is very fragile and removes a lot of the composition possibilities - someone adding a small transform through another dialect way down between vector->spirv would now have to pop back up to these files and add the deps/registration/etc. If that dialect stops being used then someone has to remember to scrub all places that may depend on it. Yuck :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope not 😕 The issue here is that because we have a pass pipeline within a pass, the pass has to say what dialects it will use up front for the entire pass pipeline. By the time runOnOperation happens, it's too late to load the dialects. There's a TODO on the original declaration of this method in the superclass referencing this, but I hope that we can revisit how to do this properly with dynamic pass pipelines. If we could walk the passes in the pipeline then we wouldn't need to declare them all here, but right now that requires a context, which we can't get in this method. I could investigate further trying to get all the passes and their dependent dialects, but I think we should just re-evaluate in the context of dynamic pass pipelines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any ETA on those and an upstream LLVM bug tracking it? As above, I'm concerned that this is a big step back and something that is generally really hard to unwind the further it is allowed to persist. As with the flow->hal example breaking out-of-tree custom dialects lowering into the HAL, this change breaks out-of-tree backends lowering out of the HAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://llvm.discourse.group/t/rfc-dynamic-pass-pipeline/1637. I'm reading up on it now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should break out-of-tree backends. The backend has to declare all the dependent dialects for the pass, which is gross, but it's still possible. I'm continuing the discussion about how to make this possible in the linked RFC. For now, I think this is much less bad than the ConvertFlowToHal thing, which I think I've fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I think your changes here are fine given what you have to work with: my issues are solely with the fact that these changes were required and that this should have highlighted issues in the registration design that prevented upstream from progressing with breakage. If @joker-eph has bugs to track fixing these issues then I'll feel better about these landing, but I'm not happy about it :/
// The context is not yet available here and runOnOperation is too late | ||
// to load the dialects. Therefore we have to register all dialects that | ||
// the conversion patterns might create. | ||
// TODO(#2958): Register dialects dynamically based on the | ||
// HALConversionDialectInterface that are available. | ||
IREE::Strings::StringsDialect, IREE::TensorList::TensorListDialect>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This absolutely needs to be fixed in MLIR - is there an upstream bug for it? This is a full inversion of dependencies and like this there is no way to add an out-of-tree custom dialect that lowers into the HAL. That's really bad. If the next thing done is fixing this I'm ok with it, but it's a really bad change and it's not good that this was forced by upstream without a workaround.
registry.insert<AffineDialect, | ||
Vulkan::VulkanDialect, | ||
gpu::GPUDialect, | ||
linalg::LinalgDialect, | ||
scf::SCFDialect, | ||
spirv::SPIRVDialect, | ||
vector::VectorDialect>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any ETA on those and an upstream LLVM bug tracking it? As above, I'm concerned that this is a big step back and something that is generally really hard to unwind the further it is allowed to persist. As with the flow->hal example breaking out-of-tree custom dialects lowering into the HAL, this change breaks out-of-tree backends lowering out of the HAL.
This reverts commit 7f8f2c5.
df3e280
to
8969f16
Compare
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
1 similar comment
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
Oh ffs. GitHub used your other email for the co-author tag.... |
Co-authored-by: Marius Brehler <marius.brehler@iml.fraunhofer.de>
f1efa46
to
00acdaa
Compare
All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the ℹ️ Googlers: Go here for more info. |
@GMNGeoffrey Sorry for the trouble :/ |
Oh not your fault at all! A combination of missing functionality in GitHub and Googlebot. |
Mostly solves #2958, but there are some TODOs left because this doesn't
handle our pipeline-within-a-pipeline workaround for the absence of
dynamic pass registration very gracefully. Hopefully #1036 will solve
some of these issues.