Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove IREE usage of the Global Dialect Registry #3036

Merged
merged 6 commits into from Sep 2, 2020

Conversation

GMNGeoffrey
Copy link
Contributor

@GMNGeoffrey GMNGeoffrey commented Aug 29, 2020

Mostly solves #2958, but there are some TODOs left because this doesn't
handle our pipeline-within-a-pipeline workaround for the absence of
dynamic pass registration very gracefully. Hopefully #1036 will solve
some of these issues.

@GMNGeoffrey
Copy link
Contributor Author

FYI @joker-eph @marbre

Copy link
Contributor

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG! I haven't seen anything that looks obviously incorrect or non-idiomatic skimming through this.

iree/tools/translate_main.cc Show resolved Hide resolved
iree/tools/translate_main.cc Show resolved Hide resolved
Copy link
Contributor

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if my approval counts here, but I've been blocked on this so I have a vested interest to see this landing inside Google :)

Comment on lines 78 to 83
// The context is not yet available here and runOnOperation is too late
// to load the dialects. Therefore we have to register all dialects that
// the conversion patterns might create.
// TODO(#2958): Register dialects dynamically based on the
// HALConversionDialectInterface that are available.
IREE::Strings::StringsDialect, IREE::TensorList::TensorListDialect>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unfortunate - why is it needed? The interface is used to find/work with dialects that are already registered by someone else - if they aren't registered, they aren't needed. For example, if you are iree-opt and don't have the tensorlist dialect pulled in then you won't be able to work with tensorlist - that's expected - but if you are iree-tf-opt and do have it then this pass should find it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if these dialects are registered, we have to tell the pass manager to load them before running the pass pipeline. It uses the declared dependent dialects of the pass to do this (https://github.com/llvm/llvm-project/blob/master/mlir/lib/Pass/Pass.cpp#L749). This pass creates entities from these dialects if it sees the corresponding tf_strings and tf_tensorlist dialects. Per the TODO here, I don't see another way to do this without some changes to MLIR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

This absolutely needs to be fixed in MLIR - is there an upstream bug for it? This is a full inversion of dependencies and like this there is no way to add an out-of-tree custom dialect that lowers into the HAL. That's really bad. If the next thing done is fixing this I'm ok with it, but it's a really bad change and it's not good that this was forced by upstream without a workaround.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joker-eph could we have a bug to track providing some way of allowing dynamic dialect dependencies for a pass?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds to me right now like a problem to solve within IREE's infrastructure more than with MLIR? It isn't clear to me what kind of facility are missing from upstream right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loads the dialect unnecessarily though even if the pass isn't called. What would be nice is for some way for this pass's getDependentDialects to look at the currently loaded dialects, but we can't here.

Copy link
Contributor

@joker-eph joker-eph Sep 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now with the global registry, we have a way for a out-of-tree user to say "here's how to lower this dialect" and if the dialect is registered and the interface for. Without it, we don't.

Sure, but another way to see it that your extension mechanism for users to plug in your system was "hijacking" the upstream global registry in ConvertFlowToHAL.cpp:

    for (auto *dialect : context->getLoadedDialects()) {

There are many ways to get around this, for example a IREEFlowToHalRegistry wouldn't be flawed in the same way and would allow any out-of-tree user to plug themselves there.

Similarly, every extension points where you hit this issue is just a place where the IREE extension mechanisms need to be revisited and be more defensive (rely less on global side effects that you don't have control on, remove implicitness, etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok here's another way that I think is a lot better. Subclasses of HALConversionDialectInterface are responsible for loading dialects they need on construction. It's still not super pretty because there isn't an easy hook for them to implement (calling child methods from constructors is hard), and it sometimes loads the dialects unnecessarily, but at least it doesn't have layering issues.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, independently, I made post related to this here: https://llvm.discourse.group/t/rfc-revamp-dialect-registration/1559/3

I think it's plausible for a dialect to know all dialects that might be created by any of its interfaces or any interfaces implemented on its ops since the set of interfaces implemented by a dialect is a closed world fixed at dialect build time. Thus, I think that having context->getOrLoadDialect<IREE::Strings::StringsDialect>(); in the dialect constructor is actually pretty palatable. Maybe the dependency on IREE::Strings::StringsDialect can be squirreled away in TfStringsToHALConversionInterface::getDependentDialects, but that seems minor. I don't see a strong need to go full lazy and only have IREE::Strings::StringsDialect loaded when TfStringsToHALConversionInterface is loaded, but it seems like a fine design as well -- we will need to be sure to collect all the interfaces in the FlowToHal conversion pass constructor, but that seems layered nicely.

One way this could be handled nicely by the framework is for DialectInterfaces to have a hook "getDependentDialects". addInterfaces in the dialect constructor would automatically call.

On the topic of

for (auto *dialect : context->getLoadedDialects()) {

I think it is actually layered nicely and is not a hack. Because we maintain the invariant "if there is an op in the module, then its dialect is loaded", then the above is just a handy alternative to this:

std::set<Dialect*> dialectSet;
module.walk([&](Operation *op) {
  dialectSet.insert(op->getDialect());
};
for (auto *dialect : dialectSet) {

Iterating over getLoadedDialects has the advantage of working even without access to the module. It has the disadvantage that it might iterate over more dialects than strictly necessary, but that seems generally ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the topic of
for (auto *dialect : context->getLoadedDialects()) {
I think it is actually layered nicely and is not a hack. Because we maintain the invariant "if there is an op in the module, then its dialect is loaded", then the above is just a handy alternative to this:

It is nice in principle but:

  1. That works only if you already have the module, you can't create the pipeline first and then get a module. You'd have to recreate the pipeline for every new module.
  2. A pass in the pipeline can introduce a new dialect, which will be loaded through getDependentDialects() at the "run()" time, while you seem to need the information at pipeline build time here.

iree/compiler/Dialect/HAL/Target/VMLA/VMLATarget.cpp Outdated Show resolved Hide resolved
Comment on lines +241 to +247
registry.insert<AffineDialect,
Vulkan::VulkanDialect,
gpu::GPUDialect,
linalg::LinalgDialect,
scf::SCFDialect,
spirv::SPIRVDialect,
vector::VectorDialect>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big layering violation. Is this the final form of this registration system? Requiring the transitive list of all dialects that may be used by any subsequent lowering is very fragile and removes a lot of the composition possibilities - someone adding a small transform through another dialect way down between vector->spirv would now have to pop back up to these files and add the deps/registration/etc. If that dialect stops being used then someone has to remember to scrub all places that may depend on it. Yuck :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope not 😕 The issue here is that because we have a pass pipeline within a pass, the pass has to say what dialects it will use up front for the entire pass pipeline. By the time runOnOperation happens, it's too late to load the dialects. There's a TODO on the original declaration of this method in the superclass referencing this, but I hope that we can revisit how to do this properly with dynamic pass pipelines. If we could walk the passes in the pipeline then we wouldn't need to declare them all here, but right now that requires a context, which we can't get in this method. I could investigate further trying to get all the passes and their dependent dialects, but I think we should just re-evaluate in the context of dynamic pass pipelines.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any ETA on those and an upstream LLVM bug tracking it? As above, I'm concerned that this is a big step back and something that is generally really hard to unwind the further it is allowed to persist. As with the flow->hal example breaking out-of-tree custom dialects lowering into the HAL, this change breaks out-of-tree backends lowering out of the HAL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should break out-of-tree backends. The backend has to declare all the dependent dialects for the pass, which is gross, but it's still possible. I'm continuing the discussion about how to make this possible in the linked RFC. For now, I think this is much less bad than the ConvertFlowToHal thing, which I think I've fixed.

iree/compiler/Dialect/HAL/Transforms/BUILD Outdated Show resolved Hide resolved
iree/compiler/Dialect/HAL/Transforms/Passes.cpp Outdated Show resolved Hide resolved
iree/compiler/Dialect/VM/Transforms/Passes.cpp Outdated Show resolved Hide resolved
iree/tools/opt_main.cc Outdated Show resolved Hide resolved
copybara-service bot pushed a commit that referenced this pull request Sep 1, 2020
The default has changed to disable it upstream.

This is a temporary measure. #3036
makes us no longer dependent on the global registry, but there are
some ongoing discussions.

Part of #2958

PiperOrigin-RevId: 329523915
Copy link
Collaborator

@benvanik benvanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I think your changes here are fine given what you have to work with: my issues are solely with the fact that these changes were required and that this should have highlighted issues in the registration design that prevented upstream from progressing with breakage. If @joker-eph has bugs to track fixing these issues then I'll feel better about these landing, but I'm not happy about it :/

Comment on lines 78 to 83
// The context is not yet available here and runOnOperation is too late
// to load the dialects. Therefore we have to register all dialects that
// the conversion patterns might create.
// TODO(#2958): Register dialects dynamically based on the
// HALConversionDialectInterface that are available.
IREE::Strings::StringsDialect, IREE::TensorList::TensorListDialect>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

This absolutely needs to be fixed in MLIR - is there an upstream bug for it? This is a full inversion of dependencies and like this there is no way to add an out-of-tree custom dialect that lowers into the HAL. That's really bad. If the next thing done is fixing this I'm ok with it, but it's a really bad change and it's not good that this was forced by upstream without a workaround.

Comment on lines +241 to +247
registry.insert<AffineDialect,
Vulkan::VulkanDialect,
gpu::GPUDialect,
linalg::LinalgDialect,
scf::SCFDialect,
spirv::SPIRVDialect,
vector::VectorDialect>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any ETA on those and an upstream LLVM bug tracking it? As above, I'm concerned that this is a big step back and something that is generally really hard to unwind the further it is allowed to persist. As with the flow->hal example breaking out-of-tree custom dialects lowering into the HAL, this change breaks out-of-tree backends lowering out of the HAL.

iree/compiler/Dialect/VM/Transforms/Passes.cpp Outdated Show resolved Hide resolved
copybara-service bot pushed a commit that referenced this pull request Sep 1, 2020
The default has changed to disable it upstream.

This is a temporary measure. #3036
makes us no longer dependent on the global registry, but there are
some ongoing discussions.

Part of #2958

PiperOrigin-RevId: 329523915
copybara-service bot pushed a commit that referenced this pull request Sep 1, 2020
The default has changed to disable it upstream.

This is a temporary measure. #3036
makes us no longer dependent on the global registry, but there are
some ongoing discussions.

Part of #2958

PiperOrigin-RevId: 329543917
@google-cla
Copy link

google-cla bot commented Sep 2, 2020

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

1 similar comment
@google-cla
Copy link

google-cla bot commented Sep 2, 2020

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added cla: no and removed cla: yes labels Sep 2, 2020
@GMNGeoffrey
Copy link
Contributor Author

Oh ffs. GitHub used your other email for the co-author tag....

Co-authored-by: Marius Brehler <marius.brehler@iml.fraunhofer.de>
@google-cla
Copy link

google-cla bot commented Sep 2, 2020

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@GMNGeoffrey
Copy link
Contributor Author

It's a suggested edit googlebot. Could you just be cool?

just_once_be_cool

@google-cla google-cla bot removed the cla: no label Sep 2, 2020
@GMNGeoffrey GMNGeoffrey merged commit 9228670 into iree-org:main Sep 2, 2020
@GMNGeoffrey GMNGeoffrey deleted the no-global-registry branch September 2, 2020 18:19
@GMNGeoffrey GMNGeoffrey mentioned this pull request Sep 2, 2020
@marbre
Copy link
Contributor

marbre commented Sep 4, 2020

@GMNGeoffrey Sorry for the trouble :/

@GMNGeoffrey
Copy link
Contributor Author

@GMNGeoffrey Sorry for the trouble :/

Oh not your fault at all! A combination of missing functionality in GitHub and Googlebot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants