New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding interfaces for global ops used by the util dialect and fixing SinkGlobalBufferLoads. #10089
Conversation
33c28aa
to
4733991
Compare
This allows us to share more code between dialects using globals and generalizes the Explorer infrastructure to support global ops in any dialect (not just util.global).
This handles multiple uniform and non-uniform global stores that can arise after rodata deduplication. By using the explorer it's also able to track the rodata across block boundaries/calls made from initializers.
// TODO(benvanik): replace this entire pass with generic IPO - the rodata refs | ||
// are kind of constant like and should be trivial to inline, though they can't | ||
// be ConstantLike and will need a new interface so that IPO can materialize | ||
// ops. It's also possible we could use the dialect interface for materializing | ||
// constants to do that, though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂 new pass with "replace this entire pass" comment, nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐢 🐢 🐢 🐢 🐢
🐢 🐢 🐢 🐢
🐢 🐢 🐢
🐢 🐢
🐢
Looks like the existing VM tests were relying on canonicalization to constant evaluate everything instead of actually testing the globals at runtime 😨 Test at head after canonicalization, what ends up in the vmfb that's tested: vm.export @test_global_load_f32 attributes {ordinal = 0 : i32}
vm.func @test_global_load_f32() attributes {ordinal = 0 : i32} {
vm.return
} With changes in here that remove that canonicalization (as it was unsafe): vm.global.f32 private @c42 {ordinal = 0 : i32} = 4.250000e+01 : f32
vm.export @test_global_load_f32 attributes {ordinal = 0 : i32}
vm.func @test_global_load_f32() attributes {ordinal = 0 : i32} {
%c9 = vm.const.i32 9
%0 = vm.const.f32 4.250000e+01
%c42 = vm.global.load.f32 @c42 : f32
%1 = vm.cmp.ne.f32.o %c42, %0 : f32
vm.cond_br %1, ^bb2(%c9 : i32), ^bb1
^bb1: // pred: ^bb0
vm.return
^bb2(%2: i32): // pred: ^bb0
vm.fail %2, "@c42 != 42.5"
} This won't run yet as it needs global initialization and such - will see what's all needed and verify these really run. |
Ah....
|
There's some larger reworking here required to make the `vm` compilation modes work a little better: we want them to not change the IR too much but do want some of the nice passes like global inits.
4733991
to
825ac61
Compare
The interfaces allow us to share more code between dialects using globals
and generalizes the Explorer infrastructure to support global ops in any
dialect (not just util.global). We could add external models for ml_program
or iree_input to run the optimization passes there. In this first step we are
only adding them to VM ops to be able to reuse the util passes there. In the
future we could go back to having
flow.global
/stream.global
and such buttoday that's not needed.
This allows for the existing SinkGlobalBufferLoads pass to be reworked to use
the Explorer to perform IR traversal and fix some issues it had with multiple stores
found during #9945 (which after inlining produces multiple duplicate rodata ops).
Future IPO passes will also now be able to function on the VM dialect after
conversion.