Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MLIR][DISC] legalize disc to llvm #50434

Merged
merged 3 commits into from Jul 6, 2021
Merged

Conversation

wyzero
Copy link
Contributor

@wyzero wyzero commented Jun 24, 2021

This PR provides conversion patterns to convert disc_ral.dispatch_op
and gpu.launch_op to its disc-compatible llvm format.

This PR provides conversion patterns to convert `disc_ral.dispatch_op`
and `gpu.launch_op` to its disc-compatible llvm format.
@google-ml-butler google-ml-butler bot added the size:L CL Change Size: Large label Jun 24, 2021
@google-cla google-cla bot added the cla: yes label Jun 24, 2021
@gbaned gbaned self-assigned this Jun 24, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Jun 24, 2021
rewriter.getStringAttr(value), /*alignment=*/0);

rewriter.restoreInsertionPoint(ip);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check that if there is an existing globalOp it also matches the provided value?

Copy link
Contributor Author

@wyzero wyzero Jun 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll add an assert here.

Value loadOrCreateGlobalString(PatternRewriter& rewriter, Operation* op,
StringRef name, StringRef value) {
ModuleOp module = op->getParentOfType<ModuleOp>();
GlobalOp globalOp = module.lookupSymbol<GlobalOp>(name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lookups will be expensive, can you instead build a SymbolTable at the beginning of the function execution and update it through?
(Patterns would have to take it in their constructor and keep a reference to it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'm not sure if I fully understand what you mean. I guess that you say module.lookupSymbol is expensive because this method needs to create new a SymbolTable every time it's called.

It's worth noting that RalToLLVMPass is a function-level instead module-level pass to enable maximum parallelism, thus multiple instance of RalToLLVMPass may run simultaneously. Therefore, we have to re-create a SymbolTable every time we need to lookup a symbol since other threads may insert symbols as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh that seems like a race condition: actually function-level pass can't create symbol in the outer module like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that there is no threading involved, can we look into this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was waiting for you here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry... I misses the updates here. I have fixed it.

PatternRewriter& rewriter, Operation* op) const {
ModuleOp module = op->getParentOfType<ModuleOp>();
LLVMFuncOp func = module.lookupSymbol<LLVMFuncOp>(kRalDispatchFunctionName);
if (!func) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, prefer early exit if possible:

Suggested change
if (!func) {
if (func) return func;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

llvm.func @the_kernel() attributes {gpu.kernel} {
llvm.return
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a precondition of this pass is that we already have a gpu.binary_blob produced. It seems to me that in such cases the module shouldn't also have an IR content?
Not only they seem redundant, but also they could get out-of-sync.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It does not need to contain the actual IR while we still need to keep the function declaration inside the module since the gpu.launch_op needs to refer it. Thus, I'll remove the content of the the_kernel while keeping the declaration.

// CHECK: %[[T23:.*]] = llvm.getelementptr %[[T22]]{{.*}} : (!llvm.ptr<array<90 x i8>>, i64, i64) -> !llvm.ptr<i8>
// CHECK: llvm.call @disc_ral_call(%[[CTX]], %[[T23]], %[[KPACK]]) : (!llvm.ptr<i8>, !llvm.ptr<i8>, !llvm.ptr<ptr<i8>>) -> ()

gpu.launch_func @kernel_module::@the_kernel blocks in (%c1, %c1, %c1) threads in (%c1, %c1, %c1) args(%0 : memref<?x?xf32>)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you split these patterns into different functions and use CHECK-LABEL to match the functions?

CHECK-LABEL delimit sections and makes FileCheck matching a bit more robust and debugging easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

class RalToLLVMPass : public RalToLLVMPassBase<RalToLLVMPass> {
void getDependentDialects(DialectRegistry& registry) const override {
registry.insert<LLVM::LLVMDialect>();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put this information in the .td file and it won't be needed here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It will introduce llvm dialect dependency into header file (e.g. PassDetail.h) if we put this dialect dependency in td file since some passes may shared one .td file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, looks fine to leave it there.

@wyzero wyzero requested a review from joker-eph June 25, 2021 09:06
@wyzero
Copy link
Contributor Author

wyzero commented Jun 29, 2021

@joker-eph Hi, could you help to review the updated version? Thanks!

@wyzero
Copy link
Contributor Author

wyzero commented Jun 30, 2021

@joker-eph Hi, could you help to take a look? Thanks!

@wyzero
Copy link
Contributor Author

wyzero commented Jul 1, 2021

@joker-eph Hi,could you help to take a look? Thanks!

@joker-eph
Copy link
Contributor

You may have missed my comment inline: #50434 (comment) ?

@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jul 2, 2021
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Jul 2, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jul 2, 2021
copybara-service bot pushed a commit to tensorflow/mlir-hlo that referenced this pull request Jul 6, 2021
Imported from GitHub PR tensorflow/tensorflow#50434

This PR provides conversion patterns to convert `disc_ral.dispatch_op`
and `gpu.launch_op` to its disc-compatible llvm format.
Copybara import of the project:

--
410262f963c6ff25875355d1b7885472fc89f84e by Wenyi Zhao <reyizero@gmail.com>:

[MLIR][DISC] legalize disc to llvm

This PR provides conversion patterns to convert `disc_ral.dispatch_op`
and `gpu.launch_op` to its disc-compatible llvm format.

--
3838e485168ce10569b3022f33225a9722073efb by Wenyi Zhao <reyizero@gmail.com>:

fix

--
a9e6891cfdfd317e0e7ec7001287ad87316f987d by Wenyi Zhao <reyizero@gmail.com>:

fix

PiperOrigin-RevId: 383282261
@copybara-service copybara-service bot merged commit b033bb4 into tensorflow:master Jul 6, 2021
PR Queue automation moved this from Approved by Reviewer to Merged Jul 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes ready to pull PR ready for merge process size:L CL Change Size: Large
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

4 participants