-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Splits cleanup block lowered by AsyncToAsyncRuntime. #66123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…up blocks for resume and destroy are separated.
@llvm/pr-subscribers-mlir-async ChangesSplits the cleanup block lowered from AsyncToAsyncRuntime. The incentive of this change is to clarify the CFG branched by The
The behavior before this change is that after the coroutine is resumed and completed, it will jump to a shared cleanup block for destroying the states of coroutines. The CFG looks like the following, Entry block This CFG can potentially be problematic, because the After this change, the CFG will look like the following, The overall structure of the lowered CFG can be the following, Entry (calling async.coro.suspend) In this case, the Cleanup block tied to the Resume block will be isolated from the other path and it is strictly dominated by "Resume".Full diff: https://github.com/llvm/llvm-project/pull/66123.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp b/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp index e38904724348a9f..d1a913409ca3663 100644 --- a/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp +++ b/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp @@ -94,8 +94,30 @@ struct CoroMachinery { Value coroHandle; // coroutine handle (!async.coro.getHandle value) Block *entry; // coroutine entry block std::optional setError; // set returned values to error state - Block *cleanup; // coroutine cleanup block - Block *suspend; // coroutine suspension block + Block *cleanup; // coroutine cleanup block + + // Coroutine cleanup block for destroy after the coroutine is resumed, + // e.g. async.coro.suspend state, [suspend], [resume], [destroy] + // + // This cleanup block is a duplicate of the cleanup block followed by the + // resume block. The purpose of having a duplicate cleanup block for destroy + // is to make the CFG clear so that the control flow analysis won't confuse. + // + // The overall structure of the lowered CFG can be the following, + // + // Entry (calling async.coro.suspend) + // | \ + // Resume Destroy (duplicate of Cleanup) + // | | + // Cleanup | + // | / + // End (ends the corontine) + // + // If there is resume-specific cleanup logic, it can go into the Cleanup + // block but not the destroy block. Otherwise, it can fail block dominance + // check. + Block *cleanupForDestroy; + Block *suspend; // coroutine suspension block }; } // namespace @@ -183,16 +205,21 @@ static CoroMachinery setupCoroMachinery(func::FuncOp func) { builder.create(originalEntryBlock); Block *cleanupBlock = func.addBlock(); + Block *cleanupBlockForDestroy = func.addBlock(); Block *suspendBlock = func.addBlock(); // ------------------------------------------------------------------------ // - // Coroutine cleanup block: deallocate coroutine frame, free the memory. + // Coroutine cleanup blocks: deallocate coroutine frame, free the memory. // ------------------------------------------------------------------------ // - builder.setInsertionPointToStart(cleanupBlock); - builder.create(coroIdOp.getId(), coroHdlOp.getHandle()); + auto buildCleanupBlock = [&](Block *cb) { + builder.setInsertionPointToStart(cb); + builder.create(coroIdOp.getId(), coroHdlOp.getHandle()); - // Branch into the suspend block. - builder.create(suspendBlock); + // Branch into the suspend block. + builder.create(suspendBlock); + }; + buildCleanupBlock(cleanupBlock); + buildCleanupBlock(cleanupBlockForDestroy); // ------------------------------------------------------------------------ // // Coroutine suspend block: mark the end of a coroutine and return allocated @@ -227,6 +254,7 @@ static CoroMachinery setupCoroMachinery(func::FuncOp func) { machinery.entry = entryBlock; machinery.setError = std::nullopt; // created lazily only if needed machinery.cleanup = cleanupBlock; + machinery.cleanupForDestroy = cleanupBlockForDestroy; machinery.suspend = suspendBlock; return machinery; } @@ -348,7 +376,7 @@ outlineExecuteOp(SymbolTable &symbolTable, ExecuteOp execute) { // Add async.coro.suspend as a suspended block terminator. builder.create(coroSaveOp.getState(), coro.suspend, - branch.getDest(), coro.cleanup); + branch.getDest(), coro.cleanupForDestroy); branch.erase(); } @@ -588,7 +616,7 @@ class AwaitOpLoweringBase : public OpConversionPattern { // Add async.coro.suspend as a suspended block terminator. builder.setInsertionPointToEnd(suspended); builder.create(coroSaveOp.getState(), coro.suspend, resume, - coro.cleanup); + coro.cleanupForDestroy); // Split the resume block into error checking and continuation. Block *continuation = rewriter.splitBlock(resume, Block::iterator(op)); diff --git a/mlir/test/Dialect/Async/async-to-async-runtime.mlir b/mlir/test/Dialect/Async/async-to-async-runtime.mlir index 635a86ecdb4bee6..36583b2b94a3c93 100644 --- a/mlir/test/Dialect/Async/async-to-async-runtime.mlir +++ b/mlir/test/Dialect/Async/async-to-async-runtime.mlir @@ -25,17 +25,22 @@ func.func @execute_no_async_args(%arg0: f32, %arg1: memref<1xf32>) { // CHECK: %[[SAVED:.*]] = async.coro.save %[[HDL]] // CHECK: async.runtime.resume %[[HDL]] // CHECK: async.coro.suspend %[[SAVED]] -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[DESTROY:.*]] // Resume coroutine after suspension. // CHECK: ^[[RESUME]]: // CHECK: memref.store // CHECK: async.runtime.set_available %[[TOKEN]] +// CHECK: cf.br ^[[CLEANUP:.*]] // Delete coroutine. // CHECK: ^[[CLEANUP]]: // CHECK: async.coro.free %[[ID]], %[[HDL]] +// Delete coroutine. +// CHECK: ^[[DESTROY]]: +// CHECK: async.coro.free %[[ID]], %[[HDL]] + // Suspend coroutine, and also a return statement for ramp function. // CHECK: ^[[SUSPEND]]: // CHECK: async.coro.end %[[HDL]] @@ -79,11 +84,15 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // CHECK: async.runtime.resume %[[HDL]] // CHECK: async.coro.suspend -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[DESTROY:.*]] // CHECK: ^[[RESUME]]: // CHECK: memref.store // CHECK: async.runtime.set_available %[[TOKEN]] +// CHECK: cf.br ^[[CLEANUP:.*]] + +// CHECK: ^[[CLEANUP]]: +// CHECK: ^[[DESTROY]]: // Function outlined from the outer async.execute operation. // CHECK-LABEL: func private @async_execute_fn_0 @@ -96,7 +105,7 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // Suspend coroutine in the beginning. // CHECK: async.runtime.resume %[[HDL]] // CHECK: async.coro.suspend -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME_0:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME_0:.*]], ^[[DESTROY_0:.*]] // Suspend coroutine second time waiting for the completion of inner execute op. // CHECK: ^[[RESUME_0]]: @@ -104,7 +113,7 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // CHECK: %[[SAVED:.*]] = async.coro.save %[[HDL]] // CHECK: async.runtime.await_and_resume %[[INNER_TOKEN]], %[[HDL]] // CHECK: async.coro.suspend %[[SAVED]] -// CHECK-SAME: ^[[SUSPEND]], ^[[RESUME_1:.*]], ^[[CLEANUP]] +// CHECK-SAME: ^[[SUSPEND]], ^[[RESUME_1:.*]], ^[[DESTROY_0]] // Check the error of the awaited token after resumption. // CHECK: ^[[RESUME_1]]: @@ -115,9 +124,11 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // CHECK: ^[[CONTINUATION:.*]]: // CHECK: memref.store // CHECK: async.runtime.set_available %[[TOKEN]] +// CHECK: cf.br ^[[CLEANUP_0:.*]] // CHECK: ^[[SET_ERROR]]: -// CHECK: ^[[CLEANUP]]: +// CHECK: ^[[CLEANUP_0]]: +// CHECK: ^[[DESTROY_0]]: // CHECK: ^[[SUSPEND]]: // ----- @@ -354,7 +365,7 @@ func.func @execute_assertion(%arg0: i1) { // Initial coroutine suspension. // CHECK: async.coro.suspend -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[DESTROY:.*]] // Resume coroutine after suspension. // CHECK: ^[[RESUME]]: @@ -363,7 +374,7 @@ func.func @execute_assertion(%arg0: i1) { // Set coroutine completion token to available state. // CHECK: ^[[SET_AVAILABLE]]: // CHECK: async.runtime.set_available %[[TOKEN]] -// CHECK: cf.br ^[[CLEANUP]] +// CHECK: cf.br ^[[CLEANUP:.*]] // Set coroutine completion token to error state. // CHECK: ^[[SET_ERROR]]: @@ -374,6 +385,10 @@ func.func @execute_assertion(%arg0: i1) { // CHECK: ^[[CLEANUP]]: // CHECK: async.coro.free %[[ID]], %[[HDL]] +// Delete coroutine. +// CHECK: ^[[DESTROY]]: +// CHECK: async.coro.free %[[ID]], %[[HDL]] + // Suspend coroutine, and also a return statement for ramp function. // CHECK: ^[[SUSPEND]]: // CHECK: async.coro.end %[[HDL]] |
@llvm/pr-subscribers-mlir ChangesSplits the cleanup block lowered from AsyncToAsyncRuntime. The incentive of this change is to clarify the CFG branched by The
The behavior before this change is that after the coroutine is resumed and completed, it will jump to a shared cleanup block for destroying the states of coroutines. The CFG looks like the following, Entry block This CFG can potentially be problematic, because the After this change, the CFG will look like the following, The overall structure of the lowered CFG can be the following, Entry (calling async.coro.suspend) In this case, the Cleanup block tied to the Resume block will be isolated from the other path and it is strictly dominated by "Resume".Full diff: https://github.com/llvm/llvm-project/pull/66123.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp b/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp index e38904724348a9f..d1a913409ca3663 100644 --- a/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp +++ b/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp @@ -94,8 +94,30 @@ struct CoroMachinery { Value coroHandle; // coroutine handle (!async.coro.getHandle value) Block *entry; // coroutine entry block std::optional setError; // set returned values to error state - Block *cleanup; // coroutine cleanup block - Block *suspend; // coroutine suspension block + Block *cleanup; // coroutine cleanup block + + // Coroutine cleanup block for destroy after the coroutine is resumed, + // e.g. async.coro.suspend state, [suspend], [resume], [destroy] + // + // This cleanup block is a duplicate of the cleanup block followed by the + // resume block. The purpose of having a duplicate cleanup block for destroy + // is to make the CFG clear so that the control flow analysis won't confuse. + // + // The overall structure of the lowered CFG can be the following, + // + // Entry (calling async.coro.suspend) + // | \ + // Resume Destroy (duplicate of Cleanup) + // | | + // Cleanup | + // | / + // End (ends the corontine) + // + // If there is resume-specific cleanup logic, it can go into the Cleanup + // block but not the destroy block. Otherwise, it can fail block dominance + // check. + Block *cleanupForDestroy; + Block *suspend; // coroutine suspension block }; } // namespace @@ -183,16 +205,21 @@ static CoroMachinery setupCoroMachinery(func::FuncOp func) { builder.create(originalEntryBlock); Block *cleanupBlock = func.addBlock(); + Block *cleanupBlockForDestroy = func.addBlock(); Block *suspendBlock = func.addBlock(); // ------------------------------------------------------------------------ // - // Coroutine cleanup block: deallocate coroutine frame, free the memory. + // Coroutine cleanup blocks: deallocate coroutine frame, free the memory. // ------------------------------------------------------------------------ // - builder.setInsertionPointToStart(cleanupBlock); - builder.create(coroIdOp.getId(), coroHdlOp.getHandle()); + auto buildCleanupBlock = [&](Block *cb) { + builder.setInsertionPointToStart(cb); + builder.create(coroIdOp.getId(), coroHdlOp.getHandle()); - // Branch into the suspend block. - builder.create(suspendBlock); + // Branch into the suspend block. + builder.create(suspendBlock); + }; + buildCleanupBlock(cleanupBlock); + buildCleanupBlock(cleanupBlockForDestroy); // ------------------------------------------------------------------------ // // Coroutine suspend block: mark the end of a coroutine and return allocated @@ -227,6 +254,7 @@ static CoroMachinery setupCoroMachinery(func::FuncOp func) { machinery.entry = entryBlock; machinery.setError = std::nullopt; // created lazily only if needed machinery.cleanup = cleanupBlock; + machinery.cleanupForDestroy = cleanupBlockForDestroy; machinery.suspend = suspendBlock; return machinery; } @@ -348,7 +376,7 @@ outlineExecuteOp(SymbolTable &symbolTable, ExecuteOp execute) { // Add async.coro.suspend as a suspended block terminator. builder.create(coroSaveOp.getState(), coro.suspend, - branch.getDest(), coro.cleanup); + branch.getDest(), coro.cleanupForDestroy); branch.erase(); } @@ -588,7 +616,7 @@ class AwaitOpLoweringBase : public OpConversionPattern { // Add async.coro.suspend as a suspended block terminator. builder.setInsertionPointToEnd(suspended); builder.create(coroSaveOp.getState(), coro.suspend, resume, - coro.cleanup); + coro.cleanupForDestroy); // Split the resume block into error checking and continuation. Block *continuation = rewriter.splitBlock(resume, Block::iterator(op)); diff --git a/mlir/test/Dialect/Async/async-to-async-runtime.mlir b/mlir/test/Dialect/Async/async-to-async-runtime.mlir index 635a86ecdb4bee6..36583b2b94a3c93 100644 --- a/mlir/test/Dialect/Async/async-to-async-runtime.mlir +++ b/mlir/test/Dialect/Async/async-to-async-runtime.mlir @@ -25,17 +25,22 @@ func.func @execute_no_async_args(%arg0: f32, %arg1: memref<1xf32>) { // CHECK: %[[SAVED:.*]] = async.coro.save %[[HDL]] // CHECK: async.runtime.resume %[[HDL]] // CHECK: async.coro.suspend %[[SAVED]] -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[DESTROY:.*]] // Resume coroutine after suspension. // CHECK: ^[[RESUME]]: // CHECK: memref.store // CHECK: async.runtime.set_available %[[TOKEN]] +// CHECK: cf.br ^[[CLEANUP:.*]] // Delete coroutine. // CHECK: ^[[CLEANUP]]: // CHECK: async.coro.free %[[ID]], %[[HDL]] +// Delete coroutine. +// CHECK: ^[[DESTROY]]: +// CHECK: async.coro.free %[[ID]], %[[HDL]] + // Suspend coroutine, and also a return statement for ramp function. // CHECK: ^[[SUSPEND]]: // CHECK: async.coro.end %[[HDL]] @@ -79,11 +84,15 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // CHECK: async.runtime.resume %[[HDL]] // CHECK: async.coro.suspend -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[DESTROY:.*]] // CHECK: ^[[RESUME]]: // CHECK: memref.store // CHECK: async.runtime.set_available %[[TOKEN]] +// CHECK: cf.br ^[[CLEANUP:.*]] + +// CHECK: ^[[CLEANUP]]: +// CHECK: ^[[DESTROY]]: // Function outlined from the outer async.execute operation. // CHECK-LABEL: func private @async_execute_fn_0 @@ -96,7 +105,7 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // Suspend coroutine in the beginning. // CHECK: async.runtime.resume %[[HDL]] // CHECK: async.coro.suspend -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME_0:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME_0:.*]], ^[[DESTROY_0:.*]] // Suspend coroutine second time waiting for the completion of inner execute op. // CHECK: ^[[RESUME_0]]: @@ -104,7 +113,7 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // CHECK: %[[SAVED:.*]] = async.coro.save %[[HDL]] // CHECK: async.runtime.await_and_resume %[[INNER_TOKEN]], %[[HDL]] // CHECK: async.coro.suspend %[[SAVED]] -// CHECK-SAME: ^[[SUSPEND]], ^[[RESUME_1:.*]], ^[[CLEANUP]] +// CHECK-SAME: ^[[SUSPEND]], ^[[RESUME_1:.*]], ^[[DESTROY_0]] // Check the error of the awaited token after resumption. // CHECK: ^[[RESUME_1]]: @@ -115,9 +124,11 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // CHECK: ^[[CONTINUATION:.*]]: // CHECK: memref.store // CHECK: async.runtime.set_available %[[TOKEN]] +// CHECK: cf.br ^[[CLEANUP_0:.*]] // CHECK: ^[[SET_ERROR]]: -// CHECK: ^[[CLEANUP]]: +// CHECK: ^[[CLEANUP_0]]: +// CHECK: ^[[DESTROY_0]]: // CHECK: ^[[SUSPEND]]: // ----- @@ -354,7 +365,7 @@ func.func @execute_assertion(%arg0: i1) { // Initial coroutine suspension. // CHECK: async.coro.suspend -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[DESTROY:.*]] // Resume coroutine after suspension. // CHECK: ^[[RESUME]]: @@ -363,7 +374,7 @@ func.func @execute_assertion(%arg0: i1) { // Set coroutine completion token to available state. // CHECK: ^[[SET_AVAILABLE]]: // CHECK: async.runtime.set_available %[[TOKEN]] -// CHECK: cf.br ^[[CLEANUP]] +// CHECK: cf.br ^[[CLEANUP:.*]] // Set coroutine completion token to error state. // CHECK: ^[[SET_ERROR]]: @@ -374,6 +385,10 @@ func.func @execute_assertion(%arg0: i1) { // CHECK: ^[[CLEANUP]]: // CHECK: async.coro.free %[[ID]], %[[HDL]] +// Delete coroutine. +// CHECK: ^[[DESTROY]]: +// CHECK: async.coro.free %[[ID]], %[[HDL]] + // Suspend coroutine, and also a return statement for ramp function. // CHECK: ^[[SUSPEND]]: // CHECK: async.coro.end %[[HDL]] |
@llvm/pr-subscribers-mlir-core ChangesSplits the cleanup block lowered from AsyncToAsyncRuntime. The incentive of this change is to clarify the CFG branched by The
The behavior before this change is that after the coroutine is resumed and completed, it will jump to a shared cleanup block for destroying the states of coroutines. The CFG looks like the following, Entry block This CFG can potentially be problematic, because the After this change, the CFG will look like the following, The overall structure of the lowered CFG can be the following, Entry (calling async.coro.suspend) In this case, the Cleanup block tied to the Resume block will be isolated from the other path and it is strictly dominated by "Resume".Full diff: https://github.com/llvm/llvm-project/pull/66123.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp b/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp index e38904724348a9..d1a913409ca366 100644 --- a/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp +++ b/mlir/lib/Dialect/Async/Transforms/AsyncToAsyncRuntime.cpp @@ -94,8 +94,30 @@ struct CoroMachinery { Value coroHandle; // coroutine handle (!async.coro.getHandle value) Block *entry; // coroutine entry block std::optional setError; // set returned values to error state - Block *cleanup; // coroutine cleanup block - Block *suspend; // coroutine suspension block + Block *cleanup; // coroutine cleanup block + + // Coroutine cleanup block for destroy after the coroutine is resumed, + // e.g. async.coro.suspend state, [suspend], [resume], [destroy] + // + // This cleanup block is a duplicate of the cleanup block followed by the + // resume block. The purpose of having a duplicate cleanup block for destroy + // is to make the CFG clear so that the control flow analysis won't confuse. + // + // The overall structure of the lowered CFG can be the following, + // + // Entry (calling async.coro.suspend) + // | \ + // Resume Destroy (duplicate of Cleanup) + // | | + // Cleanup | + // | / + // End (ends the corontine) + // + // If there is resume-specific cleanup logic, it can go into the Cleanup + // block but not the destroy block. Otherwise, it can fail block dominance + // check. + Block *cleanupForDestroy; + Block *suspend; // coroutine suspension block }; } // namespace @@ -183,16 +205,21 @@ static CoroMachinery setupCoroMachinery(func::FuncOp func) { builder.create(originalEntryBlock); Block *cleanupBlock = func.addBlock(); + Block *cleanupBlockForDestroy = func.addBlock(); Block *suspendBlock = func.addBlock(); // ------------------------------------------------------------------------ // - // Coroutine cleanup block: deallocate coroutine frame, free the memory. + // Coroutine cleanup blocks: deallocate coroutine frame, free the memory. // ------------------------------------------------------------------------ // - builder.setInsertionPointToStart(cleanupBlock); - builder.create(coroIdOp.getId(), coroHdlOp.getHandle()); + auto buildCleanupBlock = [&](Block *cb) { + builder.setInsertionPointToStart(cb); + builder.create(coroIdOp.getId(), coroHdlOp.getHandle()); - // Branch into the suspend block. - builder.create(suspendBlock); + // Branch into the suspend block. + builder.create(suspendBlock); + }; + buildCleanupBlock(cleanupBlock); + buildCleanupBlock(cleanupBlockForDestroy); // ------------------------------------------------------------------------ // // Coroutine suspend block: mark the end of a coroutine and return allocated @@ -227,6 +254,7 @@ static CoroMachinery setupCoroMachinery(func::FuncOp func) { machinery.entry = entryBlock; machinery.setError = std::nullopt; // created lazily only if needed machinery.cleanup = cleanupBlock; + machinery.cleanupForDestroy = cleanupBlockForDestroy; machinery.suspend = suspendBlock; return machinery; } @@ -348,7 +376,7 @@ outlineExecuteOp(SymbolTable &symbolTable, ExecuteOp execute) { // Add async.coro.suspend as a suspended block terminator. builder.create(coroSaveOp.getState(), coro.suspend, - branch.getDest(), coro.cleanup); + branch.getDest(), coro.cleanupForDestroy); branch.erase(); } @@ -588,7 +616,7 @@ class AwaitOpLoweringBase : public OpConversionPattern { // Add async.coro.suspend as a suspended block terminator. builder.setInsertionPointToEnd(suspended); builder.create(coroSaveOp.getState(), coro.suspend, resume, - coro.cleanup); + coro.cleanupForDestroy); // Split the resume block into error checking and continuation. Block *continuation = rewriter.splitBlock(resume, Block::iterator(op)); diff --git a/mlir/test/Dialect/Async/async-to-async-runtime.mlir b/mlir/test/Dialect/Async/async-to-async-runtime.mlir index 635a86ecdb4bee..36583b2b94a3c9 100644 --- a/mlir/test/Dialect/Async/async-to-async-runtime.mlir +++ b/mlir/test/Dialect/Async/async-to-async-runtime.mlir @@ -25,17 +25,22 @@ func.func @execute_no_async_args(%arg0: f32, %arg1: memref<1xf32>) { // CHECK: %[[SAVED:.*]] = async.coro.save %[[HDL]] // CHECK: async.runtime.resume %[[HDL]] // CHECK: async.coro.suspend %[[SAVED]] -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[DESTROY:.*]] // Resume coroutine after suspension. // CHECK: ^[[RESUME]]: // CHECK: memref.store // CHECK: async.runtime.set_available %[[TOKEN]] +// CHECK: cf.br ^[[CLEANUP:.*]] // Delete coroutine. // CHECK: ^[[CLEANUP]]: // CHECK: async.coro.free %[[ID]], %[[HDL]] +// Delete coroutine. +// CHECK: ^[[DESTROY]]: +// CHECK: async.coro.free %[[ID]], %[[HDL]] + // Suspend coroutine, and also a return statement for ramp function. // CHECK: ^[[SUSPEND]]: // CHECK: async.coro.end %[[HDL]] @@ -79,11 +84,15 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // CHECK: async.runtime.resume %[[HDL]] // CHECK: async.coro.suspend -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[DESTROY:.*]] // CHECK: ^[[RESUME]]: // CHECK: memref.store // CHECK: async.runtime.set_available %[[TOKEN]] +// CHECK: cf.br ^[[CLEANUP:.*]] + +// CHECK: ^[[CLEANUP]]: +// CHECK: ^[[DESTROY]]: // Function outlined from the outer async.execute operation. // CHECK-LABEL: func private @async_execute_fn_0 @@ -96,7 +105,7 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // Suspend coroutine in the beginning. // CHECK: async.runtime.resume %[[HDL]] // CHECK: async.coro.suspend -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME_0:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME_0:.*]], ^[[DESTROY_0:.*]] // Suspend coroutine second time waiting for the completion of inner execute op. // CHECK: ^[[RESUME_0]]: @@ -104,7 +113,7 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // CHECK: %[[SAVED:.*]] = async.coro.save %[[HDL]] // CHECK: async.runtime.await_and_resume %[[INNER_TOKEN]], %[[HDL]] // CHECK: async.coro.suspend %[[SAVED]] -// CHECK-SAME: ^[[SUSPEND]], ^[[RESUME_1:.*]], ^[[CLEANUP]] +// CHECK-SAME: ^[[SUSPEND]], ^[[RESUME_1:.*]], ^[[DESTROY_0]] // Check the error of the awaited token after resumption. // CHECK: ^[[RESUME_1]]: @@ -115,9 +124,11 @@ func.func @nested_async_execute(%arg0: f32, %arg1: f32, %arg2: memref<1xf32>) { // CHECK: ^[[CONTINUATION:.*]]: // CHECK: memref.store // CHECK: async.runtime.set_available %[[TOKEN]] +// CHECK: cf.br ^[[CLEANUP_0:.*]] // CHECK: ^[[SET_ERROR]]: -// CHECK: ^[[CLEANUP]]: +// CHECK: ^[[CLEANUP_0]]: +// CHECK: ^[[DESTROY_0]]: // CHECK: ^[[SUSPEND]]: // ----- @@ -354,7 +365,7 @@ func.func @execute_assertion(%arg0: i1) { // Initial coroutine suspension. // CHECK: async.coro.suspend -// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[CLEANUP:.*]] +// CHECK-SAME: ^[[SUSPEND:.*]], ^[[RESUME:.*]], ^[[DESTROY:.*]] // Resume coroutine after suspension. // CHECK: ^[[RESUME]]: @@ -363,7 +374,7 @@ func.func @execute_assertion(%arg0: i1) { // Set coroutine completion token to available state. // CHECK: ^[[SET_AVAILABLE]]: // CHECK: async.runtime.set_available %[[TOKEN]] -// CHECK: cf.br ^[[CLEANUP]] +// CHECK: cf.br ^[[CLEANUP:.*]] // Set coroutine completion token to error state. // CHECK: ^[[SET_ERROR]]: @@ -374,6 +385,10 @@ func.func @execute_assertion(%arg0: i1) { // CHECK: ^[[CLEANUP]]: // CHECK: async.coro.free %[[ID]], %[[HDL]] +// Delete coroutine. +// CHECK: ^[[DESTROY]]: +// CHECK: async.coro.free %[[ID]], %[[HDL]] + // Suspend coroutine, and also a return statement for ramp function. // CHECK: ^[[SUSPEND]]: // CHECK: async.coro.end %[[HDL]] |
Splits the cleanup block lowered from AsyncToAsyncRuntime. The incentive of this change is to clarify the CFG branched by `async.coro.suspend`. The `async.coro.suspend` op branches into 3 blocks, depending on the state of the coroutine: 1) suspend 2) resume 3) cleanup The behavior before this change is that after the coroutine is resumed and completed, it will jump to a shared cleanup block for destroying the states of coroutines. The CFG looks like the following, Entry block | \ resume | | | Cleanup | End This CFG can potentially be problematic, because the `Cleanup` block is a shared block and it is not dominated by `resume`. For instance, if some pass wants to add some specific cleanup mechanism to resume, it can be confused and add them to the shared `Cleanup`, which leads to the "operand not dominate its use" error because of the existence of the other "Entry->cleanup" path. After this change, the CFG will look like the following, The overall structure of the lowered CFG can be the following, Entry (calling async.coro.suspend) | \ Resume Destroy (duplicate of Cleanup) | | Cleanup | | / End (ends the corontine) In this case, the Cleanup block tied to the Resume block will be isolated from the other path and it is strictly dominated by "Resume".
Splits the cleanup block lowered from AsyncToAsyncRuntime.
The incentive of this change is to clarify the CFG branched by
async.coro.suspend
.The
async.coro.suspend
op branches into 3 blocks, depending on the state of the coroutine:The behavior before this change is that after the coroutine is resumed and completed, it will jump to a shared cleanup block for destroying the states of coroutines. The CFG looks like the following,
This CFG can potentially be problematic, because the
Cleanup
block is a shared block and it is not dominated byresume
. For instance, if some pass wants to add some specific cleanup mechanism to resume, it can be confused and add them to the sharedCleanup
, which leads to the "operand not dominate its use" error because of the existence of the other "Entry->cleanup" path.After this change, the CFG will look like the following,
The overall structure of the lowered CFG can be the following,
In this case, the Cleanup block tied to the Resume block will be isolated from the other path and it is strictly dominated by "Resume".