-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CIR] Goto #508
[CIR] Goto #508
Conversation
This PR adds `cir.brcond` lowering, which more or less follows the one in LLVM dialect lowering.
This is a suggestion to relax the existing verification even more than we did it in PR llvm#257. Here we also skip verification if a field on the given index is also of incomplete type - and we can not compare it with the result type of the operation. Now the next code fails with type mismatch error: ``` typedef struct Node { struct Node* next; } NodeStru; void foo(NodeStru* a) { a->next = 0; } ``` because the result type is kind of full and the type of field is not (for the reasons discussed in llvm#256). Basically, the problem is in the `GetMemberOp` result type generated as following (via `CIRGenTypes::convertType`) `!cir.ptr<!cir.struct<struct "Node" {!cir.ptr<!cir.struct<struct "Node" incomplete #cir.record.decl.ast>>} #cir.record.decl.ast>>` where the field type at index differs from the record type - compare with `!cir.ptr<!cir.struct<struct "Node" incomplete #cir.record.decl.ast>>` We just slightly relax the previous solution in llvm#257 - and the compilation won't fail in the case of recursive types. Well, if there are some other thoughts?
Just a trivial fix that enables declaration of local structs. Basically, there it's a copy-pasta from the original `CodeGen`, without debug info handling. Co-authored-by: Bruno Cardoso Lopes <bcardosolopes@users.noreply.github.com>
This PR handles globals initializations for c++ code for the case when a global is inited from a function call or another global.
Traditional Clang's codegen generates IDs for anonymous records (e.g. "struct.anon.1") and ensures that they are unique. This patch does the same for CIRGen, which, until now, would just identify any anonymous record as "anon". This will be required to support mutable structs uniquely identified by their names.
Rename `typeName` to just `name`, also use `StringAttr`'s nullability to identify if the record is identified or anonymous. Unnamed structs are also no longer aliased, as they have no unique name to be used in the alias.
This patch adds RTTI support for C++ virtual inheritance. This patch does not include LLVM lowering support.
Lowering Vtable and RTTI globals. Also lowering AddressPoint. based on llvm#259
The `body` attribute is used to identify if a struct type has a body or not. In other words, it separates complete from incomplete structs. For this reason, the `body` attribute was renamed to `incomplete`, matching its keyword in the CIR language.
MLIR's attributes are inherently nullable, so there is no need to use std::optional to represent a nullable attribute. This patch removes std::optional from StructType's ast attribute to simplify its usage.
…ong and untested CastKind::floating path
In the CIR CodeGen function `ScalarExprEmitter::buildScalarCast`, implement conversions from bool to an integral type. This was inadvertently left out in earlier changes. Reorganize the code in the function to be more clear, with better assertion failure messages when encountering an unimplemented construct. This is a partial fix for issue llvm#290
As discussed in llvm#279, we split `CIRGenBuilder` in two parts, which make some of the helpers usable outside of the `CodeGen` part. Basically, I placed casts and binary operations into a separate class, `CIRBaseBuilder`. Later, it can be extended with another helpers. But right now an idea to have a state less builder as a base one.
A silly fix for code like ``` *(char *)0 = 0; ```
) Currently, different specializations of a template will generate distinct types with the same name. To properly differentiate each specialization, this patch includes the template arguments in the name of the type. This will be required to support mutable structs uniquely identified by their names.
Instead of using a single builder for every possible StructType, we now have three builders: identified complete, identified incomplete, and anonymous struct types. This allows us to enforce correctness and to explicitly show the intent when creating a StructType. This patch also adds support for anonymous structs type aliases. When a StructType has no name, it will generate a `ty_anon_<kind>` alias. Conflicts are automatically resolved by MLIR.
There are instances of CIR where anonymous structs are generated as identified structs with an empty name. This patch Adds a verifier for StructType, which ensures structs have a non-empty name. This will be required for properly uniqueing mutable CIR structs.
Essentially, this patch redefines the CIR StructType manually instead of using the autogenerated definition from tablegen. This is the first step to make StructType mutable, as this feature is not yet supported by tablegen. It's mostly a copy of the tablegen definition, with a few notable differences: - A few embellishments are added to make the code more dev-friendly - Addition of a CIRTypesDetails.h file to keep custom storage definitions - The CIR_AnyCIRType constraint is removed, as it is not used and must be defined in C++ to ensure StructType is a part of it. ghstack-source-id: 5f706dc0a61a4a2ed6e2f20ab0937b1a42bfa9cc Pull Request resolved: llvm#302
This allows a named StructType to be mutated after it has been created, if it is identified and incomplete. The motivation for this is to improve the codegen of CIR in certain scenarios where an incomplete type is used and later completed. These usually leave the IR in an inconsistent state, where there are two records types with the same identifier but different definitions (one complete the other incomplete). For example: ```c++ struct Node { Node *next; }; void test(struct Node n) {} ``` Generates: ```mlir !temp_struct = !cir.struct<struct "Node" incomplete> !full_struct = !cir.struct<struct "Node" {!cir.ptr<!temp_struct>}> ``` To generate the `Node` struct type, its members must be created first. However, the `next` member is a recursive reference, so it can only be completed after its parent. This generates a temporary incomplete definition of the `Node` type that remains in the code even after the type to which it refers is completed. As a consequence, accessing the `next` member of a `Node` value fetches the old incomplete version of the type which affects CIR's type-checking capabilities. This patch ensures that, once the parent is fully visited, the `next` member can be completed in place, automatically updating any references to it at a low cost. To represent recursive types, the StructType now is equipped with self-references. These are represented by a `cir.struct` type with just the name of the parent struct that it refers to. The same snippet of code will not generate the following CIR IR: ```mlir !full_struct = !cir.struct<struct "Node" {!cir.ptr<!cir.struct<struct "Node">>}> ``` Summary of the changes made: - Named records are now uniquely identified by their name. An attempt to create a new record with the same will fail. - Anonymous records are uniquely identified by members and other relevant attributes. - StructType has a new `mutate` method that allows it to be mutated after it has been created. Each type can only be mutated if it is identified and incomplete, rendering further changes impossible. - When building a new name StructType, the builder will try to first create, then complete the type, ensuring that: - Inexistent types are created - Existing incomplete types are completed - Existing complete types with matching attributes are reused - Existing complete types with different attributes raise errors - StructType now uses the CyclicParser/Printer guard to avoid infinite recursion and identify when it should print/parse a self-reference. ghstack-source-id: a6d4f650515cbf2d7f6e27d45aae6f768ba44f92 Pull Request resolved: llvm#303
Matrix types are already checked for in `buildScalarConversion`, so they don't need to be checked for again in `buildScalarCast`. Not having to worry about matrix types means the `Element` local variables are no longer necessary. Remove duplicate code by having a variable to store the `CastKind`, and have only one call to `Builder.create<CastOp>`. There are no test changes, because this is refactoring only. There should be no functional changes.
The PR fixes a var initialization with explicit cast.
Add a new entry to enum `CastKind`, `bool_to_float`, since none of the existing enum values adequately covered that conversion. Add code to code gen, CIR validation, LLVM lowering, and the cast test to cover this conversion. Fix ClangIR issue llvm#290
Fix a couple typos in the validation failure messages for scalar casts
You can test this locally with the following command:git-clang-format --diff 9a9729e3ddb7ac6580954cc332ac032c0256b538 46985d2176ae535f42df1c069d9e2fbdcbf8ba9f -- clang/lib/CIR/Dialect/Transforms/RegionInlining.cpp clang/include/clang/CIR/Dialect/Passes.h clang/lib/CIR/CodeGen/CIRGenFunction.cpp clang/lib/CIR/CodeGen/CIRGenFunction.h clang/lib/CIR/CodeGen/CIRGenStmt.cpp clang/lib/CIR/CodeGen/CIRPasses.cpp clang/lib/CIR/Dialect/Transforms/MergeCleanups.cpp clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp View the diff from clang-format here.diff --git a/clang/lib/CIR/Dialect/Transforms/RegionInlining.cpp b/clang/lib/CIR/Dialect/Transforms/RegionInlining.cpp
index 404261e15d..fdc33703c1 100644
--- a/clang/lib/CIR/Dialect/Transforms/RegionInlining.cpp
+++ b/clang/lib/CIR/Dialect/Transforms/RegionInlining.cpp
@@ -2,15 +2,14 @@
#include "mlir/Dialect/Func/IR/FuncOps.h"
#include "mlir/IR/PatternMatch.h"
#include "mlir/Support/LogicalResult.h"
-#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
#include "mlir/Transforms/DialectConversion.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
#include "clang/CIR/Dialect/IR/CIRDialect.h"
#include "clang/CIR/Dialect/Passes.h"
using namespace mlir;
using namespace cir;
-
namespace {
/// Lowers operations with the terminator trait that have a single successor.
@@ -35,12 +34,11 @@ void walkRegionSkipping(mlir::Region ®ion,
});
}
-
struct RegionInliningPass : public RegionInliningBase<RegionInliningPass> {
using RegionInliningBase::RegionInliningBase;
void runOnOperation() override;
- void process_gotos(SmallVector<GotoOp>& gotos);
+ void process_gotos(SmallVector<GotoOp> &gotos);
};
struct IfOpRegionsInlining : public OpRewritePattern<IfOp> {
@@ -92,8 +90,7 @@ struct IfOpRegionsInlining : public OpRewritePattern<IfOp> {
// I believe it can be done in BrCondOp lowering, and we don't need it here.
rewriter.setInsertionPointToEnd(currentBlock);
rewriter.create<mlir::cir::BrCondOp>(loc, ifOp.getCondition(),
- thenBeforeBody,
- elseBeforeBody);
+ thenBeforeBody, elseBeforeBody);
if (!emptyElse) {
rewriter.setInsertionPointToEnd(elseAfterBody);
@@ -143,7 +140,7 @@ struct ScopeOpRegionsInlining : public OpRewritePattern<ScopeOp> {
// Save stack and then branch into the body of the region.
rewriter.setInsertionPointToEnd(currentBlock);
// TODO(CIR): stackSaveOp
-
+
rewriter.create<mlir::cir::BrOp>(loc, mlir::ValueRange(), beforeBody);
// Replace the scopeop return with a branch that jumps out of the body.
@@ -170,10 +167,9 @@ public:
using mlir::OpInterfaceRewritePattern<
mlir::cir::LoopOpInterface>::OpInterfaceRewritePattern;
- inline void
- lowerConditionOp(mlir::cir::ConditionOp op, mlir::Block *body,
- mlir::Block *exit,
- mlir::PatternRewriter &rewriter) const {
+ inline void lowerConditionOp(mlir::cir::ConditionOp op, mlir::Block *body,
+ mlir::Block *exit,
+ mlir::PatternRewriter &rewriter) const {
mlir::OpBuilder::InsertionGuard guard(rewriter);
rewriter.setInsertionPoint(op);
rewriter.replaceOpWithNewOp<mlir::cir::BrCondOp>(op, op.getCondition(),
@@ -221,7 +217,7 @@ public:
// if (bodyYield)
// lowerTerminator(bodyYield, (step ? step : cond), rewriter);
- for (auto& blk : op.getBody().getBlocks()) {
+ for (auto &blk : op.getBody().getBlocks()) {
auto bodyYield = dyn_cast<mlir::cir::YieldOp>(blk.getTerminator());
if (bodyYield)
lowerTerminator(bodyYield, (step ? step : cond), rewriter);
@@ -243,9 +239,7 @@ public:
}
};
-
-
-void populateRegionInliningPatterns(RewritePatternSet &patterns) {
+void populateRegionInliningPatterns(RewritePatternSet &patterns) {
// clang-format off
patterns.add<
IfOpRegionsInlining,
@@ -255,18 +249,18 @@ void populateRegionInliningPatterns(RewritePatternSet &patterns) {
// clang-format on
}
-// TODO: this is still a draft, not sure it's a good way. Maybe create a separate
-// pass for it
-void RegionInliningPass::process_gotos(SmallVector<GotoOp>& gotos) {
- for (auto& goTo : gotos) {
+// TODO: this is still a draft, not sure it's a good way. Maybe create a
+// separate pass for it
+void RegionInliningPass::process_gotos(SmallVector<GotoOp> &gotos) {
+ for (auto &goTo : gotos) {
mlir::ConversionPatternRewriter rewriter(goTo.getContext());
rewriter.setInsertionPoint(goTo);
auto func = goTo.getOperation()->getParentOfType<mlir::cir::FuncOp>();
- func.getBody().walk([&](mlir::Block* blk) {
- for (auto& op : blk->getOperations()) {
+ func.getBody().walk([&](mlir::Block *blk) {
+ for (auto &op : blk->getOperations()) {
if (auto lab = dyn_cast<mlir::cir::LabelOp>(op)) {
- if (goTo.getLabel() == lab.getLabel()){
+ if (goTo.getLabel() == lab.getLabel()) {
lowerTerminator(goTo, blk, rewriter);
goTo.erase();
return mlir::WalkResult::interrupt();
@@ -291,7 +285,7 @@ void RegionInliningPass::runOnOperation() {
ops.push_back(op);
if (auto goTo = dyn_cast<GotoOp>(op))
- gotos.push_back(goTo);
+ gotos.push_back(goTo);
});
// Apply patterns.
diff --git a/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp b/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
index 0543c34593..199a469da5 100644
--- a/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+++ b/clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
@@ -2666,8 +2666,8 @@ class CIRLabelOpLowering
mlir::LogicalResult
matchAndRewrite(mlir::cir::LabelOp op, OpAdaptor adaptor,
mlir::ConversionPatternRewriter &rewriter) const override {
- rewriter.eraseOp(op);
- return mlir::success();
+ rewriter.eraseOp(op);
+ return mlir::success();
}
};
@@ -2745,8 +2745,8 @@ void populateCIRToLLVMConversionPatterns(mlir::RewritePatternSet &patterns,
CIRVectorExtractLowering, CIRVectorCmpOpLowering, CIRVectorSplatLowering,
CIRVectorTernaryLowering, CIRStackSaveLowering, CIRStackRestoreLowering,
CIRUnreachableLowering, CIRTrapLowering, CIRInlineAsmOpLowering,
- CIRSetBitfieldLowering, CIRGetBitfieldLowering,
- CIRLabelOpLowering>(converter, patterns.getContext());
+ CIRSetBitfieldLowering, CIRGetBitfieldLowering, CIRLabelOpLowering>(
+ converter, patterns.getContext());
}
namespace {
@@ -2939,31 +2939,30 @@ void ConvertCIRToLLVMPass::runOnOperation() {
getOperation()->removeAttr("cir.sob");
getOperation()->removeAttr("cir.lang");
- llvm::SmallVector<mlir::Operation*> ops;
+ llvm::SmallVector<mlir::Operation *> ops;
ops.push_back(module);
- llvm::SmallVector<mlir::Block*> unreachable_blocks;
- module->walk(
- [&](mlir::Block* blk) {
- if (blk->hasNoPredecessors() && !blk->isEntryBlock())
- unreachable_blocks.push_back(blk);
+ llvm::SmallVector<mlir::Block *> unreachable_blocks;
+ module->walk([&](mlir::Block *blk) {
+ if (blk->hasNoPredecessors() && !blk->isEntryBlock())
+ unreachable_blocks.push_back(blk);
});
- std::set<mlir::Block*> visited;
- for (auto* root : unreachable_blocks) {
-
+ std::set<mlir::Block *> visited;
+ for (auto *root : unreachable_blocks) {
+
// We create a work list for each unreachable block.
// Thus we traverse operations in some order.
- std::deque<mlir::Block*> blocks;
+ std::deque<mlir::Block *> blocks;
blocks.push_back(root);
while (!blocks.empty()) {
- auto* blk = blocks.back();
+ auto *blk = blocks.back();
blocks.pop_back();
- if (visited.count(blk))
+ if (visited.count(blk))
continue;
visited.emplace(blk);
- for (auto& op : *blk)
+ for (auto &op : *blk)
ops.push_back(&op);
for (auto it = blk->succ_begin(); it != blk->succ_end(); ++it)
@@ -2971,7 +2970,6 @@ void ConvertCIRToLLVMPass::runOnOperation() {
}
}
-
if (failed(applyPartialConversion(ops, target, std::move(patterns))))
signalPassFailure();
|
@@ -2924,7 +2966,40 @@ void ConvertCIRToLLVMPass::runOnOperation() { | |||
getOperation()->removeAttr("cir.sob"); | |||
getOperation()->removeAttr("cir.lang"); | |||
|
|||
if (failed(applyPartialConversion(module, target, std::move(patterns)))) | |||
llvm::SmallVector<mlir::Operation*> ops; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is the stub I was talking about - and I don't like it. I collect unreachable blocks and then iterate them, discovering successors, add theirs operations and etc.
|
||
// TODO: this is still a draft, not sure it's a good way. Maybe create a separate | ||
// pass for it | ||
void RegionInliningPass::process_gotos(SmallVector<GotoOp>& gotos) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here is place where GoTo
is replaced by CIR BrOp
. Probably it's not a good place - and I can create a separate pass for that after RegionInlining
lowerTerminator(op, exit, rewriter); | ||
}); | ||
|
||
// Lower optional body region yield. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only noticeable change: for some reason YieldOp
in the loop body was searched only in the first block, which won't work if there are several in there:
do {
...
label:
...
} while(...)
Thanks for working on this Oleg, this is long overdue.
Yep. This is how far I went! I appreciate you are building the real support. For the
This way we can make the label symbols local to a function, and easily check whether we should run the pass for it. It might require FuncOp to define more symbols than it's name, shouldn't be a problem though? Alternatively, we could have multiple operations
For this one, verifier should make sure it's the first op in a block.
Yes! The naive plan I had on my mind was basically the same:
Maybe a good first step to build this up would be to start with refactoring out of LowerToLLVM:
Once you complete this step, then you create a new "SolveGotos" pass, that will:
How does this sound? Thoughts?
Yes! |
@bcardosolopes
speaking about the new pass: I would suggest to embed it into the lowering pipeline, i.e. in the
But it's up to you to decide - I'm ready to implement the new pass in the CIR pipeline UPD: though |
We are actually looking into two new passes:
Perfect. I'd break into more incremental PRs to make it easier to land/review.
Actually, I don't think current existing tests need to change at all. Not even I'm not opposed if we start by embeding it into the lowering pipeline, but I'd like that we can test the following scenarios: One question that just occured to me is where should The suggestion I have for now is to keep Also, it sounds to me that
I'm a bit confused, which case do you mean? |
@bcardosolopes Well, my intention was to add the pass in Next, I could add the pass right after lowering prepare. This would make our life easier - because by doing so we could control the way CIR is emitted exactly as we want - before the pass in question and after. But here is a list of cons:
So I think I have a solution. We add new passes after the
Right! So what do you think? Will it work from your point of view? |
Right, we need to make this a bit more uniform. Perhaps cir-opt and clang should share something along the lines of
Those are good points!
Oh boy, I get it now, thanks for clarifying.
Yep, same is true for future work on exceptions and coroutines.
+1
LGTM.
Indeed, this goes back to my first sentence in this comment: we need to make cir-opt and clang use the same pass setup from CIR down to LLVM.
Yes, thanks for pushing this forward. |
ed3955a
to
8b7417c
Compare
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in #508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
@gitoleg can we close this one? |
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in #508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in #508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in llvm#508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in #508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in llvm#508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in llvm#508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in llvm#508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
We start our journey towards `goto` support and this is a first step on this way. There are some discussion in #508 and according to the plan we do the following here: - a new pass called `cir-flatten-cfg` that is a stub now but later will be responsible for the regions inlining. The pass works only if `-emit-flat-cir` is passed in cmd line. Thus, the clang behavior is not changed here from the user's point of view. - The pass will be accomplished with `goto` solver later, so we talk about several passes that are mandatory for the lowering into `llvm` dialect. There are at least two clients of this pass that will be affected: `clang` itself and `cir-opt`, so we need a common point for them: and `populateCIRFlatteningPasses` and `populateCIRToLLVMPasses` guarantee that `CIR` will be in the correct state for all the clients, whatever new passes we will add later.
Please, take a seat, there is a long story ahead.
Intro
The idea is to add a support for
goto
and this PR is a draft, WIP, a proposal and help seeking.There is some
goto
support in the CIR code, but I would say the existing approach won't help us in the real code, where any goto may lead to any place in a function. The existing approach assumes that goto and target label are located in the same lexical scope.Problem description
Initial desire is just extend current approach and resolve gotos and labels somewhere in the end of function building. But it won't work, since CIR operations contains regions, and it's prohibited to reference a block from another region, e.g. we can't create a branch from
IfOp
to another or do branching inside a loop. In other words, we just can't apply the same approach here.Implementation in CIR
Thus, we need operations -
cir.goto
andcir.label
. The latter could be replaced with an attribute for a block, but it's not possible as far I understand. Since labels are unique inside a function - we can use just string attribute here. Thus, for a simple code like:we get the next CIR code
Well, there is a downside here as well - we introduce a new terminator, new branch-like operation which one need to take in account while writing some analysis. But anyway. We don't have much choice here.
Well, looks like that's we need to do in CIR, The next part is lowering.
Lowering Implementation
At first glance, we don't need to do a lot here. Lower
GotoOp
as following: iterate over the blocks and find one which starts with the corresponded label. It could work in the example above. But here arises a bigger problem. Look:Well, I don't provide a CIR code here, but the code under
err
label is unreachable from the lowering point of view. And we can't just connect blocks and think we've solved the problem. Unreachable - means that they won't be visited during the lowering.Lowering observations
Here is some thoughts about lowering.
The good thing - and actually it's what make
goto
possible to implement - is that we don't pass regions into dialect, and inline them instead. Thus, we can refer to any block in a function. Good.In the same time - take a look, we emit CIR operations while lower to llvm dialect. It's kind of wrong.
So. I think we need a separate pass that will inline regions, i.e.
ifOp
,ScopeOp
,LoopOp
... should be removed from CIR code before lowering and replaced with branches and conditional branches in the same fashion like we do in the lowering.I didn't remove the said operations from
LowerToLLVM
in order to make the diff more readable. But actually the plan is that they never appear in the lowering.This is what
RegionInlining
pass is intended - it's almost a copy pasta fromLowerToLLVM.cpp
. So far there are only three operations handled there. Also, I wrote a small function that connectsGotoOp
andLabelOp
there as well- and it's definitely not a good place for that, I need a separate pass for it, Or renameRegionInlining
to something else. It doesn't matter right now, though if you have some thought - I'm eager to hear.Anyway, the point is that we need to have a proper control flow before the lowering. And thus we have it!
Problems and questions
First, I need to understand if the said above is good from your point of view. Maybe there is another possible solution I didn't come to.
Next. There are some problems remains. The biggest one is unreachable blocks that contains CIR code and hence will be rejected by llvm dialect. Unreachable blocks may appear even in the current lowering process when we do regions inlining, blocks splitting.. but it's kind of coincidence (or not?) that theirs operations are replaced by ones from the llvm dialect, while I create exactly CIR operations in the
RegionInliningPass
.Thus I create a stub in
LowerToLLVM.cpp
- that just collect all the unreachable operations and lower them. So may be you know how to do the same better.Next.
SwitchOp
is lowered as ..SwitchOp
) and it means we need to invent here something to inline regions as it's done forLoopOp
for instance. The only idea I have so far is to create a special service operation that will reflect the one from dialect (the same was done forBrCondOp
that never appears in the CIR codegen).Outro
It's possible to support
goto
in CIR!