-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion failure on switch statement with case label in nested block #522
Comments
Famous "last words" haha, I just hit this in prod! Some "temporary" workaround is using this. Now that @gitoleg work has landed we probably have the machinery to solve this. Any chance (or any plans) you might work on this soon @gitoleg? (cc. @wenpen) |
I will take a look in the next few days, sure |
This is all about codegen, so I would say the previous work has nothing to do with it. Speaking about the current issue, it's still not obvious for me how to handle it properly, though I think I can suggest an approach.
The original codegen even doesn't emit any random code as far I can see, and I think we need to do the same. We could extend the current issue example (will refer it as example 1) to make it a little more interesting (example 2):
The Now, what I would do. I would create a small class for the switch stmt processing with several fields like case attributes, maybe switch variable type - and maintain a pointer to it in the So we will able to handle nested switch statements and we can fix the example 1 . @bcardosolopes what do you think? Is it a good/workable solution? Or I miss some details here? |
Right, I wasn't implying that, sorry for the misunderstanding. My initial thinking is that we could probably map these weird inner cases with synthetic gotos/labels, since they violate some scope control-flow.
Thanks for looking into this.
If the original codegen is doing that (I haven't checked), I agree we should do the same - It'd probably be nice to maintain the statement for the sake of providing unrecheable diagnostics later on, e.g. even though we ignore the emission, we could emit an empy scope with source loc on a basic block that is unrecheable. But just initially supporting lowering works for me. I also don't see any current warnings for these types of unrecheable statements in clang.
Yep, we currently track some of it already as part of
Sounds great to me, thanks for sharing a solution. Do you have any interest to work on this? |
I do) But I can't promise I'll do it soon. So if someone wants to do it earlier, just let me know, e.g. right here. Otherwise I'll return to this issue later and keep you informed about the progress/problems etc. |
@wenpen do you want to work on this one? |
@piggynl this could be a good issue to tackle as well. |
I'm still working on #528, and I feel these problems should be resolved in a unified solution. |
I have a proposal to fix this issue by modifying the definition of What's the limitation of current
|
@wenpen thanks for taking the time to propose a holistic solution, very nice writeup. I think it overall makes sense, the part that bothers me is having a design that prioritize the corner case of the language in detriment of the common use cases, but I think this is hinting at the right direction. Given your proposal, few questions:
Thanks! |
Here is an example code to discuss the mixed (common and corner) case.
Good point! I feel it's a better structure to make cir.case have a region for "normal" use case. So, I'd like to keep the current logic to scan
My thought is same with the current implement: use
Yes, it looks similar. But when I tried some code to see the cir, it crashed. Have created issue 688 for it. Will have another try later. Thanks for your suggestions~ @bcardosolopes |
I'm not, sorry |
@ChuanqiXu9 Sorry my work is hold on, happy to see you take it~ |
Out of curiosity, what's the benefit to make a region for a case? I feel it is less straight forward. Can we make it simpler to treat all cases like labels? |
Some updates after I played around it in the early stage: I feel it is more or less over designed to make each case to have its own region. Even if we ignore the Since I think CIR is in the early stage, the risks to break the so-called old behavior may be under control. Even if there is anything going wrong, we can fix it. We can avoid the technical debt at the young age. |
If the alternative here is pure basic blocks instead, I don't see what goodness it brings either, once it becomes all basic block we lose the higher level view too early. What approach were you thinking about? I like the regions, it becomes more logical to think about the code (let's say we want to look at a lower level switch/case with some structuring) and merge similar case's, make transformations with the switch, etc. Some lifetime checker violations implemented on switch/cases have a nice abstraction to work with as well right now. I prefer not to design around the edge case, the common case sounds more appealing to me. When we hit the edge cases, we should map that somehow (for example, we could use symbolic cir.goto to be later be solved in the pipeline) but without losing the granularity of the common case (be able to look at a lower level switch/case statements at will). |
Yeah, this is what confuses me and I tried to discuss in #978. What is a region in CIR?
So this was the reason why I thought we shouldn't lower a But later I think more, I feel like, may be, CIR models the regions as relationships between control flows (ignoring some special cases @sitio-couto mentioned in #978). Then it makes sense. I come from LLVM and regions are not a first class concept there. So I preferred to use blocks to model the control flow relationships. But if the CIR (or the MLIR) preferring using regions to model the control flows, I am fine with that. (Correct me if I am wrong) Although this may sound pedantic, but I feel it is important. Otherwise we can't think seriously.
But we have to care about the edge cases. What I care about here is consistency. Given the discuss in #978, we know that the region concept in CIR are not strict and we somewhat, not care about if the regions are isolated or not before flattening (We're going to teach passes to skip these inlegal regions if I read correctly). Then how about always creating a region for every case? |
Both those statements are true. The edge cases absolutely have to work. But we hopefully don't have to sacrifice a nice design in the process. The design should be optimized for the normal case. If that design also works well for the edge cases, that's great. But it's not always possible.
Consistency is a good goal. But it is not the top priority. Sometimes it has to lose out to other priorities. In this case, the design of C and C++ make it hard to be both consistent and useful. The flexibility of The vast, vast majority of For the very small number of
were actually written as
If that works (and I don't know if it actually will), that preserves the nice design of a |
hi, thanks for the written up. But let's try to see if we can still have a consistent design for it: e.g., always creating a region for all caseOps. |
I believe there is only remaining blocking issue for me to send my PR and I feel like I need some inputs here. The question is: what's the motivation/goal/intention to create a new return for each cases? clangir/clang/lib/CIR/CodeGen/CIRGenFunction.h Lines 2249 to 2256 in cab2b44
clangir/clang/lib/CIR/CodeGen/CIRGenFunction.h Lines 2209 to 2213 in cab2b44
Here is a comment but I can't understand the underlying problem. I mean, what's the problem and how do we "solve" that by the current solution. Initially I thought it was related to the lifetime/cleanups for variables. But I failed to get more insight. Following are some random thoughts. They are more or less in chaos. I am not sure if it helps. First, I think, if the case is followed by a
Here the |
Sent #1006 I guess my above question not come from CIR semantics, but from the restrict from MLIR: a block may not refer to blocks in other regions. Correct me if I am wrong. |
Close #522 This solves the issue we can't handle `case` in nested scopes and we can't handle if the switch body is not a compound statement. The core idea of the patch is to introduce the `cir.case` operation to the language. Then we can get the cases by traversing the body of the `cir.switch` operation easily instead of counting the regions and the attributes. Every `cir.case` operation has a region and now the `cir.switch` has only one region too. But to make the analysis and optimizations easier, I add a new concept `simple form` here. That a simple `cir.switch` operation is: all the `cir.case` operation owned by the `cir.switch` lives in the top level blocks of the `cir.switch` region and there is no other operations except the ending `cir.yield`. This solves the previous `simplified for common-case` vs `general solution` discussion in #522. After implemented this, I feel the correct answer to it is, we want a general solution for constructing and lowering the operations but we like simple and common case for analysis and optimizations. We just mixed the different phases. For other semantics, see `CIROps.td`. For lowering, we can make it generally by lower the cases one by one and finally lower the switch itself. Although this patch has 1000+ lines of changes, I feel it is relatively neat especially it erases some odd behaviors before. Tested with Spec2017's C benchmarks except 500.perlbench_r.
Close #522 This solves the issue we can't handle `case` in nested scopes and we can't handle if the switch body is not a compound statement. The core idea of the patch is to introduce the `cir.case` operation to the language. Then we can get the cases by traversing the body of the `cir.switch` operation easily instead of counting the regions and the attributes. Every `cir.case` operation has a region and now the `cir.switch` has only one region too. But to make the analysis and optimizations easier, I add a new concept `simple form` here. That a simple `cir.switch` operation is: all the `cir.case` operation owned by the `cir.switch` lives in the top level blocks of the `cir.switch` region and there is no other operations except the ending `cir.yield`. This solves the previous `simplified for common-case` vs `general solution` discussion in #522. After implemented this, I feel the correct answer to it is, we want a general solution for constructing and lowering the operations but we like simple and common case for analysis and optimizations. We just mixed the different phases. For other semantics, see `CIROps.td`. For lowering, we can make it generally by lower the cases one by one and finally lower the switch itself. Although this patch has 1000+ lines of changes, I feel it is relatively neat especially it erases some odd behaviors before. Tested with Spec2017's C benchmarks except 500.perlbench_r.
ClangIR hits an assertion failure when a switch statement contains a case label that is within a nested block statement.
While code like this is unlikely to appear in production and will usually be found in test suites that try to break the compiler, it is legal code in both C and C++ and should not trigger an internal compiler error.
The text was updated successfully, but these errors were encountered: