#11134 Fix broken AST with (designated) initializers#4550
Conversation
Made similar to around line 790
- _always_, because in some cases this was omitted (around line 790) or too strict (around line 860) - _only_, and not following tokens which happen to be } as well (around line 1030)
auto var{ {{},{}}, {} };
Old AST:
```
{
|-var
`-{
`-,
|-,
| |-{
| `-{
`-{
```
New AST:
```
{
|-var
`-,
|-{
| `-,
| | |-{
| | `-{
`-{
```
Compare the same example, but with `X{}` instead of just `{}`:
`auto var{ a{b{},c{}}, d{} };`
```
{
|-var
`-,
|-{
| |-a
| `-,
| | |-{
| | | `-b
| | `-{
| | | `-c
`-{
`-d
```
This structure is similar to that of the new AST, not the old AST
Code: `auto var{{1,a::b{2,3}}, {4,a::b{5,6}}};`
Old AST:
```
{
|-var
`-{
`-,
|-,
| |-1 'signed int'
| `-{
| | |-::
| | | |-a
| | | `-b
| | `-,
| | | |-2 'signed int'
| | | `-3 'signed int'
`-{
`-,
|-4 'signed int'
`-{
|-::
| |-a
| `-b
`-,
|-5 'signed int'
`-6 'signed int'
```
New AST:
```
{
|-var
`-,
|-{
| `-,
| | |-1 'signed int'
| | `-{
| | | |-::
| | | | |-a
| | | | `-b
| | | `-,
| | | | |-2 'signed int'
| | | | `-3 'signed int'
`-{
`-,
|-4 'signed int'
`-{
|-::
| |-a
| `-b
`-,
|-5 'signed int'
`-6 'signed int'
```
… marked as cpp11init Because of the missing `;` after the class declaration, it was marked as a cpp11init block. Which it isn't, and which now throws an exception
The following unit tests failed on the newly introduced throws, because the code for these tests incorrectly marked some tokens as cpp11init: TestVarID::varid_cpp11initialization TestTokenizer::checkRefQualifiers
| if (Token::Match(nameToken, "%name%|return|: {") && | ||
| (!Token::simpleMatch(nameToken->tokAt(2), "[") || findLambdaEndScope(nameToken->tokAt(2)))) | ||
| (!Token::simpleMatch(nameToken->tokAt(2), "[") || findLambdaEndScope(nameToken->tokAt(2))) && | ||
| !Token::simpleMatch(nameToken->tokAt(-2), ") .") && !Token::Match(nameToken->tokAt(-3), ") &|&& .")) // trailing return type, where -> is replaced by . |
There was a problem hiding this comment.
I just found out about tok->originalName(). Should this be changed to something like this?
!(nameToken->previous()->originalName() == "->" && (nameToken->tokAt(-2)->str() == ")" || Token::Match(nameToken->tokAt(-3), ") &|&&")))And come to think about it, I'm missing const as well (as in auto f() const -> void {}). Any more I'm missing?
There was a problem hiding this comment.
Pushed a fix for this. A bit different approach, using the fact that the for-loop which searches for ; will always work for function bodies that have a non-void return-type (since there must be a return ... ; somewhere). The only cases I could think of that do not have a semicolon in the body, are void-functions. So checking for -> void { is good enough
Observation: the only function body _not_ containing a semicolon, is a void function: something like
auto make_zero(ini& i) -> void {
while(--i > 0) {}
}
Non-void function? Then it must return a value, and thus contain a semicolon, which is checked for a few lines later.
In the following example, vector was marked as cpp11init due to the mismatch of `%any% {`
auto f() -> std::vector<int> { return {}; }
I made the assumption that whenever "%any% {" matches, endtok must be set too.
If this assumtion doesn't hold (so "%any% {" matches, but endtok == nullptr), then the for-loop would search all the way to the end of stream. Which I guess was not the intention.
As far as I see the new ASTs are correct.
Sounds good. So we will see problems
this is ok for me. |
danmar
left a comment
There was a problem hiding this comment.
as far as I see if you remove the comments for the throws we can merge this
| if (tok == end) | ||
| tok = tok->next(); | ||
| else | ||
| throw InternalError(tok, "Syntax error. Unexpected tokens in designated initializer.", InternalError::AST); // unexpected throw? It might indicate a bug with iscpp11init... |
There was a problem hiding this comment.
please remove the comment I guess the throw is a good idea.
There was a problem hiding this comment.
Removed the comments
…with_designated_initializers
During tokenizing, when a
{was encountered, the corresponding}was dealt with inconsistently.For example:
Further back, in a similar condition, it said
This was a bit too greedy, hopping over all consecutive closing braces.
These cases are now made consistent.
I added a couple of
throws to make some assumptions more explicit, especially for cases where the code was incorrectly (but silently) marked as being a c++11 initializer. Pro: those bugs were silent, not anymore. Con: it might be a bit too strict.3 types of testcases failed (now fixed):
;, therefore incorrectly marking it as a c++11 initializer (found by the addedthrows, fixed by adding;to the unit test)TestVarID::varid_cpp11initializationandTestTokenizer::checkRefQualifiers) in which some tokens incorrectly were marked as c++11 initializer (also found by the addedthrows)class A : B { };was correct (Bis no initializer), butclass A : B, C { };wasn't (Cwas incorrectly marked as initializer), due to the comma (now fixed)auto f() -> X {}:Xwas incorrectly marked as cpp11init.AST case 1:
auto var{ {{},{}}, {} };Old AST:
New AST:
AST case 2:
auto var{{1,a::b{2,3}}, {4,a::b{5,6}}};Old AST:
New AST:
Couple of questions from my side:
throws?cpp11init-test. Now I only test the cases I changed, not all existing code wrtisCpp11init, as that would overlap with the tests inastpar()