Skip to content

#11134 Fix broken AST with (designated) initializers#4550

Merged
orbitcowboy merged 11 commits intocppcheck-opensource:mainfrom
gerboengels:11134_fix_broken_ast_with_designated_initializers
Oct 19, 2022
Merged

#11134 Fix broken AST with (designated) initializers#4550
orbitcowboy merged 11 commits intocppcheck-opensource:mainfrom
gerboengels:11134_fix_broken_ast_with_designated_initializers

Conversation

@gerboengels
Copy link
Copy Markdown
Contributor

During tokenizing, when a { was encountered, the corresponding } was dealt with inconsistently.
For example:

           if (Token::Match(tok, "{ . %name% =|{")) {
                //...
                compileBinOp(tok, state, compileExpression); // tok points to closing `}`
            } else if (Token::simpleMatch(tok, "{ }")) {
                // ...
                tok = tok->tokAt(2); // tok points to _after_ `}`

Further back, in a similar condition, it said

            while (Token::simpleMatch(tok, "}"))
                tok = tok->next();

This was a bit too greedy, hopping over all consecutive closing braces.

These cases are now made consistent.

I added a couple of throws to make some assumptions more explicit, especially for cases where the code was incorrectly (but silently) marked as being a c++11 initializer. Pro: those bugs were silent, not anymore. Con: it might be a bit too strict.

3 types of testcases failed (now fixed):

  • Unit test with a class declaration with a missing ;, therefore incorrectly marking it as a c++11 initializer (found by the added throws, fixed by adding ; to the unit test)
  • Unit tests (TestVarID::varid_cpp11initialization and TestTokenizer::checkRefQualifiers) in which some tokens incorrectly were marked as c++11 initializer (also found by the added throws)
    • class A : B { }; was correct (B is no initializer), but class A : B, C { }; wasn't (C was incorrectly marked as initializer), due to the comma (now fixed)
    • Function with trailing return type: auto f() -> X {}: X was incorrectly marked as cpp11init.
  • For 2 tests, the AST was wrong (I think) and now is correct (I think). Please confirm:

AST case 1: auto var{ {{},{}}, {} };
Old AST:

{
|-var
`-{
  `-,
    |-,
    | |-{
    | `-{
    `-{

New AST:

{
|-var
`-,
  |-{
  | `-,
  | | |-{
  | | `-{
  `-{

AST case 2: auto var{{1,a::b{2,3}}, {4,a::b{5,6}}};

Old AST:

{
|-var
`-{
  `-,
    |-,
    | |-1 'signed int'
    | `-{
    | | |-::
    | | | |-a
    | | | `-b
    | | `-,
    | | | |-2 'signed int'
    | | | `-3 'signed int'
    `-{
      `-,
        |-4 'signed int'
        `-{
          |-::
          | |-a
          | `-b
          `-,
            |-5 'signed int'
            `-6 'signed int'

New AST:

{
|-var
`-,
  |-{
  | `-,
  | | |-1 'signed int'
  | | `-{
  | | | |-::
  | | | | |-a
  | | | | `-b
  | | | `-,
  | | | | |-2 'signed int'
  | | | | `-3 'signed int'
  `-{
    `-,
      |-4 'signed int'
      `-{
        |-::
        | |-a
        | `-b
        `-,
          |-5 'signed int'
          `-6 'signed int'

Couple of questions from my side:

  1. Can someone confirm the new AST's are correct? (I'm still new to the cppcheck codebase, have no experience interpreting those AST's)
  2. Keep the added throws?
  3. I'm not sure about the added cpp11init-test. Now I only test the cases I changed, not all existing code wrt isCpp11init, as that would overlap with the tests in astpar()

- _always_, because in some cases this was omitted (around line 790) or too strict (around line 860)
- _only_, and not following tokens which happen to be } as well (around line 1030)
auto var{ {{},{}}, {} };

Old AST:
```
{
|-var
`-{
  `-,
    |-,
    | |-{
    | `-{
    `-{
```
New AST:
```
{
|-var
`-,
  |-{
  | `-,
  | | |-{
  | | `-{
  `-{
```
Compare the same example, but with `X{}` instead of just `{}`:
`auto var{ a{b{},c{}}, d{} };`
```
{
|-var
`-,
  |-{
  | |-a
  | `-,
  | | |-{
  | | | `-b
  | | `-{
  | | | `-c
  `-{
    `-d
```
This structure is similar to that of the new AST, not the old AST
Code: `auto var{{1,a::b{2,3}}, {4,a::b{5,6}}};`

Old AST:
```
{
|-var
`-{
  `-,
    |-,
    | |-1 'signed int'
    | `-{
    | | |-::
    | | | |-a
    | | | `-b
    | | `-,
    | | | |-2 'signed int'
    | | | `-3 'signed int'
    `-{
      `-,
        |-4 'signed int'
        `-{
          |-::
          | |-a
          | `-b
          `-,
            |-5 'signed int'
            `-6 'signed int'
```
New AST:
```
{
|-var
`-,
  |-{
  | `-,
  | | |-1 'signed int'
  | | `-{
  | | | |-::
  | | | | |-a
  | | | | `-b
  | | | `-,
  | | | | |-2 'signed int'
  | | | | `-3 'signed int'
  `-{
    `-,
      |-4 'signed int'
      `-{
        |-::
        | |-a
        | `-b
        `-,
          |-5 'signed int'
          `-6 'signed int'
```
… marked as cpp11init

Because of the missing `;` after the class declaration, it was marked as a cpp11init block.
Which it isn't, and which now throws an exception
The following unit tests failed on the newly introduced throws, because the code for these tests incorrectly marked some tokens as cpp11init:
TestVarID::varid_cpp11initialization
TestTokenizer::checkRefQualifiers
Comment thread lib/tokenlist.cpp Outdated
if (Token::Match(nameToken, "%name%|return|: {") &&
(!Token::simpleMatch(nameToken->tokAt(2), "[") || findLambdaEndScope(nameToken->tokAt(2))))
(!Token::simpleMatch(nameToken->tokAt(2), "[") || findLambdaEndScope(nameToken->tokAt(2))) &&
!Token::simpleMatch(nameToken->tokAt(-2), ") .") && !Token::Match(nameToken->tokAt(-3), ") &|&& .")) // trailing return type, where -> is replaced by .
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found out about tok->originalName(). Should this be changed to something like this?

!(nameToken->previous()->originalName() == "->" && (nameToken->tokAt(-2)->str() == ")" || Token::Match(nameToken->tokAt(-3), ") &|&&")))

And come to think about it, I'm missing const as well (as in auto f() const -> void {}). Any more I'm missing?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a fix for this. A bit different approach, using the fact that the for-loop which searches for ; will always work for function bodies that have a non-void return-type (since there must be a return ... ; somewhere). The only cases I could think of that do not have a semicolon in the body, are void-functions. So checking for -> void { is good enough

Observation: the only function body _not_ containing a semicolon, is a void function: something like
   auto make_zero(ini& i) -> void {
     while(--i > 0) {}
   }
Non-void function? Then it must return a value, and thus contain a semicolon, which is checked for a few lines later.
In the following example, vector was marked as cpp11init due to the mismatch of `%any% {`
auto f() -> std::vector<int> { return {}; }

I made the assumption that whenever "%any% {" matches, endtok must be set too.
If this assumtion doesn't hold (so "%any% {" matches, but endtok == nullptr), then the for-loop would search all the way to the end of stream. Which I guess was not the intention.
@danmar
Copy link
Copy Markdown
Collaborator

danmar commented Oct 16, 2022

Can someone confirm the new AST's are correct? (I'm still new to the cppcheck codebase, have no experience interpreting those AST's)

As far as I see the new ASTs are correct.

Keep the added throws?

Sounds good. So we will see problems

I'm not sure about the added cpp11init-test. Now I only test the cases I changed, not all existing code wrt isCpp11init, as that would overlap with the tests in astpar()

this is ok for me.

Copy link
Copy Markdown
Collaborator

@danmar danmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as I see if you remove the comments for the throws we can merge this

Comment thread lib/tokenlist.cpp Outdated
if (tok == end)
tok = tok->next();
else
throw InternalError(tok, "Syntax error. Unexpected tokens in designated initializer.", InternalError::AST); // unexpected throw? It might indicate a bug with iscpp11init...
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the comment I guess the throw is a good idea.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comments

@orbitcowboy orbitcowboy merged commit 6a01fa9 into cppcheck-opensource:main Oct 19, 2022
@gerboengels gerboengels deleted the 11134_fix_broken_ast_with_designated_initializers branch October 21, 2022 07:17
@chrchr-github
Copy link
Copy Markdown
Collaborator

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants