Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang-tidy] The first PR our of many PRs for the "Initialized Class Members" check. #65189

Closed
wants to merge 3 commits into from

Conversation

adriannistor
Copy link

The goal of this checker is to eliminate UUM (Use of Uninitialized Memory) bugs caused by uninitialized class members.

This checker is different from ProTypeMemberInitCheck in that this checker attempts to eliminate UUMs as a bug class, at the expense of false positives.

This checker is WIP. We are incrementally adding features and increasing coverage until we get to a shape that is acceptable.

The goal of this checker is to eliminate UUM (Use of Uninitialized Memory) bugs caused by uninitialized class members.

This checker is different from ProTypeMemberInitCheck in that this checker attempts to eliminate UUMs as a bug class, at the expense of false positives.

This checker is WIP. We are incrementally adding features and increasing coverage until we get to a shape that is acceptable.
@adriannistor adriannistor requested a review from a team as a code owner September 1, 2023 23:06
@adriannistor
Copy link
Author

@Xazax-hun @gribozavr

Copy link
Collaborator

@Xazax-hun Xazax-hun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really cool check! I can't wait to see how it turns out :)

@adriannistor
Copy link
Author

@ymand

@adriannistor
Copy link
Author

Thank you for the feedback @Xazax-hun ! Working on it!

.. code-block:: c++
struct SomeStruct {
int X;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be good idea to add link to relevant Coding Guidelines.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, fully agree! I added a broader discussion below. To summarize (full context there), we are still working the high level specifications and the low level specifications based on data. Once we have a stable and mature version, we will update the documentation for this checker in a more concrete and actionable way. Until then, we will keep this file (cpp-init-class-members.rst) and the unit tests as a high level and low level specifications, respectively. Full details in the broader discussion below. Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not closing this because technically I did not change the code. However, if you think the comment above fully addresses this issue, please close this (if the comment does not fully address, please let me know what you think! --- thanks!)

@EugeneZelenko
Copy link
Contributor

@PiotrZSL, @carlosgalvezp: Somehow label was not assigned automatically, so I mention you manually.

@PiotrZSL
Copy link
Member

PiotrZSL commented Sep 2, 2023

" Checks that class members are initialized in constructors (implicitly or explicitly). Reports constructors or classes where class members are not initialized."
This is exacly what cppcoreguidelines-pro-type-member-init is doing.

And provide same warnings:

10:7: warning: constructor does not initialize these fields: X [cppcoreguidelines-pro-type-member-init]
50:8: warning: constructor does not initialize these fields: Y [cppcoreguidelines-pro-type-member-init]

The only warning that cppcoreguidelines-pro-type-member-init is doing does not provide is for implicitly deleted constructor, wiht is fine.
I didn't found any rule about this in Google codding standard.

Currently I dont see any reason for this check existence, and any benefit that it could bring over cppcoreguidelines-pro-type-member-init.
If somehow cppcoreguidelines-pro-type-member-init does not cover some important part, then basicly cppcoreguidelines-pro-type-member-init should be extended, or in extreame case just renamed into "bugprone-uninitialized-member", extended, and aliased into cppcoreguidelines or any other standards. Doing same thing from scrach and generate same kind of warnings is basicly reundant.

.bind("ctor"),
this);

Finder->addMatcher(cxxRecordDecl(isDefinition(), hasDefaultConstructor(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasDefaultConstructor and unless(has(cxxConstructorDecl())) may opose them self, be more specific, like has user specific constructor....

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. Why is "may opose [sic] them self" a problem? If they sometimes oppose themselves, the condition is false and the matcher is not added. The important thing is to not oppose themselves always (in which case the condition would always be false), but that is not the case here.

We can refine this condition later.

For now, just saying "is default constructor but has no declaration in the AST" is good enough (catches a lot of our cases). We can refine later if we have data (false positives or false negatives).

My understanding is that you find this condition too broad? In that case, that is good. Because that will allow us to see false positives (and filter them out). If the condition would be too narrow, we would have false negatives. False negatives are a problem, because we may not be aware that we have them.

In short, let's leave this as it is for now and we can refine later based on data.

Thanks!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not closing this because technically I did not change the code. However, if you think the comment above fully addresses this issue, please close this (if the comment does not fully address, please let me know what you think! --- thanks!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing is that sometimes constructor declarations wont be visible in AST until they actually used first time, this may happen for implicit constructors.

In such case, hasDefaultConstructor may return true, but unless(has(cxxConstructorDecl())) may return false or true, depend if constructor is used or not. I assume that what you wanted to check is to check if default constructor is implicit. Other thing that instead of using "has" it should be "hasMethod".
And here you could use things like unless(has(cxxConstructorDecl(unless(isImplicit())))).

Copy link
Member

@PiotrZSL PiotrZSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not re-implement thigns that are already covered by other check (cppcoreguidelines-pro-type-member-init).
I don't see curently any point for this check.
If there are some small difrences bettwen this check and cppcoreguidelines-pro-type-member-init, then simply implement them as a part of cppcoreguidelines-pro-type-member-init (under configuration option), or even better as completly separate check that only check for what is not covered by cppcoreguidelines-pro-type-member-init.

@PiotrZSL PiotrZSL changed the title Adding an initial version of the "Initialized Class Members" checker. [clang-tidy] Adding an initial version of the "Initialized Class Members" checker. Sep 2, 2023
@PiotrZSL PiotrZSL requested a review from a team September 2, 2023 04:40

This checker is different from ProTypeMemberInitCheck in that this checker
attempts to eliminate UUMs as a bug class, at the expense of false
positives.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not entirely clear to me what "UUMs as a bug class" means. Could you provide an example where ProTypeMemberInitCheck doesn't detect the problem, and this check does? Also, why shouldn't that be fixed in ProTypeMemberInitCheck instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples

Could you provide an example where ProTypeMemberInitCheck doesn't detect the problem, and this check does?

The 2 cases for which this PR is written (available in the unit tests) are not found by ProTypeMemberInitCheck.

Specifically (I copy them from the unit test):

class PositiveDefaultedDefaultConstructor {
public:
  PositiveDefaultedDefaultConstructor() = default;
  // CHECK-MESSAGES: :[[@LINE-1]]:3: warning: constructor should initialize these fields: X

private:
  int X;
};

and

struct PositiveStruct {
  // CHECK-MESSAGES: :[[@LINE-1]]:8: warning: these fields should be initialized: X, Y
  int X;
  int Y;
};

Also a variation (again, copied them from the unit test)

class PositiveDefaultedConstructorObjectAndPrimitive {
 public:
  PositiveDefaultedConstructorObjectAndPrimitive() = default;
  // CHECK-MESSAGES: :[[@LINE-1]]:3: warning: constructor should initialize these fields: Y

  Helper* GetHelper() { return &X; }

  void SetY(bool enabled) { Y = enabled; }

  bool IsY() { return Y; }

 private:
  Helper X;
  bool Y;
};

Regarding adding to the existing checker.

I fully discuss this in the comment below. Please take a look there for the full context. Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First example is fine, but for that exist check should be just extended.
Second example is not so fine, simply because this is POD, and by adding initialization you change this, and this may impact usage of such class, like zeroed using memset, used in union. For me if you want to force initialization for such classes, this should be separated check or option in exist one, maybe distinguish based on keyword struct vs class.
Last issue is already detected by cppcoreguidelines-pro-type-member-init.

@adriannistor
Copy link
Author

adriannistor commented Sep 6, 2023

@PiotrZSL @carlosgalvezp

Piotr, thank you for your feedback!

You are right: if this was the entire checker, it would make no sense to add a new checker.

It was my mistake, I should have added more context and explanations to this PR (doing that now!).

This is only the first PR out of many PRs.

The first PR (and probably a few subsequent PRs) will look like they could have been added to cppcoreguidelines-pro-type-member-init.

However, the more we add, the more the two checker will diverge.

I actually considered expanding cppcoreguidelines-pro-type-member-init, but based on our current plan (more details next) expanding cppcoreguidelines-pro-type-member-init will soon become a lot of if-then-else blocks.

I updated the documentation (second commit a few minutes ago) in cpp-init-class-members.rst, ReleaseNotes.rst, and the .H file to reflect the above (i.e., this is the first commit our of many commits and this checker is under active development).

If you would prefer that we write the checker until the end and then upstream, we would be very glad to do that, just let me know!

We felt it is better to incrementally upstream the checker such that the community can give early feedback. Additionally, if we develop the checker internally and only upstream the checker at the end, there may be internal assumptions that creep in and then it will be difficult to remove for upstream. Additionally, porting the internal version to the upstream version in one go may require non-trivial work.

However, either way if fine for us, just let me know!

Regarding the checker specifications (especially as they compare the cppcoreguidelines-pro-type-member-init):

  • We have an internal document that sets the direction and high level requirements for the checker. We are actively updating the requirements based on data, so writing a public version for that document would be premature (and it will require us to invest time and effort into something that will anyway likely change). Once that document and the checker and the data are in a stable version, we plan to improve the documentation for requirements.

  • We also have data that is guiding the low level development. We are actively gathering and analyzing the data, so our low level plans are also under constant change. However, you can see the low level requirements captured in the unit tests. As we implement more, the unit tests will capture that. Once the checker is in a stable version, we plan to improve the low level public requirements.

  • We plan to update the Google C++ Style Guide with the final specification for this checker. However, we can do this only when the specification is (very) stable (see above discussion about the specification still being under development) because obviously we do not want to introduce instability into the Google C++ Style Guide.

Additionally, in a comment above I list the 3 cases (copied from the unit test) that are not reported by the current checker (however, this is a minor detail --- as explained above, the key idea here is not that this PR handles 3 cases not handled by the existing check. Rather, the key idea is that this is a first PR our of many PRs).

TLDR:

  • This is the first of many PRs.

  • We are happy to do what is the best for you (just let us know!), i.e., either: (1) upstream the checker incrementally with PRs or (2) upstream the checker all at once at the end (though there are tradeoffs, discussed above).

  • We are building the high level requirements and the low level implementation details as we go, based on data and analysis that we are currently working on. We are including some documentation (to capture the high level requirements) and unit tests (to capture the low level requirements) in the PRs. We plan to update the documentation in a more stable version once the checker, our data, our analysis of our data, and the (high level and low level) requirements are in a more stable and mature shape.

  • We plan to update the Google C++ Style Guide with the final specification for this checker. However, we can do this only when the specification is (very) stable (see above discussion about the specification still being under development) because obviously we do not want to introduce instability into the Google C++ Style Guide.

  • In a comment above I list the 3 cases (copied from the unit test) that are not reported by the current checker . (however, this is a minor detail --- as explained above, the key idea here is not that this PR handles 3 cases not handled by the existing check. Rather, the key idea is that this is a first PR our of many PRs)

Please let me know what you think of the above plan and any feedback you have! I would be very glad to work based on your feedback!

Thanks!

@adriannistor adriannistor changed the title [clang-tidy] Adding an initial version of the "Initialized Class Members" checker. [clang-tidy] The first PR our of many PRs for the "Initialized Class Members" checker. Sep 7, 2023
@adriannistor adriannistor changed the title [clang-tidy] The first PR our of many PRs for the "Initialized Class Members" checker. [clang-tidy] The first PR our of many PRs for the "Initialized Class Members" check. Sep 7, 2023
…enaming method names, changing method parameter types.
@adriannistor
Copy link
Author

@Xazax-hun @carlosgalvezp @EugeneZelenko @PiotrZSL

I addressed all the comments in this PR. Please take a look and let me know what you think!

@PiotrZSL @carlosgalvezp , for the comment regarding the the existing checker, I fully discuss this topic in this newly posted comment. Please take a look and let me know what you think!

Thanks!

@Xazax-hun
Copy link
Collaborator

As far as I understand, the C++ Core Guidelines does not explicitly mention whether we need to add initializers in code like:

struct X {
int a, int b;
};

It is up to the guidelines someone is following whether we require member-initializers, or whether we don't require them, but require initialization at the allocation site:

X x; // warn here?

I think in order to warn about the lack of member initializers, the very least we would need to ask the core guidelines' editors to do clarifications. I am not opposed to that, but for the time being, I think it makes sense to continue the development on a separate check as long as the differences are clearly documented. Once the end result looks similar enough, we can potentially remove one of them, or make some aspects configurable in the same check. What do you think?

Copy link
Collaborator

@Xazax-hun Xazax-hun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine to me, added some a comment about a potential TODO.

if (Ty.isNull())
return false;

// FIXME: For now, this check focuses on several allowlisted types. We will
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you will need to do more than checking for a couple fo types.

For example, consider the scenario:

// 3rdPartyHeader.h:
struct OthersPair { int first, second; };

// MyCode:
struct MyStruct {
 int a = 1;
 OthersPair p;
};

Here, the user might have no control over OthersPair, it is coming from a 3rd party library. It does not have a constructor to initialize its members, so we need to add member initializers in MyStruct.

and are actively working on more commits. Users who want a mature and stable
check should not use this check yet.

This check is different from ProTypeMemberInitCheck in that this check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProTypeMemberInitCheck is an implementation detail - please refer to the user-facing check name (cppcoreguidelines-pro-type-member-init) together with a clickable link. Feel free to check out the Release Notes on how we typically refer to checks.

positives. The authors of this check will add more documentation about the
differences with ProTypeMemberInitCheck as the check evolves.

For now, this check reports `X` in the following two patterns:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to show what the "fixed" code would look like.

Copy link
Member

@PiotrZSL PiotrZSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest you still didn't clarify how this check is different from an exist ones.
Entire section "Regarding the checker specifications (especially as they compare the cppcoreguidelines-pro-type-member-init):" says only "there are some requirements and plans somewere" but does not say what actually.

The only information about this check comes from examples that you put.
And I see there 2 cases:

  1. Handling default constructors - this should be handled in exist check
  2. Forcing initialization for POD types - this should be actually separate check or separate option to exist one.

Except above I still don't see any reason why this check should exist in this form.
Name of check suggest class members, but then example show POD structs.

Work more on documentation, "This checker is different from ProTypeMemberInitCheck in that this checker
attempts to eliminate UUMs as a bug class, at the expense of false
positives." says nothing, ProTypeMemberInitCheck also "attempts to eliminate UUMs"...

Thing is that you still got big todo list, and things like initialization by using methods, order of initialization and at the end you still going to end up with bunch of "if-else", not even mentioning needed configuration to exclude some types or some scenarios.

Start with a list of requirements, then we can thing if that should be new check, extended exist check, or multiple checks. For me would be sufficient to add "IncludeTriviallyDefaultConstructible" option to exist check.

@carlosgalvezp
Copy link
Contributor

Thanks for the detailed response @adriannistor !

will soon become a lot of if-then-else blocks.

In general the decision as to which checks to have should be based on Coding Guidelines and user-facing experience - not on implementation complexity. In that sense, as a user I would find confusing to have 2 checks that do nearly the same.

If I understand the discussion correctly, the intention is to create a check that is stricter than the C++ Core Guidelines, is that correct? There is some vagueness in the Guidelines as well. Typically we handle this with Options, allowing users to enable a more strict or lenient version. Would that be an alternative?

Long-term, I believe we should just have 1 single check for "initialize your class members". This can be done in different ways, for example expanding the C++ Core Guidelines one, or replacing it with this new check (which should have relevant configuration options to ensure it behaves according to the spec of the Guidelines). I would be fine with temporarily having 2 checks in parallel as long as there's a plan for merging them eventually.

If you would prefer that we write the checker until the end and then upstream, we would be very glad to do that, just let me know!

I think it's great that you are splitting the development into smaller pieces so it's easier to do a quality review. Would it however make sense to keep a WIP documentation where one can see the approximate roadmap of this check, i.e what use cases are implemented and which ones will come?

@adriannistor
Copy link
Author

@carlosgalvezp @PiotrZSL @Xazax-hun @EugeneZelenko

Thank you for your feedback! I appreciate the time you spent on this PR!

After discussion in our team based on the feedback in this PR, our current plan is to develop the check to a more mature shape and then upstream it here (i.e., to not pursue the current approach of upstreaming the checker as we develop it, in many PRs).

This approach will address the feedback in this PR, e.g., clarify the differences to the existing checker, bring the check's (high level and low level) requirements to a more stable shape, etc.

Based on the above, I will close this PR.

I will get back in touch when we are ready with the more stable version!

Thank you!

Adrian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants