Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang-tidy] Reports macro names matching '^_[^A-Z_]' as reserved identifiers #64130

Closed
odkr opened this issue Jul 26, 2023 · 14 comments
Closed
Labels
bug Indicates an unexpected problem or unintended behavior clang-tidy confirmed Verified by a second party

Comments

@odkr
Copy link

odkr commented Jul 26, 2023

When using the "bugprone-reserved-identifier" plugin (or one of its aliases), clang-tidy reports macro names that start with an underscore and are neither followed by another underscore nor an uppercase letter as being reserved identifiers. I don't know about later versions of the C standard, but it would seem to me that such macro names are neither reserved by ISO/IEC 9899:1999 ("C99") nor ISO/IEC 9899:2011 ("C11") .

The relevant paragraph reads:

All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces. (ISO/IEC 9899:1999, § 7.1.2 (1), my emph.)

However, macros are explicitly "not considered" (para. 1) in § 6.2.1 "Scope of identifiers", where file scope is defined (paras. 2 and 4). They also have a scope of their own, "macro scope" (§ 6.10.3.5). So they seem to not have file scope. And I would argue that § 6.2.1 (1) also excludes them from the definition of "ordinary identifiers" in § 6.2.3 (1). They have a name space of their own (§ 6.10.3 (7)), too. So they don't seem to be in the ordinary name space either. Why should § 7.1.2 (1) apply to them then?

I'm not an expert on C standards, but I'm not alone. Pavel Morozkin agrees in a discussion on reserved identifiers on the GCC issue tracker. And the authors of the CERT C coding standard give various example for identifiers that violate § 7.1.2 (1) of the C standard/rule DCL37 of CERT C, but none that involves a macro name that starts with an underscore and is neither followed by another underscore nor an uppercase letter.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 26, 2023

@llvm/issue-subscribers-clang-tidy

@PiotrZSL
Copy link
Member

PiotrZSL commented Jul 26, 2023

Actually there is compiler warning --Wreserved-macro-identifier that is part of -Wreserved-identifier that also detect those issues.
Interesting thing is that -Wreserved-identifier is even more strict than bugprone-reserved-identifier, because on test it finds 61 issues instead of 35.

"https://eel.is/c++draft/lex.name#3"
"In addition, some identifiers appearing as a token or preprocessing-token are reserved for use by C++ implementations and shall not be used otherwise; no diagnostic is required.
(3.1)
Each identifier that contains a double underscore __ or begins with an underscore followed by an uppercase letter is reserved to the implementation for any use.
(3.2)
Each identifier that begins with an underscore is reserved to the implementation for use as a name in the global namespace."

Personally I wouldn't change this. What could be done is to add (optional) option to this check to ignore macros that start with single _.

@odkr
Copy link
Author

odkr commented Jul 26, 2023

Hmm, I should have added that -- though I stumbled upon the issue because using such macro names was the least bad option in a project of mine -- I concur that using such names is a Bad Idea. But I thought I'd file it as an issue because the C spec appears to allow it. Adding an option sounds like a good idea to me.

@AaronBallman
Copy link
Collaborator

But I thought I'd file it as an issue because the C spec appears to allow it. Adding an option sounds like a good idea to me.

I don't think it's clear that it is allowed. The wording has changed for C23, but the relevant bits regarding macro names are still the same.

6.4.2p7 (in part):

All identifiers that begin with an underscore are reserved for use as identifiers with file scope in both the ordinary and tag name spaces.

6.2.1p1 (in part):

An identifier can denote: a macro name; ... Macro names and macro parameters are not considered further here, because prior to the semantic phase of program translation any occurrences of macro names in the source file are replaced by the preprocessing token sequences that constitute their macro definitions.

6.2.3p1 (in part):

all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants).

6.10.4.5p1:

A macro definition lasts (independent of block structure) until a corresponding #undef directive is encountered or (if none is encountered) until the end of the preprocessing translation unit. Macro definitions have no significance after translation phase 4.

So a macro name is an ordinary identifier and the identifier has some sort of a scope which isn't clearly defined as such but behaves as if it were file scope because the name is a valid macro name until undefined regardless of scope. So a macro named _foo is most likely reserved but not 100% clear due to the scope issue. It seems others have the same interpretation I do: https://devblogs.microsoft.com/oldnewthing/20230109-00/?p=107685

@odkr
Copy link
Author

odkr commented Jul 28, 2023

Thanks for citing some of the relevant sections! And sorry that it took me a while to respond.

I don't think it's clear that it is allowed.

I agree that the wording of the relevant parts of sec. 6.2. could be improved ;-).

[...] all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants). (§ 6.2.3 (1))

I don't see how this establishes that macro names are "ordinary identifiers"; if anything, it would seem to do the opposite: "ordinary identifiers" are "declared". But macros are defined; there is no such thing as a macro declaration.

So I still believe that § 6.2.1 (1) implies that sec. 6.2 as a whole does not apply to macros. In my view, this is corroborated by sec. 6.2 dealing exclusively with topics that have little, if any, bearing on them: types, storage duration, .... I'd also add name spaces in the sense of § 6.2.3 (1) to that list, because the "job" of such name spaces is to disambiguate between multiple instances of the same identifier if they are visible from the the same point of the programme, but macro names are never "visible" in that sense from any point of the programme (per § 6.2.1 (1)). In my mind, this implies that macros have their own name space, and a sui generies name space at that.

And I don't see how macros "behave" as if they had file scope either. One of the defining features of scope is that identifiers at an inner scope can hide identifiers at an outer scope (§ 6.2.1 (4)). But macro names, supposing for the sake of argument that they are in the ordinary namespace, cannot be hidden. Moreover, that macros can be undefined strikes me as important.

Also, I don't think that how macros "behave" is a strong argument either way. The question is whether "identifiers with file scope" (§ 7.1.2 (1)) applies to macro names. That question should be settled by explicating the meaning and, err, scope of "file scope" in the context of the standard. And that macro names cannot be hidden would seem to imply that they do not have any identifier scope.

Here is another argument: In any given name space and within any given scope, an identifier can only refer to one type of entity, right? But this is legal:

#define foo foo
int foo() { return 0; }

Even Clang-tidy doesn't find fault with it ;-).

@AaronBallman
Copy link
Collaborator

Thanks for citing some of the relevant sections! And sorry that it took me a while to respond.

No worries!

I don't see how this establishes that macro names are "ordinary identifiers"; if anything, it would seem to do the opposite: "ordinary identifiers" are "declared". But macros are defined; there is no such thing as a macro declaration.

I think the identifier for a macro is different than the macro itself. e.g., you define a macro with a declared identifier. But this is getting reallllllly too far into "what do these words mean in English?" territory for my comfort.

Because the standard really doesn't say enough to decide for certain, I think the best way forward is to ask WG14. I'll post the question to the reflectors and if there appears to be a decisive answer, that'll be great; if there's not, it'll require writing a paper and will take a bit longer to nail down.

@AaronBallman AaronBallman added bug Indicates an unexpected problem or unintended behavior confirmed Verified by a second party labels Jul 28, 2023
@AaronBallman
Copy link
Collaborator

Because the standard really doesn't say enough to decide for certain, I think the best way forward is to ask WG14. I'll post the question to the reflectors and if there appears to be a decisive answer, that'll be great;

I got a very quick and decisive answer back. (Response from David Keaton, the current WG14 convener, who has a ton of historical knowledge in this area.)

A single underscore followed by a lower-case letter was intended to be reserved only for identifiers, not for macro names.

The purpose was to have a namespace to be used by linker symbols, so a library could have internal names available to allow whatever magic was needed to implement the library. More specifically, this was about the implementation's C library.

So I think we can now confirm this as a bug against clang-tidy.

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 28, 2023

@llvm/issue-subscribers-bug

@odkr
Copy link
Author

odkr commented Jul 28, 2023

Thanks for taking the time to ask WG14 about this! It's nice to have this clarified.

@AaronBallman
Copy link
Collaborator

Thanks for taking the time to ask WG14 about this! It's nice to have this clarified.

Any time, thank you for raising the issue and discussing it!

@carlosgalvezp
Copy link
Contributor

A bit of a side track, let me know if you'd rather discuss it somewhere else :) If -Wreserved-identifier and bugprone-reserved-identifier are meant to provide the same functionality and the latter has a bug, would it make more sense to just remove the check from clang-tidy and rely on the one from clang? It seems inefficient to maintain two separate implementations.

@PiotrZSL
Copy link
Member

Compiler warning does not support configuration, example: #59119
Check is less strict than a compiler warning. Easiest solution would be simply add option AllowedMacros.

@carlosgalvezp
Copy link
Contributor

Fix here!
https://reviews.llvm.org/D156608

@carlosgalvezp carlosgalvezp added the awaiting-review Has pending Phabricator review label Jul 30, 2023
@carlosgalvezp carlosgalvezp removed the awaiting-review Has pending Phabricator review label Jul 30, 2023
@AaronBallman
Copy link
Collaborator

A bit of a side track, let me know if you'd rather discuss it somewhere else :) If -Wreserved-identifier and bugprone-reserved-identifier are meant to provide the same functionality and the latter has a bug, would it make more sense to just remove the check from clang-tidy and rely on the one from clang? It seems inefficient to maintain two separate implementations.

Because clang-tidy runs Clang under the hood, I can see the appeal to relying on the Clang check, However, the Clang check is off by default (at least at the moment) and does not have as much flexibility in configuration. So I think it's fine to have both checks for now, but if the Clang functionality improves to the point it can be enabled by default (which likely involves improving configurability), we can revisit at that point.

razmser pushed a commit to razmser/llvm-project that referenced this issue Sep 8, 2023
…case letter in bugprone-reserved-identifier

Fixes llvm#64130

Differential Revision: https://reviews.llvm.org/D156608
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior clang-tidy confirmed Verified by a second party
Projects
None yet
Development

No branches or pull requests

6 participants