-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[Clang] improve -Wstring-concatenation to warn on every missing comma in initializer lists #154018
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… in initializer lists
@llvm/pr-subscribers-clang Author: Oleksandr T. (a-tarasyuk) ChangesFixes #153745 This PR addresses a limitation in Full diff: https://github.com/llvm/llvm-project/pull/154018.diff 3 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index b35f4ea42818a..48575b6e821f0 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -162,6 +162,8 @@ Improvements to Clang's diagnostics
an override of a virtual method.
- Fixed fix-it hint for fold expressions. Clang now correctly places the suggested right
parenthesis when diagnosing malformed fold expressions. (#GH151787)
+- ``-Wstring-concatenation`` now diagnoses every missing comma in an initializer list,
+ rather than stopping after the first. (#GH153745)
- Fixed an issue where emitted format-signedness diagnostics were not associated with an appropriate
diagnostic id. Besides being incorrect from an API standpoint, this was user visible, e.g.:
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 8ddbaf34a7f47..b63c157d81dc8 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -14708,7 +14708,16 @@ void Sema::CheckCompleteVariableDeclaration(VarDecl *var) {
isa<InitListExpr>(var->getInit())) {
const auto *ILE = cast<InitListExpr>(var->getInit());
unsigned NumInits = ILE->getNumInits();
- if (NumInits > 2)
+ if (NumInits > 2) {
+ auto concatenatedPartsAt = [&](unsigned Index) -> unsigned {
+ const Expr *E = ILE->getInit(Index);
+ if (E) {
+ if (const auto *S = dyn_cast<StringLiteral>(E->IgnoreImpCasts()))
+ return S->getNumConcatenated();
+ }
+ return 0;
+ };
+
for (unsigned I = 0; I < NumInits; ++I) {
const auto *Init = ILE->getInit(I);
if (!Init)
@@ -14721,24 +14730,23 @@ void Sema::CheckCompleteVariableDeclaration(VarDecl *var) {
// Diagnose missing comma in string array initialization.
// Do not warn when all the elements in the initializer are concatenated
// together. Do not warn for macros too.
- if (NumConcat == 2 && !SL->getBeginLoc().isMacroID()) {
- bool OnlyOneMissingComma = true;
- for (unsigned J = I + 1; J < NumInits; ++J) {
- const auto *Init = ILE->getInit(J);
- if (!Init)
- break;
- const auto *SLJ = dyn_cast<StringLiteral>(Init->IgnoreImpCasts());
- if (!SLJ || SLJ->getNumConcatenated() > 1) {
- OnlyOneMissingComma = false;
- break;
- }
- }
+ if (NumConcat == 2) {
+ if (SL->getBeginLoc().isMacroID())
+ continue;
+
+ unsigned L = I > 0 ? concatenatedPartsAt(I - 1) : 0;
+ unsigned R = I + 1 < NumInits ? concatenatedPartsAt(I + 1) : 0;
+
+ // Skip neighbors with multi-part concatenations.
+ if (L > 1 || R > 1)
+ continue;
- if (OnlyOneMissingComma) {
+ // Diagnose when at least one neighbor is a single literal.
+ if (L || R) {
SmallVector<FixItHint, 1> Hints;
- for (unsigned i = 0; i < NumConcat - 1; ++i)
- Hints.push_back(FixItHint::CreateInsertion(
- PP.getLocForEndOfToken(SL->getStrTokenLoc(i)), ","));
+ // Insert a comma between the two tokens of this element.
+ Hints.push_back(FixItHint::CreateInsertion(
+ PP.getLocForEndOfToken(SL->getStrTokenLoc(0)), ", "));
Diag(SL->getStrTokenLoc(1),
diag::warn_concatenated_literal_array_init)
@@ -14746,10 +14754,9 @@ void Sema::CheckCompleteVariableDeclaration(VarDecl *var) {
Diag(SL->getBeginLoc(),
diag::note_concatenated_string_literal_silence);
}
- // In any case, stop now.
- break;
}
}
+ }
}
diff --git a/clang/test/Sema/string-concat.c b/clang/test/Sema/string-concat.c
index 63abf100c020f..ae6c0897878a7 100644
--- a/clang/test/Sema/string-concat.c
+++ b/clang/test/Sema/string-concat.c
@@ -168,3 +168,17 @@ const char *extra_parens_to_suppress_warning[] = {
"promise"),
"shared_future"
};
+
+const char *multiple_missing_commas[] = {
+ "1",
+ "2" // expected-note {{place parentheses around the string literal to silence warning}}
+ "3", // expected-warning {{suspicious concatenation of string literals in an array initialization; did you mean to separate the elements with a comma?}}
+ "4",
+ "5",
+ "6" // expected-note {{place parentheses around the string literal to silence warning}}
+ "7", // expected-warning {{suspicious concatenation of string literals in an array initialization; did you mean to separate the elements with a comma?}}
+ "8",
+ "9",
+ "10",
+ "11",
+};
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
For what it's worth, this causes
const char *const trunk_errmsg[] = { /* map errcode */
"", /* There is no such error code at index 0*/
"link negotiated speed does not match existing"
" trunk - link was \"low\" speed",
"link negotiated speed does not match"
" existing trunk - link was \"middle\" speed",
"link negotiated speed does not match existing"
" trunk - link was \"high\" speed",
"Attached to non-trunking port - F_Port",
"Attached to non-trunking port - N_Port",
"FLOGI response timeout",
"non-FLOGI frame received",
"Invalid FLOGI response",
"Trunking initialization protocol",
"Trunk peer device mismatch",
}; diff --git a/drivers/scsi/lpfc/lpfc_attr.c b/drivers/scsi/lpfc/lpfc_attr.c
index 33582d48ec09..433cddcae4b6 100644
--- a/drivers/scsi/lpfc/lpfc_attr.c
+++ b/drivers/scsi/lpfc/lpfc_attr.c
@@ -68,12 +68,9 @@
const char *const trunk_errmsg[] = { /* map errcode */
"", /* There is no such error code at index 0*/
- "link negotiated speed does not match existing"
- " trunk - link was \"low\" speed",
- "link negotiated speed does not match"
- " existing trunk - link was \"middle\" speed",
- "link negotiated speed does not match existing"
- " trunk - link was \"high\" speed",
+ "link negotiated speed does not match existing trunk - link was \"low\" speed",
+ "link negotiated speed does not match existing trunk - link was \"middle\" speed",
+ "link negotiated speed does not match existing trunk - link was \"high\" speed",
"Attached to non-trunking port - F_Port",
"Attached to non-trunking port - N_Port",
"FLOGI response timeout",
static const char *ferr_fat_fbd_name[] = {
[22] = "Non-Redundant Fast Reset Timeout",
[2] = ">Tmid Thermal event with intelligent throttling disabled",
[1] = "Memory or FBD configuration CRC read error",
[0] = "Memory Write error on non-redundant retry or "
"FBD configuration Write error on retry",
}; diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 69068f8d0cad..939b855e8fbf 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -193,8 +193,7 @@ static const char *ferr_fat_fbd_name[] = {
[22] = "Non-Redundant Fast Reset Timeout",
[2] = ">Tmid Thermal event with intelligent throttling disabled",
[1] = "Memory or FBD configuration CRC read error",
- [0] = "Memory Write error on non-redundant retry or "
- "FBD configuration Write error on retry",
+ [0] = "Memory Write error on non-redundant retry or FBD configuration Write error on retry",
};
#define GET_FBD_FAT_IDX(fbderr) (((fbderr) >> 28) & 3)
#define FERR_FAT_FBD_ERR_MASK ((1 << 0) | (1 << 1) | (1 << 2) | (1 << 22)) However, this third one is a little interesting because the warning happens on the last instance of a sequence of multi-line concatenations:
const char *sja1105_static_config_error_msg[] = {
[SJA1105_CONFIG_OK] = "",
[SJA1105_TTETHERNET_NOT_SUPPORTED] =
"schedule-table present, but TTEthernet is "
"only supported on T and Q/S",
[SJA1105_INCORRECT_TTETHERNET_CONFIGURATION] =
"schedule-table present, but one of "
"schedule-entry-points-table, schedule-parameters-table or "
"schedule-entry-points-parameters table is empty",
[SJA1105_INCORRECT_VIRTUAL_LINK_CONFIGURATION] =
"vl-lookup-table present, but one of vl-policing-table, "
"vl-forwarding-table or vl-forwarding-parameters-table is empty",
[SJA1105_MISSING_L2_POLICING_TABLE] =
"l2-policing-table needs to have at least one entry",
[SJA1105_MISSING_L2_FORWARDING_TABLE] =
"l2-forwarding-table is either missing or incomplete",
[SJA1105_MISSING_L2_FORWARDING_PARAMS_TABLE] =
"l2-forwarding-parameters-table is missing",
[SJA1105_MISSING_GENERAL_PARAMS_TABLE] =
"general-parameters-table is missing",
[SJA1105_MISSING_VLAN_TABLE] =
"vlan-lookup-table needs to have at least the default untagged VLAN",
[SJA1105_MISSING_XMII_TABLE] =
"xmii-table is missing",
[SJA1105_MISSING_MAC_TABLE] =
"mac-configuration-table needs to contain an entry for each port",
[SJA1105_OVERCOMMITTED_FRAME_MEMORY] =
"Not allowed to overcommit frame memory. L2 memory partitions "
"and VL memory partitions share the same space. The sum of all "
"16 memory partitions is not allowed to be larger than 929 "
"128-byte blocks (or 910 with retagging). Please adjust "
"l2-forwarding-parameters-table.part_spc and/or "
"vl-forwarding-parameters-table.partspc.",
}; I will probably suggest fixing it like so just to be consistent but I am sure someone will ask why it only flags the third instance and none of the others (I realize because of the heuristic but still). diff --git a/drivers/net/dsa/sja1105/sja1105_static_config.c b/drivers/net/dsa/sja1105/sja1105_static_config.c
index ffece8a400a6..45564b62ce41 100644
--- a/drivers/net/dsa/sja1105/sja1105_static_config.c
+++ b/drivers/net/dsa/sja1105/sja1105_static_config.c
@@ -980,15 +980,15 @@ static u64 blk_id_map[BLK_IDX_MAX] = {
const char *sja1105_static_config_error_msg[] = {
[SJA1105_CONFIG_OK] = "",
[SJA1105_TTETHERNET_NOT_SUPPORTED] =
- "schedule-table present, but TTEthernet is "
- "only supported on T and Q/S",
+ ("schedule-table present, but TTEthernet is "
+ "only supported on T and Q/S"),
[SJA1105_INCORRECT_TTETHERNET_CONFIGURATION] =
- "schedule-table present, but one of "
+ ("schedule-table present, but one of "
"schedule-entry-points-table, schedule-parameters-table or "
- "schedule-entry-points-parameters table is empty",
+ "schedule-entry-points-parameters table is empty"),
[SJA1105_INCORRECT_VIRTUAL_LINK_CONFIGURATION] =
- "vl-lookup-table present, but one of vl-policing-table, "
- "vl-forwarding-table or vl-forwarding-parameters-table is empty",
+ ("vl-lookup-table present, but one of vl-policing-table, "
+ "vl-forwarding-table or vl-forwarding-parameters-table is empty"),
[SJA1105_MISSING_L2_POLICING_TABLE] =
"l2-policing-table needs to have at least one entry",
[SJA1105_MISSING_L2_FORWARDING_TABLE] =
@@ -1004,12 +1004,12 @@ const char *sja1105_static_config_error_msg[] = {
[SJA1105_MISSING_MAC_TABLE] =
"mac-configuration-table needs to contain an entry for each port",
[SJA1105_OVERCOMMITTED_FRAME_MEMORY] =
- "Not allowed to overcommit frame memory. L2 memory partitions "
+ ("Not allowed to overcommit frame memory. L2 memory partitions "
"and VL memory partitions share the same space. The sum of all "
"16 memory partitions is not allowed to be larger than 929 "
"128-byte blocks (or 910 with retagging). Please adjust "
"l2-forwarding-parameters-table.part_spc and/or "
- "vl-forwarding-parameters-table.partspc.",
+ "vl-forwarding-parameters-table.partspc."),
};
static sja1105_config_valid_t Maybe this warning could take into account using designators probably means the concatenation is intentional? Not sure if that is possible. |
That looks like perfectly reasonable code we probably should not be warning on IMO. Requiring the user to parenthesize it or concatenate it is kind of annoying. I wonder if another suppression mechanism would be to silence in the presence of a
I'm not certain that a designator is enough of a signal to mean the concatenation is intentional. The tradeoff here is false positives vs false negatives, and I think this is starting to show that we're introducing more false positives. Another idea would be to revert these changes and instead make a clang-tidy check that's more aggressive. |
FWIW, we're seeing this in Fuchsia too. The logs show that this is mostly in third party code, but that also means we may not be able to easily get it changed, and will have to suppress these on a case-by-case basis. In particular the tcpdump cases also seem benign to me:
I think that would be our preference. |
@ilovepi @nathanchance @AaronBallman I’ve opened a PR to roll back the changes for now, to finalize the intended behavior of this flag. |
…ng comma in initializer lists" (#154369) Revert #154018 changes due to excessive _false positives_. The warning caused multiple benign reports in large codebases (e.g. _Linux kernel_, _Fuchsia_, _tcpdump_). Since many of these concatenations are intentional and follow project style rules, the diagnostic introduced more false positives than value. This will be revisited as a potential `clang-tidy` check instead.
…every missing comma in initializer lists" (#154369) Revert llvm/llvm-project#154018 changes due to excessive _false positives_. The warning caused multiple benign reports in large codebases (e.g. _Linux kernel_, _Fuchsia_, _tcpdump_). Since many of these concatenations are intentional and follow project style rules, the diagnostic introduced more false positives than value. This will be revisited as a potential `clang-tidy` check instead.
We noticed false positives in our internal codebase as well, and used parens to fix them. As expect, the warning also identified additional bugs. I hope there will either be a flag that performs as expected (concatenation is bad unless it is explicitly marked as good) or a clang-tidy to capture the same effect. It seems perverse to rely on indentation and project formatting to signal safety, as many of the above false positives do. This reminds me of implicit fallthroughs where projects used many inconsistent variants of Here there are many ways to signal intended concatenation, but they all rely on fallible human inspection to determine that they're fine. I don't think it's good to rely on visual inspection. clang-tidy is less desirable from our perspective because it is often a retroactive catch rather than an immediate catch. |
Fixes #153745
This PR addresses a limitation in
-Wstring-concatenation
, where only the first missing comma in an initializer list was diagnosed.