-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[DRAFT][clang-format][lld] Run clang-format on lld #80461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Using the same critera as llvm/utils/git/code-format-helper.py, clang-format source files in the lld tree. Instructions and a script to help downstream repos resolve conflicts will be committed separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed the first few changes here, and I see a couple of examples where auto-format is producing worse result. To me, this definitely calls for the need for human judgement on these changes. I would not approve this review as it stands.
case IMAGE_REL_AMD64_ADDR64: | ||
add64(off, s + imageBase); | ||
break; | ||
case IMAGE_REL_AMD64_ADDR32NB: add32(off, s); break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be slightly less readable, but not terrible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a situation where I think having it look slightly worse is worth the general consistency autoformatting brings... But for the record you could also go for // clang-format: off
:
// clang-format off
case IMAGE_REL_AMD64_ADDR32NB: add32(off, s); break;
case IMAGE_REL_AMD64_REL32: add32(off, s - p - 4); break;
case IMAGE_REL_AMD64_REL32_1: add32(off, s - p - 5); break;
case IMAGE_REL_AMD64_REL32_2: add32(off, s - p - 6); break;
case IMAGE_REL_AMD64_REL32_3: add32(off, s - p - 7); break;
case IMAGE_REL_AMD64_REL32_4: add32(off, s - p - 8); break;
case IMAGE_REL_AMD64_REL32_5: add32(off, s - p - 9); break;
// clang-format on
size_t MergeChunk::getSize() const { | ||
return builder.getSize(); | ||
} | ||
size_t MergeChunk::getSize() const { return builder.getSize(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a poor formatting choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One-liner get/set functions are extremely common. More often in .h files than .cpp files though. It does look odd in context. I'd be inclined to move these to the .h file actually, and keep them as single-line functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this is worse.
void setIsTagged
and static unsigned getVerDefNum
in lld/ELF/Symbols.h are similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a AllowShortFunctionsOnASingleLine
setting for clang-format that we can try setting: https://clang.llvm.org/docs/ClangFormatStyleOptions.html#allowshortfunctionsonasingleline
using llvm::StringRef; | ||
using llvm::COFF::IMAGE_FILE_MACHINE_UNKNOWN; | ||
using llvm::COFF::WindowsSubsystem; | ||
using llvm::StringRef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, this is only clang-format right? This doesn't look like a formatting change at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be: https://clang.llvm.org/docs/ClangFormatStyleOptions.html#sortusingdeclarations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and I would say sorting things where semantics don't change alphabetically (or rather by "ASCII") does more good than harm...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the doc, it appears LLVM style has been set to LexicographicNumeric, and I agree Lexicographic would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't mean to propose a change to "ASCII", mainly wanted to illustrate what happened. It's probably best to avoid option changes mostly.
CV = 0x1, /// CodeView | ||
PData = 0x2, /// Procedure Data | ||
Fixup = 0x4, /// Relocation Table | ||
None = 0x0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks harmful to readability - only minorly so, admittedly.
0x51, // push ecx | ||
0x52, // push edx | ||
0x50, // push eax | ||
0x68, 0, 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This markedly damages readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should use // clang-format off
here (or find other ways to express things that aren't mangled by clang-format).
static const uint8_t tailMergeX86[] = {
// clang-format off
0x51, // push ecx
0x52, // push edx
0x50, // push eax
0x68, 0, 0, 0, 0, // push offset ___DELAY_IMPORT_DESCRIPTOR_<DLLNAME>_dll
0xE8, 0, 0, 0, 0, // call ___delayLoadHelper2@8
0x5A, // pop edx
0x59, // pop ecx
0xFF, 0xE0, // jmp eax
// clang-format on
};
or another (admittedly not too great attempt) without disabling clang format:
#define INST(x) x
static const uint8_t tailMergeSequence[] = {
INST(0x51), // push ecx
INST(0x52), // push edx
INST(0x50), // push eax
INST(0x68, 0, 0, 0,
0), // push offset ___DELAY_IMPORT_DESCRIPTOR_<DLLNAME>_dll
INST(0xE8, 0, 0, 0, 0), // call ___delayLoadHelper2@8
INST(0x5A), // pop edx
INST(0x59), // pop ecx
INST(0xFF, 0xE0), // jmp eax
}
return a->value < b->value; | ||
}); | ||
mapSyms.erase( | ||
std::unique(mapSyms.begin(), mapSyms.end(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this was output from a previous version of clang-format. Personally I'd avoid such a change, but if a special rule makes things harder, I think it would be fine to format this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting this so we can have a constructive discussion on the effects of auto-formatting!
Generally: We should take some time and tweak clang-format
settings before mass-formatting. We should also get used to more // clang-format off
usage when auto-formatting. In my experience working 3 years in am auto-formatted project, you usually you have to get used to not always achieving the same level of "beauty" of human written code. But in my opinion the overal improvements is worth it for big projects with a lot of developers...
case IMAGE_REL_AMD64_ADDR64: | ||
add64(off, s + imageBase); | ||
break; | ||
case IMAGE_REL_AMD64_ADDR32NB: add32(off, s); break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a situation where I think having it look slightly worse is worth the general consistency autoformatting brings... But for the record you could also go for // clang-format: off
:
// clang-format off
case IMAGE_REL_AMD64_ADDR32NB: add32(off, s); break;
case IMAGE_REL_AMD64_REL32: add32(off, s - p - 4); break;
case IMAGE_REL_AMD64_REL32_1: add32(off, s - p - 5); break;
case IMAGE_REL_AMD64_REL32_2: add32(off, s - p - 6); break;
case IMAGE_REL_AMD64_REL32_3: add32(off, s - p - 7); break;
case IMAGE_REL_AMD64_REL32_4: add32(off, s - p - 8); break;
case IMAGE_REL_AMD64_REL32_5: add32(off, s - p - 9); break;
// clang-format on
size_t MergeChunk::getSize() const { | ||
return builder.getSize(); | ||
} | ||
size_t MergeChunk::getSize() const { return builder.getSize(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a AllowShortFunctionsOnASingleLine
setting for clang-format that we can try setting: https://clang.llvm.org/docs/ClangFormatStyleOptions.html#allowshortfunctionsonasingleline
using llvm::StringRef; | ||
using llvm::COFF::IMAGE_FILE_MACHINE_UNKNOWN; | ||
using llvm::COFF::WindowsSubsystem; | ||
using llvm::StringRef; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be: https://clang.llvm.org/docs/ClangFormatStyleOptions.html#sortusingdeclarations
0x51, // push ecx | ||
0x52, // push edx | ||
0x50, // push eax | ||
0x68, 0, 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should use // clang-format off
here (or find other ways to express things that aren't mangled by clang-format).
static const uint8_t tailMergeX86[] = {
// clang-format off
0x51, // push ecx
0x52, // push edx
0x50, // push eax
0x68, 0, 0, 0, 0, // push offset ___DELAY_IMPORT_DESCRIPTOR_<DLLNAME>_dll
0xE8, 0, 0, 0, 0, // call ___delayLoadHelper2@8
0x5A, // pop edx
0x59, // pop ecx
0xFF, 0xE0, // jmp eax
// clang-format on
};
or another (admittedly not too great attempt) without disabling clang format:
#define INST(x) x
static const uint8_t tailMergeSequence[] = {
INST(0x51), // push ecx
INST(0x52), // push edx
INST(0x50), // push eax
INST(0x68, 0, 0, 0,
0), // push offset ___DELAY_IMPORT_DESCRIPTOR_<DLLNAME>_dll
INST(0xE8, 0, 0, 0, 0), // call ___delayLoadHelper2@8
INST(0x5A), // pop edx
INST(0x59), // pop ecx
INST(0xFF, 0xE0), // jmp eax
}
LWA = 58, | ||
STD = 62 | ||
}; | ||
enum DSFormOpcd { LD = 58, LWA = 58, STD = 62 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could add a comma here for formatting purposes:
enum DSFormOpcd { LD = 58, LWA = 58, STD = 62 }; | |
enum DSFormOpcd { | |
LD = 58, | |
LWA = 58, | |
STD = 62, | |
}; |
|
||
#include "llvm/ADT/SmallVector.h" | ||
#include "Relocations.h" | ||
#include "llvm/ADT/SmallVector.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note include sorting here! This can technically change semantics for poorly written include files. I personally like the setting (and also the incentive to not have poorly written include files)... If you want to enforce some order you can form groups of include
statements separated by empty line and each group gets sorted separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM style has required self-contained headers for a lot longer than we've had clang-format.
Using the same critera as llvm/utils/git/code-format-helper.py, clang-format source files in the lld tree.
This is an example for https://discourse.llvm.org/t/rfc-clang-format-all-the-things/76614 and shows formatting all files, not just the small/medium files.
Instructions and a script to help downstream repos resolve conflicts will be committed separately.