Skip to content

Conversation

pogo59
Copy link
Collaborator

@pogo59 pogo59 commented Feb 2, 2024

Using the same critera as llvm/utils/git/code-format-helper.py, clang-format source files in the lld tree.

This is an example for https://discourse.llvm.org/t/rfc-clang-format-all-the-things/76614 and shows formatting all files, not just the small/medium files.

Instructions and a script to help downstream repos resolve conflicts will be committed separately.

Using the same critera as llvm/utils/git/code-format-helper.py,
clang-format source files in the lld tree.

Instructions and a script to help downstream repos resolve conflicts
will be committed separately.
Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I skimmed the first few changes here, and I see a couple of examples where auto-format is producing worse result. To me, this definitely calls for the need for human judgement on these changes. I would not approve this review as it stands.

case IMAGE_REL_AMD64_ADDR64:
add64(off, s + imageBase);
break;
case IMAGE_REL_AMD64_ADDR32NB: add32(off, s); break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be slightly less readable, but not terrible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a situation where I think having it look slightly worse is worth the general consistency autoformatting brings... But for the record you could also go for // clang-format: off:

  // clang-format off
  case IMAGE_REL_AMD64_ADDR32NB: add32(off, s); break;
  case IMAGE_REL_AMD64_REL32:    add32(off, s - p - 4); break;
  case IMAGE_REL_AMD64_REL32_1:  add32(off, s - p - 5); break;
  case IMAGE_REL_AMD64_REL32_2:  add32(off, s - p - 6); break;
  case IMAGE_REL_AMD64_REL32_3:  add32(off, s - p - 7); break;
  case IMAGE_REL_AMD64_REL32_4:  add32(off, s - p - 8); break;
  case IMAGE_REL_AMD64_REL32_5:  add32(off, s - p - 9); break;
  // clang-format on

size_t MergeChunk::getSize() const {
return builder.getSize();
}
size_t MergeChunk::getSize() const { return builder.getSize(); }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a poor formatting choice.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-liner get/set functions are extremely common. More often in .h files than .cpp files though. It does look odd in context. I'd be inclined to move these to the .h file actually, and keep them as single-line functions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is worse.
void setIsTagged and static unsigned getVerDefNum in lld/ELF/Symbols.h are similar.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a AllowShortFunctionsOnASingleLine setting for clang-format that we can try setting: https://clang.llvm.org/docs/ClangFormatStyleOptions.html#allowshortfunctionsonasingleline

using llvm::StringRef;
using llvm::COFF::IMAGE_FILE_MACHINE_UNKNOWN;
using llvm::COFF::WindowsSubsystem;
using llvm::StringRef;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, this is only clang-format right? This doesn't look like a formatting change at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I would say sorting things where semantics don't change alphabetically (or rather by "ASCII") does more good than harm...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the doc, it appears LLVM style has been set to LexicographicNumeric, and I agree Lexicographic would be better.

Copy link
Contributor

@MatzeB MatzeB Feb 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't mean to propose a change to "ASCII", mainly wanted to illustrate what happened. It's probably best to avoid option changes mostly.

CV = 0x1, /// CodeView
PData = 0x2, /// Procedure Data
Fixup = 0x4, /// Relocation Table
None = 0x0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks harmful to readability - only minorly so, admittedly.

0x51, // push ecx
0x52, // push edx
0x50, // push eax
0x68, 0, 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This markedly damages readability.

Copy link
Contributor

@MatzeB MatzeB Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use // clang-format off here (or find other ways to express things that aren't mangled by clang-format).

static const uint8_t tailMergeX86[] = {
// clang-format off
    0x51,              // push  ecx
    0x52,              // push  edx
    0x50,              // push  eax
    0x68, 0, 0, 0, 0,  // push  offset ___DELAY_IMPORT_DESCRIPTOR_<DLLNAME>_dll
    0xE8, 0, 0, 0, 0,  // call  ___delayLoadHelper2@8
    0x5A,              // pop   edx
    0x59,              // pop   ecx
    0xFF, 0xE0,        // jmp   eax
// clang-format on
};

or another (admittedly not too great attempt) without disabling clang format:

#define INST(x) x
static const uint8_t tailMergeSequence[] = {
    INST(0x51), // push  ecx
    INST(0x52), // push  edx
    INST(0x50), // push  eax
    INST(0x68, 0, 0, 0,
         0), // push  offset ___DELAY_IMPORT_DESCRIPTOR_<DLLNAME>_dll
    INST(0xE8, 0, 0, 0, 0), // call  ___delayLoadHelper2@8
    INST(0x5A),             // pop   edx
    INST(0x59),             // pop   ecx
    INST(0xFF, 0xE0),       // jmp   eax
}

return a->value < b->value;
});
mapSyms.erase(
std::unique(mapSyms.begin(), mapSyms.end(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was output from a previous version of clang-format. Personally I'd avoid such a change, but if a special rule makes things harder, I think it would be fine to format this.

Copy link
Contributor

@MatzeB MatzeB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this so we can have a constructive discussion on the effects of auto-formatting!

Generally: We should take some time and tweak clang-format settings before mass-formatting. We should also get used to more // clang-format off usage when auto-formatting. In my experience working 3 years in am auto-formatted project, you usually you have to get used to not always achieving the same level of "beauty" of human written code. But in my opinion the overal improvements is worth it for big projects with a lot of developers...

case IMAGE_REL_AMD64_ADDR64:
add64(off, s + imageBase);
break;
case IMAGE_REL_AMD64_ADDR32NB: add32(off, s); break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a situation where I think having it look slightly worse is worth the general consistency autoformatting brings... But for the record you could also go for // clang-format: off:

  // clang-format off
  case IMAGE_REL_AMD64_ADDR32NB: add32(off, s); break;
  case IMAGE_REL_AMD64_REL32:    add32(off, s - p - 4); break;
  case IMAGE_REL_AMD64_REL32_1:  add32(off, s - p - 5); break;
  case IMAGE_REL_AMD64_REL32_2:  add32(off, s - p - 6); break;
  case IMAGE_REL_AMD64_REL32_3:  add32(off, s - p - 7); break;
  case IMAGE_REL_AMD64_REL32_4:  add32(off, s - p - 8); break;
  case IMAGE_REL_AMD64_REL32_5:  add32(off, s - p - 9); break;
  // clang-format on

size_t MergeChunk::getSize() const {
return builder.getSize();
}
size_t MergeChunk::getSize() const { return builder.getSize(); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a AllowShortFunctionsOnASingleLine setting for clang-format that we can try setting: https://clang.llvm.org/docs/ClangFormatStyleOptions.html#allowshortfunctionsonasingleline

using llvm::StringRef;
using llvm::COFF::IMAGE_FILE_MACHINE_UNKNOWN;
using llvm::COFF::WindowsSubsystem;
using llvm::StringRef;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0x51, // push ecx
0x52, // push edx
0x50, // push eax
0x68, 0, 0,
Copy link
Contributor

@MatzeB MatzeB Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use // clang-format off here (or find other ways to express things that aren't mangled by clang-format).

static const uint8_t tailMergeX86[] = {
// clang-format off
    0x51,              // push  ecx
    0x52,              // push  edx
    0x50,              // push  eax
    0x68, 0, 0, 0, 0,  // push  offset ___DELAY_IMPORT_DESCRIPTOR_<DLLNAME>_dll
    0xE8, 0, 0, 0, 0,  // call  ___delayLoadHelper2@8
    0x5A,              // pop   edx
    0x59,              // pop   ecx
    0xFF, 0xE0,        // jmp   eax
// clang-format on
};

or another (admittedly not too great attempt) without disabling clang format:

#define INST(x) x
static const uint8_t tailMergeSequence[] = {
    INST(0x51), // push  ecx
    INST(0x52), // push  edx
    INST(0x50), // push  eax
    INST(0x68, 0, 0, 0,
         0), // push  offset ___DELAY_IMPORT_DESCRIPTOR_<DLLNAME>_dll
    INST(0xE8, 0, 0, 0, 0), // call  ___delayLoadHelper2@8
    INST(0x5A),             // pop   edx
    INST(0x59),             // pop   ecx
    INST(0xFF, 0xE0),       // jmp   eax
}

LWA = 58,
STD = 62
};
enum DSFormOpcd { LD = 58, LWA = 58, STD = 62 };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add a comma here for formatting purposes:

Suggested change
enum DSFormOpcd { LD = 58, LWA = 58, STD = 62 };
enum DSFormOpcd {
LD = 58,
LWA = 58,
STD = 62,
};

Comment on lines 12 to +13

#include "llvm/ADT/SmallVector.h"
#include "Relocations.h"
#include "llvm/ADT/SmallVector.h"
Copy link
Contributor

@MatzeB MatzeB Feb 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note include sorting here! This can technically change semantics for poorly written include files. I personally like the setting (and also the incentive to not have poorly written include files)... If you want to enforce some order you can form groups of include statements separated by empty line and each group gets sorted separately.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM style has required self-contained headers for a lot longer than we've had clang-format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants