Skip to content

Conversation

rjmansfield
Copy link
Contributor

The --totals option was not working for Mach-O files because the Darwin segment size calculation skipped the totals accumulation.

@llvmbot
Copy link
Member

llvmbot commented Sep 10, 2025

@llvm/pr-subscribers-llvm-binary-utilities

Author: Ryan Mansfield (rjmansfield)

Changes

The --totals option was not working for Mach-O files because the Darwin segment size calculation skipped the totals accumulation.


Full diff: https://github.com/llvm/llvm-project/pull/157904.diff

2 Files Affected:

  • (modified) llvm/test/tools/llvm-size/totals.test (+148)
  • (modified) llvm/tools/llvm-size/llvm-size.cpp (+32-10)
diff --git a/llvm/test/tools/llvm-size/totals.test b/llvm/test/tools/llvm-size/totals.test
index 6f97dc18fee78..40de272b38ff9 100644
--- a/llvm/test/tools/llvm-size/totals.test
+++ b/llvm/test/tools/llvm-size/totals.test
@@ -16,6 +16,16 @@
 # CHECK-NEXT: [[FILE2]]
 # CHECK-NEXT:      18      36      72     126      7e (TOTALS)
 
+# RUN: yaml2obj %s --docnum=3 -o %t-macho.o
+# RUN: yaml2obj %s --docnum=4 -o %t-macho2.o
+# RUN: llvm-size --totals %t-macho.o %t-macho2.o \
+# RUN:   | FileCheck %s --check-prefix=MACHO-CHECK -DFILE1=%t-macho.o -DFILE2=%t-macho2.o
+
+# MACHO-CHECK:      __TEXT	__DATA	__OBJC	others	dec	hex
+# MACHO-CHECK-NEXT: 20	100	0	32	152	98	[[FILE1]]
+# MACHO-CHECK-NEXT: 20	200	0	32	252	fc	[[FILE2]]
+# MACHO-CHECK-NEXT: 40	300	0	64	404	194	(TOTALS)
+
 --- !ELF
 FileHeader:
   Class:   ELFCLASS64
@@ -55,3 +65,141 @@ Sections:
     Type:    SHT_NOBITS
     Flags:   [SHF_ALLOC, SHF_WRITE]
     Size:    32
+
+--- !mach-o
+FileHeader:
+  magic:           0xFEEDFACF
+  cputype:         0x100000C
+  cpusubtype:      0x0
+  filetype:        0x1
+  ncmds:           2
+  sizeofcmds:      352
+  flags:           0x2000
+  reserved:        0x0
+LoadCommands:
+  - cmd:             LC_SEGMENT_64
+    cmdsize:         312
+    segname:         ''
+    vmaddr:          0
+    vmsize:          152
+    fileoff:         384
+    filesize:        152
+    maxprot:         7
+    initprot:        7
+    nsects:          3
+    flags:           0
+    Sections:
+      - sectname:        __text
+        segname:         __TEXT
+        addr:            0x0
+        size:            20
+        offset:          0x180
+        align:           2
+        reloff:          0x0
+        nreloc:          0
+        flags:           0x80000400
+        reserved1:       0x0
+        reserved2:       0x0
+        reserved3:       0x0
+      - sectname:        __data
+        segname:         __DATA
+        addr:            0x18
+        size:            100
+        offset:          0x198
+        align:           2
+        reloff:          0x0
+        nreloc:          0
+        flags:           0x0
+        reserved1:       0x0
+        reserved2:       0x0
+        reserved3:       0x0
+      - sectname:        __compact_unwind
+        segname:         __LD
+        addr:            0x20
+        size:            32
+        offset:          0x200
+        align:           3
+        reloff:          0x0
+        nreloc:          0
+        flags:           0x2000000
+        reserved1:       0x0
+        reserved2:       0x0
+        reserved3:       0x0
+  - cmd:             LC_BUILD_VERSION
+    cmdsize:         32
+    platform:        1
+    minos:           851968
+    sdk:             983040
+    ntools:          1
+    Tools:
+      - tool:            3
+        version:         68157696
+
+--- !mach-o
+FileHeader:
+  magic:           0xFEEDFACF
+  cputype:         0x100000C
+  cpusubtype:      0x0
+  filetype:        0x1
+  ncmds:           2
+  sizeofcmds:      352
+  flags:           0x2000
+  reserved:        0x0
+LoadCommands:
+  - cmd:             LC_SEGMENT_64
+    cmdsize:         312
+    segname:         ''
+    vmaddr:          0
+    vmsize:          252
+    fileoff:         384
+    filesize:        252
+    maxprot:         7
+    initprot:        7
+    nsects:          3
+    flags:           0
+    Sections:
+      - sectname:        __text
+        segname:         __TEXT
+        addr:            0x0
+        size:            20
+        offset:          0x180
+        align:           2
+        reloff:          0x0
+        nreloc:          0
+        flags:           0x80000400
+        reserved1:       0x0
+        reserved2:       0x0
+        reserved3:       0x0
+      - sectname:        __data
+        segname:         __DATA
+        addr:            0x18
+        size:            200
+        offset:          0x198
+        align:           2
+        reloff:          0x0
+        nreloc:          0
+        flags:           0x0
+        reserved1:       0x0
+        reserved2:       0x0
+        reserved3:       0x0
+      - sectname:        __compact_unwind
+        segname:         __LD
+        addr:            0x20
+        size:            32
+        offset:          0x260
+        align:           3
+        reloff:          0x0
+        nreloc:          0
+        flags:           0x2000000
+        reserved1:       0x0
+        reserved2:       0x0
+        reserved3:       0x0
+  - cmd:             LC_BUILD_VERSION
+    cmdsize:         32
+    platform:        1
+    minos:           851968
+    sdk:             983040
+    ntools:          1
+    Tools:
+      - tool:            3
+        version:         68157696
diff --git a/llvm/tools/llvm-size/llvm-size.cpp b/llvm/tools/llvm-size/llvm-size.cpp
index afb6b2386218e..f7839a0f369c1 100644
--- a/llvm/tools/llvm-size/llvm-size.cpp
+++ b/llvm/tools/llvm-size/llvm-size.cpp
@@ -78,6 +78,7 @@ static OutputFormatTy OutputFormat;
 static bool DarwinLongFormat;
 static RadixTy Radix = RadixTy::decimal;
 static bool TotalSizes;
+static bool HasMachOFiles = false;
 
 static std::vector<std::string> InputFilenames;
 
@@ -92,6 +93,10 @@ static uint64_t TotalObjectData = 0;
 static uint64_t TotalObjectBss = 0;
 static uint64_t TotalObjectTotal = 0;
 
+// Darwin-specific totals
+static uint64_t TotalObjectObjc = 0;
+static uint64_t TotalObjectOthers = 0;
+
 static void error(const Twine &Message, StringRef File = "") {
   HadError = true;
   if (File.empty())
@@ -283,6 +288,7 @@ static void printDarwinSegmentSizes(MachOObjectFile *MachO) {
   uint64_t total_data = 0;
   uint64_t total_objc = 0;
   uint64_t total_others = 0;
+  HasMachOFiles = true;
   for (const auto &Load : MachO->load_commands()) {
     if (Load.C.cmd == MachO::LC_SEGMENT_64) {
       MachO::segment_command_64 Seg = MachO->getSegment64LoadCommand(Load);
@@ -340,6 +346,14 @@ static void printDarwinSegmentSizes(MachOObjectFile *MachO) {
   }
   uint64_t total = total_text + total_data + total_objc + total_others;
 
+  if (TotalSizes) {
+    TotalObjectText += total_text;
+    TotalObjectData += total_data;
+    TotalObjectObjc += total_objc;
+    TotalObjectOthers += total_others;
+    TotalObjectTotal += total;
+  }
+
   if (!BerkeleyHeaderPrinted) {
     outs() << "__TEXT\t__DATA\t__OBJC\tothers\tdec\thex\n";
     BerkeleyHeaderPrinted = true;
@@ -852,16 +866,24 @@ static void printBerkeleyTotals() {
   std::string fmtbuf;
   raw_string_ostream fmt(fmtbuf);
   const char *radix_fmt = getRadixFmt();
-  fmt << "%#7" << radix_fmt << "\t"
-      << "%#7" << radix_fmt << "\t"
-      << "%#7" << radix_fmt << "\t";
-  outs() << format(fmtbuf.c_str(), TotalObjectText, TotalObjectData,
-                   TotalObjectBss);
-  fmtbuf.clear();
-  fmt << "%7" << (Radix == octal ? PRIo64 : PRIu64) << "\t"
-      << "%7" PRIx64 "\t";
-  outs() << format(fmtbuf.c_str(), TotalObjectTotal, TotalObjectTotal)
-         << "(TOTALS)\n";
+
+  if (HasMachOFiles) {
+    // Darwin format totals: __TEXT __DATA __OBJC others dec hex
+    outs() << TotalObjectText << "\t" << TotalObjectData << "\t" << TotalObjectObjc << "\t"
+           << TotalObjectOthers << "\t" << TotalObjectTotal << "\t"
+           << format("%" PRIx64, TotalObjectTotal) << "\t(TOTALS)\n";
+  } else {
+    fmt << "%#7" << radix_fmt << "\t"
+        << "%#7" << radix_fmt << "\t"
+        << "%#7" << radix_fmt << "\t";
+    outs() << format(fmtbuf.c_str(), TotalObjectText, TotalObjectData,
+                     TotalObjectBss);
+    fmtbuf.clear();
+    fmt << "%7" << (Radix == octal ? PRIo64 : PRIu64) << "\t"
+        << "%7" PRIx64 "\t";
+    outs() << format(fmtbuf.c_str(), TotalObjectTotal, TotalObjectTotal)
+           << "(TOTALS)\n";
+  }
 }
 
 int llvm_size_main(int argc, char **argv, const llvm::ToolContext &) {

Copy link

github-actions bot commented Sep 10, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@jh7370
Copy link
Collaborator

jh7370 commented Sep 12, 2025

See #86644. I suspect this fixes the issue (so add the relevant annotation to the PR description, please).

@rjmansfield rjmansfield force-pushed the size-totals-macho branch 2 times, most recently from b8ab0e4 to c53d117 Compare September 12, 2025 16:05
@rjmansfield
Copy link
Contributor Author

See #86644. I suspect this fixes the issue (so add the relevant annotation to the PR description, please).

Unfortunately it doesn't. The 4G __PAGEZERO segment is still counted as a part of others. Since size is meant to report the vmsize, and the zero filled __PAGEZERO segment is part of the binary, the number is correct but can be misleading to users with respect to the other sizes. I think in this case a macho specific option that excludes __PAGEZERO might provide more reasonable total. I can push up a separate patch for that.

The --totals option was not working for Mach-O files because the Darwin
segment size calculation skipped the totals accumulation.
@rjmansfield
Copy link
Contributor Author

Test failures seem to be unrelated lldb tests:

******************** Failed Tests (2): lldb-api :: commands/expression/import-std-module/vector-dbg-info-content/TestDbgInfoContentVectorFromStdModule.py lldb-api :: commands/expression/import-std-module/vector-of-vectors/TestVectorOfVectorsFromStdModule.py

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LLVM uses a squash and merge approach to landing PRs, which means all commits in the PR get combined into one commit that gets rebased onto main, with a title and description that matches the PR title and description. This has a couple of consequences:

  1. Please keep the PR to one feature/fix. Make additional PRs for subsequent features. If needing to stack a series of changes, see "stacked pull requests" in the GitHub docs.
  2. Please don't force push to branches that are under active review, as it complicates the review process and is actually against LLVM policy. Instead, use fixup commits added to the branch, to make any necessary changes. If you need to rebase, use a merge commit, since the squash and merge will make the merge commit disappear.

Comment on lines 21 to 22
# RUN: llvm-size --totals %t-macho.o %t-macho2.o \
# RUN: | FileCheck %s --check-prefix=MACHO-CHECK --strict-whitespace --match-full-lines --implicit-check-not={{.}} -DFILE1=%t-macho.o -DFILE2=%t-macho2.o
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# RUN: llvm-size --totals %t-macho.o %t-macho2.o \
# RUN: | FileCheck %s --check-prefix=MACHO-CHECK --strict-whitespace --match-full-lines --implicit-check-not={{.}} -DFILE1=%t-macho.o -DFILE2=%t-macho2.o
# RUN: llvm-size --totals %t-macho.o %t-macho2.o | \
# RUN: FileCheck %s --check-prefix=MACHO-CHECK --strict-whitespace \
# RUN: --match-full-lines --implicit-check-not={{.}} -DFILE1=%t-macho.o -DFILE2=%t-macho2.o

Two things: a) the FileCheck command should be split over multiple lines, since it's getting quite long; b) I have a slight preference for continuing lines as in the above suggestion, since the trailing | \ indicates the next line continues, but as a new command (i.e. there are no more args to come), while the indentation on the second line indicates it is a follow-on command from the previous one).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that, thanks for letting me know. I think I'll have to force push once more to drop the other commit

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, you could probably have reverted the commit in your branch, rather than force pushed. No big deal though.

Copy link
Collaborator

@jh7370 jh7370 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, otherwise LGTM. Do you have commit access?

Co-authored-by: James Henderson <James.Henderson@sony.com>
@rjmansfield
Copy link
Contributor Author

One nit, otherwise LGTM. Do you have commit access?

No, I do not. If you don't mind merging on my behalf, I'd appreciate it.

@jh7370 jh7370 merged commit dcd0a2e into llvm:main Sep 18, 2025
9 checks passed
SeongjaeP pushed a commit to SeongjaeP/llvm-project that referenced this pull request Sep 23, 2025
The --totals option was not working for Mach-O files because the Darwin
segment size calculation skipped the totals accumulation.

---------

Co-authored-by: James Henderson <James.Henderson@sony.com>
@rjmansfield rjmansfield deleted the size-totals-macho branch September 29, 2025 23:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants