Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Object] Remove restriction universal archives having both IR and native #67505

Merged
merged 1 commit into from
Oct 3, 2023

Conversation

drodriguez
Copy link
Contributor

Mach-O archives seems to be able to contain both IR objects and native objects mixed together. Apple tooling seems to deal with them correctly.

The current implementation was adding an additional restriction of all the objects in the archive being either IR objects or native objects.

The changes in this commit remove that restriction and allow mixing both IR and native objects, while still checking that the CPU restrictions still apply (all objects in a slice need to be the same CPU type/subtype).

A test that was testing for the previous behaviour had been modified to test that mixed archives are allowed and that they create the expected results.

Additionally, locally I checked the results of Apple's libtool against llvm-libtool-darwin with this code, and the resulting libraries are almost identical with expected differences in the GUID and code signatures load commands, and some minor differences in the rest of the binary.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 27, 2023

@llvm/pr-subscribers-llvm-binary-utilities

Changes

Mach-O archives seems to be able to contain both IR objects and native objects mixed together. Apple tooling seems to deal with them correctly.

The current implementation was adding an additional restriction of all the objects in the archive being either IR objects or native objects.

The changes in this commit remove that restriction and allow mixing both IR and native objects, while still checking that the CPU restrictions still apply (all objects in a slice need to be the same CPU type/subtype).

A test that was testing for the previous behaviour had been modified to test that mixed archives are allowed and that they create the expected results.

Additionally, locally I checked the results of Apple's libtool against llvm-libtool-darwin with this code, and the resulting libraries are almost identical with expected differences in the GUID and code signatures load commands, and some minor differences in the rest of the binary.


Full diff: https://github.com/llvm/llvm-project/pull/67505.diff

2 Files Affected:

  • (modified) llvm/lib/Object/MachOUniversalWriter.cpp (+42-44)
  • (modified) llvm/test/tools/llvm-lipo/create-archive-input.test (+16-6)
diff --git a/llvm/lib/Object/MachOUniversalWriter.cpp b/llvm/lib/Object/MachOUniversalWriter.cpp
index 909a10b2c072a2e..ea7fe3848877601 100644
--- a/llvm/lib/Object/MachOUniversalWriter.cpp
+++ b/llvm/lib/Object/MachOUniversalWriter.cpp
@@ -100,7 +100,7 @@ Slice::Slice(const IRObjectFile &IRO, uint32_t CPUType, uint32_t CPUSubType,
 
 Slice::Slice(const MachOObjectFile &O) : Slice(O, calculateAlignment(O)) {}
 
-using MachoCPUTy = std::pair<unsigned, unsigned>;
+using MachoCPUTy = std::pair<uint32_t, uint32_t>;
 
 static Expected<MachoCPUTy> getMachoCPUFromTriple(Triple TT) {
   auto CPU = std::make_pair(MachO::getCPUType(TT), MachO::getCPUSubType(TT));
@@ -117,10 +117,16 @@ static Expected<MachoCPUTy> getMachoCPUFromTriple(StringRef TT) {
   return getMachoCPUFromTriple(Triple{TT});
 }
 
+static Expected<MachoCPUTy>
+getMachoCPUFromObjectFile(const MachOObjectFile *O) {
+  return std::make_pair(O->getHeader().cputype, O->getHeader().cpusubtype);
+}
+
 Expected<Slice> Slice::create(const Archive &A, LLVMContext *LLVMCtx) {
   Error Err = Error::success();
   std::unique_ptr<MachOObjectFile> MFO = nullptr;
   std::unique_ptr<IRObjectFile> IRFO = nullptr;
+  std::optional<MachoCPUTy> CPU = std::nullopt;
   for (const Archive::Child &Child : A.children(Err)) {
     Expected<std::unique_ptr<Binary>> ChildOrErr = Child.getAsBinary(LLVMCtx);
     if (!ChildOrErr)
@@ -134,65 +140,57 @@ Expected<Slice> Slice::create(const Archive &A, LLVMContext *LLVMCtx) {
                                    .c_str());
     if (Bin->isMachO()) {
       MachOObjectFile *O = cast<MachOObjectFile>(Bin);
-      if (IRFO) {
-        return createStringError(
-            std::errc::invalid_argument,
-            "archive member %s is a MachO, while previous archive member "
-            "%s was an IR LLVM object",
-            O->getFileName().str().c_str(), IRFO->getFileName().str().c_str());
-      }
-      if (MFO &&
-          std::tie(MFO->getHeader().cputype, MFO->getHeader().cpusubtype) !=
-              std::tie(O->getHeader().cputype, O->getHeader().cpusubtype)) {
+      auto ObjectCPU = getMachoCPUFromObjectFile(O);
+      if (!ObjectCPU)
+        return ObjectCPU.takeError();
+
+      if (CPU && CPU != *ObjectCPU) {
+        // If CPU != nullptr, one of MFO, IRFO will be != nullptr.
+        StringRef PreviousName = MFO ? MFO->getFileName() : IRFO->getFileName();
         return createStringError(
             std::errc::invalid_argument,
             ("archive member " + O->getFileName() + " cputype (" +
-             Twine(O->getHeader().cputype) + ") and cpusubtype(" +
-             Twine(O->getHeader().cpusubtype) +
+             Twine(ObjectCPU->first) + ") and cpusubtype(" +
+             Twine(ObjectCPU->second) +
              ") does not match previous archive members cputype (" +
-             Twine(MFO->getHeader().cputype) + ") and cpusubtype(" +
-             Twine(MFO->getHeader().cpusubtype) +
-             ") (all members must match) " + MFO->getFileName())
+             Twine(CPU->first) + ") and cpusubtype(" + Twine(CPU->second) +
+             ") (all members must match) " + PreviousName)
                 .str()
                 .c_str());
       }
       if (!MFO) {
         ChildOrErr.get().release();
         MFO.reset(O);
+        if (!CPU)
+          CPU.emplace(*ObjectCPU);
       }
     } else if (Bin->isIR()) {
       IRObjectFile *O = cast<IRObjectFile>(Bin);
-      if (MFO) {
-        return createStringError(std::errc::invalid_argument,
-                                 "archive member '%s' is an LLVM IR object, "
-                                 "while previous archive member "
-                                 "'%s' was a MachO",
-                                 O->getFileName().str().c_str(),
-                                 MFO->getFileName().str().c_str());
+      auto ObjectCPU = getMachoCPUFromTriple(O->getTargetTriple());
+      if (!ObjectCPU)
+        return ObjectCPU.takeError();
+
+      if (CPU && CPU != *ObjectCPU) {
+        // If CPU != nullptr, one of MFO, IRFO will be != nullptr.
+        StringRef PreviousName =
+            IRFO ? IRFO->getFileName() : MFO->getFileName();
+        return createStringError(
+            std::errc::invalid_argument,
+            ("archive member " + O->getFileName() + " cputype (" +
+             Twine(ObjectCPU->first) + ") and cpusubtype(" +
+             Twine(ObjectCPU->second) +
+             ") does not match previous archive members cputype (" +
+             Twine(CPU->first) + ") and cpusubtype(" + Twine(CPU->second) +
+             ") (all members must match) " + PreviousName)
+                .str()
+                .c_str());
       }
-      if (IRFO) {
-        Expected<MachoCPUTy> CPUO = getMachoCPUFromTriple(O->getTargetTriple());
-        Expected<MachoCPUTy> CPUFO =
-            getMachoCPUFromTriple(IRFO->getTargetTriple());
-        if (!CPUO)
-          return CPUO.takeError();
-        if (!CPUFO)
-          return CPUFO.takeError();
-        if (*CPUO != *CPUFO) {
-          return createStringError(
-              std::errc::invalid_argument,
-              ("archive member " + O->getFileName() + " cputype (" +
-               Twine(CPUO->first) + ") and cpusubtype(" + Twine(CPUO->second) +
-               ") does not match previous archive members cputype (" +
-               Twine(CPUFO->first) + ") and cpusubtype(" +
-               Twine(CPUFO->second) + ") (all members must match) " +
-               IRFO->getFileName())
-                  .str()
-                  .c_str());
-        }
-      } else {
+
+      if (!IRFO) {
         ChildOrErr.get().release();
         IRFO.reset(O);
+        if (!CPU)
+          CPU.emplace(*ObjectCPU);
       }
     } else
       return createStringError(std::errc::invalid_argument,
diff --git a/llvm/test/tools/llvm-lipo/create-archive-input.test b/llvm/test/tools/llvm-lipo/create-archive-input.test
index 5558797db87b691..c4323811af6fd86 100644
--- a/llvm/test/tools/llvm-lipo/create-archive-input.test
+++ b/llvm/test/tools/llvm-lipo/create-archive-input.test
@@ -28,17 +28,27 @@
 # RUN: llvm-lipo %t-ir-armv7-x86_64-universal.o -thin x86_64 -output %t-ir-extracted-x86_64.o
 # RUN: cmp %t-ir-extracted-x86_64.o %t-ir-x86_64.o
 
-# RUN: llvm-ar cr %t.different_types0.a %t-i386.o %t-ir-x86_64.o
-# RUN: not llvm-lipo -create %t.different_types0.a -output /dev/null 2>&1 | FileCheck --check-prefix=ARCHIVE-WITH-MACHO-AND-IR %s
-# RUN: llvm-ar cr %t.different_types1.a %t-ir-x86_64.o %t-i386.o 
-# RUN: not llvm-lipo -create %t.different_types1.a -output /dev/null 2>&1 | FileCheck --check-prefix=ARCHIVE-WITH-IR-AND-MACHO %s
+# RUN: llvm-ar cr %t.different_types0-i386.a %t-i386.o
+# RUN: llvm-ar cr %t.different_types0-x86_64.a %t-x86_64.o %t-ir-x86_64.o
+# RUN: llvm-lipo -create %t.different_types0-i386.a %t.different_types0-x86_64.a -output %t.different_types0-universal.a
+# RUN: llvm-lipo %t.different_types0-universal.a -thin i386 -output %t.different_types0-extracted-i386.a
+# RUN: llvm-lipo %t.different_types0-universal.a -thin x86_64 -output %t.different_types0-extracted-x86_64.a
+# RUN: cmp %t.different_types0-extracted-i386.a %t.different_types0-i386.a
+# RUN: cmp %t.different_types0-extracted-x86_64.a %t.different_types0-x86_64.a
+
+# RUN: llvm-ar cr %t.different_types1-i386.a %t-i386.o
+# RUN: llvm-ar cr %t.different_types1-x86_64.a %t-ir-x86_64.o %t-x86_64.o
+# RUN: llvm-lipo -create %t.different_types1-x86_64.a %t.different_types1-i386.a -output %t.different_types1-universal.a
+# RUN: llvm-lipo %t.different_types1-universal.a -thin i386 -output %t.different_types1-extracted-i386.a
+# RUN: llvm-lipo %t.different_types1-universal.a -thin x86_64 -output %t.different_types1-extracted-x86_64.a
+# RUN: cmp %t.different_types1-extracted-i386.a %t.different_types1-i386.a
+# RUN: cmp %t.different_types1-extracted-x86_64.a %t.different_types1-x86_64.a
+
 # RUN: llvm-ar cr %t.different_architectures_ir.a %t-ir-x86_64.o %t-ir-armv7.o
 # RUN: not llvm-lipo -create %t.different_architectures_ir.a -output /dev/null 2>&1 | FileCheck --check-prefix=ARCHIVE-WITH-DIFFERENT-ARCHS %s
 
 # EMPTY-ARCHIVE: empty archive
 # ARCHIVE-WITH-DIFFERENT-ARCHS: all members must match
-# ARCHIVE-WITH-MACHO-AND-IR: is an LLVM IR object, while previous archive member {{.*}} was a MachO
-# ARCHIVE-WITH-IR-AND-MACHO: is a MachO, while previous archive member {{.*}} was an IR LLVM object
 # ARCHIVE-WITH-FAT-BINARY: fat file (not allowed in an archive)
 #
 # INFO-i386-x86_64: i386 x86_64

Copy link
Collaborator

@alexander-shaposhnikov alexander-shaposhnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG

Mach-O archives seems to be able to contain both IR objects and native
objects mixed together. Apple tooling seems to deal with them correctly.

The current implementation was adding an additional restriction of all
the objects in the archive being either IR objects or native objects.

The changes in this commit remove that restriction and allow mixing both
IR and native objects, while still checking that the CPU restrictions
still apply (all objects in a slice need to be the same CPU type/subtype).

A test that was testing for the previous behaviour had been modified to
test that mixed archives are allowed and that they create the expected
results.

Additionally, locally I checked the results of Apple's `libtool` against
`llvm-libtool-darwin` with this code, and the resulting libraries are
almost identical with expected differences in the GUID and code
signatures load commands, and some minor differences in the rest of the
binary.
@drodriguez drodriguez merged commit b116747 into llvm:main Oct 3, 2023
3 checks passed
@drodriguez drodriguez deleted the llvm-libtool-darwin-mixed branch October 3, 2023 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants