Skip to content

[lld] check cache in loadDylib before real_path #143595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 17, 2025

Conversation

rmaz
Copy link
Contributor

@rmaz rmaz commented Jun 10, 2025

In #140791 an attempt was made to defer the cost of real_path to after checking
the dylib cache. This was reverted due to issues incorrect handling of the
reference after the map updated.

In this PR the approach is changed to avoid mutating the dylib cache when
checking for real path presence, and updating the map only at the end of the
function.

@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2025

@llvm/pr-subscribers-lld-macho

Author: Richard Howell (rmaz)

Changes

In #140791 an attempt was made to defer the cost of real_path to after checking
the dylib cache. This was reverted due to issues incorrect handling of the
reference after the map updated.

In this PR the approach is changed to recursively call loadDylib with a
resolved symlink path if it differs from the original path. This avoids the
complicated logic holding on to two references that need to be updated
together.


Full diff: https://github.com/llvm/llvm-project/pull/143595.diff

1 Files Affected:

  • (modified) lld/MachO/DriverUtils.cpp (+22-6)
diff --git a/lld/MachO/DriverUtils.cpp b/lld/MachO/DriverUtils.cpp
index f7f6be049f0e1..30cb83f8d7a2e 100644
--- a/lld/MachO/DriverUtils.cpp
+++ b/lld/MachO/DriverUtils.cpp
@@ -15,12 +15,14 @@
 #include "lld/Common/Reproduce.h"
 #include "llvm/ADT/CachedHashString.h"
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/StringRef.h"
 #include "llvm/LTO/LTO.h"
 #include "llvm/Option/Arg.h"
 #include "llvm/Option/ArgList.h"
 #include "llvm/Option/Option.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/FileSystem.h"
+#include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/Path.h"
 #include "llvm/TextAPI/InterfaceFile.h"
 #include "llvm/TextAPI/TextAPIReader.h"
@@ -227,12 +229,7 @@ static DenseMap<CachedHashStringRef, DylibFile *> loadedDylibs;
 
 DylibFile *macho::loadDylib(MemoryBufferRef mbref, DylibFile *umbrella,
                             bool isBundleLoader, bool explicitlyLinked) {
-  // Frameworks can be found from different symlink paths, so resolve
-  // symlinks before looking up in the dylib cache.
-  SmallString<128> realPath;
-  std::error_code err = fs::real_path(mbref.getBufferIdentifier(), realPath);
-  CachedHashStringRef path(!err ? uniqueSaver().save(StringRef(realPath))
-                                : mbref.getBufferIdentifier());
+  CachedHashStringRef path(mbref.getBufferIdentifier());
   DylibFile *&file = loadedDylibs[path];
   if (file) {
     if (explicitlyLinked)
@@ -240,6 +237,25 @@ DylibFile *macho::loadDylib(MemoryBufferRef mbref, DylibFile *umbrella,
     return file;
   }
 
+  // Frameworks can be found from different symlink paths, so resolve
+  // symlinks and look up in the dylib cache.
+  SmallString<128> realPath;
+  std::error_code err = fs::real_path(mbref.getBufferIdentifier(), realPath);
+  if (!err && !StringRef(realPath).ends_with(mbref.getBufferIdentifier())) {
+    // Symlink has resolved to a different path, load the dylib from there
+    // and cache the result for the symlink too.
+    if (auto *realFile =
+            loadDylib(MemoryBufferRef(mbref.getBuffer(),
+                                      uniqueSaver().save(StringRef(realPath))),
+                      umbrella, isBundleLoader, explicitlyLinked)) {
+      loadedDylibs[path] = realFile;
+      return realFile;
+    }
+
+    // The map may have mutated, so update the reference.
+    file = loadedDylibs[path];
+  }
+
   DylibFile *newFile;
   file_magic magic = identify_magic(mbref.getBuffer());
   if (magic == file_magic::tapi_file) {

@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2025

@llvm/pr-subscribers-lld

Author: Richard Howell (rmaz)

Changes

In #140791 an attempt was made to defer the cost of real_path to after checking
the dylib cache. This was reverted due to issues incorrect handling of the
reference after the map updated.

In this PR the approach is changed to recursively call loadDylib with a
resolved symlink path if it differs from the original path. This avoids the
complicated logic holding on to two references that need to be updated
together.


Full diff: https://github.com/llvm/llvm-project/pull/143595.diff

1 Files Affected:

  • (modified) lld/MachO/DriverUtils.cpp (+22-6)
diff --git a/lld/MachO/DriverUtils.cpp b/lld/MachO/DriverUtils.cpp
index f7f6be049f0e1..30cb83f8d7a2e 100644
--- a/lld/MachO/DriverUtils.cpp
+++ b/lld/MachO/DriverUtils.cpp
@@ -15,12 +15,14 @@
 #include "lld/Common/Reproduce.h"
 #include "llvm/ADT/CachedHashString.h"
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/StringRef.h"
 #include "llvm/LTO/LTO.h"
 #include "llvm/Option/Arg.h"
 #include "llvm/Option/ArgList.h"
 #include "llvm/Option/Option.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/FileSystem.h"
+#include "llvm/Support/MemoryBufferRef.h"
 #include "llvm/Support/Path.h"
 #include "llvm/TextAPI/InterfaceFile.h"
 #include "llvm/TextAPI/TextAPIReader.h"
@@ -227,12 +229,7 @@ static DenseMap<CachedHashStringRef, DylibFile *> loadedDylibs;
 
 DylibFile *macho::loadDylib(MemoryBufferRef mbref, DylibFile *umbrella,
                             bool isBundleLoader, bool explicitlyLinked) {
-  // Frameworks can be found from different symlink paths, so resolve
-  // symlinks before looking up in the dylib cache.
-  SmallString<128> realPath;
-  std::error_code err = fs::real_path(mbref.getBufferIdentifier(), realPath);
-  CachedHashStringRef path(!err ? uniqueSaver().save(StringRef(realPath))
-                                : mbref.getBufferIdentifier());
+  CachedHashStringRef path(mbref.getBufferIdentifier());
   DylibFile *&file = loadedDylibs[path];
   if (file) {
     if (explicitlyLinked)
@@ -240,6 +237,25 @@ DylibFile *macho::loadDylib(MemoryBufferRef mbref, DylibFile *umbrella,
     return file;
   }
 
+  // Frameworks can be found from different symlink paths, so resolve
+  // symlinks and look up in the dylib cache.
+  SmallString<128> realPath;
+  std::error_code err = fs::real_path(mbref.getBufferIdentifier(), realPath);
+  if (!err && !StringRef(realPath).ends_with(mbref.getBufferIdentifier())) {
+    // Symlink has resolved to a different path, load the dylib from there
+    // and cache the result for the symlink too.
+    if (auto *realFile =
+            loadDylib(MemoryBufferRef(mbref.getBuffer(),
+                                      uniqueSaver().save(StringRef(realPath))),
+                      umbrella, isBundleLoader, explicitlyLinked)) {
+      loadedDylibs[path] = realFile;
+      return realFile;
+    }
+
+    // The map may have mutated, so update the reference.
+    file = loadedDylibs[path];
+  }
+
   DylibFile *newFile;
   file_magic magic = identify_magic(mbref.getBuffer());
   if (magic == file_magic::tapi_file) {

@rmaz rmaz force-pushed the lldframeworksymlink branch 2 times, most recently from 1cdfcd9 to 276ce5d Compare June 11, 2025 17:03
@rmaz rmaz changed the title [lld] recursively call loadDylib for symlinks [lld] check cache in loadDylib before real_path Jun 11, 2025
@rmaz rmaz force-pushed the lldframeworksymlink branch from 276ce5d to cf85cff Compare June 11, 2025 17:06
@ellishg
Copy link
Contributor

ellishg commented Jun 11, 2025

Can we add a test case that would have crashed the previous PR?

@rmaz
Copy link
Contributor Author

rmaz commented Jun 11, 2025

Can we add a test case that would have crashed the previous PR?

Tricky to manage in a unit test, but i'll try and come up with a test that fails without this change involving re-exported library names and symlinks.

@rmaz rmaz force-pushed the lldframeworksymlink branch from cf85cff to 8efe2c0 Compare June 11, 2025 20:38
@rmaz rmaz requested review from Prabhuk and zmodem June 11, 2025 21:03
@rmaz rmaz force-pushed the lldframeworksymlink branch from 8efe2c0 to 32c5cbf Compare June 12, 2025 16:44
Copy link
Contributor

@drodriguez drodriguez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zmodem I am not sure if you have been able to test this in your internal systems. I understand this was complicated to reproduce, but if it is even more difficult to reproduce now, it might mean it is removed completely. Obviously landing and waiting for your internal systems to catch up is also a possibility.

@zmodem
Copy link
Collaborator

zmodem commented Jun 16, 2025

I didn't try it, but the code looks fine so I don't think there should be any problems.

@zmodem
Copy link
Collaborator

zmodem commented Jun 17, 2025

I tried the patch on the reproducer I uploaded, and that links without issue.

@ellishg ellishg merged commit 35f6d91 into llvm:main Jun 17, 2025
7 checks passed
ajaden-codes pushed a commit to Jaddyen/llvm-project that referenced this pull request Jun 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants