Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[profile] Implement a non-mmap path when reading profile files from a non-local filesystem #131177

Merged
merged 20 commits into from
Mar 18, 2025

Conversation

w2yehia
Copy link
Contributor

@w2yehia w2yehia commented Mar 13, 2025

On AIX, when accessing mmap'ed memory associated to a file on NFS, a SIGBUS might be raised at random.
The problem is still in open state with the OS team.

This PR teaches the profile runtime, under certain conditions, to avoid the mmap when reading the profile file during online merging.
This PR has no effect on any platform other than AIX because I'm not aware of this problem on other platforms.
Other platforms can easily opt-in to this functionality in the future.

The logic in function is_local_filesystem was copied from llvm/lib/Support/Unix/Path.inc (https://reviews.llvm.org/D58801), because it seems that the compiler-rt/profile cannot reuse code from llvm except through InstrProfData.inc.

Thanks to @hubert-reinterpretcast for substantial feedback downstream.

@llvmbot llvmbot added compiler-rt PGO Profile Guided Optimizations labels Mar 13, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 13, 2025

@llvm/pr-subscribers-pgo

Author: Wael Yehia (w2yehia)

Changes

… non-local filesystem


Full diff: https://github.com/llvm/llvm-project/pull/131177.diff

5 Files Affected:

  • (modified) compiler-rt/lib/profile/InstrProfilingFile.c (+15-14)
  • (modified) compiler-rt/lib/profile/InstrProfilingUtil.c (+116)
  • (modified) compiler-rt/lib/profile/InstrProfilingUtil.h (+19)
  • (modified) compiler-rt/test/profile/Posix/instrprof-fork.c (+23-15)
  • (added) compiler-rt/test/profile/instrprof-no-mmap-during-merging.c (+25)
diff --git a/compiler-rt/lib/profile/InstrProfilingFile.c b/compiler-rt/lib/profile/InstrProfilingFile.c
index e5eca7947cf9b..6d6d03476a77c 100644
--- a/compiler-rt/lib/profile/InstrProfilingFile.c
+++ b/compiler-rt/lib/profile/InstrProfilingFile.c
@@ -428,17 +428,18 @@ static int getProfileFileSizeForMerging(FILE *ProfileFile,
  * \p ProfileBuffer. Returns -1 on failure. On success, the caller is
  * responsible for unmapping the mmap'd buffer in \p ProfileBuffer. */
 static int mmapProfileForMerging(FILE *ProfileFile, uint64_t ProfileFileSize,
-                                 char **ProfileBuffer) {
-  *ProfileBuffer = mmap(NULL, ProfileFileSize, PROT_READ, MAP_SHARED | MAP_FILE,
-                        fileno(ProfileFile), 0);
-  if (*ProfileBuffer == MAP_FAILED) {
+                                 ManagedMemory *ProfileBuffer) {
+  lprofGetFileContentBuffer(ProfileFile, ProfileFileSize, ProfileBuffer);
+
+  if (ProfileBuffer->Status == MM_INVALID) {
     PROF_ERR("Unable to merge profile data, mmap failed: %s\n",
              strerror(errno));
     return -1;
   }
 
-  if (__llvm_profile_check_compatibility(*ProfileBuffer, ProfileFileSize)) {
-    (void)munmap(*ProfileBuffer, ProfileFileSize);
+  if (__llvm_profile_check_compatibility(ProfileBuffer->Addr,
+                                         ProfileFileSize)) {
+    (void)lprofReleaseBuffer(ProfileBuffer, ProfileFileSize);
     PROF_WARN("Unable to merge profile data: %s\n",
               "source profile file is not compatible.");
     return -1;
@@ -453,7 +454,7 @@ static int mmapProfileForMerging(FILE *ProfileFile, uint64_t ProfileFileSize,
 */
 static int doProfileMerging(FILE *ProfileFile, int *MergeDone) {
   uint64_t ProfileFileSize;
-  char *ProfileBuffer;
+  ManagedMemory ProfileBuffer;
 
   /* Get the size of the profile on disk. */
   if (getProfileFileSizeForMerging(ProfileFile, &ProfileFileSize) == -1)
@@ -469,9 +470,9 @@ static int doProfileMerging(FILE *ProfileFile, int *MergeDone) {
     return -1;
 
   /* Now start merging */
-  if (__llvm_profile_merge_from_buffer(ProfileBuffer, ProfileFileSize)) {
+  if (__llvm_profile_merge_from_buffer(ProfileBuffer.Addr, ProfileFileSize)) {
     PROF_ERR("%s\n", "Invalid profile data to merge");
-    (void)munmap(ProfileBuffer, ProfileFileSize);
+    (void)lprofReleaseBuffer(&ProfileBuffer, ProfileFileSize);
     return -1;
   }
 
@@ -480,7 +481,7 @@ static int doProfileMerging(FILE *ProfileFile, int *MergeDone) {
   (void)COMPILER_RT_FTRUNCATE(ProfileFile,
                               __llvm_profile_get_size_for_buffer());
 
-  (void)munmap(ProfileBuffer, ProfileFileSize);
+  (void)lprofReleaseBuffer(&ProfileBuffer, ProfileFileSize);
   *MergeDone = 1;
 
   return 0;
@@ -702,13 +703,13 @@ static void initializeProfileForContinuousMode(void) {
     } else {
       /* The merged profile has a non-zero length. Check that it is compatible
        * with the data in this process. */
-      char *ProfileBuffer;
+      ManagedMemory ProfileBuffer;
       if (mmapProfileForMerging(File, ProfileFileSize, &ProfileBuffer) == -1) {
         lprofUnlockFileHandle(File);
         fclose(File);
         return;
       }
-      (void)munmap(ProfileBuffer, ProfileFileSize);
+      (void)lprofReleaseBuffer(&ProfileBuffer, ProfileFileSize);
     }
   } else {
     File = fopen(Filename, FileOpenMode);
@@ -1346,12 +1347,12 @@ COMPILER_RT_VISIBILITY int __llvm_profile_set_file_object(FILE *File,
     } else {
       /* The merged profile has a non-zero length. Check that it is compatible
        * with the data in this process. */
-      char *ProfileBuffer;
+      ManagedMemory ProfileBuffer;
       if (mmapProfileForMerging(File, ProfileFileSize, &ProfileBuffer) == -1) {
         lprofUnlockFileHandle(File);
         return 1;
       }
-      (void)munmap(ProfileBuffer, ProfileFileSize);
+      (void)lprofReleaseBuffer(&ProfileBuffer, ProfileFileSize);
     }
     mmapForContinuousMode(0, File);
     lprofUnlockFileHandle(File);
diff --git a/compiler-rt/lib/profile/InstrProfilingUtil.c b/compiler-rt/lib/profile/InstrProfilingUtil.c
index c637b9d0b893c..4bec5feca823f 100644
--- a/compiler-rt/lib/profile/InstrProfilingUtil.c
+++ b/compiler-rt/lib/profile/InstrProfilingUtil.c
@@ -21,6 +21,15 @@
 #include <unistd.h>
 #endif
 
+#ifdef _AIX
+#include <sys/statfs.h>
+// <sys/vmount.h> depends on `uint` to be a typedef from <sys/types.h> to
+// `uint_t`; however, <sys/types.h> does not always declare `uint`. We provide
+// the typedef prior to including <sys/vmount.h> to work around this issue.
+typedef uint_t uint;
+#include <sys/vmount.h>
+#endif
+
 #ifdef COMPILER_RT_HAS_UNAME
 #include <sys/utsname.h>
 #endif
@@ -258,6 +267,113 @@ COMPILER_RT_VISIBILITY FILE *lprofOpenFileEx(const char *ProfileName) {
   return f;
 }
 
+// Return 1 (true) if the file descriptor Fd represents a file that is on a
+// local filesystem, otherwise return 0.
+static int is_local_filesystem(int Fd) {
+#if defined(_AIX)
+  struct statfs Vfs;
+  if (fstatfs(Fd, &Vfs) != 0) {
+    PROF_ERR("%s: fstatfs(%d) failed: %s\n", __func__, Fd, strerror(errno));
+    return 0;
+  }
+
+  int Ret;
+  size_t BufSize = 2048u;
+  char *Buf;
+  int Tries = 3;
+  while (Tries--) {
+    Buf = malloc(BufSize);
+    Ret = mntctl(MCTL_QUERY, BufSize, Buf);
+    if (Ret != 0)
+      break;
+    BufSize = *(unsigned int *)Buf;
+    free(Buf);
+  }
+
+  if (Ret != -1) {
+    // Look for the correct vmount entry.
+    char *CurObjPtr = Buf;
+    while (Ret--) {
+      struct vmount *Vp = (struct vmount *)CurObjPtr;
+      _Static_assert(sizeof(Vfs.f_fsid) == sizeof(Vp->vmt_fsid),
+                     "fsid length mismatch");
+      if (memcmp(&Vfs.f_fsid, &Vp->vmt_fsid, sizeof Vfs.f_fsid) == 0) {
+        free(Buf);
+        return (Vp->vmt_flags & MNT_REMOTE) == 0;
+      }
+      CurObjPtr += Vp->vmt_length;
+    }
+  }
+
+  free(Buf);
+  // There was an error in mntctl or vmount entry not found; "remote" is the
+  // conservative answer.
+#endif
+  return 0;
+}
+
+static int isMmapSafe(int Fd) {
+  if (getenv("LLVM_NO_MMAP")) // For testing purposes.
+    return 0;
+  (void)&is_local_filesystem; // a fake reference to satisfy -Wunused-function
+#ifdef _AIX
+  return is_local_filesystem(Fd);
+#endif
+  return 1;
+}
+
+COMPILER_RT_VISIBILITY void lprofGetFileContentBuffer(FILE *F, uint64_t Length,
+                                                      ManagedMemory *Buf) {
+  Buf->Status = MM_INVALID;
+
+  if (!F || isMmapSafe(fileno(F))) {
+    Buf->Addr = mmap(NULL, Length, PROT_READ, MAP_SHARED | MAP_FILE,
+                     F ? fileno(F) : -1, 0);
+    if (Buf->Addr != MAP_FAILED)
+      Buf->Status = MM_MMAP;
+    return;
+  }
+
+  if (getenv("LLVM_PROFILE_VERBOSE"))
+    PROF_NOTE("Could not use mmap; using fread instead.%s\n", "");
+
+  void *Buffer = malloc(Length);
+
+  if (!Buffer) {
+    PROF_ERR("%s: malloc failed: %s\n", __func__, strerror(errno));
+    return;
+  }
+  if (fseek(F, 0L, SEEK_SET) != 0) {
+    PROF_ERR("%s: fseek(0, SEEK_SET) failed: %s\n", __func__, strerror(errno));
+    return;
+  }
+
+  // Read the entire file into memory.
+  size_t BytesRead = fread(Buffer, 1, Length, F);
+  if (BytesRead != (size_t)Length || ferror(F)) {
+    PROF_ERR("%s: fread failed: %s\n", __func__, strerror(errno));
+    return;
+  }
+
+  // Reading was successful, record the result in the Buf parameter.
+  Buf->Addr = Buffer;
+  Buf->Status = MM_MALLOC;
+}
+
+void lprofReleaseBuffer(ManagedMemory *Buf, size_t Length) {
+  switch (Buf->Status) {
+  case MM_MALLOC:
+    free(Buf->Addr);
+    break;
+  case MM_MMAP:
+    munmap(Buf->Addr, Length);
+    break;
+  case MM_INVALID:
+    PROF_ERR("%s: Buffer has invalid state: %d", __func__, Buf->Status);
+    break;
+  }
+}
+
 COMPILER_RT_VISIBILITY const char *lprofGetPathPrefix(int *PrefixStrip,
                                                       size_t *PrefixLen) {
   const char *Prefix = getenv("GCOV_PREFIX");
diff --git a/compiler-rt/lib/profile/InstrProfilingUtil.h b/compiler-rt/lib/profile/InstrProfilingUtil.h
index 227c2aa0a7cae..9a7550aad9fab 100644
--- a/compiler-rt/lib/profile/InstrProfilingUtil.h
+++ b/compiler-rt/lib/profile/InstrProfilingUtil.h
@@ -31,6 +31,25 @@ int lprofUnlockFileHandle(FILE *F);
  * lock for exclusive access. The caller will block
  * if the lock is already held by another process. */
 FILE *lprofOpenFileEx(const char *Filename);
+
+enum MemoryStatus {
+  MM_INVALID = 0x1, // Addr is not a valid address
+  MM_MMAP = 0x2,    // Addr was mmap'ed
+  MM_MALLOC = 0x4   // Addr was malloc'ed
+};
+typedef struct {
+  void *Addr;
+  enum MemoryStatus Status;
+} ManagedMemory;
+
+/* Read the content of a file using mmap or fread into a buffer.
+ * Certain files (e.g. NFS mounted) cannot be opened reliably with mmap,
+ * so we use fread in those cases. The corresponding lprofReleaseBuffer
+ * will free/munmap the buffer.
+ */
+void lprofGetFileContentBuffer(FILE *F, uint64_t FileSize, ManagedMemory *Buf);
+void lprofReleaseBuffer(ManagedMemory *FileBuffer, size_t Length);
+
 /* PS4 doesn't have setenv/getenv/fork. Define a shim. */
 #if __ORBIS__
 #include <sys/types.h>
diff --git a/compiler-rt/test/profile/Posix/instrprof-fork.c b/compiler-rt/test/profile/Posix/instrprof-fork.c
index 8df0abd73c4c3..6724efa30a2f6 100644
--- a/compiler-rt/test/profile/Posix/instrprof-fork.c
+++ b/compiler-rt/test/profile/Posix/instrprof-fork.c
@@ -3,11 +3,15 @@
 // RUN: %clang_pgogen=%t.profdir -o %t -O2 %s
 // RUN: %run %t
 // RUN: llvm-profdata show --all-functions --counts %t.profdir/default_*.profraw  | FileCheck %s
+// RUN: rm -fr %t.profdir
+// RUN: env LLVM_NO_MMAP=1 %run %t
+// RUN: llvm-profdata show --all-functions --counts %t.profdir/default_*.profraw  | FileCheck %s
+
 //
 // CHECK: func1:
-// CHECK: Block counts: [4]
+// CHECK: Block counts: [21]
 // CHECK:  func2:
-// CHECK: Block counts: [1]
+// CHECK: Block counts: [10]
 
 #include <sys/wait.h>
 #include <unistd.h>
@@ -16,17 +20,21 @@ __attribute__((noinline)) void func1() {}
 __attribute__((noinline)) void func2() {}
 
 int main(void) {
-  //                       child     | parent
-  int status;         // func1 func2 | func1 func2
-  func1();            //   +1        |   +1        (*)
-  pid_t pid = fork(); //             |
-  if (pid == -1)      //             |
-    return 1;         //             |
-  if (pid == 0)       //             |
-    func2();          //         +1  |
-  func1();            //   +1        |   +1
-  if (pid)            // ------------+------------
-    wait(&status);    //    2     1  |    2    0
-  return 0;           // (*)  the child inherits counter values prior to fork
-                      //      from the parent in non-continuous mode.
+  //                           child     | parent
+  //                         func1 func2 | func1 func2
+  func1();              //   +10       |   +1        (*)
+  int i = 10;           //             |
+  while (i-- > 0) {     //             |
+    pid_t pid = fork(); //             |
+    if (pid == -1)      //             |
+      return 1;         //             |
+    if (pid == 0) {     //             |
+      func2();          //         +10 |
+      func1();          //   +10       |
+      return 0;         //             |
+    }                   //             |
+  }                     // ------------+------------
+  int status;           //   20     10 |   1     0
+  wait(&status);        // (*)  the child inherits counter values prior to fork
+  return 0;             //      from the parent in non-continuous mode.
 }
diff --git a/compiler-rt/test/profile/instrprof-no-mmap-during-merging.c b/compiler-rt/test/profile/instrprof-no-mmap-during-merging.c
new file mode 100644
index 0000000000000..ee74b158a78a4
--- /dev/null
+++ b/compiler-rt/test/profile/instrprof-no-mmap-during-merging.c
@@ -0,0 +1,25 @@
+// RUN: mkdir -p %t.d && cd %t.d
+// RUN: rm -f *.profraw
+// RUN: %clang_pgogen %s -o a.out
+
+// Need to run a.out twice, the second time a merge will occur which will trigger an mmap.
+// RUN: ./a.out
+// RUN: llvm-profdata show default_*.profraw --all-functions --counts --memop-sizes 2>&1 | FileCheck %s -check-prefix=PROFDATA
+// RUN: env LLVM_NO_MMAP=1 LLVM_PROFILE_VERBOSE=1 ./a.out 2>&1 | FileCheck %s
+// RUN: llvm-profdata show default_*.profraw --all-functions --counts --memop-sizes 2>&1 | FileCheck %s -check-prefix=PROFDATA2
+
+// CHECK: Could not use mmap; using fread instead.
+// PROFDATA: Block counts: [1]
+// PROFDATA: [  0,    0,          1 ]
+// PROFDATA: Maximum function count: 1
+// PROFDATA2: Block counts: [2]
+// PROFDATA2: [  0,    0,          2 ]
+// PROFDATA2: Maximum function count: 2
+
+#include <string.h>
+int ar[8];
+int main() {
+  memcpy(ar, ar + 2, ar[0]);
+  memcpy(ar, ar + 2, ar[2]);
+  return ar[2];
+}

Copy link

github-actions bot commented Mar 13, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 9ef7287d42526014abb0cf2aa53ac2c3087198be 3fd025e3321727c79d903833a09d46d589155cbb --extensions h,c -- compiler-rt/test/profile/instrprof-no-mmap-during-merging.c compiler-rt/lib/profile/InstrProfilingFile.c compiler-rt/lib/profile/InstrProfilingPort.h compiler-rt/lib/profile/InstrProfilingUtil.c compiler-rt/lib/profile/InstrProfilingUtil.h compiler-rt/test/profile/Posix/instrprof-fork.c
View the diff from clang-format here.
diff --git a/compiler-rt/test/profile/Posix/instrprof-fork.c b/compiler-rt/test/profile/Posix/instrprof-fork.c
index a1b99b492a..595b8511e5 100644
--- a/compiler-rt/test/profile/Posix/instrprof-fork.c
+++ b/compiler-rt/test/profile/Posix/instrprof-fork.c
@@ -32,11 +32,11 @@ int main(void) {
       func2();          //         +10 |
       func1();          //   +10       |
       return 0;         //             |
-    }                   //             |
-  }                     // ------------+------------
-  int status;           //   20     10 |   1     0
-  i = 10;               // (*)  the child inherits counter values prior to fork
-  while (i-- > 0)       //      from the parent in non-continuous mode.
+    } //             |
+  } // ------------+------------
+  int status;     //   20     10 |   1     0
+  i = 10;         // (*)  the child inherits counter values prior to fork
+  while (i-- > 0) //      from the parent in non-continuous mode.
     wait(&status);
   return 0;
 }

@w2yehia w2yehia requested review from hubert-reinterpretcast, vitalybuka and jdoerfert and removed request for jdoerfert March 13, 2025 18:45
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
w2yehia and others added 2 commits March 14, 2025 00:29
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
w2yehia and others added 6 commits March 14, 2025 19:35
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Copy link
Collaborator

@hubert-reinterpretcast hubert-reinterpretcast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some suggestions.

Comment on lines 320 to 324
#ifdef _AIX
return is_local_filesystem(Fd);
#else
return 1;
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't using mmap be an issue even on other platforms when the profile is stored on NFS (e.g. Linux)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find any evidence that it's broken, in general, on linux or other platforms.

Note that in this case, we're using mmap to read a file, while maintaining exclusive access to it (opened with lprofOpenFileEx) -- we've seen instances, but on AIX only, where we receive a SIGBUS when accessing mmap'ed memory (in clang, while reading a source file) when the file underneath is changed by another process (happens in a makefile build).

No one reported a problem in the past 5+ years, so I think we're good not doing anything on non-AIX platforms for now.

w2yehia and others added 2 commits March 15, 2025 18:07
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
@petrhosek petrhosek requested a review from david-xl March 15, 2025 22:10
w2yehia and others added 5 commits March 17, 2025 10:54
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
With this change, an initialized ManagedMemory object will get the MS_INVALID(==0) status.
@w2yehia
Copy link
Contributor Author

w2yehia commented Mar 17, 2025

@petrhosek @david-xl Thank you for your review.
I've left two unresolved comments of yours for you to close or reply to.

@w2yehia w2yehia merged commit 14c95e0 into llvm:main Mar 18, 2025
9 of 10 checks passed
@ahatanak
Copy link
Collaborator

@aeubanks
Copy link
Contributor

I've reverted this PR in 32476b9 due to the tests failing on mac. Feel free to reland once you've figured out what's going on there

@w2yehia
Copy link
Contributor Author

w2yehia commented Mar 20, 2025

@ahatanak @aeubanks Thanks for notifying and reverting.
I don't have a mac but got a colleague to try out a fix for me. I also couldn't find a clang on macos on godbolt.org.
The failure is due to missing value profiling data.
The memcpy calls are not lowered by the FE to the llvm.memcpy intrinsic. They are macro expanded to memcpy_chk which is not recognized by the value profiling instrumentation.
So memory intrinsics (at least memcpy) value profiling is currently broken on macos.

w2yehia added a commit that referenced this pull request Mar 20, 2025
… non-local filesystem (#131177)

On AIX, when accessing mmap'ed memory associated to a file on NFS, a
SIGBUS might be raised at random.
The problem is still in open state with the OS team.

This PR teaches the profile runtime, under certain conditions, to avoid
the mmap when reading the profile file during online merging.
This PR has no effect on any platform other than AIX because I'm not
aware of this problem on other platforms.
Other platforms can easily opt-in to this functionality in the future.

The logic in function `is_local_filesystem` was copied from
[llvm/lib/Support/Unix/Path.inc](https://github.com/llvm/llvm-project/blob/f388ca3d9d9a58e3d189458b590ba68dfd9e5a2d/llvm/lib/Support/Unix/Path.inc#L515)
(https://reviews.llvm.org/D58801), because it seems that the
compiler-rt/profile cannot reuse code from llvm except through
`InstrProfData.inc`.

Thanks to @hubert-reinterpretcast for substantial feedback downstream.

---------

Co-authored-by: Wael Yehia <wyehia@ca.ibm.com>
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler-rt PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants