Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Offload]: Skip copying of unused kernel-mapped data #124723

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

pradt2
Copy link

@pradt2 pradt2 commented Jan 28, 2025

This commit skips copying of buffers that aren't used by the kernel.

Context

Currently kernel-mapped data is always copied to and/or from device even if it's not used by the kernel. E.g.

double A[1024];

#pragma omp target teams distribute parallel for map(tofrom:A[0:1024])
for (int i = 0; i < 1024; i++) {
  // empty kernel, A mapping is not used, but data is still copied to and from device
}

This change aims to detect such unused mappings and act on them by skipping memory transfers.

Approach

There are multiple ways of doing this. I aimed for my approach to be as non-invasive as possible. Currently, the optimisation is applied regardless of use of custom mappers. We would probably want to skip this optimisation when custom mappers are present to be on the safe side. I will add this handling as soon as I get a green light that this approach is acceptable.

Tests

Currently there are no tests for this. I'm happy to (at least try to) add tests as soon as we agree on the technical approach.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Jan 28, 2025

@llvm/pr-subscribers-offload

Author: None (pradt2)

Changes

This commit skips copying of buffers that aren't used by the kernel.

Approach

There are multiple ways of doing this. I aimed for my approach to be as non-invasive as possible. Currently, the optimisation is applied regardless of use of custom mappers. We would probably want to skip this optimisation when custom mappers are present to be on the safe side. I will add this handling as soon as I get a green light that this approach is acceptable.

Tests

Currently there are no tests for this. I'm happy to (at least try to) add tests as soon as we agree on the technical approach.


Full diff: https://github.com/llvm/llvm-project/pull/124723.diff

1 Files Affected:

  • (modified) offload/src/omptarget.cpp (+36-2)
diff --git a/offload/src/omptarget.cpp b/offload/src/omptarget.cpp
index 89fa63347babe2..3a9fea2239d5bb 100644
--- a/offload/src/omptarget.cpp
+++ b/offload/src/omptarget.cpp
@@ -1197,6 +1197,35 @@ class PrivateArgumentManagerTy {
   }
 };
 
+static std::unique_ptr<int64_t[]> maskIgnorableMappings(int64_t DeviceId, int32_t ArgNum, int64_t *ArgTypes,
+                                                        int64_t *ArgSizes, map_var_info_t *ArgNames) {
+    std::unique_ptr<int64_t[]> ArgTypesOverride = std::make_unique<int64_t[]>(ArgNum);
+
+    for (int32_t I = 0; I < ArgNum; ++I) {
+        bool IsTargetParam = ArgTypes[I] & OMP_TGT_MAPTYPE_TARGET_PARAM;
+
+        bool IsMapTo = ArgTypes[I] & OMP_TGT_MAPTYPE_TO;
+        if (IsTargetParam || !IsMapTo) {
+            ArgTypesOverride[I] = ArgTypes[I];
+            continue;
+        }
+
+        bool IsMapFrom = ArgTypes[I] & OMP_TGT_MAPTYPE_FROM;
+        const char *Type = IsMapFrom ? "tofrom" : "to";
+
+        // Optimisation: A 'to' or 'tofrom' mapping is not
+        // used by the kernel. Change its type such that
+        // no new mapping is created, but any existing
+        // mapping has its counter decremented.
+        INFO(OMP_INFOTYPE_ALL, DeviceId, "%s(%s)[%" PRId64 "] %s\n", Type,
+             getNameFromMapping(ArgNames[I]).c_str(), ArgSizes[I], "is not used and will not be copied");
+
+        ArgTypesOverride[I] = ArgTypes[I] & ~(OMP_TGT_MAPTYPE_TO | OMP_TGT_MAPTYPE_FROM);
+    }
+
+    return ArgTypesOverride;
+}
+
 /// Process data before launching the kernel, including calling targetDataBegin
 /// to map and transfer data to target device, transferring (first-)private
 /// variables.
@@ -1417,11 +1446,16 @@ int target(ident_t *Loc, DeviceTy &Device, void *HostPtr,
 
   int NumClangLaunchArgs = KernelArgs.NumArgs;
   int Ret = OFFLOAD_SUCCESS;
+
+    std::unique_ptr<int64_t[]> ArgTypesOverride =
+            maskIgnorableMappings(DeviceId, NumClangLaunchArgs, KernelArgs.ArgTypes,
+                                  KernelArgs.ArgSizes, KernelArgs.ArgNames);
+
   if (NumClangLaunchArgs) {
     // Process data, such as data mapping, before launching the kernel
     Ret = processDataBefore(Loc, DeviceId, HostPtr, NumClangLaunchArgs,
                             KernelArgs.ArgBasePtrs, KernelArgs.ArgPtrs,
-                            KernelArgs.ArgSizes, KernelArgs.ArgTypes,
+                            KernelArgs.ArgSizes, ArgTypesOverride.get(),
                             KernelArgs.ArgNames, KernelArgs.ArgMappers, TgtArgs,
                             TgtOffsets, PrivateArgumentManager, AsyncInfo);
     if (Ret != OFFLOAD_SUCCESS) {
@@ -1473,7 +1507,7 @@ int target(ident_t *Loc, DeviceTy &Device, void *HostPtr,
     // variables
     Ret = processDataAfter(Loc, DeviceId, HostPtr, NumClangLaunchArgs,
                            KernelArgs.ArgBasePtrs, KernelArgs.ArgPtrs,
-                           KernelArgs.ArgSizes, KernelArgs.ArgTypes,
+                           KernelArgs.ArgSizes, ArgTypesOverride.get(),
                            KernelArgs.ArgNames, KernelArgs.ArgMappers,
                            PrivateArgumentManager, AsyncInfo);
     if (Ret != OFFLOAD_SUCCESS) {

@pradt2
Copy link
Author

pradt2 commented Jan 28, 2025

@jdoerfert

@jplehr
Copy link
Contributor

jplehr commented Jan 28, 2025

Not sure if this is meant to work / pass tests already. I see many failures when running check-offload with this patch.

Copy link
Contributor

@shiltian shiltian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test?

@shiltian shiltian requested review from jdoerfert and jhuber6 January 28, 2025 14:08
@RaviNarayanaswamy
Copy link

What about the case when you have map(always, to....). Need to copy the data.

@shiltian
Copy link
Contributor

I also wonder how you could know if a map is used w/o information passed from compiler...

Copy link
Member

@jdoerfert jdoerfert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned, we need tests, incl. for the modifiers, e.g., always.

@@ -1197,6 +1197,35 @@ class PrivateArgumentManagerTy {
}
};

static std::unique_ptr<int64_t[]> maskIgnorableMappings(int64_t DeviceId, int32_t ArgNum, int64_t *ArgTypes,
int64_t *ArgSizes, map_var_info_t *ArgNames) {
std::unique_ptr<int64_t[]> ArgTypesOverride = std::make_unique<int64_t[]>(ArgNum);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not changing ArgTypes in place?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a segfault when I do, I think this is because the pointer points straight to a region of read-only memory.

Copy link

⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo.
Please turn off Keep my email addresses private setting in your account.
See LLVM Discourse for more information.

Copy link

github-actions bot commented Jan 29, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 170b9caf335eb99cdecba0fb6cdf45728bcac11d 12561c445d04f13a769f0f8a2c0c01934a0a8062 --extensions h,cpp -- offload/test/mapping/skip_transfers.cpp offload/include/OpenMP/Mapping.h offload/include/Shared/Debug.h offload/libomptarget/omptarget.cpp
View the diff from clang-format here.
diff --git a/offload/include/OpenMP/Mapping.h b/offload/include/OpenMP/Mapping.h
index 1595e06714..f5be04b1a3 100644
--- a/offload/include/OpenMP/Mapping.h
+++ b/offload/include/OpenMP/Mapping.h
@@ -388,10 +388,9 @@ struct LookupResult {
   TargetPointerResultTy TPR;
 
   bool isEmpty() const {
-      bool IsEmpty = Flags.IsContained == 0
-              & Flags.ExtendsBefore == 0
-              & Flags.ExtendsAfter == 0;
-      return IsEmpty;
+    bool IsEmpty = Flags.IsContained == 0 & Flags.ExtendsBefore == 0 &
+                   Flags.ExtendsAfter == 0;
+    return IsEmpty;
   }
 };
 
diff --git a/offload/libomptarget/omptarget.cpp b/offload/libomptarget/omptarget.cpp
index 6fe3def424..8e692e5b30 100644
--- a/offload/libomptarget/omptarget.cpp
+++ b/offload/libomptarget/omptarget.cpp
@@ -1200,66 +1200,77 @@ public:
 /// Try to find redundant mappings associated with a kernel launch,
 /// and provide a masked version of the kernel argument types that
 /// avoid redundant to data transfers between the host and device.
-static std::unique_ptr<int64_t[]> maskRedundantTransfers(DeviceTy &Device, int32_t ArgNum,
-                                                         int64_t *ArgTypes, int64_t *ArgSizes,
-                                                         map_var_info_t *ArgNames, void **ArgPtrs,
-                                                         void **ArgMappers) {
-    std::unique_ptr<int64_t[]> ArgTypesOverride = std::make_unique<int64_t[]>(ArgNum);
+static std::unique_ptr<int64_t[]>
+maskRedundantTransfers(DeviceTy &Device, int32_t ArgNum, int64_t *ArgTypes,
+                       int64_t *ArgSizes, map_var_info_t *ArgNames,
+                       void **ArgPtrs, void **ArgMappers) {
+  std::unique_ptr<int64_t[]> ArgTypesOverride =
+      std::make_unique<int64_t[]>(ArgNum);
 
-    MappingInfoTy &MappingInfo = Device.getMappingInfo();
-    MappingInfoTy::HDTTMapAccessorTy HDTTMap = MappingInfo
-            .HostDataToTargetMap.getExclusiveAccessor();
+  MappingInfoTy &MappingInfo = Device.getMappingInfo();
+  MappingInfoTy::HDTTMapAccessorTy HDTTMap =
+      MappingInfo.HostDataToTargetMap.getExclusiveAccessor();
 
-    int64_t UnusedArgs = 0;
+  int64_t UnusedArgs = 0;
 
-    for (int32_t I = 0; I < ArgNum; ++I) {
-        tgt_map_type ArgType = (tgt_map_type) ArgTypes[I];
-
-        // Check for unused implicit mappings
-        bool IsArgUnused = ArgType == OMP_TGT_MAPTYPE_NONE;
-
-        // Check for unused `map(buf[0:size])` mappings
-        IsArgUnused |= ArgType == OMP_TGT_MAPTYPE_FROM
-                || ArgType == OMP_TGT_MAPTYPE_TO
-                || ArgType == (OMP_TGT_MAPTYPE_FROM | OMP_TGT_MAPTYPE_TO);
-
-        // Check for unused `map(wrapper.buf[0:size])` mappings
-        IsArgUnused |= UnusedArgs == ArgNum - 1 && ArgType & OMP_TGT_MAPTYPE_MEMBER_OF
-                && ((ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) == OMP_TGT_MAPTYPE_PTR_AND_OBJ
-                || (ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) == (OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_TO)
-                || (ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) == (OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_FROM | OMP_TGT_MAPTYPE_TO));
-
-        bool IsExistingMapping = !MappingInfo.lookupMapping(HDTTMap, ArgPtrs[I], ArgSizes[I]).isEmpty();
-
-        bool IsCustomMapped = ArgMappers && ArgMappers[I];
-
-        if (IsExistingMapping | IsCustomMapped | !IsArgUnused) {
-            ArgTypesOverride[I] = ArgTypes[I];
-            continue;
-        }
-
-        const std::string Name = ArgNames && ArgNames[I] ?
-                                 getNameFromMapping(ArgNames[I]) : std::string("unknown");
-
-        bool IsArgFrom = ArgType & OMP_TGT_MAPTYPE_FROM;
-        bool IsArgTo = ArgType & OMP_TGT_MAPTYPE_TO;
-
-        const char *Type = IsArgFrom && IsArgTo ? "tofrom"
-                               : IsArgFrom ? "from"
-                               : IsArgTo ? "to"
-                               : "unknown";
-
-        // Optimisation:
-        // A new mapping is not used by the kernel.
-        // Change the type such that no data is transferred to and/or from the device.
-        INFO(OMP_INFOTYPE_REDUNDANT_TRANSFER, Device.DeviceID, "%s(%s)[%" PRId64 "] %s\n", Type,
-             Name.c_str(), ArgSizes[I], "is not used and will not be copied");
-
-        ArgTypesOverride[I] = ArgTypes[I] & ~(OMP_TGT_MAPTYPE_TO | OMP_TGT_MAPTYPE_FROM);
-        UnusedArgs++;
+  for (int32_t I = 0; I < ArgNum; ++I) {
+    tgt_map_type ArgType = (tgt_map_type)ArgTypes[I];
+
+    // Check for unused implicit mappings
+    bool IsArgUnused = ArgType == OMP_TGT_MAPTYPE_NONE;
+
+    // Check for unused `map(buf[0:size])` mappings
+    IsArgUnused |= ArgType == OMP_TGT_MAPTYPE_FROM ||
+                   ArgType == OMP_TGT_MAPTYPE_TO ||
+                   ArgType == (OMP_TGT_MAPTYPE_FROM | OMP_TGT_MAPTYPE_TO);
+
+    // Check for unused `map(wrapper.buf[0:size])` mappings
+    IsArgUnused |= UnusedArgs == ArgNum - 1 &&
+                   ArgType & OMP_TGT_MAPTYPE_MEMBER_OF &&
+                   ((ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) ==
+                        OMP_TGT_MAPTYPE_PTR_AND_OBJ ||
+                    (ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) ==
+                        (OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_TO) ||
+                    (ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) ==
+                        (OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_FROM |
+                         OMP_TGT_MAPTYPE_TO));
+
+    bool IsExistingMapping =
+        !MappingInfo.lookupMapping(HDTTMap, ArgPtrs[I], ArgSizes[I]).isEmpty();
+
+    bool IsCustomMapped = ArgMappers && ArgMappers[I];
+
+    if (IsExistingMapping | IsCustomMapped | !IsArgUnused) {
+      ArgTypesOverride[I] = ArgTypes[I];
+      continue;
     }
 
-    return ArgTypesOverride;
+    const std::string Name = ArgNames && ArgNames[I]
+                                 ? getNameFromMapping(ArgNames[I])
+                                 : std::string("unknown");
+
+    bool IsArgFrom = ArgType & OMP_TGT_MAPTYPE_FROM;
+    bool IsArgTo = ArgType & OMP_TGT_MAPTYPE_TO;
+
+    const char *Type = IsArgFrom && IsArgTo ? "tofrom"
+                       : IsArgFrom          ? "from"
+                       : IsArgTo            ? "to"
+                                            : "unknown";
+
+    // Optimisation:
+    // A new mapping is not used by the kernel.
+    // Change the type such that no data is transferred to and/or from the
+    // device.
+    INFO(OMP_INFOTYPE_REDUNDANT_TRANSFER, Device.DeviceID,
+         "%s(%s)[%" PRId64 "] %s\n", Type, Name.c_str(), ArgSizes[I],
+         "is not used and will not be copied");
+
+    ArgTypesOverride[I] =
+        ArgTypes[I] & ~(OMP_TGT_MAPTYPE_TO | OMP_TGT_MAPTYPE_FROM);
+    UnusedArgs++;
+  }
+
+  return ArgTypesOverride;
 }
 
 /// Process data before launching the kernel, including calling targetDataBegin
@@ -1483,9 +1494,9 @@ int target(ident_t *Loc, DeviceTy &Device, void *HostPtr,
   int NumClangLaunchArgs = KernelArgs.NumArgs;
   int Ret = OFFLOAD_SUCCESS;
 
-    std::unique_ptr<int64_t[]> ArgTypesOverride =
-            maskRedundantTransfers(Device, NumClangLaunchArgs, KernelArgs.ArgTypes,
-                                  KernelArgs.ArgSizes, KernelArgs.ArgNames, KernelArgs.ArgPtrs, KernelArgs.ArgMappers);
+  std::unique_ptr<int64_t[]> ArgTypesOverride = maskRedundantTransfers(
+      Device, NumClangLaunchArgs, KernelArgs.ArgTypes, KernelArgs.ArgSizes,
+      KernelArgs.ArgNames, KernelArgs.ArgPtrs, KernelArgs.ArgMappers);
 
   if (NumClangLaunchArgs) {
     // Process data, such as data mapping, before launching the kernel
diff --git a/offload/test/mapping/skip_transfers.cpp b/offload/test/mapping/skip_transfers.cpp
index ff0459f01f..c784840b4d 100644
--- a/offload/test/mapping/skip_transfers.cpp
+++ b/offload/test/mapping/skip_transfers.cpp
@@ -7,66 +7,77 @@
 // clang-format on
 
 int main() {
-    float DataStack = 0;
+  float DataStack = 0;
 
-    // CHECK-NOT: omptarget device 0 info: from(unknown)[4] is not used and will not be copied
-    #pragma omp target map(from: DataStack)
-    {
-        DataStack = 1;
-    }
+// CHECK-NOT: omptarget device 0 info: from(unknown)[4] is not used and will not
+// be copied
+#pragma omp target map(from : DataStack)
+  {
+    DataStack = 1;
+  }
 
-    // CHECK-NOT: omptarget device 0 info: to(unknown)[4] is not used and will not be copied
-    #pragma omp target map(always to: DataStack)
-    ;
+// CHECK-NOT: omptarget device 0 info: to(unknown)[4] is not used and will not
+// be copied
+#pragma omp target map(always to : DataStack)
+  ;
 
-    // CHECK: omptarget device 0 info: tofrom(unknown)[4] is not used and will not be copied
-    #pragma omp target map(tofrom: DataStack)
-    ;
+// CHECK: omptarget device 0 info: tofrom(unknown)[4] is not used and will not
+// be copied
+#pragma omp target map(tofrom : DataStack)
+  ;
 
-    int Size = 16;
-    double *Data = new double[Size];
+  int Size = 16;
+  double *Data = new double[Size];
 
-    // CHECK-NOT: omptarget device 0 info: tofrom(unknown)[8] is not used and will not be copied
-    #pragma omp target map(tofrom: Data[0:1])
-    {
-        Data[0] = 1;
-    }
+// CHECK-NOT: omptarget device 0 info: tofrom(unknown)[8] is not used and will
+// not be copied
+#pragma omp target map(tofrom : Data[0 : 1])
+  {
+    Data[0] = 1;
+  }
 
-    // CHECK-NOT: omptarget device 0 info: tofrom(unknown)[16] is not used and will not be copied
-    #pragma omp target map(always tofrom: Data[0:2])
-    ;
+// CHECK-NOT: omptarget device 0 info: tofrom(unknown)[16] is not used and will
+// not be copied
+#pragma omp target map(always tofrom : Data[0 : 2])
+  ;
 
-    // CHECK: omptarget device 0 info: from(unknown)[24] is not used and will not be copied
-    #pragma omp target map(from: Data[0:3])
-    ;
+// CHECK: omptarget device 0 info: from(unknown)[24] is not used and will not be
+// copied
+#pragma omp target map(from : Data[0 : 3])
+  ;
 
-    // CHECK: omptarget device 0 info: to(unknown)[24] is not used and will not be copied
-    #pragma omp target map(to: Data[0:3])
-    ;
+// CHECK: omptarget device 0 info: to(unknown)[24] is not used and will not be
+// copied
+#pragma omp target map(to : Data[0 : 3])
+  ;
 
-    // CHECK: omptarget device 0 info: tofrom(unknown)[32] is not used and will not be copied
-    #pragma omp target map(tofrom: Data[0:4])
-    ;
+// CHECK: omptarget device 0 info: tofrom(unknown)[32] is not used and will not
+// be copied
+#pragma omp target map(tofrom : Data[0 : 4])
+  ;
 
-    // CHECK-NOT: omptarget device 0 info: to(unknown)[40] is not used and will not be copied
-    #pragma omp target map(to: Data[0:5])
-    {
-        #pragma omp teams
-        Data[0] = 1;
-    }
+// CHECK-NOT: omptarget device 0 info: to(unknown)[40] is not used and will not
+// be copied
+#pragma omp target map(to : Data[0 : 5])
+  {
+#pragma omp teams
+    Data[0] = 1;
+  }
 
-    struct {
-      double *Data;
-    } Wrapper { .Data = Data };
+  struct {
+    double *Data;
+  } Wrapper{.Data = Data};
 
-    // CHECK-NOT: omptarget device 0 info: tofrom(unknown)[48] is not used and will not be copied
-    #pragma omp target map(tofrom: Wrapper.Data[0:6])
-    {
-        Wrapper.Data[0] = 1;
-    }
+// CHECK-NOT: omptarget device 0 info: tofrom(unknown)[48] is not used and will
+// not be copied
+#pragma omp target map(tofrom : Wrapper.Data[0 : 6])
+  {
+    Wrapper.Data[0] = 1;
+  }
 
-    // CHECK: omptarget device 0 info: unknown(unknown)[8] is not used and will not be copied
-    // CHECK: omptarget device 0 info: tofrom(unknown)[56] is not used and will not be copied
-    #pragma omp target map(tofrom: Wrapper.Data[0:7])
-    ;
+// CHECK: omptarget device 0 info: unknown(unknown)[8] is not used and will not
+// be copied CHECK: omptarget device 0 info: tofrom(unknown)[56] is not used and
+// will not be copied
+#pragma omp target map(tofrom : Wrapper.Data[0 : 7])
+  ;
 }

@pradt2
Copy link
Author

pradt2 commented Feb 6, 2025

Thanks for your feedback. This latest iteration handles modifiers, and custom mappers. Tests are in place and are passing locally. It would be good to run the test pipeline now to confirm.

@jplehr
Copy link
Contributor

jplehr commented Feb 6, 2025

I checked out the PR and used the CMake cache file from offload/cmake/caches/AMDGPUBot.cmake for my build config. I get a build error. It seems this was accidentally rebased onto something broken.

[1105/7623] Building CXX object lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/CFIFixup.cpp.o
FAILED: lib/CodeGen/CMakeFiles/LLVMCodeGen.dir/CFIFixup.cpp.o

I rebased onto current main which let me build and run tests. Failures that appears previously appear to be addressed.

(OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_FROM |
OMP_TGT_MAPTYPE_TO));

bool IsExistingMapping =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to check these if we know we won't skip this one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reordered the checks so as to minimise the number of times we call MappingInfo. Let me know if this is OK

MappingInfoTy::HDTTMapAccessorTy HDTTMap =
MappingInfo.HostDataToTargetMap.getExclusiveAccessor();

int64_t UnusedArgs = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good way to check for a "LastWasUnused". Make it explicit. Also, do we have a test where the object is used but the pointee is not?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not checking for "LastWasUnused". I'm checking for "AllAreUnused". Anyway, I made the check more explicit. Let me know if this is OK

(OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_TO) ||
(ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) ==
(OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_FROM |
OMP_TGT_MAPTYPE_TO));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make static helper for each of these with descriptive names, please.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rearranged the checks to make them more explicit

// Check for unused `map(buf[0:size])` mappings
IsArgUnused |= ArgType == OMP_TGT_MAPTYPE_FROM ||
ArgType == OMP_TGT_MAPTYPE_TO ||
ArgType == (OMP_TGT_MAPTYPE_FROM | OMP_TGT_MAPTYPE_TO);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there no other flags allowed? It might be so, just checking.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All test cases that we discussed are supported with these checks. In principle there may be some flags that could be ignored but I wasn't able to produce a test case to check for them.

@jhuber6
Copy link
Contributor

jhuber6 commented Feb 11, 2025

Was this closed accidentally?

@pradt2
Copy link
Author

pradt2 commented Feb 12, 2025

Yes I will sort it out now

@pradt2 pradt2 reopened this Feb 13, 2025
@jplehr
Copy link
Contributor

jplehr commented Feb 14, 2025

I have another question: The title says that it skips unused data copies, but a lot of the info output talks about redundant copies. Can you help me understand what's that distinction and how this patch determines if data is unused?

@shiltian
Copy link
Contributor

I have the same question. W/o compiler passing the information to the runtime, how can we determine if data is used or not.

@pradt2
Copy link
Author

pradt2 commented Feb 18, 2025

@jplehr @shiltian

The details vary slightly depending on the mapping, but the bottom line is that unused mappings can be detected just by looking at the ArgType flags of the kernel arguments.

For example, "freestanding buffers", i.e. #pragma omp target map(buf[0:N])), that are unused in the kernel are missing the OMP_TGT_MAPTYPE_TARGET_PARAM flag, whereas unused "wrapped buffers", i.e. #pragma omp target map(wrapper.buf[0:N]) can be detected by determining if the parent object (here - wrapper) is missing the OMP_TGT_MAPTYPE_TARGET_PARAM flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants