-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Offload]: Skip copying of unused kernel-mapped data #124723
base: main
Are you sure you want to change the base?
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-offload Author: None (pradt2) ChangesThis commit skips copying of buffers that aren't used by the kernel. ApproachThere are multiple ways of doing this. I aimed for my approach to be as non-invasive as possible. Currently, the optimisation is applied regardless of use of custom mappers. We would probably want to skip this optimisation when custom mappers are present to be on the safe side. I will add this handling as soon as I get a green light that this approach is acceptable. TestsCurrently there are no tests for this. I'm happy to (at least try to) add tests as soon as we agree on the technical approach. Full diff: https://github.com/llvm/llvm-project/pull/124723.diff 1 Files Affected:
diff --git a/offload/src/omptarget.cpp b/offload/src/omptarget.cpp
index 89fa63347babe2..3a9fea2239d5bb 100644
--- a/offload/src/omptarget.cpp
+++ b/offload/src/omptarget.cpp
@@ -1197,6 +1197,35 @@ class PrivateArgumentManagerTy {
}
};
+static std::unique_ptr<int64_t[]> maskIgnorableMappings(int64_t DeviceId, int32_t ArgNum, int64_t *ArgTypes,
+ int64_t *ArgSizes, map_var_info_t *ArgNames) {
+ std::unique_ptr<int64_t[]> ArgTypesOverride = std::make_unique<int64_t[]>(ArgNum);
+
+ for (int32_t I = 0; I < ArgNum; ++I) {
+ bool IsTargetParam = ArgTypes[I] & OMP_TGT_MAPTYPE_TARGET_PARAM;
+
+ bool IsMapTo = ArgTypes[I] & OMP_TGT_MAPTYPE_TO;
+ if (IsTargetParam || !IsMapTo) {
+ ArgTypesOverride[I] = ArgTypes[I];
+ continue;
+ }
+
+ bool IsMapFrom = ArgTypes[I] & OMP_TGT_MAPTYPE_FROM;
+ const char *Type = IsMapFrom ? "tofrom" : "to";
+
+ // Optimisation: A 'to' or 'tofrom' mapping is not
+ // used by the kernel. Change its type such that
+ // no new mapping is created, but any existing
+ // mapping has its counter decremented.
+ INFO(OMP_INFOTYPE_ALL, DeviceId, "%s(%s)[%" PRId64 "] %s\n", Type,
+ getNameFromMapping(ArgNames[I]).c_str(), ArgSizes[I], "is not used and will not be copied");
+
+ ArgTypesOverride[I] = ArgTypes[I] & ~(OMP_TGT_MAPTYPE_TO | OMP_TGT_MAPTYPE_FROM);
+ }
+
+ return ArgTypesOverride;
+}
+
/// Process data before launching the kernel, including calling targetDataBegin
/// to map and transfer data to target device, transferring (first-)private
/// variables.
@@ -1417,11 +1446,16 @@ int target(ident_t *Loc, DeviceTy &Device, void *HostPtr,
int NumClangLaunchArgs = KernelArgs.NumArgs;
int Ret = OFFLOAD_SUCCESS;
+
+ std::unique_ptr<int64_t[]> ArgTypesOverride =
+ maskIgnorableMappings(DeviceId, NumClangLaunchArgs, KernelArgs.ArgTypes,
+ KernelArgs.ArgSizes, KernelArgs.ArgNames);
+
if (NumClangLaunchArgs) {
// Process data, such as data mapping, before launching the kernel
Ret = processDataBefore(Loc, DeviceId, HostPtr, NumClangLaunchArgs,
KernelArgs.ArgBasePtrs, KernelArgs.ArgPtrs,
- KernelArgs.ArgSizes, KernelArgs.ArgTypes,
+ KernelArgs.ArgSizes, ArgTypesOverride.get(),
KernelArgs.ArgNames, KernelArgs.ArgMappers, TgtArgs,
TgtOffsets, PrivateArgumentManager, AsyncInfo);
if (Ret != OFFLOAD_SUCCESS) {
@@ -1473,7 +1507,7 @@ int target(ident_t *Loc, DeviceTy &Device, void *HostPtr,
// variables
Ret = processDataAfter(Loc, DeviceId, HostPtr, NumClangLaunchArgs,
KernelArgs.ArgBasePtrs, KernelArgs.ArgPtrs,
- KernelArgs.ArgSizes, KernelArgs.ArgTypes,
+ KernelArgs.ArgSizes, ArgTypesOverride.get(),
KernelArgs.ArgNames, KernelArgs.ArgMappers,
PrivateArgumentManager, AsyncInfo);
if (Ret != OFFLOAD_SUCCESS) {
|
Not sure if this is meant to work / pass tests already. I see many failures when running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test?
What about the case when you have map(always, to....). Need to copy the data. |
I also wonder how you could know if a map is used w/o information passed from compiler... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned, we need tests, incl. for the modifiers, e.g., always.
offload/src/omptarget.cpp
Outdated
@@ -1197,6 +1197,35 @@ class PrivateArgumentManagerTy { | |||
} | |||
}; | |||
|
|||
static std::unique_ptr<int64_t[]> maskIgnorableMappings(int64_t DeviceId, int32_t ArgNum, int64_t *ArgTypes, | |||
int64_t *ArgSizes, map_var_info_t *ArgNames) { | |||
std::unique_ptr<int64_t[]> ArgTypesOverride = std::make_unique<int64_t[]>(ArgNum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not changing ArgTypes in place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get a segfault when I do, I think this is because the pointer points straight to a region of read-only memory.
|
You can test this locally with the following command:git-clang-format --diff 170b9caf335eb99cdecba0fb6cdf45728bcac11d 12561c445d04f13a769f0f8a2c0c01934a0a8062 --extensions h,cpp -- offload/test/mapping/skip_transfers.cpp offload/include/OpenMP/Mapping.h offload/include/Shared/Debug.h offload/libomptarget/omptarget.cpp View the diff from clang-format here.diff --git a/offload/include/OpenMP/Mapping.h b/offload/include/OpenMP/Mapping.h
index 1595e06714..f5be04b1a3 100644
--- a/offload/include/OpenMP/Mapping.h
+++ b/offload/include/OpenMP/Mapping.h
@@ -388,10 +388,9 @@ struct LookupResult {
TargetPointerResultTy TPR;
bool isEmpty() const {
- bool IsEmpty = Flags.IsContained == 0
- & Flags.ExtendsBefore == 0
- & Flags.ExtendsAfter == 0;
- return IsEmpty;
+ bool IsEmpty = Flags.IsContained == 0 & Flags.ExtendsBefore == 0 &
+ Flags.ExtendsAfter == 0;
+ return IsEmpty;
}
};
diff --git a/offload/libomptarget/omptarget.cpp b/offload/libomptarget/omptarget.cpp
index 6fe3def424..8e692e5b30 100644
--- a/offload/libomptarget/omptarget.cpp
+++ b/offload/libomptarget/omptarget.cpp
@@ -1200,66 +1200,77 @@ public:
/// Try to find redundant mappings associated with a kernel launch,
/// and provide a masked version of the kernel argument types that
/// avoid redundant to data transfers between the host and device.
-static std::unique_ptr<int64_t[]> maskRedundantTransfers(DeviceTy &Device, int32_t ArgNum,
- int64_t *ArgTypes, int64_t *ArgSizes,
- map_var_info_t *ArgNames, void **ArgPtrs,
- void **ArgMappers) {
- std::unique_ptr<int64_t[]> ArgTypesOverride = std::make_unique<int64_t[]>(ArgNum);
+static std::unique_ptr<int64_t[]>
+maskRedundantTransfers(DeviceTy &Device, int32_t ArgNum, int64_t *ArgTypes,
+ int64_t *ArgSizes, map_var_info_t *ArgNames,
+ void **ArgPtrs, void **ArgMappers) {
+ std::unique_ptr<int64_t[]> ArgTypesOverride =
+ std::make_unique<int64_t[]>(ArgNum);
- MappingInfoTy &MappingInfo = Device.getMappingInfo();
- MappingInfoTy::HDTTMapAccessorTy HDTTMap = MappingInfo
- .HostDataToTargetMap.getExclusiveAccessor();
+ MappingInfoTy &MappingInfo = Device.getMappingInfo();
+ MappingInfoTy::HDTTMapAccessorTy HDTTMap =
+ MappingInfo.HostDataToTargetMap.getExclusiveAccessor();
- int64_t UnusedArgs = 0;
+ int64_t UnusedArgs = 0;
- for (int32_t I = 0; I < ArgNum; ++I) {
- tgt_map_type ArgType = (tgt_map_type) ArgTypes[I];
-
- // Check for unused implicit mappings
- bool IsArgUnused = ArgType == OMP_TGT_MAPTYPE_NONE;
-
- // Check for unused `map(buf[0:size])` mappings
- IsArgUnused |= ArgType == OMP_TGT_MAPTYPE_FROM
- || ArgType == OMP_TGT_MAPTYPE_TO
- || ArgType == (OMP_TGT_MAPTYPE_FROM | OMP_TGT_MAPTYPE_TO);
-
- // Check for unused `map(wrapper.buf[0:size])` mappings
- IsArgUnused |= UnusedArgs == ArgNum - 1 && ArgType & OMP_TGT_MAPTYPE_MEMBER_OF
- && ((ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) == OMP_TGT_MAPTYPE_PTR_AND_OBJ
- || (ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) == (OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_TO)
- || (ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) == (OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_FROM | OMP_TGT_MAPTYPE_TO));
-
- bool IsExistingMapping = !MappingInfo.lookupMapping(HDTTMap, ArgPtrs[I], ArgSizes[I]).isEmpty();
-
- bool IsCustomMapped = ArgMappers && ArgMappers[I];
-
- if (IsExistingMapping | IsCustomMapped | !IsArgUnused) {
- ArgTypesOverride[I] = ArgTypes[I];
- continue;
- }
-
- const std::string Name = ArgNames && ArgNames[I] ?
- getNameFromMapping(ArgNames[I]) : std::string("unknown");
-
- bool IsArgFrom = ArgType & OMP_TGT_MAPTYPE_FROM;
- bool IsArgTo = ArgType & OMP_TGT_MAPTYPE_TO;
-
- const char *Type = IsArgFrom && IsArgTo ? "tofrom"
- : IsArgFrom ? "from"
- : IsArgTo ? "to"
- : "unknown";
-
- // Optimisation:
- // A new mapping is not used by the kernel.
- // Change the type such that no data is transferred to and/or from the device.
- INFO(OMP_INFOTYPE_REDUNDANT_TRANSFER, Device.DeviceID, "%s(%s)[%" PRId64 "] %s\n", Type,
- Name.c_str(), ArgSizes[I], "is not used and will not be copied");
-
- ArgTypesOverride[I] = ArgTypes[I] & ~(OMP_TGT_MAPTYPE_TO | OMP_TGT_MAPTYPE_FROM);
- UnusedArgs++;
+ for (int32_t I = 0; I < ArgNum; ++I) {
+ tgt_map_type ArgType = (tgt_map_type)ArgTypes[I];
+
+ // Check for unused implicit mappings
+ bool IsArgUnused = ArgType == OMP_TGT_MAPTYPE_NONE;
+
+ // Check for unused `map(buf[0:size])` mappings
+ IsArgUnused |= ArgType == OMP_TGT_MAPTYPE_FROM ||
+ ArgType == OMP_TGT_MAPTYPE_TO ||
+ ArgType == (OMP_TGT_MAPTYPE_FROM | OMP_TGT_MAPTYPE_TO);
+
+ // Check for unused `map(wrapper.buf[0:size])` mappings
+ IsArgUnused |= UnusedArgs == ArgNum - 1 &&
+ ArgType & OMP_TGT_MAPTYPE_MEMBER_OF &&
+ ((ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) ==
+ OMP_TGT_MAPTYPE_PTR_AND_OBJ ||
+ (ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) ==
+ (OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_TO) ||
+ (ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) ==
+ (OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_FROM |
+ OMP_TGT_MAPTYPE_TO));
+
+ bool IsExistingMapping =
+ !MappingInfo.lookupMapping(HDTTMap, ArgPtrs[I], ArgSizes[I]).isEmpty();
+
+ bool IsCustomMapped = ArgMappers && ArgMappers[I];
+
+ if (IsExistingMapping | IsCustomMapped | !IsArgUnused) {
+ ArgTypesOverride[I] = ArgTypes[I];
+ continue;
}
- return ArgTypesOverride;
+ const std::string Name = ArgNames && ArgNames[I]
+ ? getNameFromMapping(ArgNames[I])
+ : std::string("unknown");
+
+ bool IsArgFrom = ArgType & OMP_TGT_MAPTYPE_FROM;
+ bool IsArgTo = ArgType & OMP_TGT_MAPTYPE_TO;
+
+ const char *Type = IsArgFrom && IsArgTo ? "tofrom"
+ : IsArgFrom ? "from"
+ : IsArgTo ? "to"
+ : "unknown";
+
+ // Optimisation:
+ // A new mapping is not used by the kernel.
+ // Change the type such that no data is transferred to and/or from the
+ // device.
+ INFO(OMP_INFOTYPE_REDUNDANT_TRANSFER, Device.DeviceID,
+ "%s(%s)[%" PRId64 "] %s\n", Type, Name.c_str(), ArgSizes[I],
+ "is not used and will not be copied");
+
+ ArgTypesOverride[I] =
+ ArgTypes[I] & ~(OMP_TGT_MAPTYPE_TO | OMP_TGT_MAPTYPE_FROM);
+ UnusedArgs++;
+ }
+
+ return ArgTypesOverride;
}
/// Process data before launching the kernel, including calling targetDataBegin
@@ -1483,9 +1494,9 @@ int target(ident_t *Loc, DeviceTy &Device, void *HostPtr,
int NumClangLaunchArgs = KernelArgs.NumArgs;
int Ret = OFFLOAD_SUCCESS;
- std::unique_ptr<int64_t[]> ArgTypesOverride =
- maskRedundantTransfers(Device, NumClangLaunchArgs, KernelArgs.ArgTypes,
- KernelArgs.ArgSizes, KernelArgs.ArgNames, KernelArgs.ArgPtrs, KernelArgs.ArgMappers);
+ std::unique_ptr<int64_t[]> ArgTypesOverride = maskRedundantTransfers(
+ Device, NumClangLaunchArgs, KernelArgs.ArgTypes, KernelArgs.ArgSizes,
+ KernelArgs.ArgNames, KernelArgs.ArgPtrs, KernelArgs.ArgMappers);
if (NumClangLaunchArgs) {
// Process data, such as data mapping, before launching the kernel
diff --git a/offload/test/mapping/skip_transfers.cpp b/offload/test/mapping/skip_transfers.cpp
index ff0459f01f..c784840b4d 100644
--- a/offload/test/mapping/skip_transfers.cpp
+++ b/offload/test/mapping/skip_transfers.cpp
@@ -7,66 +7,77 @@
// clang-format on
int main() {
- float DataStack = 0;
+ float DataStack = 0;
- // CHECK-NOT: omptarget device 0 info: from(unknown)[4] is not used and will not be copied
- #pragma omp target map(from: DataStack)
- {
- DataStack = 1;
- }
+// CHECK-NOT: omptarget device 0 info: from(unknown)[4] is not used and will not
+// be copied
+#pragma omp target map(from : DataStack)
+ {
+ DataStack = 1;
+ }
- // CHECK-NOT: omptarget device 0 info: to(unknown)[4] is not used and will not be copied
- #pragma omp target map(always to: DataStack)
- ;
+// CHECK-NOT: omptarget device 0 info: to(unknown)[4] is not used and will not
+// be copied
+#pragma omp target map(always to : DataStack)
+ ;
- // CHECK: omptarget device 0 info: tofrom(unknown)[4] is not used and will not be copied
- #pragma omp target map(tofrom: DataStack)
- ;
+// CHECK: omptarget device 0 info: tofrom(unknown)[4] is not used and will not
+// be copied
+#pragma omp target map(tofrom : DataStack)
+ ;
- int Size = 16;
- double *Data = new double[Size];
+ int Size = 16;
+ double *Data = new double[Size];
- // CHECK-NOT: omptarget device 0 info: tofrom(unknown)[8] is not used and will not be copied
- #pragma omp target map(tofrom: Data[0:1])
- {
- Data[0] = 1;
- }
+// CHECK-NOT: omptarget device 0 info: tofrom(unknown)[8] is not used and will
+// not be copied
+#pragma omp target map(tofrom : Data[0 : 1])
+ {
+ Data[0] = 1;
+ }
- // CHECK-NOT: omptarget device 0 info: tofrom(unknown)[16] is not used and will not be copied
- #pragma omp target map(always tofrom: Data[0:2])
- ;
+// CHECK-NOT: omptarget device 0 info: tofrom(unknown)[16] is not used and will
+// not be copied
+#pragma omp target map(always tofrom : Data[0 : 2])
+ ;
- // CHECK: omptarget device 0 info: from(unknown)[24] is not used and will not be copied
- #pragma omp target map(from: Data[0:3])
- ;
+// CHECK: omptarget device 0 info: from(unknown)[24] is not used and will not be
+// copied
+#pragma omp target map(from : Data[0 : 3])
+ ;
- // CHECK: omptarget device 0 info: to(unknown)[24] is not used and will not be copied
- #pragma omp target map(to: Data[0:3])
- ;
+// CHECK: omptarget device 0 info: to(unknown)[24] is not used and will not be
+// copied
+#pragma omp target map(to : Data[0 : 3])
+ ;
- // CHECK: omptarget device 0 info: tofrom(unknown)[32] is not used and will not be copied
- #pragma omp target map(tofrom: Data[0:4])
- ;
+// CHECK: omptarget device 0 info: tofrom(unknown)[32] is not used and will not
+// be copied
+#pragma omp target map(tofrom : Data[0 : 4])
+ ;
- // CHECK-NOT: omptarget device 0 info: to(unknown)[40] is not used and will not be copied
- #pragma omp target map(to: Data[0:5])
- {
- #pragma omp teams
- Data[0] = 1;
- }
+// CHECK-NOT: omptarget device 0 info: to(unknown)[40] is not used and will not
+// be copied
+#pragma omp target map(to : Data[0 : 5])
+ {
+#pragma omp teams
+ Data[0] = 1;
+ }
- struct {
- double *Data;
- } Wrapper { .Data = Data };
+ struct {
+ double *Data;
+ } Wrapper{.Data = Data};
- // CHECK-NOT: omptarget device 0 info: tofrom(unknown)[48] is not used and will not be copied
- #pragma omp target map(tofrom: Wrapper.Data[0:6])
- {
- Wrapper.Data[0] = 1;
- }
+// CHECK-NOT: omptarget device 0 info: tofrom(unknown)[48] is not used and will
+// not be copied
+#pragma omp target map(tofrom : Wrapper.Data[0 : 6])
+ {
+ Wrapper.Data[0] = 1;
+ }
- // CHECK: omptarget device 0 info: unknown(unknown)[8] is not used and will not be copied
- // CHECK: omptarget device 0 info: tofrom(unknown)[56] is not used and will not be copied
- #pragma omp target map(tofrom: Wrapper.Data[0:7])
- ;
+// CHECK: omptarget device 0 info: unknown(unknown)[8] is not used and will not
+// be copied CHECK: omptarget device 0 info: tofrom(unknown)[56] is not used and
+// will not be copied
+#pragma omp target map(tofrom : Wrapper.Data[0 : 7])
+ ;
}
|
Thanks for your feedback. This latest iteration handles modifiers, and custom mappers. Tests are in place and are passing locally. It would be good to run the test pipeline now to confirm. |
I checked out the PR and used the CMake cache file from
I rebased onto current |
offload/src/omptarget.cpp
Outdated
(OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_FROM | | ||
OMP_TGT_MAPTYPE_TO)); | ||
|
||
bool IsExistingMapping = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to check these if we know we won't skip this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reordered the checks so as to minimise the number of times we call MappingInfo. Let me know if this is OK
offload/src/omptarget.cpp
Outdated
MappingInfoTy::HDTTMapAccessorTy HDTTMap = | ||
MappingInfo.HostDataToTargetMap.getExclusiveAccessor(); | ||
|
||
int64_t UnusedArgs = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a good way to check for a "LastWasUnused". Make it explicit. Also, do we have a test where the object is used but the pointee is not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not checking for "LastWasUnused". I'm checking for "AllAreUnused". Anyway, I made the check more explicit. Let me know if this is OK
offload/src/omptarget.cpp
Outdated
(OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_TO) || | ||
(ArgType & ~OMP_TGT_MAPTYPE_MEMBER_OF) == | ||
(OMP_TGT_MAPTYPE_PTR_AND_OBJ | OMP_TGT_MAPTYPE_FROM | | ||
OMP_TGT_MAPTYPE_TO)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make static helper for each of these with descriptive names, please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rearranged the checks to make them more explicit
offload/src/omptarget.cpp
Outdated
// Check for unused `map(buf[0:size])` mappings | ||
IsArgUnused |= ArgType == OMP_TGT_MAPTYPE_FROM || | ||
ArgType == OMP_TGT_MAPTYPE_TO || | ||
ArgType == (OMP_TGT_MAPTYPE_FROM | OMP_TGT_MAPTYPE_TO); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there no other flags allowed? It might be so, just checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All test cases that we discussed are supported with these checks. In principle there may be some flags that could be ignored but I wasn't able to produce a test case to check for them.
Was this closed accidentally? |
Yes I will sort it out now |
I have another question: The title says that it skips unused data copies, but a lot of the info output talks about redundant copies. Can you help me understand what's that distinction and how this patch determines if data is unused? |
I have the same question. W/o compiler passing the information to the runtime, how can we determine if data is used or not. |
The details vary slightly depending on the mapping, but the bottom line is that unused mappings can be detected just by looking at the For example, "freestanding buffers", i.e. |
This commit skips copying of buffers that aren't used by the kernel.
Context
Currently kernel-mapped data is always copied to and/or from device even if it's not used by the kernel. E.g.
This change aims to detect such unused mappings and act on them by skipping memory transfers.
Approach
There are multiple ways of doing this. I aimed for my approach to be as non-invasive as possible. Currently, the optimisation is applied regardless of use of custom mappers. We would probably want to skip this optimisation when custom mappers are present to be on the safe side. I will add this handling as soon as I get a green light that this approach is acceptable.
Tests
Currently there are no tests for this. I'm happy to (at least try to) add tests as soon as we agree on the technical approach.