Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenMP] Fix runtime problem due to wrong map size. #74692

Merged
merged 2 commits into from
Dec 7, 2023

Conversation

jyu2-git
Copy link
Contributor

@jyu2-git jyu2-git commented Dec 7, 2023

Currently we are missing set up-boundary address for FinalArraySection as highests elements in partial struct data.

Currently for:
#pragma omp target map(D.a) map(D.b[:2])
The size is:
%a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
%b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
%arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
%2 = getelementptr float, ptr %arrayidx, i32 1
%3 = ptrtoint ptr %2 to i64
%4 = ptrtoint ptr %a to i64
%5 = sub i64 %3, %4
%6 = sdiv exact i64 %5, ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64)

Where %2 is wrong for (D.b[:2]) is pointer to first element of array section. It should pointe to last element of array section.

The fix is to emit the pointer to the last element of array section and use this pointer as the highest element in partial struct data.

After change IR:
%a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
%b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
%arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
%b1 = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
%arrayidx2 = getelementptr inbounds [2 x float], ptr %b1, i64 0, i64 1
%1 = getelementptr float, ptr %arrayidx2, i32 1
%2 = ptrtoint ptr %1 to i64
%3 = ptrtoint ptr %a to i64
%4 = sub i64 %2, %3
%5 = sdiv exact i64 %4, ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64)

Currently we are missing set up-boundary address for FinalArraySection
as highests elements in partial struct data.

Currently for:
\#pragma omp target map(D.a) map(D.b[:2])
The size is:
  %a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
  %b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
  %arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
  %2 = getelementptr float, ptr %arrayidx, i32 1
  %3 = ptrtoint ptr %2 to i64
  %4 = ptrtoint ptr %a to i64
  %5 = sub i64 %3, %4
  %6 = sdiv exact i64 %5, ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64)

Where %arrayidx is wrong for (D.b[:2]) should be:
  %arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 1

The fix is to emit the pointer to the last element of array section and
use this pointer as the highest element in partial struct data.

After change IR:
  %a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
  %b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
  %arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
  %b1 = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
  %arrayidx2 = getelementptr inbounds [2 x float], ptr %b1, i64 0, i64 1
  %1 = getelementptr float, ptr %arrayidx2, i32 1
  %2 = ptrtoint ptr %1 to i64
  %3 = ptrtoint ptr %a to i64
  %4 = sub i64 %2, %3
  %5 = sdiv exact i64 %4, ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64)
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen clang:openmp OpenMP related changes to Clang openmp:libomptarget OpenMP offload runtime labels Dec 7, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Dec 7, 2023

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: None (jyu2-git)

Changes

Currently we are missing set up-boundary address for FinalArraySection as highests elements in partial struct data.

Currently for:
#pragma omp target map(D.a) map(D.b[:2])
The size is:
%a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
%b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
%arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
%2 = getelementptr float, ptr %arrayidx, i32 1
%3 = ptrtoint ptr %2 to i64
%4 = ptrtoint ptr %a to i64
%5 = sub i64 %3, %4
%6 = sdiv exact i64 %5, ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64)

Where %arrayidx is wrong for (D.b[:2]) should be:
%arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 1

The fix is to emit the pointer to the last element of array section and use this pointer as the highest element in partial struct data.

After change IR:
%a = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 0
%b = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
%arrayidx = getelementptr inbounds [2 x float], ptr %b, i64 0, i64 0
%b1 = getelementptr inbounds %struct.DataTy, ptr %D, i32 0, i32 1
%arrayidx2 = getelementptr inbounds [2 x float], ptr %b1, i64 0, i64 1
%1 = getelementptr float, ptr %arrayidx2, i32 1
%2 = ptrtoint ptr %1 to i64
%3 = ptrtoint ptr %a to i64
%4 = sub i64 %2, %3
%5 = sdiv exact i64 %4, ptrtoint (ptr getelementptr (i8, ptr null, i32 1) to i64)


Full diff: https://github.com/llvm/llvm-project/pull/74692.diff

3 Files Affected:

  • (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+8-1)
  • (modified) clang/test/OpenMP/target_data_use_device_addr_codegen.cpp (+7-1)
  • (added) openmp/libomptarget/test/offloading/target_map_for_member_data.cpp (+23)
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 55648963df36a..7f7e6f5306664 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -7389,7 +7389,14 @@ class MappableExprsHandler {
           } else if (FieldIndex < PartialStruct.LowestElem.first) {
             PartialStruct.LowestElem = {FieldIndex, LowestElem};
           } else if (FieldIndex > PartialStruct.HighestElem.first) {
-            PartialStruct.HighestElem = {FieldIndex, LowestElem};
+            if (IsFinalArraySection) {
+              Address HB =
+                  CGF.EmitOMPArraySectionExpr(OASE, /*IsLowerBound=*/false)
+                      .getAddress(CGF);
+              PartialStruct.HighestElem = {FieldIndex, HB};
+            } else {
+              PartialStruct.HighestElem = {FieldIndex, LowestElem};
+            }
           }
         }
 
diff --git a/clang/test/OpenMP/target_data_use_device_addr_codegen.cpp b/clang/test/OpenMP/target_data_use_device_addr_codegen.cpp
index 7e70cdf74ad37..ae0653d0585d4 100644
--- a/clang/test/OpenMP/target_data_use_device_addr_codegen.cpp
+++ b/clang/test/OpenMP/target_data_use_device_addr_codegen.cpp
@@ -144,7 +144,13 @@ int main() {
 // CHECK: [[ARR_IDX6:%.+]] = getelementptr inbounds [4 x i32], ptr [[ARR_ADDR]], i64 0, i64 0
 // CHECK: [[A_ADDR2:%.+]] = getelementptr inbounds %struct.S, ptr [[THIS]], i32 0, i32 0
 // CHECK: [[P4:%.+]] = mul nuw i64 [[CONV:%.+]], 4
-// CHECK: [[ARR_END:%.+]] = getelementptr i32, ptr [[ARR_IDX6]], i32 1
+// CHECK: [[A_ADDR3:%.+]] = getelementptr inbounds %struct.S, ptr [[THIS]], i32 0, i32 0
+// CHECK: [[L5:%.+]] = load i32, ptr [[A_ADDR3]]
+// CHECK: [[L6:%.+]] = sext i32 [[L5]] to i64
+// CHECK: [[LB_ADD_LEN:%lb_add_len]] = add nsw i64 -1, [[L6]]
+// CHECK: [[ARR_ADDR9:%.+]] = getelementptr inbounds %struct.S, ptr [[THIS]], i32 0, i32 3
+// CHECK: [[ARR_IDX10:%arrayidx.+]] = getelementptr inbounds [4 x i32], ptr [[ARR_ADDR9]], i64 0, i64 %lb_add_len
+// CHECK: [[ARR_END:%.+]] = getelementptr i32, ptr [[ARR_IDX10]], i32 1
 // CHECK: [[E:%.+]] = ptrtoint ptr [[ARR_END]] to i64
 // CHECK: [[B:%.+]] = ptrtoint ptr [[A_ADDR]] to i64
 // CHECK: [[DIFF:%.+]] = sub i64 [[E]], [[B]]
diff --git a/openmp/libomptarget/test/offloading/target_map_for_member_data.cpp b/openmp/libomptarget/test/offloading/target_map_for_member_data.cpp
new file mode 100644
index 0000000000000..8c8b4668c32e6
--- /dev/null
+++ b/openmp/libomptarget/test/offloading/target_map_for_member_data.cpp
@@ -0,0 +1,23 @@
+// clang-format off
+// RUN: %libomptarget-compilexx-generic && env LIBOMPTARGET_DEBUG=1 %libomptarget-run-generic 2>&1 | %fcheck-generic
+// clang-format on
+
+struct DataTy {
+  float a;
+  float b[2];
+};
+
+int main(int argc, char **argv) {
+  DataTy D;
+#pragma omp target map(D.a) map(D.b[ : 2])
+  {
+    D.a = 0;
+    D.b[0] = 1;
+  }
+  return 0;
+}
+// clang-format off
+// CHECK: omptarget --> Entry  0: Base=[[DAT_HST_PTR_BASE:0x.*]], Begin=[[DAT_HST_PTR_BASE]], Size=12
+// CHECK: omptarget --> Entry  1: Base=[[DAT_HST_PTR_BASE]], Begin=[[DAT_HST_PTR_BASE]], Size=4,
+// CHECK: omptarget --> Entry  2: Base=[[DAT_HST_PTR_BASE]], Begin=[[DATUM_HST_PTR_BASE:0x.*]], Size=8,
+// clang-format on


struct DataTy {
float a;
float b[2];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you change the size of the array to, say, 3 instead of 2? Required to check that we map the requested array section, not the whole array.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I changed that.

Copy link
Member

@alexey-bataev alexey-bataev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the description is not quite correct. Looks like %arrayidx is correct, but %2 is incorrect.

@jyu2-git
Copy link
Contributor Author

jyu2-git commented Dec 7, 2023

I think the description is not quite correct. Looks like %arrayidx is correct, but %2 is incorrect.

I changed description using %2 pointer to last element of array section. Thanks.

@jyu2-git jyu2-git changed the title [OpenMP] Fix runtime problem due wrong map size. [OpenMP] Fix runtime problem due to wrong map size. Dec 7, 2023
@jyu2-git jyu2-git merged commit 0113722 into llvm:main Dec 7, 2023
4 checks passed
@jplehr
Copy link
Contributor

jplehr commented Dec 7, 2023

It appears that this breaks the AMDGPU OpenMP Offload buildbot: https://lab.llvm.org/buildbot/#/builders/193/builds/43297

@jplehr
Copy link
Contributor

jplehr commented Dec 7, 2023

I'm looking into it locally

@jplehr
Copy link
Contributor

jplehr commented Dec 7, 2023

The issue comes from the bot building without setting CMake option -DLIBOMPTARGET_ENABLE_DEBUG=ON. This makes the environment variable LIBOMPTARGET_DEBUG=1 have no effect in the test, i.e., no output for FileCheck to test.

@jyu2-git
Copy link
Contributor Author

jyu2-git commented Dec 7, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category openmp:libomptarget OpenMP offload runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants