Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenMP] Increment kernel args version, used by runtime for detecting dyn_ptr. #85363

Merged
merged 3 commits into from
Mar 19, 2024

Conversation

dhruvachak
Copy link
Contributor

A kernel implicit parameter (dyn_ptr) was introduced some time back. This patch increments the kernel args version for a compiler supporting dyn_ptr. The version will be used by the runtime to determine whether the implicit parameter is generated by the compiler. The versioning is required to support use cases where code generated by an older compiler is linked with a newer runtime.

If approved, this patch should be backported to release 18.

… dyn_ptr.

A kernel implicit parameter (dyn_ptr) was introduced some time back.
This patch increments the kernel args version for a compiler supporting
dyn_ptr. The version will be used by the runtime to determine whether
the implicit parameter is generated by the compiler. The versioning is
required to support use cases where code generated by an older compiler
is linked with a newer runtime.

If approved, this patch should be backported to release 18.
@llvmbot llvmbot added clang Clang issues not falling into any other category flang:openmp clang:openmp OpenMP related changes to Clang openmp:libomptarget OpenMP offload runtime labels Mar 15, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 15, 2024

@llvm/pr-subscribers-flang-openmp

Author: None (dhruvachak)

Changes

A kernel implicit parameter (dyn_ptr) was introduced some time back. This patch increments the kernel args version for a compiler supporting dyn_ptr. The version will be used by the runtime to determine whether the implicit parameter is generated by the compiler. The versioning is required to support use cases where code generated by an older compiler is linked with a newer runtime.

If approved, this patch should be backported to release 18.


Patch is 1.13 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85363.diff

139 Files Affected:

  • (modified) clang/test/OpenMP/bug60602.cpp (+2-2)
  • (modified) clang/test/OpenMP/distribute_codegen.cpp (+10-10)
  • (modified) clang/test/OpenMP/distribute_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp (+8-8)
  • (modified) clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp (+24-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp (+32-32)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_num_threads_codegen.cpp (+24-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/distribute_private_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/distribute_simd_codegen.cpp (+20-20)
  • (modified) clang/test/OpenMP/distribute_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_simd_private_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/distribute_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/map_struct_ordering.cpp (+1-1)
  • (modified) clang/test/OpenMP/reduction_implicit_map.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_codegen_global_capture.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_firstprivate_codegen.cpp (+24-24)
  • (modified) clang/test/OpenMP/target_has_device_addr_codegen.cpp (+15-15)
  • (modified) clang/test/OpenMP/target_has_device_addr_codegen_01.cpp (+2-2)
  • (modified) clang/test/OpenMP/target_is_device_ptr_codegen.cpp (+44-44)
  • (modified) clang/test/OpenMP/target_map_codegen_03.cpp (+2-2)
  • (modified) clang/test/OpenMP/target_map_codegen_hold.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_map_deref_array_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_map_member_expr_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_offload_mandatory_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_ompx_dyn_cgroup_mem_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_parallel_codegen.cpp (+14-14)
  • (modified) clang/test/OpenMP/target_parallel_for_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/target_parallel_for_simd_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/target_parallel_generic_loop_codegen-1.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_parallel_generic_loop_codegen-2.cpp (+2-2)
  • (modified) clang/test/OpenMP/target_parallel_generic_loop_uses_allocators_codegen.cpp (+1-1)
  • (modified) clang/test/OpenMP/target_parallel_if_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_parallel_num_threads_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_task_affinity_codegen.cpp (+2-2)
  • (modified) clang/test/OpenMP/target_teams_codegen.cpp (+22-22)
  • (modified) clang/test/OpenMP/target_teams_distribute_codegen.cpp (+14-14)
  • (modified) clang/test/OpenMP/target_teams_distribute_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_distribute_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_schedule_codegen.cpp (+60-60)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp (+24-24)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_schedule_codegen.cpp (+60-60)
  • (modified) clang/test/OpenMP/target_teams_distribute_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_reduction_codegen.cpp (+40-40)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_codegen-1.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp (+5-5)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_uses_allocators_codegen.cpp (+1-1)
  • (modified) clang/test/OpenMP/target_teams_map_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_num_teams_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_teams_thread_limit_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_codegen.cpp (+20-20)
  • (modified) clang/test/OpenMP/teams_distribute_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_distribute_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_copyin_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp (+8-8)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_num_threads_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_schedule_codegen.cpp (+60-60)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp (+32-32)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_num_threads_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_schedule_codegen.cpp (+60-60)
  • (modified) clang/test/OpenMP/teams_distribute_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_simd_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_simd_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_distribute_simd_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_firstprivate_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_generic_loop_codegen-1.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_generic_loop_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_generic_loop_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_generic_loop_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_private_codegen.cpp (+10-10)
  • (modified) llvm/include/llvm/Frontend/OpenMP/OMPConstants.h (+4-1)
  • (modified) openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h (+2-2)
  • (modified) openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp (+13-8)
  • (modified) openmp/libomptarget/src/interface.cpp (+13-3)
  • (modified) openmp/libomptarget/src/omptarget.cpp (+2-1)
diff --git a/clang/test/OpenMP/bug60602.cpp b/clang/test/OpenMP/bug60602.cpp
index 3ecc70cab778a1..d3aed5d02c1d06 100644
--- a/clang/test/OpenMP/bug60602.cpp
+++ b/clang/test/OpenMP/bug60602.cpp
@@ -95,7 +95,7 @@ int kernel_within_loop(int *a, int *b, int N, int num_iters) {
 // CHECK-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [3 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [3 x i64], ptr [[DOTOFFLOAD_SIZES]], i32 0, i32 0
 // CHECK-NEXT:    [[TMP28:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK-NEXT:    store i32 2, ptr [[TMP28]], align 4
+// CHECK-NEXT:    store i32 3, ptr [[TMP28]], align 4
 // CHECK-NEXT:    [[TMP29:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK-NEXT:    store i32 3, ptr [[TMP29]], align 4
 // CHECK-NEXT:    [[TMP30:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -181,7 +181,7 @@ int kernel_within_loop(int *a, int *b, int N, int num_iters) {
 // CHECK-NEXT:    [[ADD:%.*]] = add i32 [[TMP71]], 1
 // CHECK-NEXT:    [[TMP72:%.*]] = zext i32 [[ADD]] to i64
 // CHECK-NEXT:    [[TMP73:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 0
-// CHECK-NEXT:    store i32 2, ptr [[TMP73]], align 4
+// CHECK-NEXT:    store i32 3, ptr [[TMP73]], align 4
 // CHECK-NEXT:    [[TMP74:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 1
 // CHECK-NEXT:    store i32 3, ptr [[TMP74]], align 4
 // CHECK-NEXT:    [[TMP75:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 2
diff --git a/clang/test/OpenMP/distribute_codegen.cpp b/clang/test/OpenMP/distribute_codegen.cpp
index 31ec6ff911905e..34d14c89fedaed 100644
--- a/clang/test/OpenMP/distribute_codegen.cpp
+++ b/clang/test/OpenMP/distribute_codegen.cpp
@@ -163,7 +163,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK1-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK1-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -354,7 +354,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK1-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK1-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -545,7 +545,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK1-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK1-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -744,7 +744,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[ADD4:%.*]] = add nsw i32 [[TMP9]], 1
 // CHECK1-NEXT:    [[TMP10:%.*]] = zext i32 [[ADD4]] to i64
 // CHECK1-NEXT:    [[TMP11:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP11]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP11]], align 4
 // CHECK1-NEXT:    [[TMP12:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 1, ptr [[TMP12]], align 4
 // CHECK1-NEXT:    [[TMP13:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -911,7 +911,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[TMP5:%.*]] = getelementptr inbounds [1 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP6:%.*]] = getelementptr inbounds [1 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP7:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP7]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP7]], align 4
 // CHECK1-NEXT:    [[TMP8:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 1, ptr [[TMP8]], align 4
 // CHECK1-NEXT:    [[TMP9:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1084,7 +1084,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK3-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK3-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1271,7 +1271,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK3-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK3-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1458,7 +1458,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK3-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK3-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1653,7 +1653,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[ADD4:%.*]] = add nsw i32 [[TMP9]], 1
 // CHECK3-NEXT:    [[TMP10:%.*]] = zext i32 [[ADD4]] to i64
 // CHECK3-NEXT:    [[TMP11:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP11]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP11]], align 4
 // CHECK3-NEXT:    [[TMP12:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 1, ptr [[TMP12]], align 4
 // CHECK3-NEXT:    [[TMP13:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1820,7 +1820,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[TMP5:%.*]] = getelementptr inbounds [1 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP6:%.*]] = getelementptr inbounds [1 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP7:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP7]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP7]], align 4
 // CHECK3-NEXT:    [[TMP8:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 1, ptr [[TMP8]], align 4
 // CHECK3-NEXT:    [[TMP9:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
diff --git a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
index 800a002e439680..567d12119db218 100644
--- a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
@@ -542,7 +542,7 @@ int main() {
 // CHECK9-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP25]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP25]], align 4
 // CHECK9-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 5, ptr [[TMP26]], align 4
 // CHECK9-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -838,7 +838,7 @@ int main() {
 // CHECK9-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP20]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP20]], align 4
 // CHECK9-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP21]], align 4
 // CHECK9-NEXT:    [[TMP22:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1207,7 +1207,7 @@ int main() {
 // CHECK11-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK11-NEXT:    store i32 2, ptr [[TMP25]], align 4
+// CHECK11-NEXT:    store i32 3, ptr [[TMP25]], align 4
 // CHECK11-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK11-NEXT:    store i32 5, ptr [[TMP26]], align 4
 // CHECK11-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1501,7 +1501,7 @@ int main() {
 // CHECK11-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK11-NEXT:    store i32 2, ptr [[TMP20]], align 4
+// CHECK11-NEXT:    store i32 3, ptr [[TMP20]], align 4
 // CHECK11-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK11-NEXT:    store i32 4, ptr [[TMP21]], align 4
 // CHECK11-NEXT:    [[TMP22:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
diff --git a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
index 772372076e9477..5d1a12d27145e0 100644
--- a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
@@ -527,7 +527,7 @@ int main() {
 // CHECK9-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP25]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP25]], align 4
 // CHECK9-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 5, ptr [[TMP26]], align 4
 // CHECK9-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -841,7 +841,7 @@ int main() {
 // CHECK9-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP20]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP20]], align 4
 // CHECK9-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP21]], align 4
 // CHECK9-NEXT:    [[TMP22:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1229,7 +1229,7 @@ int main() {
 // CHECK11-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK11-NEXT:    store i32 2, ptr [[TMP25]], align 4
+// CHECK11-NEXT:    store i32 3, ptr [[TMP25]], align 4
 // CHECK11-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK11-NEXT:    store i32 5, ptr [[TMP26]], align 4
 // CHECK11-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1541,7 +1541,7 @@ int main() {
 // CHECK11-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK11-NEXT:    store i32 2, ptr [[TMP20]], align 4
+// CHECK11-NEXT:    store i32 3, ptr [[TMP20]], align 4
 // CHECK11-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK11-NEXT:    store i32 4, ptr [[TMP21]], align 4
 // CHECK11-NEXT:    [[TMP22:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
diff --git a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
index 95adefa8020f62..d5d37725988bfa 100644
--- a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
@@ -4372,7 +4372,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD:%.*]] = add nsw i32 [[TMP21]], 1
 // CHECK9-NEXT:    [[TMP22:%.*]] = zext i32 [[ADD]] to i64
 // CHECK9-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP23]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP23]], align 4
 // CHECK9-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP24]], align 4
 // CHECK9-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -4447,7 +4447,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD13:%.*]] = add nsw i32 [[TMP59]], 1
 // CHECK9-NEXT:    [[TMP60:%.*]] = zext i32 [[ADD13]] to i64
 // CHECK9-NEXT:    [[TMP61:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP61]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP61]], align 4
 // CHECK9-NEXT:    [[TMP62:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP62]], align 4
 // CHECK9-NEXT:    [[TMP63:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 2
@@ -4531,7 +4531,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD27:%.*]] = add nsw i32 [[TMP102]], 1
 // CHECK9-NEXT:    [[TMP103:%.*]] = zext i32 [[ADD27]] to i64
 // CHECK9-NEXT:    [[TMP104:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS28]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP104]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP104]], align 4
 // CHECK9-NEXT:    [[TMP105:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS28]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 5, ptr [[TMP105]], align 4
 // CHECK9-NEXT:    [[TMP106:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS28]], i32 0, i32 2
@@ -4606,7 +4606,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD41:%.*]] = add nsw i32 [[TMP140]], 1
 // CHECK9-NEXT:    [[TMP141:%.*]] = zext i32 [[ADD41]] to i64
 // CHECK9-NEXT:    [[TMP142:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS42]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP142]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP142]], align 4
 // CHECK9-NEXT:    [[TMP143:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS42]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP143]], align 4
 // CHECK9-NEXT:    [[TMP144:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS42]], i32 0, i32 2
@@ -4690,7 +4690,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD56:%.*]] = add nsw i32 [[TMP183]], 1
 /...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 15, 2024

@llvm/pr-subscribers-clang

Author: None (dhruvachak)

Changes

A kernel implicit parameter (dyn_ptr) was introduced some time back. This patch increments the kernel args version for a compiler supporting dyn_ptr. The version will be used by the runtime to determine whether the implicit parameter is generated by the compiler. The versioning is required to support use cases where code generated by an older compiler is linked with a newer runtime.

If approved, this patch should be backported to release 18.


Patch is 1.13 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85363.diff

139 Files Affected:

  • (modified) clang/test/OpenMP/bug60602.cpp (+2-2)
  • (modified) clang/test/OpenMP/distribute_codegen.cpp (+10-10)
  • (modified) clang/test/OpenMP/distribute_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_if_codegen.cpp (+8-8)
  • (modified) clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_num_threads_codegen.cpp (+24-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_if_codegen.cpp (+32-32)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_num_threads_codegen.cpp (+24-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/distribute_private_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/distribute_simd_codegen.cpp (+20-20)
  • (modified) clang/test/OpenMP/distribute_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/distribute_simd_private_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/distribute_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/map_struct_ordering.cpp (+1-1)
  • (modified) clang/test/OpenMP/reduction_implicit_map.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_codegen_global_capture.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_firstprivate_codegen.cpp (+24-24)
  • (modified) clang/test/OpenMP/target_has_device_addr_codegen.cpp (+15-15)
  • (modified) clang/test/OpenMP/target_has_device_addr_codegen_01.cpp (+2-2)
  • (modified) clang/test/OpenMP/target_is_device_ptr_codegen.cpp (+44-44)
  • (modified) clang/test/OpenMP/target_map_codegen_03.cpp (+2-2)
  • (modified) clang/test/OpenMP/target_map_codegen_hold.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_map_deref_array_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_map_member_expr_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_offload_mandatory_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_ompx_dyn_cgroup_mem_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_parallel_codegen.cpp (+14-14)
  • (modified) clang/test/OpenMP/target_parallel_for_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/target_parallel_for_simd_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/target_parallel_generic_loop_codegen-1.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_parallel_generic_loop_codegen-2.cpp (+2-2)
  • (modified) clang/test/OpenMP/target_parallel_generic_loop_uses_allocators_codegen.cpp (+1-1)
  • (modified) clang/test/OpenMP/target_parallel_if_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_parallel_num_threads_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_task_affinity_codegen.cpp (+2-2)
  • (modified) clang/test/OpenMP/target_teams_codegen.cpp (+22-22)
  • (modified) clang/test/OpenMP/target_teams_distribute_codegen.cpp (+14-14)
  • (modified) clang/test/OpenMP/target_teams_distribute_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_distribute_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_schedule_codegen.cpp (+60-60)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp (+24-24)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_schedule_codegen.cpp (+60-60)
  • (modified) clang/test/OpenMP/target_teams_distribute_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_reduction_codegen.cpp (+40-40)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_codegen.cpp (+28-28)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_codegen-1.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp (+5-5)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_uses_allocators_codegen.cpp (+1-1)
  • (modified) clang/test/OpenMP/target_teams_map_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/target_teams_num_teams_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/target_teams_thread_limit_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_codegen.cpp (+20-20)
  • (modified) clang/test/OpenMP/teams_distribute_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_distribute_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_copyin_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_if_codegen.cpp (+8-8)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_num_threads_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_schedule_codegen.cpp (+60-60)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp (+32-32)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_num_threads_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_proc_bind_codegen.cpp (+3-3)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_schedule_codegen.cpp (+60-60)
  • (modified) clang/test/OpenMP/teams_distribute_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_simd_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_simd_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_distribute_simd_dist_schedule_codegen.cpp (+18-18)
  • (modified) clang/test/OpenMP/teams_distribute_simd_firstprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_simd_lastprivate_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_simd_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_distribute_simd_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_firstprivate_codegen.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_generic_loop_codegen-1.cpp (+12-12)
  • (modified) clang/test/OpenMP/teams_generic_loop_collapse_codegen.cpp (+6-6)
  • (modified) clang/test/OpenMP/teams_generic_loop_private_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_generic_loop_reduction_codegen.cpp (+4-4)
  • (modified) clang/test/OpenMP/teams_private_codegen.cpp (+10-10)
  • (modified) llvm/include/llvm/Frontend/OpenMP/OMPConstants.h (+4-1)
  • (modified) openmp/libomptarget/plugins-nextgen/common/include/PluginInterface.h (+2-2)
  • (modified) openmp/libomptarget/plugins-nextgen/common/src/PluginInterface.cpp (+13-8)
  • (modified) openmp/libomptarget/src/interface.cpp (+13-3)
  • (modified) openmp/libomptarget/src/omptarget.cpp (+2-1)
diff --git a/clang/test/OpenMP/bug60602.cpp b/clang/test/OpenMP/bug60602.cpp
index 3ecc70cab778a1..d3aed5d02c1d06 100644
--- a/clang/test/OpenMP/bug60602.cpp
+++ b/clang/test/OpenMP/bug60602.cpp
@@ -95,7 +95,7 @@ int kernel_within_loop(int *a, int *b, int N, int num_iters) {
 // CHECK-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [3 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [3 x i64], ptr [[DOTOFFLOAD_SIZES]], i32 0, i32 0
 // CHECK-NEXT:    [[TMP28:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK-NEXT:    store i32 2, ptr [[TMP28]], align 4
+// CHECK-NEXT:    store i32 3, ptr [[TMP28]], align 4
 // CHECK-NEXT:    [[TMP29:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK-NEXT:    store i32 3, ptr [[TMP29]], align 4
 // CHECK-NEXT:    [[TMP30:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -181,7 +181,7 @@ int kernel_within_loop(int *a, int *b, int N, int num_iters) {
 // CHECK-NEXT:    [[ADD:%.*]] = add i32 [[TMP71]], 1
 // CHECK-NEXT:    [[TMP72:%.*]] = zext i32 [[ADD]] to i64
 // CHECK-NEXT:    [[TMP73:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 0
-// CHECK-NEXT:    store i32 2, ptr [[TMP73]], align 4
+// CHECK-NEXT:    store i32 3, ptr [[TMP73]], align 4
 // CHECK-NEXT:    [[TMP74:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 1
 // CHECK-NEXT:    store i32 3, ptr [[TMP74]], align 4
 // CHECK-NEXT:    [[TMP75:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 2
diff --git a/clang/test/OpenMP/distribute_codegen.cpp b/clang/test/OpenMP/distribute_codegen.cpp
index 31ec6ff911905e..34d14c89fedaed 100644
--- a/clang/test/OpenMP/distribute_codegen.cpp
+++ b/clang/test/OpenMP/distribute_codegen.cpp
@@ -163,7 +163,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK1-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK1-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -354,7 +354,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK1-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK1-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -545,7 +545,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK1-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK1-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -744,7 +744,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[ADD4:%.*]] = add nsw i32 [[TMP9]], 1
 // CHECK1-NEXT:    [[TMP10:%.*]] = zext i32 [[ADD4]] to i64
 // CHECK1-NEXT:    [[TMP11:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP11]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP11]], align 4
 // CHECK1-NEXT:    [[TMP12:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 1, ptr [[TMP12]], align 4
 // CHECK1-NEXT:    [[TMP13:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -911,7 +911,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK1-NEXT:    [[TMP5:%.*]] = getelementptr inbounds [1 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP6:%.*]] = getelementptr inbounds [1 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK1-NEXT:    [[TMP7:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK1-NEXT:    store i32 2, ptr [[TMP7]], align 4
+// CHECK1-NEXT:    store i32 3, ptr [[TMP7]], align 4
 // CHECK1-NEXT:    [[TMP8:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK1-NEXT:    store i32 1, ptr [[TMP8]], align 4
 // CHECK1-NEXT:    [[TMP9:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1084,7 +1084,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK3-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK3-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1271,7 +1271,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK3-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK3-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1458,7 +1458,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[TMP16:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP17:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP18]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP18]], align 4
 // CHECK3-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 4, ptr [[TMP19]], align 4
 // CHECK3-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1653,7 +1653,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[ADD4:%.*]] = add nsw i32 [[TMP9]], 1
 // CHECK3-NEXT:    [[TMP10:%.*]] = zext i32 [[ADD4]] to i64
 // CHECK3-NEXT:    [[TMP11:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP11]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP11]], align 4
 // CHECK3-NEXT:    [[TMP12:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 1, ptr [[TMP12]], align 4
 // CHECK3-NEXT:    [[TMP13:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1820,7 +1820,7 @@ int fint(void) { return ftemplate<int>(); }
 // CHECK3-NEXT:    [[TMP5:%.*]] = getelementptr inbounds [1 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP6:%.*]] = getelementptr inbounds [1 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK3-NEXT:    [[TMP7:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK3-NEXT:    store i32 2, ptr [[TMP7]], align 4
+// CHECK3-NEXT:    store i32 3, ptr [[TMP7]], align 4
 // CHECK3-NEXT:    [[TMP8:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK3-NEXT:    store i32 1, ptr [[TMP8]], align 4
 // CHECK3-NEXT:    [[TMP9:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
diff --git a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
index 800a002e439680..567d12119db218 100644
--- a/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_firstprivate_codegen.cpp
@@ -542,7 +542,7 @@ int main() {
 // CHECK9-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP25]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP25]], align 4
 // CHECK9-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 5, ptr [[TMP26]], align 4
 // CHECK9-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -838,7 +838,7 @@ int main() {
 // CHECK9-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP20]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP20]], align 4
 // CHECK9-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP21]], align 4
 // CHECK9-NEXT:    [[TMP22:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1207,7 +1207,7 @@ int main() {
 // CHECK11-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK11-NEXT:    store i32 2, ptr [[TMP25]], align 4
+// CHECK11-NEXT:    store i32 3, ptr [[TMP25]], align 4
 // CHECK11-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK11-NEXT:    store i32 5, ptr [[TMP26]], align 4
 // CHECK11-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1501,7 +1501,7 @@ int main() {
 // CHECK11-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK11-NEXT:    store i32 2, ptr [[TMP20]], align 4
+// CHECK11-NEXT:    store i32 3, ptr [[TMP20]], align 4
 // CHECK11-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK11-NEXT:    store i32 4, ptr [[TMP21]], align 4
 // CHECK11-NEXT:    [[TMP22:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
diff --git a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
index 772372076e9477..5d1a12d27145e0 100644
--- a/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
+++ b/clang/test/OpenMP/distribute_lastprivate_codegen.cpp
@@ -527,7 +527,7 @@ int main() {
 // CHECK9-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP25]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP25]], align 4
 // CHECK9-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 5, ptr [[TMP26]], align 4
 // CHECK9-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -841,7 +841,7 @@ int main() {
 // CHECK9-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK9-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP20]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP20]], align 4
 // CHECK9-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP21]], align 4
 // CHECK9-NEXT:    [[TMP22:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1229,7 +1229,7 @@ int main() {
 // CHECK11-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [5 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK11-NEXT:    store i32 2, ptr [[TMP25]], align 4
+// CHECK11-NEXT:    store i32 3, ptr [[TMP25]], align 4
 // CHECK11-NEXT:    [[TMP26:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK11-NEXT:    store i32 5, ptr [[TMP26]], align 4
 // CHECK11-NEXT:    [[TMP27:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -1541,7 +1541,7 @@ int main() {
 // CHECK11-NEXT:    [[TMP18:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_BASEPTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP19:%.*]] = getelementptr inbounds [4 x ptr], ptr [[DOTOFFLOAD_PTRS]], i32 0, i32 0
 // CHECK11-NEXT:    [[TMP20:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK11-NEXT:    store i32 2, ptr [[TMP20]], align 4
+// CHECK11-NEXT:    store i32 3, ptr [[TMP20]], align 4
 // CHECK11-NEXT:    [[TMP21:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK11-NEXT:    store i32 4, ptr [[TMP21]], align 4
 // CHECK11-NEXT:    [[TMP22:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
diff --git a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
index 95adefa8020f62..d5d37725988bfa 100644
--- a/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
+++ b/clang/test/OpenMP/distribute_parallel_for_codegen.cpp
@@ -4372,7 +4372,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD:%.*]] = add nsw i32 [[TMP21]], 1
 // CHECK9-NEXT:    [[TMP22:%.*]] = zext i32 [[ADD]] to i64
 // CHECK9-NEXT:    [[TMP23:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP23]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP23]], align 4
 // CHECK9-NEXT:    [[TMP24:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP24]], align 4
 // CHECK9-NEXT:    [[TMP25:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS]], i32 0, i32 2
@@ -4447,7 +4447,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD13:%.*]] = add nsw i32 [[TMP59]], 1
 // CHECK9-NEXT:    [[TMP60:%.*]] = zext i32 [[ADD13]] to i64
 // CHECK9-NEXT:    [[TMP61:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP61]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP61]], align 4
 // CHECK9-NEXT:    [[TMP62:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP62]], align 4
 // CHECK9-NEXT:    [[TMP63:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS14]], i32 0, i32 2
@@ -4531,7 +4531,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD27:%.*]] = add nsw i32 [[TMP102]], 1
 // CHECK9-NEXT:    [[TMP103:%.*]] = zext i32 [[ADD27]] to i64
 // CHECK9-NEXT:    [[TMP104:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS28]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP104]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP104]], align 4
 // CHECK9-NEXT:    [[TMP105:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS28]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 5, ptr [[TMP105]], align 4
 // CHECK9-NEXT:    [[TMP106:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS28]], i32 0, i32 2
@@ -4606,7 +4606,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD41:%.*]] = add nsw i32 [[TMP140]], 1
 // CHECK9-NEXT:    [[TMP141:%.*]] = zext i32 [[ADD41]] to i64
 // CHECK9-NEXT:    [[TMP142:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS42]], i32 0, i32 0
-// CHECK9-NEXT:    store i32 2, ptr [[TMP142]], align 4
+// CHECK9-NEXT:    store i32 3, ptr [[TMP142]], align 4
 // CHECK9-NEXT:    [[TMP143:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS42]], i32 0, i32 1
 // CHECK9-NEXT:    store i32 4, ptr [[TMP143]], align 4
 // CHECK9-NEXT:    [[TMP144:%.*]] = getelementptr inbounds [[STRUCT___TGT_KERNEL_ARGUMENTS]], ptr [[KERNEL_ARGS42]], i32 0, i32 2
@@ -4690,7 +4690,7 @@ int main() {
 // CHECK9-NEXT:    [[ADD56:%.*]] = add nsw i32 [[TMP183]], 1
 /...
[truncated]

@nikic nikic removed their request for review March 15, 2024 08:04
@jdoerfert
Copy link
Member

The versioning is required to support use cases where code generated by an older compiler is linked with a newer runtime.

Is that supported?

@dhruvachak
Copy link
Contributor Author

The versioning is required to support use cases where code generated by an older compiler is linked with a newer runtime.

Is that supported?

I think compatibility across released versions is not supported in upstream LLVM. But downstream, there are use cases. For example, classic-flang that does not generate the implicit parameter. While we maintained this change downstream for some time, it would be very helpful for maintenance if this could be merged upstream. The change in this patch is fairly simple and non-intrusive.

@jeffreysandoval
Copy link

The versioning is required to support use cases where code generated by an older compiler is linked with a newer runtime.

Is that supported?

I think compatibility across released versions is not supported in upstream LLVM. But downstream, there are use cases. For example, classic-flang that does not generate the implicit parameter. While we maintained this change downstream for some time, it would be very helpful for maintenance if this could be merged upstream. The change in this patch is fairly simple and non-intrusive.

We're hitting this with HPE's compiler/library, too, because adding the extra kernel argument without explicitly communicating it to libomptarget breaks the compiler-library __tgt ABI. I think a version increment is expected for a change like this. It's the kind of change that may not be a problem for long, assuming all LLVM-based compilers eventually follow the new policy of adding the extra argument. But, until that happens the transition can be very difficult for users that mix a variety of compilers. E.g., right now HPE's libcrayacc (which implements the __tgt interface) cannot "get it right" for all compilers of interest, since vanilla upstream expects the extra argument to be added but AMD ROCm does not. Incrementing the version allows the library to determine whether the argument needs to be added.

This seems like the most expedient/straightforward fix. Another option is to add an explicit KernelArgsTy::Flags field, though that probably requires a version increment, too (unless we're guaranteed that unused flags are always initialized to 0). Or, a different approach would be to add a new OMP_TGT_MAPTYPE enum value (e.g., OMP_TGT_MAPTYPE_TARGET_PARAM_DYN_PTR) that would allow the compiler to explicitly represent this extra kernel argument in the normal argument list. The library would add special-case handling to fill in the launch env pointer for this kernel argument. These options would be more implementation effort, so I propose moving forward with this patch first. But if there is interest in these other approaches, they could be investigated later. These mechanisms could be used to avoid the need for the isCtorOrDtor() check before adding the extra kernel argument in libomptarget (i.e., the launch of a Ctor or Dtor kernel could explicitly specify that a kernel launch env is not needed).

Copy link
Member

@jdoerfert jdoerfert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG.

@dhruvachak dhruvachak merged commit b5d02bb into llvm:main Mar 19, 2024
4 checks passed
chencha3 pushed a commit to chencha3/llvm-project that referenced this pull request Mar 23, 2024
… dyn_ptr. (llvm#85363)

A kernel implicit parameter (dyn_ptr) was introduced some time back.
This patch increments the kernel args version for a compiler supporting
dyn_ptr. The version will be used by the runtime to determine whether
the implicit parameter is generated by the compiler. The versioning is
required to support use cases where code generated by an older compiler
is linked with a newer runtime.

If approved, this patch should be backported to release 18.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category flang:openmp openmp:libomptarget OpenMP offload runtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants