Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8309130: x86_64 AVX512 intrinsics for Arrays.sort methods (int, long, float and double arrays) #14227

Closed
wants to merge 45 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
e98e5ef
8309130: x86_64 AVX512 intrinsics for Arrays.sort methods (int, long,…
vamsi-parasa May 30, 2023
923a7ca
remove libstdc++
vamsi-parasa May 30, 2023
6d140d5
Merge branch 'master' of https://git.openjdk.java.net/jdk into avx512…
vamsi-parasa May 30, 2023
30a50d9
fix license
vamsi-parasa Jun 1, 2023
a7c2b6e
Update test/micro/org/openjdk/bench/java/util/ArraysSort.java
vamsi-parasa Jun 1, 2023
1dc9589
fix license in one file
vamsi-parasa Jun 1, 2023
3bd12ec
Merge branch 'openjdk:master' into avx512sort
vamsi-parasa Jun 8, 2023
53a5309
replace multiple intrinsics with one general intrinsic
vamsi-parasa Jun 23, 2023
25fa86e
merge master
vamsi-parasa Jun 23, 2023
2bd0419
minor cleanups
vamsi-parasa Jun 27, 2023
e09c050
change API to enable MemorySegment
vamsi-parasa Jul 25, 2023
5eac7b3
update arraySort docstring
vamsi-parasa Jul 25, 2023
240fde1
add special cases to float and double arrays
vamsi-parasa Jul 25, 2023
17b5127
Update src/java.base/share/classes/java/util/Arrays.java
vamsi-parasa Aug 1, 2023
a2e14d4
fix arraySort API and fastdebug issue
vamsi-parasa Aug 4, 2023
7065f1c
moved stubroutines definitions to vmStructs_jvmci.cpp
vamsi-parasa Aug 4, 2023
37f3c52
Update avx512 sort, benchmarks, shenandoahSupport
vamsi-parasa Aug 4, 2023
e0ffc81
More avx512 sort cleanups
vamsi-parasa Aug 4, 2023
13f4aaf
Change name from libavx512_x86_64 to libx86_64
vamsi-parasa Aug 4, 2023
c49657e
change names from avx512 to x86_64
vamsi-parasa Aug 7, 2023
5846799
Fix signature for Shenandoah support
vamsi-parasa Aug 11, 2023
07349ec
Fix preservation of NaNs for floats and doubles
vamsi-parasa Aug 15, 2023
9153059
Decomposed DPQS using AVX512 partitioning and AVX512 sort (for small …
vamsi-parasa Aug 22, 2023
8b80b80
Update avx512-common-qsort.h
vamsi-parasa Aug 23, 2023
96cdd19
Update copyright for DPQS.java; replace avx512 pivot calculation with…
vamsi-parasa Aug 23, 2023
5173849
add parallelSort benchmarking
vamsi-parasa Aug 23, 2023
df17b3e
Fix unused assignment in DPQS.java and space in Arrays.java
vamsi-parasa Aug 24, 2023
f3b5fcf
Move sort and partition intrinsics from Arrays.java to DPQS.java
vamsi-parasa Aug 25, 2023
e44f11a
Remove unnecessary import in Arrays.java
vamsi-parasa Aug 25, 2023
9642d85
Clean up parameters passed to arrayPartition; update the check to loa…
vamsi-parasa Aug 28, 2023
bbec6bf
Merge branch 'master' of https://git.openjdk.org/jdk into avx512sort
vamsi-parasa Aug 31, 2023
1746eed
update build script
vamsi-parasa Aug 31, 2023
a0f006b
Update make/modules/java.base/Lib.gmk
vamsi-parasa Aug 31, 2023
0ec5f52
Change name of the avxsort library to libx86_64_sort
vamsi-parasa Aug 31, 2023
c096ff6
Fix regression when intrinsics are disabled; enable insertion sort in…
vamsi-parasa Sep 8, 2023
ed8b95c
Refactor stub handling to use a generic function for all types
vamsi-parasa Sep 12, 2023
172b2d3
Refactor the sort and partition intrinsics to accept method reference…
vamsi-parasa Sep 13, 2023
e63a2aa
Move functional interfaces close to the associated methods
vamsi-parasa Sep 15, 2023
7fc1afa
Remove the unnecessary exception in single pivot partitioning fallbac…
vamsi-parasa Sep 18, 2023
bf41d2a
Rename arraySort and arrayPartition Java methods to sort and partitio…
vamsi-parasa Sep 19, 2023
3e0b8cf
Update DualPivotQuicksort.java
vamsi-parasa Sep 19, 2023
b04cb6c
change variable names of indexPivot* to pivotIndex*
vamsi-parasa Sep 20, 2023
dbf4332
Update CompileThresholdScaling only for the sort and partition intrin…
vamsi-parasa Sep 22, 2023
f11d6b4
Merge branch 'master' of https://git.openjdk.java.net/jdk into avx512…
vamsi-parasa Oct 5, 2023
a5262d8
fix code style and formatting
vamsi-parasa Oct 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
21 changes: 21 additions & 0 deletions make/modules/java.base/Lib.gmk
vamsi-parasa marked this conversation as resolved.
Show resolved Hide resolved
Expand Up @@ -230,3 +230,24 @@ ifeq ($(ENABLE_FALLBACK_LINKER), true)

TARGETS += $(BUILD_LIBFALLBACKLINKER)
endif

################################################################################

ifeq ($(call isTargetOs, linux)+$(call isTargetCpu, x86_64)+$(INCLUDE_COMPILER2), true+true+true)
vamsi-parasa marked this conversation as resolved.
Show resolved Hide resolved
$(eval $(call SetupJdkLibrary, BUILD_LIBAVX512_X86_64, \
NAME := avx512_x86_64, \
OPTIMIZATION := HIGH, \
CFLAGS := $(CFLAGS_JDKLIB) -mavx512f -mavx512dq, \
CXXFLAGS := $(CXXFLAGS_JDKLIB) -mavx512f -mavx512dq, \
LDFLAGS := $(LDFLAGS_JDKLIB) \
$(call SET_SHARED_LIBRARY_ORIGIN), \
LDFLAGS_linux := -Wl$(COMMA)--no-as-needed, \
vamsi-parasa marked this conversation as resolved.
Show resolved Hide resolved
LDFLAGS_windows := -defaultlib:msvcrt, \
LIBS := $(LIBCXX), \
LIBS_linux := -lc -lm -ldl, \
))

TARGETS += $(BUILD_LIBAVX512_X86_64)
endif

################################################################################
26 changes: 26 additions & 0 deletions src/hotspot/cpu/x86/stubGenerator_x86_64.cpp
Expand Up @@ -4126,6 +4126,32 @@ void StubGenerator::generate_compiler_stubs() {
= CAST_FROM_FN_PTR(address, SharedRuntime::montgomery_square);
}

// Get avx512 sort stub routine addresses
void *libavx512_x86_64 = nullptr;
char ebuf_avx512[1024];
char dll_name_avx512[JVM_MAXPATHLEN];
if (os::dll_locate_lib(dll_name_avx512, sizeof(dll_name_avx512), Arguments::get_dll_dir(), "avx512_x86_64")) {
libavx512_x86_64 = os::dll_load(dll_name_avx512, ebuf_avx512, sizeof ebuf_avx512);
}
if (libavx512_x86_64 != nullptr) {
log_info(library)("Loaded library %s, handle " INTPTR_FORMAT, JNI_LIB_PREFIX "avx512_x86_64" JNI_LIB_SUFFIX, p2i(libavx512_x86_64));

if (UseAVX > 2 && VM_Version::supports_avx512dq()) {
vamsi-parasa marked this conversation as resolved.
Show resolved Hide resolved
vamsi-parasa marked this conversation as resolved.
Show resolved Hide resolved

snprintf(ebuf_avx512, sizeof(ebuf_avx512), "avx512_sort_int");
StubRoutines::_arraysort_int = (address)os::dll_lookup(libavx512_x86_64, ebuf_avx512);

snprintf(ebuf_avx512, sizeof(ebuf_avx512), "avx512_sort_long");
StubRoutines::_arraysort_long = (address)os::dll_lookup(libavx512_x86_64, ebuf_avx512);

snprintf(ebuf_avx512, sizeof(ebuf_avx512), "avx512_sort_float");
StubRoutines::_arraysort_float = (address)os::dll_lookup(libavx512_x86_64, ebuf_avx512);

snprintf(ebuf_avx512, sizeof(ebuf_avx512), "avx512_sort_double");
StubRoutines::_arraysort_double = (address)os::dll_lookup(libavx512_x86_64, ebuf_avx512);
}
}

// Get svml stub routine addresses
void *libjsvml = nullptr;
char ebuf[1024];
Expand Down
10 changes: 10 additions & 0 deletions src/hotspot/share/classfile/vmIntrinsics.hpp
Expand Up @@ -341,6 +341,16 @@ class methodHandle;
do_name( copyOf_name, "copyOf") \
do_signature(copyOf_signature, "([Ljava/lang/Object;ILjava/lang/Class;)[Ljava/lang/Object;") \
\
do_intrinsic(_arraySortI, java_util_Arrays, arraySort_name, arraySortI_signature, F_S) \
do_name( arraySort_name, "arraySort") \
do_signature(arraySortI_signature, "([III)V") \
do_intrinsic(_arraySortL, java_util_Arrays, arraySort_name, arraySortL_signature, F_S) \
do_signature(arraySortL_signature, "([JII)V") \
do_intrinsic(_arraySortF, java_util_Arrays, arraySort_name, arraySortF_signature, F_S) \
do_signature(arraySortF_signature, "([FII)V") \
do_intrinsic(_arraySortD, java_util_Arrays, arraySort_name, arraySortD_signature, F_S) \
do_signature(arraySortD_signature, "([DII)V") \
\
do_intrinsic(_copyOfRange, java_util_Arrays, copyOfRange_name, copyOfRange_signature, F_S) \
do_name( copyOfRange_name, "copyOfRange") \
do_signature(copyOfRange_signature, "([Ljava/lang/Object;IILjava/lang/Class;)[Ljava/lang/Object;") \
Expand Down
4 changes: 4 additions & 0 deletions src/hotspot/share/opto/c2compiler.cpp
Expand Up @@ -575,6 +575,10 @@ bool C2Compiler::is_intrinsic_supported(const methodHandle& method) {
case vmIntrinsics::_min_strict:
case vmIntrinsics::_max_strict:
case vmIntrinsics::_arraycopy:
case vmIntrinsics::_arraySortI:
case vmIntrinsics::_arraySortL:
case vmIntrinsics::_arraySortF:
case vmIntrinsics::_arraySortD:
case vmIntrinsics::_indexOfL:
case vmIntrinsics::_indexOfU:
case vmIntrinsics::_indexOfUL:
Expand Down
59 changes: 59 additions & 0 deletions src/hotspot/share/opto/library_call.cpp
Expand Up @@ -292,6 +292,11 @@ bool LibraryCallKit::try_to_inline(int predicate) {

case vmIntrinsics::_arraycopy: return inline_arraycopy();

case vmIntrinsics::_arraySortI:
case vmIntrinsics::_arraySortL:
case vmIntrinsics::_arraySortF:
case vmIntrinsics::_arraySortD: return inline_arraysort(intrinsic_id());

case vmIntrinsics::_compareToL: return inline_string_compareTo(StrIntrinsicNode::LL);
case vmIntrinsics::_compareToU: return inline_string_compareTo(StrIntrinsicNode::UU);
case vmIntrinsics::_compareToLU: return inline_string_compareTo(StrIntrinsicNode::LU);
Expand Down Expand Up @@ -5192,6 +5197,60 @@ void LibraryCallKit::create_new_uncommon_trap(CallStaticJavaNode* uncommon_trap_
uncommon_trap_call->set_req(0, top()); // not used anymore, kill it
}

//------------------------------inline_arraysort-----------------------
bool LibraryCallKit::inline_arraysort(vmIntrinsics::ID id) {

address stubAddr = nullptr;
vamsi-parasa marked this conversation as resolved.
Show resolved Hide resolved
const char *stubName;
stubName = "arraysort_stub";
BasicType bt;

switch(id) {
case vmIntrinsics::_arraySortI:
bt = T_INT;
break;
case vmIntrinsics::_arraySortL:
bt = T_LONG;
break;
case vmIntrinsics::_arraySortF:
bt = T_FLOAT;
break;
case vmIntrinsics::_arraySortD:
bt = T_DOUBLE;
break;
default:
break;
}

stubAddr = StubRoutines::select_arraysort_function(bt);
if (stubAddr == nullptr) return false;

Node* array = argument(0);
Node* fromIndex = argument(1);
Node* toIndex = argument(2);

array = must_be_not_null(array, true);

const TypeAryPtr* array_type = array->Value(&_gvn)->isa_aryptr();
assert(array_type != nullptr && array_type->elem() != Type::BOTTOM, "args are strange");

// for the quick and dirty code we will skip all the checks.
// we are just trying to get the call to be generated.
Node* array_fromIndex = array;
if (fromIndex != nullptr || toIndex != nullptr) {
assert(fromIndex != nullptr && toIndex != nullptr, "");
array_fromIndex = array_element_address(array, fromIndex, bt);
}

// Call the stub.
make_runtime_call(RC_LEAF|RC_NO_FP, OptoRuntime::array_sort_Type(),
stubAddr, stubName, TypePtr::BOTTOM,
array_fromIndex, fromIndex, toIndex);

return true;
}


//------------------------------inline_arraycopy-----------------------
// public static native void java.lang.System.arraycopy(Object src, int srcPos,
// Object dest, int destPos,
Expand Down
2 changes: 1 addition & 1 deletion src/hotspot/share/opto/library_call.hpp
Expand Up @@ -279,7 +279,7 @@ class LibraryCallKit : public GraphKit {
JVMState* arraycopy_restore_alloc_state(AllocateArrayNode* alloc, int& saved_reexecute_sp);
void arraycopy_move_allocation_here(AllocateArrayNode* alloc, Node* dest, JVMState* saved_jvms_before_guards, int saved_reexecute_sp,
uint new_idx);

bool inline_arraysort(vmIntrinsics::ID id);
typedef enum { LS_get_add, LS_get_set, LS_cmp_swap, LS_cmp_swap_weak, LS_cmp_exchange } LoadStoreKind;
bool inline_unsafe_load_store(BasicType type, LoadStoreKind kind, AccessKind access_kind);
bool inline_unsafe_fence(vmIntrinsics::ID id);
Expand Down
19 changes: 19 additions & 0 deletions src/hotspot/share/opto/runtime.cpp
Expand Up @@ -857,6 +857,25 @@ const TypeFunc* OptoRuntime::array_fill_Type() {
return TypeFunc::make(domain, range);
}

const TypeFunc* OptoRuntime::array_sort_Type() {
// create input type (domain)
int num_args = 3;
int argcnt = num_args;
const Type** fields = TypeTuple::fields(argcnt);
int argp = TypeFunc::Parms;
fields[argp++] = TypePtr::NOTNULL; // array(fromIndex)
fields[argp++] = TypeInt::INT; // fromIndex
fields[argp++] = TypeInt::INT; // toIndex
assert(argp == TypeFunc::Parms+argcnt, "correct decoding");
const TypeTuple* domain = TypeTuple::make(TypeFunc::Parms+argcnt, fields);

// no result type needed
fields = TypeTuple::fields(1);
fields[TypeFunc::Parms+0] = nullptr; // void
const TypeTuple* range = TypeTuple::make(TypeFunc::Parms, fields);
return TypeFunc::make(domain, range);
}

// for aescrypt encrypt/decrypt operations, just three pointers returning void (length is constant)
const TypeFunc* OptoRuntime::aescrypt_block_Type() {
// create input type (domain)
Expand Down
1 change: 1 addition & 0 deletions src/hotspot/share/opto/runtime.hpp
Expand Up @@ -268,6 +268,7 @@ class OptoRuntime : public AllStatic {

static const TypeFunc* array_fill_Type();

static const TypeFunc* array_sort_Type();
static const TypeFunc* aescrypt_block_Type();
static const TypeFunc* cipherBlockChaining_aescrypt_Type();
static const TypeFunc* electronicCodeBook_aescrypt_Type();
Expand Down
17 changes: 17 additions & 0 deletions src/hotspot/share/runtime/stubRoutines.cpp
Expand Up @@ -175,6 +175,11 @@ address StubRoutines::_hf2f = nullptr;
address StubRoutines::_vector_f_math[VectorSupport::NUM_VEC_SIZES][VectorSupport::NUM_SVML_OP] = {{nullptr}, {nullptr}};
address StubRoutines::_vector_d_math[VectorSupport::NUM_VEC_SIZES][VectorSupport::NUM_SVML_OP] = {{nullptr}, {nullptr}};

address StubRoutines::_arraysort_int = nullptr;
address StubRoutines::_arraysort_long = nullptr;
address StubRoutines::_arraysort_float = nullptr;
address StubRoutines::_arraysort_double = nullptr;

address StubRoutines::_cont_thaw = nullptr;
address StubRoutines::_cont_returnBarrier = nullptr;
address StubRoutines::_cont_returnBarrierExc = nullptr;
Expand Down Expand Up @@ -647,3 +652,15 @@ UnsafeCopyMemoryMark::~UnsafeCopyMemoryMark() {
}
}
}

address StubRoutines::select_arraysort_function(BasicType t) {
switch(t) {
case T_INT: return _arraysort_int;
case T_LONG: return _arraysort_long;
case T_FLOAT: return _arraysort_float;
case T_DOUBLE: return _arraysort_double;
default:
ShouldNotReachHere();
return nullptr;
}
}
5 changes: 5 additions & 0 deletions src/hotspot/share/runtime/stubRoutines.hpp
Expand Up @@ -153,6 +153,10 @@ class StubRoutines: AllStatic {
static BufferBlob* _compiler_stubs_code; // code buffer for C2 intrinsics
static BufferBlob* _final_stubs_code; // code buffer for all other routines

static address _arraysort_int;
static address _arraysort_long;
static address _arraysort_float;
static address _arraysort_double;
// Leaf routines which implement arraycopy and their addresses
// arraycopy operands aligned on element type boundary
static address _jbyte_arraycopy;
Expand Down Expand Up @@ -372,6 +376,7 @@ class StubRoutines: AllStatic {
static UnsafeArrayCopyStub UnsafeArrayCopy_stub() { return CAST_TO_FN_PTR(UnsafeArrayCopyStub, _unsafe_arraycopy); }

static address generic_arraycopy() { return _generic_arraycopy; }
static address select_arraysort_function(BasicType t);

static address jbyte_fill() { return _jbyte_fill; }
static address jshort_fill() { return _jshort_fill; }
Expand Down
4 changes: 4 additions & 0 deletions src/hotspot/share/runtime/vmStructs.cpp
Expand Up @@ -588,6 +588,10 @@
static_field(StubRoutines, _checkcast_arraycopy_uninit, address) \
static_field(StubRoutines, _unsafe_arraycopy, address) \
static_field(StubRoutines, _generic_arraycopy, address) \
static_field(StubRoutines, _arraysort_int, address) \
static_field(StubRoutines, _arraysort_long, address) \
static_field(StubRoutines, _arraysort_float, address) \
static_field(StubRoutines, _arraysort_double, address) \
vamsi-parasa marked this conversation as resolved.
Show resolved Hide resolved
\
/*****************/ \
/* SharedRuntime */ \
Expand Down