Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
157286d
JDK-8323582
eme64 Nov 11, 2024
ef81708
more parts
eme64 Nov 11, 2024
09dffff
keep predicates until after superword
eme64 Nov 12, 2024
b7a9e11
deopt check for alignment
eme64 Nov 12, 2024
9804647
create_loop_nest only add the auto_vectorization_check once per bci
eme64 Nov 12, 2024
43501db
parse auto_vectorization_parse_predicate_proj in VLoop
eme64 Nov 12, 2024
4c34402
apply_speculative_runtime_checks
eme64 Nov 12, 2024
527473c
refactor add_speculative_alignment_check
eme64 Nov 12, 2024
c69687b
some TODO's
eme64 Nov 12, 2024
fd7940c
PhaseIdealLoop::maybe_multiversion_for_auto_vectorization_runtime_che…
eme64 Nov 13, 2024
4e434bc
refactor unswitching for multiversioning
eme64 Nov 14, 2024
2b6ee5a
small refactor
eme64 Nov 14, 2024
b6eef5e
add in multiversioning, with cond one
eme64 Nov 14, 2024
56a7c7b
add opaque node
eme64 Nov 14, 2024
6e05a30
fix type issues, now multiversions recursively
eme64 Nov 15, 2024
f701569
add multiversion flags
eme64 Nov 15, 2024
1d5c146
stall the stalled_slow loop
eme64 Nov 15, 2024
e202aa6
find multiversion fast proj from VLoop
eme64 Nov 15, 2024
154eae5
prep for multiversion check addition
eme64 Nov 15, 2024
bef2e54
broken state before lunch
eme64 Nov 15, 2024
1ae75e9
fix last commits
eme64 Nov 15, 2024
3b19506
find multiversion opaque from slow_path
eme64 Nov 15, 2024
7ad7eef
unstalling mechanism
eme64 Nov 15, 2024
e2dfca3
block native memory addresses if speculation not possible
eme64 Nov 15, 2024
688665b
some descriptions
eme64 Nov 15, 2024
8832c9b
rename to OpaqueMultiversioning
eme64 Nov 15, 2024
814c965
rename unswitch --> multiversion
eme64 Nov 18, 2024
c1c3712
descriptions
eme64 Nov 18, 2024
6c37124
manual merge
eme64 Nov 18, 2024
9999ad1
more work in PhaseIdealLoop::do_multiversioning
eme64 Nov 19, 2024
fdcf314
more todo's fixed
eme64 Nov 19, 2024
bf589ed
no cfg for multiversioning
eme64 Nov 20, 2024
6ea8201
node budget
eme64 Nov 20, 2024
c21e37d
cleanup
eme64 Nov 20, 2024
0203bbe
run IGVN before SuperWord
eme64 Nov 21, 2024
c2a4bfd
fix assert in IdealLoopTree::policy_range_check
eme64 Nov 21, 2024
3d923f7
manual merge
eme64 Nov 21, 2024
3322250
add stub of test TestMemorySegmentUnalignedAddress.java
eme64 Nov 21, 2024
052c7ea
knarly manual merge
eme64 Jan 21, 2025
f03dd83
resolve merge issues in loopUnswitch.cpp
eme64 Jan 21, 2025
1f7d66d
fixed more from merge
eme64 Jan 21, 2025
be899f9
rm some code we no longer need after merge with MemPointer / VPointer
eme64 Jan 21, 2025
fe956ad
Fix JDK-8323582 TODO comments
eme64 Jan 21, 2025
d8ec0f0
rm old multiversion code
eme64 Jan 21, 2025
efe8ffa
comments for PhaseIdealLoop::create_new_if_for_multiversion
eme64 Jan 21, 2025
8590cbd
trace speculative runtime checks
eme64 Jan 21, 2025
e4d6752
more trace
eme64 Jan 21, 2025
349b42e
improve trace flags a little more
eme64 Jan 22, 2025
699b2aa
Merge branch 'master' into JDK-8323582-SW-native-alignment
eme64 Jan 22, 2025
f8b8cf5
better comments
eme64 Jan 22, 2025
65f3fbc
more comments about multiversioning
eme64 Jan 22, 2025
8181a84
more comments
eme64 Jan 22, 2025
cdccc33
rm TODO
eme64 Jan 22, 2025
891478a
refactor verify
eme64 Jan 22, 2025
bd09153
add Verify/AlignVector runs to test
eme64 Jan 22, 2025
6ee5b90
stub for slicing
eme64 Jan 22, 2025
9191052
test changed to unaligned ints
eme64 Jan 22, 2025
1bdece0
3 test versions
eme64 Jan 22, 2025
8eecab2
IR rules for all cases
eme64 Jan 22, 2025
68709f9
copyright and rm CFG check
eme64 Jan 22, 2025
bfa62b9
register opaque with igvn
eme64 Jan 23, 2025
c53985f
remove multiversion mark if we break the structure
eme64 Jan 23, 2025
a98ffab
Merge branch 'master' into JDK-8323582-SW-native-alignment
eme64 Feb 19, 2025
b3044bc
adjust selector if probability
eme64 Feb 20, 2025
23248f9
stall -> delay, plus some more comments
eme64 Feb 25, 2025
8eb5229
Merge branch 'master' into JDK-8323582-SW-native-alignment
eme64 Feb 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/hotspot/share/jvmci/vmStructs_jvmci.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -708,6 +708,7 @@
declare_constant(Deoptimization::Reason_constraint) \
declare_constant(Deoptimization::Reason_div0_check) \
declare_constant(Deoptimization::Reason_loop_limit_check) \
declare_constant(Deoptimization::Reason_auto_vectorization_check) \
declare_constant(Deoptimization::Reason_type_checked_inlining) \
declare_constant(Deoptimization::Reason_optimized_type_check) \
declare_constant(Deoptimization::Reason_aliasing) \
Expand Down
6 changes: 6 additions & 0 deletions src/hotspot/share/opto/c2_globals.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -346,6 +346,12 @@
develop(bool, TraceLoopUnswitching, false, \
"Trace loop unswitching") \
\
product(bool, LoopMultiversioning, true, DIAGNOSTIC, \
"Enable loop multiversioning (for speculative compilation)") \
\
develop(bool, TraceLoopMultiversioning, false, \
"Trace loop multiversioning") \
\
product(bool, AllowVectorizeOnDemand, true, \
"Globally suppress vectorization set in VectorizeMethod") \
\
Expand Down
2 changes: 1 addition & 1 deletion src/hotspot/share/opto/cfgnode.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -428,7 +428,7 @@ class IfNode : public MultiBranchNode {
IfNode(Node* control, Node* bol, float p, float fcnt);
IfNode(Node* control, Node* bol, float p, float fcnt, AssertionPredicateType assertion_predicate_type);

static IfNode* make_with_same_profile(IfNode* if_node_profile, Node* ctrl, BoolNode* bol);
static IfNode* make_with_same_profile(IfNode* if_node_profile, Node* ctrl, Node* bol);

virtual int Opcode() const;
virtual bool pinned() const { return true; }
Expand Down
1 change: 1 addition & 0 deletions src/hotspot/share/opto/classes.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,7 @@ macro(OnSpinWait)
macro(Opaque1)
macro(OpaqueLoopInit)
macro(OpaqueLoopStride)
macro(OpaqueMultiversioning)
macro(OpaqueZeroTripGuard)
macro(OpaqueNotNull)
macro(OpaqueInitializedAssertionPredicate)
Expand Down
1 change: 1 addition & 0 deletions src/hotspot/share/opto/graphKit.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4086,6 +4086,7 @@ void GraphKit::add_parse_predicates(int nargs) {
if (UseProfiledLoopPredicate) {
add_parse_predicate(Deoptimization::Reason_profile_predicate, nargs);
}
add_parse_predicate(Deoptimization::Reason_auto_vectorization_check, nargs);
// Loop Limit Check Predicate should be near the loop.
add_parse_predicate(Deoptimization::Reason_loop_limit_check, nargs);
}
Expand Down
6 changes: 5 additions & 1 deletion src/hotspot/share/opto/ifnode.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -469,7 +469,7 @@ static Node* split_if(IfNode *iff, PhaseIterGVN *igvn) {
return new ConINode(TypeInt::ZERO);
}

IfNode* IfNode::make_with_same_profile(IfNode* if_node_profile, Node* ctrl, BoolNode* bol) {
IfNode* IfNode::make_with_same_profile(IfNode* if_node_profile, Node* ctrl, Node* bol) {
// Assert here that we only try to create a clone from an If node with the same profiling if that actually makes sense.
// Some If node subtypes should not be cloned in this way. In theory, we should not clone BaseCountedLoopEndNodes.
// But they can end up being used as normal If nodes when peeling a loop - they serve as zero-trip guard.
Expand Down Expand Up @@ -2177,6 +2177,7 @@ ParsePredicateNode::ParsePredicateNode(Node* control, Deoptimization::DeoptReaso
switch (deopt_reason) {
case Deoptimization::Reason_predicate:
case Deoptimization::Reason_profile_predicate:
case Deoptimization::Reason_auto_vectorization_check:
case Deoptimization::Reason_loop_limit_check:
break;
default:
Expand Down Expand Up @@ -2214,6 +2215,9 @@ void ParsePredicateNode::dump_spec(outputStream* st) const {
case Deoptimization::DeoptReason::Reason_profile_predicate:
st->print("Profiled Loop ");
break;
case Deoptimization::DeoptReason::Reason_auto_vectorization_check:
st->print("Auto_Vectorization_Check ");
break;
case Deoptimization::DeoptReason::Reason_loop_limit_check:
st->print("Loop Limit Check ");
break;
Expand Down
33 changes: 31 additions & 2 deletions src/hotspot/share/opto/loopTransform.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -745,6 +745,11 @@ void PhaseIdealLoop::do_peeling(IdealLoopTree *loop, Node_List &old_new) {
cl->set_trip_count(cl->trip_count() - 1);
if (cl->is_main_loop()) {
cl->set_normal_loop();
if (cl->is_multiversion()) {
// Peeling also destroys the connection of the main loop
// to the multiversion_if.
cl->set_no_multiversion();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we want to change the multiversion guard at this point so it constant folds and the slow version is removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we can probably do that. Otherwise, we just have to wait until the OpaqueMultiversioningNode constant folds after loop-opts.

}
#ifndef PRODUCT
if (PrintOpto && VerifyLoopOptimizations) {
tty->print("Peeling a 'main' loop; resetting to 'normal' ");
Expand Down Expand Up @@ -1174,8 +1179,9 @@ bool IdealLoopTree::policy_range_check(PhaseIdealLoop* phase, bool provisional,
if (!bol->is_Bool()) {
assert(bol->is_OpaqueNotNull() ||
bol->is_OpaqueTemplateAssertionPredicate() ||
bol->is_OpaqueInitializedAssertionPredicate(),
"Opaque node of a non-null-check or an Assertion Predicate");
bol->is_OpaqueInitializedAssertionPredicate() ||
bol->is_OpaqueMultiversioning(),
"Opaque node of a non-null-check or an Assertion Predicate or Multiversioning");
continue;
}
if (bol->as_Bool()->_test._test == BoolTest::ne) {
Expand Down Expand Up @@ -3354,6 +3360,23 @@ bool IdealLoopTree::iteration_split_impl(PhaseIdealLoop *phase, Node_List &old_n
// Do nothing special to pre- and post- loops
if (cl->is_pre_loop() || cl->is_post_loop()) return true;

// With multiversioning, we create a fast_loop and a slow_loop, and a multiversion_if that
// decides which loop is taken at runtime. At first, the multiversion_if always takes the
// fast_loop, and we only optimize the fast_loop. Since we are not sure if we will ever use
// the slow_loop, we delay optimizations for it, so we do not waste compile time and code
// size. If we never change the condition of the multiversion_if, the slow_loop is eventually
// folded away after loop-opts. While optimizing the fast_loop, we may want to perform some
// speculative optimization, for which we need a runtime-check. We add this runtime-check
// condition to the multiversion_if. Now, it becomes possible to execute the slow_loop at
// runtime, and we resume optimizations for slow_loop ("un-delay" it).
// TLDR: If the slow_loop is still in "delay" mode, check if the multiversion_if was changed
// and we should now resume optimizations for it.
if (cl->is_multiversion_delayed_slow_loop() &&
!phase->try_resume_optimizations_for_delayed_slow_loop(this)) {
// We are still delayed, so wait with further loop-opts.
return true;
}

// Compute loop trip count from profile data
compute_profile_trip_cnt(phase);

Expand Down Expand Up @@ -3413,6 +3436,12 @@ bool IdealLoopTree::iteration_split_impl(PhaseIdealLoop *phase, Node_List &old_n
if (!phase->may_require_nodes(estimate)) {
return false;
}

// We are going to add pre-loop and post-loop.
// But should we also multi-version for auto-vectorization speculative
// checks, i.e. fast and slow-paths?
phase->maybe_multiversion_for_auto_vectorization_runtime_checks(this, old_new);

phase->insert_pre_post_loops(this, old_new, peel_only);
}
// Adjust the pre- and main-loop limits to let the pre and post loops run
Expand Down
Loading