Skip to content
Closed
Changes from 1 commit
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
3dd72b8
8307513: C2: intrinsify Math.max(long,long) and Math.min(long,long)
galderz Jul 8, 2024
e43b390
Add IR test
galderz Jul 18, 2024
f910739
Refactor inline methods to unify their implementations
galderz Jul 19, 2024
ce71a0e
Add math vectorized JMH benchmark
galderz Jul 23, 2024
8d66f7b
Rename benchmark class to MathLoopBench
galderz Aug 27, 2024
605a78a
Fix multi long tests to use long arrays
galderz Aug 27, 2024
1522e26
Implement cmovL as a jump+mov branch
galderz Sep 9, 2024
a64fcda
Switch movl to movq
galderz Sep 11, 2024
13ed872
Fix format of assembly for the movl to movq switch
galderz Sep 11, 2024
da720c5
Distribute values targetting a branch percentage
galderz Sep 12, 2024
0b71cb5
Fix min case to distribute numbers as per probability
galderz Sep 12, 2024
fe3aff4
Fix compilation error
galderz Sep 12, 2024
0047a4b
Add an intermediate % that is more representative of real life
galderz Sep 12, 2024
f622852
Skip single array benchmarks
galderz Sep 16, 2024
6fd8805
Add min/max benchmark that includes loops and reductions
galderz Sep 24, 2024
93799d5
Renamed benchmark methods
galderz Sep 24, 2024
c06e869
Multiply array value in reduction for vectorization to kick in
galderz Sep 25, 2024
28778c8
Remove previous benchmark effort
galderz Sep 27, 2024
bc648aa
Revert "Fix format of assembly for the movl to movq switch"
galderz Sep 27, 2024
7a07aa8
Revert "Switch movl to movq"
galderz Sep 27, 2024
16ae2a3
Revert "Implement cmovL as a jump+mov branch"
galderz Sep 27, 2024
3f712e2
Merge branch 'master' into topic.intrinsify-max-min-long
galderz Oct 17, 2024
6cc5484
Avoid creating result array in benchmark method
galderz Oct 9, 2024
c956012
Encapsulate benchmark state within an inner class
galderz Oct 10, 2024
0b19789
Add clipping range benchmark that uses min/max
galderz Oct 10, 2024
e669893
Restore previous benchmark iterations and default param size
galderz Oct 10, 2024
dcf6b54
Make state class non-final
galderz Oct 10, 2024
b19fc81
Double/Float tests only when avx enabled
galderz Oct 15, 2024
f6f0244
Renamed benchmark class
galderz Oct 17, 2024
0a8718e
Use same default size as in other vector reduction benchmarks
galderz Oct 17, 2024
aca0922
Merge branch 'master' into topic.intrinsify-max-min-long
galderz Dec 12, 2024
65e2e48
Add empty line
galderz Dec 17, 2024
c964c26
Add max reduction test
galderz Dec 17, 2024
cfe0239
Fix style
galderz Dec 17, 2024
7353a07
Adjust min/max identity IR test expectations after changes
galderz Dec 17, 2024
130b475
Added comment around the assertions
galderz Dec 17, 2024
4d4753f
Tests should also run on aarch64 asimd=true envs
galderz Dec 18, 2024
fb0f731
Fix license header
galderz Dec 18, 2024
c049198
Test can only run with 256 bit registers or bigger
galderz Jan 9, 2025
abbaf87
Make sure it runs with cpus with either avx512 or asimd
galderz Jan 13, 2025
94397d3
Fix copyright years
galderz Jan 17, 2025
f83d886
Renaming methods and variables and add docu on algorithms
galderz Jan 17, 2025
724a346
Fix typo
galderz Jan 17, 2025
a190ae6
Merge branch 'master' into topic.intrinsify-max-min-long
galderz Feb 7, 2025
d0e793a
Add simple reduction benchmarks on top of multiply ones
galderz Feb 17, 2025
38537fc
Add assertion comments
galderz Mar 7, 2025
1aa690d
Merge branch 'master' into topic.intrinsify-max-min-long
galderz Mar 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@ public static void ReductionInit(long[] longs, int probability) {

@Test
@IR(applyIfAnd = {"SuperWordReductions", "true", "MaxVectorSize", ">=32"},
applyIfCPUFeatureOr = {"avx512", "true", "asimd" , "true"},
counts = {IRNode.MIN_REDUCTION_V, " > 0"})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eme64 I've addressed all your comments except aarch64 testing. asimd is not enough, you need sve for this, but I'm yet to make it work even with sve, something's up and need to debug it further.

Hi @galderz , may I ask if these long-reduction cases can't work even with sve? It might be related with the limitation here. Some sve machines have only 128 bits.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. Neoverse V2 is 4 pipes of 128 bits, V1 is 2 pipes of 256 bits.
That comment is "interesting". Maybe it should be tunable by the back end. Given that Neoverse V2 can issue 4 SVE operations per clock cycle, it might still be a win.

Galder, how about you disable that line and give it another try?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I'm working on removing the line here.

The issue is that on some platforms 2-element vectors are somehow really slower, and we need a cost-model to give us a better heuristic, rather than the hard "no". See my draft #20964.

But yes: why don't you remove the line, and see if that makes it work. If so, then don't worry about this case for now, and maybe leave a comment in the test. We can then fix that later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this limit limits reductions like this working on 128 bit registers:

      // Length 2 reductions of INT/LONG do not offer performance benefits
      if (((arith_type->basic_type() == T_INT) || (arith_type->basic_type() == T_LONG)) && (size == 2)) {
        retValue = false;

I've tried today to remove that but then the profitable checks fail to pass. So, I'm not going down that route now.

public static long minReductionImplement(long[] a, long res) {
for (int i = 0; i < a.length; i++) {
Expand All @@ -142,6 +143,7 @@ public static long minReductionImplement(long[] a, long res) {

@Test
@IR(applyIfAnd = {"SuperWordReductions", "true", "MaxVectorSize", ">=32"},
applyIfCPUFeatureOr = {"avx512", "true", "asimd" , "true"},
counts = {IRNode.MAX_REDUCTION_V, " > 0"})
public static long maxReductionImplement(long[] a, long res) {
for (int i = 0; i < a.length; i++) {
Expand Down