-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8310308: IR Framework: check for type and size of vector nodes #14539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back epeter! A progress list of the required criteria for merging this PR into |
…estCyclicDependency.java
@eme64 this pull request can not be integrated into git checkout JDK-8310308
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
TestFormat.checkNoReport(!vmInfo.canTrustVectorSize(), "sanity"); | ||
// If we have a size specified but cannot trust the size, and must check an upper | ||
// bound, this can be impossible to count correctly - if we have an incorrect size | ||
// we may count either too many nodes. We just create a impossible regex which will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// we may count either too many nodes. We just create a impossible regex which will | |
// we may count either too many nodes. We just create an impossible regex which will |
@@ -87,7 +87,7 @@ static float[] test() { | |||
} | |||
``` | |||
|
|||
However, the size does not have to be specified. In most cases, one either wants to have vectorization at the maximal possible vector width, or no vectorization at all. Hence, the default size is `IRNode.VECTOR_SIZE_MAX`, except when using `failOn` or `counts` with comparisons `<`, `<=` or `=0`, where we have a default of `IRNode.VECTOR_SIZE_ANY`. | |||
However, the size does not have to be specified. In most cases, one either wants to have vectorization at the maximal possible vector width, or no vectorization at all. Hence, for lower bound counts ('>' or '>=') the default size is `IRNode.VECTOR_SIZE_MAX`, and for upper bound counts ('<' or '<=' or '=0' or failOn) the default is `IRNode.VECTOR_SIZE_ANY`. Equal count comparisons with a strictly positive count (e.g. '=2') are not allowed for vector nodes. On machines with 'canTrustVectorSize == false' (cascade lake) the maximal vector width is not predictable currently. Hence, on such a machine we have to automatically weaken the IR rules. All lower bound counts are performed checking with `IRNode.VECTOR_SIZE_ANY`. Upper bound counts with no user specified size are performed with `IRNode.VECTOR_SIZE_ANY` but upper bound counts with a user specified size are not checked at all. Details and reasoning can be found in [RawIRNode](./driver/irmatching/irrule/checkattribute/parsing/RawIRNode.java). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, the size does not have to be specified. In most cases, one either wants to have vectorization at the maximal possible vector width, or no vectorization at all. Hence, for lower bound counts ('>' or '>=') the default size is `IRNode.VECTOR_SIZE_MAX`, and for upper bound counts ('<' or '<=' or '=0' or failOn) the default is `IRNode.VECTOR_SIZE_ANY`. Equal count comparisons with a strictly positive count (e.g. '=2') are not allowed for vector nodes. On machines with 'canTrustVectorSize == false' (cascade lake) the maximal vector width is not predictable currently. Hence, on such a machine we have to automatically weaken the IR rules. All lower bound counts are performed checking with `IRNode.VECTOR_SIZE_ANY`. Upper bound counts with no user specified size are performed with `IRNode.VECTOR_SIZE_ANY` but upper bound counts with a user specified size are not checked at all. Details and reasoning can be found in [RawIRNode](./driver/irmatching/irrule/checkattribute/parsing/RawIRNode.java). | |
However, the size does not have to be specified. In most cases, one either wants to have vectorization at the maximal possible vector width, or no vectorization at all. Hence, for lower bound counts ('>' or '>=') the default size is `IRNode.VECTOR_SIZE_MAX`, and for upper bound counts ('<' or '<=' or '=0' or failOn) the default is `IRNode.VECTOR_SIZE_ANY`. Equal count comparisons with a strictly positive count (e.g. '=2') are not allowed for vector nodes. On machines with 'canTrustVectorSize == false' (Cascade Lake) the maximal vector width is not predictable currently. Hence, on such a machine we have to automatically weaken the IR rules. All lower bound counts are performed checking with `IRNode.VECTOR_SIZE_ANY`. Upper bound counts with no user specified size are performed with `IRNode.VECTOR_SIZE_ANY` but upper bound counts with a user specified size are not checked at all. Details and reasoning can be found in [RawIRNode](./driver/irmatching/irrule/checkattribute/parsing/RawIRNode.java). |
Same for other occurrences.
} | ||
case "=" -> { | ||
// if 0, we expect none -> expect to not find any with any size | ||
return comparison.getGivenValue() > 0; | ||
// "=0" is same as setting upper bound - just like for failOn. But i we compare equals a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// "=0" is same as setting upper bound - just like for failOn. But i we compare equals a | |
// "=0" is same as setting upper bound - just like for failOn. But if we compare equals a |
// if 0, we expect none -> expect to not find any with any size | ||
return comparison.getGivenValue() > 0; | ||
// "=0" is same as setting upper bound - just like for failOn. But i we compare equals a | ||
// strictly positive number it is like setting both and upper and lower bound (equal). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// strictly positive number it is like setting both and upper and lower bound (equal). | |
// strictly positive number it is like setting both upper and lower bound (equal). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only some minor comments left, otherwise, the update looks good to me!
@@ -105,6 +105,8 @@ public class IRNode { | |||
private static final String STORE_OF_CLASS_POSTFIX = "(:|\\+)\\S* \\*" + END; | |||
private static final String LOAD_OF_CLASS_POSTFIX = "(:|\\+)\\S* \\*" + END; | |||
|
|||
public static final String IMPOSSIBLE_NODE_REGEX = "impossible_node_regex"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add additional #
to be on the safe side to never accidentally match it:
public static final String IMPOSSIBLE_NODE_REGEX = "impossible_node_regex"; | |
public static final String IMPOSSIBLE_NODE_REGEX = "#impossible_node_regex#"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line is now removed, not required any more.
test/hotspot/jtreg/compiler/lib/ir_framework/driver/SuccessOnlyConstraintException.java
Outdated
Show resolved
Hide resolved
test/hotspot/jtreg/compiler/lib/ir_framework/driver/SuccessOnlyConstraintException.java
Outdated
Show resolved
Hide resolved
...reg/compiler/lib/ir_framework/driver/irmatching/irrule/checkattribute/parsing/RawIRNode.java
Outdated
Show resolved
Hide resolved
...reg/compiler/lib/ir_framework/driver/irmatching/irrule/checkattribute/parsing/RawIRNode.java
Outdated
Show resolved
Hide resolved
...eg/compiler/lib/ir_framework/driver/irmatching/irrule/constraint/SuccessConstraintCheck.java
Outdated
Show resolved
Hide resolved
System.err.println("--- VMInfo from Test VM ---"); | ||
System.err.println("cpuFeatures: " + getStringValue("cpuFeatures")); | ||
System.err.println("MaxVectorSize: " + getLongValue("MaxVectorSize")); | ||
System.err.println("MaxVectorSizeIsDefault: " + getLongValue("MaxVectorSizeIsDefault")); | ||
System.err.println("LoopMaxUnroll: " + getLongValue("LoopMaxUnroll")); | ||
System.err.println("UseAVX: " + getLongValue("UseAVX")); | ||
System.err.println("UseAVXIsDefault: " + getLongValue("UseAVXIsDefault")); | ||
if (isDefaultCascadeLake()) { | ||
System.err.println(" -> You are on default Cascade Lake"); | ||
System.err.println(" -> SuperWord expected to run with 32 byte, not 64 byte, VectorAPI expected to use 64 byte"); | ||
System.err.println(" -> \"canTrustVectorSize == false\", some vector node IR rules are made weaker."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a leftover from debugging? If you want to print this information for debugging purposes, I suggest to move this code to VMInfoParser
and additionally guard it with VERBOSE || PRINT_IR_ENCODING
. The name PRINT_IR_ENCODING
is not completely correct here but we might want to clean this up separately at some other point in time.
You can keep the verification of calling the get*()
methods here, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can just remove it. It is not necessary any more I think.
* make use of the full MaxVectorSize. For Cascade Lake we by default only use | ||
* 32 bytes for SuperWord even though MaxVectorSize is 64. But the VectorAPI still |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: For Cascade Lake, we only use 32 bytes for SuperWord by default even though MaxVectorSize is 64.
Co-authored-by: Christian Hagedorn <christian.hagedorn@oracle.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, looks good!
Thanks @TobiHartmann @chhagedorn for all the discussions, help and reviews! |
Going to push as commit a02d65e.
Your commit was automatically rebased without conflicts. |
For some changes to
SuperWord
, and maybe auto-vectorization in general, I want to strengthen the IR Framework.Motivation
I want to not just find the relevant IR nodes, but also assert that they have the maximal length that they could have on the respective platform (given the CPU features and
MaxVectorSize
). Without this verification it is possible that a future change leads to a regression where we still vectorize but at shorter vector widths as before - leading to performance loss.How to use it
All
IRNode
s intest/hotspot/jtreg/compiler/lib/ir_framework/IRNode.java
that are created withvectorNode
are now all matched with theirtype
andsize
. The regex might now look something like this:"(\d+(\s){2}(VectorCastF2X.*)+(\s){2}===.*vector[A-Za-z]\[8\]:\{int\})"
which would match with IR nodes dumped like that:
1150 VectorCastF2X === _ 1151 [[ 1146 ]] #vectory[8]:{int} ...
The goal was to keep it simple and straight forward. In most cases, you can just use the nodes as before, and implicitly we now check for maximal size automatically. However, in some cases we want to ensure there is no or only a limited number of nodes (
failOn
or comparison<
or<=
or=0
) - in those cases we usually want to make sure there is not any node of any size, so we match with any size by default. The size can also explicitly be constrained usingIRNode.VECTOR_SIZE
.Some examples:
@IR(counts = {IRNode.LOAD_VECTOR_I, " >0 "})
-> search for aLoadVector
node withtype
int
, and maximalsize
possible on the machine (limited by CPU features andMaxVectorSize
). This is the most common use case.@IR(failOn = { IRNode.LOAD_VECTOR_L, IRNode.STORE_VECTOR })
-> fail if there is aLoadVector
with typelong
, ofany
size.@IR(counts = { IRNode.XOR_VI, IRNode.VECTOR_SIZE_4, " > 0 "})
-> find at least oneXorV
node with typeint
and exactly4
elements. Useful for VectorAPI when the vector species is fixed.@IR(counts = { IRNode.LOAD_VECTOR_D, IRNode.VECTOR_SIZE + "min(4, max_double)", " >0 " })
-> search for aLoadVector
node withtype
double
, andsize
exactly equals tomin(4, max_double)
(so 4 elements, or if the hardware allows fewerdoubles
, then that number).@IR(counts = { IRNode.ABS_VF, IRNode.VECTOR_SIZE + "min(LoopMaxUnroll, max_float)", ">= 1" })
-> find at least oneAbsV
nodes with typefloat
, and thesize
exactly equals to the smaller ofLoopMaxUnroll
or the maximal size allowed forfloats
(useful for tests where theLoopMaxUnroll
is artificially lowered, which sometimes prevents the maximal filling of vectors).@IR(counts = {IRNode.VECTOR_CAST_I2F, IRNode.VECTOR_SIZE + "min(max_int, max_float)", ">0"})
-> find at least oneVectorCastI2X
node that casts to typefloat
, and where the size is exactly equals to the smaller maximal size forints
andfloats
. This is helpful when there are multiple types in the loop, and the number of elements is limited by the sizes of multiple types.I had to change lots of occurrences, hence you can find many more examples in the tests.
Details
Vector nodes that should be tested for
type
andsize
now are to be created withVECTOR_PREFIX
andvectorNode
, seeIRNode.java
.When specifying such a
vectorNode
in an IR rule, one first uses theirNodePlaceholder
(egLoad_VECTOR_I
), and following it one can optionally add aIRNode.VECTOR_SIZE
specifier, which is then parsed byparseVectorNodeSize
. This allows either naming a concrete size (egIRNode.VECTOR_SIZE_8
), a tag (IRNode.VECTOR_SIZE + "<tag>"
) where the the tag can be one of the tags listed inparseVectorNodeSizeTag
, or amin(...)
clause which computes the minimum value of a comma separated list of tags. As a last resort one can match for any size (IRNode.VECTOR_SIZE_ANY
).The maximal vector size for any type is computed in
getMaxElementsForType
, under consideration of the CPU features and theMaxVectorSize
.Changes to tests
Unfortunately, I had to change a lot of IR rules, though not substantially. Most changes are because we usually had nodes like
MAX_V
orLOAD_VECTOR
which matched for any type, and I had to create one node per type now (egMAX_VF, MAX_VD
, orLOAD_VECTOR_I, LOAD_VECTOR_L, LOAD_VECTOR_F, ...
). While this was a lot of work, it is still good to know that we are generating the nodes with the correct types.In the VectorAPI tests there were many which required concrete sizes due to the concrete size of the vector species. This is nice to test, since it guarantees that the vector species indeed generate the expected vector sizes.
A few tests required more attention, where I had to use patterns like
IRNode.VECTOR_SIZE + "min(...)"
. These are especially interesting, as they test cases like mixed types (eg casting between types).A few tests had loop iteration counts that were too small (maybe 512), such that the loops were not sufficiently unrolled to reach the maximal vector width. This happened especially with byte cases, which require an unrolling factor of 64 to fill 512bit registers. I increased the loop iteration counts, such that we can also properly test the largest vector widths. This improves our test coverage.
Future Work
There are a few nodes that I did not yet handle with
vectorNode
(egVECTOR_REINTERPRET
,OR_V_MASK
,MACRO_LOGIC_V
,LOAD_VECTOR_GATHER(_MASKED)
). Some of these only have very few tests and are all from the Vector API which was not my priority here. They can easily be converted should the need arise in the future.While looking at lots of IR tests I also came up with these RFE's:
JDK-8310891 C2 SuperWord tests: move platform requirements to IR rules
JDK-8310523 Add IR tests for nodes that have too few IR tests yet
JDK-8310533 [IR Framework] Add possibility to automatically verify that a test method always returns the same result
Testing
tier1-tier6 and stress-testing.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/14539/head:pull/14539
$ git checkout pull/14539
Update a local copy of the PR:
$ git checkout pull/14539
$ git pull https://git.openjdk.org/jdk.git pull/14539/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 14539
View PR using the GUI difftool:
$ git pr show -t 14539
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/14539.diff
Webrev
Link to Webrev Comment