Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add ValueNumbering support for GT_SIMD and GT_HWINTRINSIC tree nodes #31834

Open
wants to merge 22 commits into
base: master
from

Conversation

@briansull
Copy link
Contributor

briansull commented Feb 6, 2020

No description provided.

@tannergooding

This comment has been minimized.

Copy link
Member

tannergooding commented Feb 7, 2020


#ifdef FEATURE_HW_INTRINSICS
case GT_HWINTRINSIC:
return true; // allow Hardware Intrinsics to be CSE-ed

This comment has been minimized.

Copy link
@tannergooding

tannergooding Feb 7, 2020

Member

Are there any HWIntrinsics we don't want CSE'd or any that need special handling (such as the non-temporal load/stores)?

This comment has been minimized.

Copy link
@CarolEidt

CarolEidt Feb 7, 2020

Member

Excellent thought - I would say this should include:

  • any of the "special" loads (LoadFence, any NonTemporal
  • any store (i.e. HW_Category_MemoryStore and StoreFence)
  • any prefetch
  • MemoryFence

Perhaps we should consider adding a flag for this, but a reasonable proxy in the meantime would be to exclude any HW_Category_MemoryStore, HW_Category_MemoryLoad and HW_Category_Special:

  • There would be few, if any, loads that the JIT could identify as CSE's (local stack loads would generally not show up as actual load intrinsics).
  • The only HW_Category_Special that I found that this would unnecessarily exclude is CompareLessThan
@briansull

This comment has been minimized.

Copy link
Contributor Author

briansull commented Feb 7, 2020

One issue that I ran into while working on this feature was that the numArgs for the SIMD and HW_INTRINSICS is not always set to a useful value. Value numbering needs to know the exact number of arguments to expect and some instructions have a "-1" for numArgs or specify 2 when then can accept either 1 or 2.

For the purposes of Value Numbering it would be better to have multiple entries in the table when the intrinsic can take different numbers of arguments.
Currently I just bail out when the table specifies a numArgs of -1.

@briansull

This comment has been minimized.

Copy link
Contributor Author

briansull commented Feb 7, 2020

For true binary operations, it would be useful to have a "commutative" column that indicates that the operands can be safely swapped.

@tannergooding

This comment has been minimized.

Copy link
Member

tannergooding commented Feb 7, 2020

For true binary operations, it would be useful to have a "communitive" column that indicates that the operands can be safely swapped.

You can use static bool IsCommutative(NamedIntrinsic id): https://github.com/dotnet/runtime/blob/master/src/coreclr/src/jit/hwintrinsic.h#L202. You would get the id via AsHWIntrinsic()->gtHWIntrinsicId

This is likewise hooked up through the standard OperIsCommutative API.

Its also worth noting that there are some intrinsics, like FusedMultiplyAdd, which are commutative but have 3 operands and require special handling to account for this (the operation is a fused a * b + c and so the a and b operands are commutative).

For the purposes of Value Numbering it would be better to have multiple entries in the table when the intrinsic can take different numbers of arguments.

This would make lookup during imporation quite a bit more expensive. We already have to check the names, I don't think we want to also have to start checking other parameters.

There are existing helper methods such as static int lookupNumArgs(const GenTreeHWIntrinsic* node): https://github.com/dotnet/runtime/blob/master/src/coreclr/src/jit/hwintrinsic.h#L135 which will return the correct number of arguments for a given node and will attempt to do it as efficiently as possible.

briansull added 9 commits Jan 30, 2020
  - Zero out the bbAssertionIn values, as these can be referenced in RangeCheck::MergeAssertion
    and this is shared state with the CSE phase: bbCseIn
Added bool OperIsMemoryLoad() to GenTreeSIMD, returns true for SIMDIntrinsicInitArray
Added valuenumfuncs.h to src/coreclr/src/jit/CMakeLists.txt
Mutate the gloabl heap when performing a HW_INTRINSIC memory store operation
Printing of SIMD constants only support 0
@briansull briansull force-pushed the briansull:vn-simd branch from 5e05ec9 to 51201c1 Feb 7, 2020
src/coreclr/src/jit/gentree.cpp Show resolved Hide resolved
src/coreclr/src/jit/gentree.cpp Outdated Show resolved Hide resolved
@briansull

This comment has been minimized.

Copy link
Contributor Author

briansull commented Feb 7, 2020

There are existing helper methods such as static int lookupNumArgs(const GenTreeHWIntrinsic* node): https://github.com/dotnet/runtime/blob/master/src/coreclr/src/jit/hwintrinsic.h#L135 which will return the correct number of arguments for a given node and will attempt to do it as efficiently as possible.

The Value number operation defined by this enum (below), requires that each enum value have a fixed value for arity and commutativity, as we just pass around the value number and then expect to be able to unpack it based upon the VNFunc.

enum VNFunc
{
    // Implicitly, elements of genTreeOps here.
    VNF_Boundary = GT_COUNT,
#define ValueNumFuncDef(nm, arity, commute, knownNonNull, sharedStatic) VNF_##nm,
#include "valuenumfuncs.h"
    VNF_COUNT
};
@tannergooding

This comment has been minimized.

Copy link
Member

tannergooding commented Feb 8, 2020

The Value number operation defined by this enum (below), requires that each enum value have a fixed value for arity and commutativity, as we just pass around the value number and then expect to be able to unpack it based upon the VNFunc

@CarolEidt, do you have any concerns about updating the hwintrinsic tables to only contain constant data (so entries that currently have a variable number of args or supporting multiple SIMD sizes would be split into multiple entries, for example)?

It would make the table larger but I think we could probably cut some of the current lookup cost and still make this manageable (like having a table per column as we've done elsewhere; or tracking the first/last entry for a given ISA).

@CarolEidt

This comment has been minimized.

Copy link
Member

CarolEidt commented Feb 8, 2020

do you have any concerns about updating the hwintrinsic tables to only contain constant data (so entries that currently have a variable number of args or supporting multiple SIMD sizes would be split into multiple entries

It would be unfortunate, but I don't think it would be too prohibitive to split it out by number of args, but I don't think it's necessary to split it out by different SIMD sizes, as nodes of different sizes should never have the same value type - isn't that right @briansull?

@briansull

This comment has been minimized.

Copy link
Contributor Author

briansull commented Feb 8, 2020

I don't think it's necessary to split it out by different SIMD sizes, as nodes of different sizes should never have the same value type - isn't that right @briansull?

Actually the type size isn't part of the the value number, so there are a couple of nodes where Value Numbering adds an additional arg that holds a constant represent the size of the operation. We do that for GT_CAST and I also had to add this for the SIMD Init operation., As one test was CSE-ing two different sized Vector Inits: 3-way and 4-way. So it awkward but it is something that we can already handle.

            case SIMDIntrinsicInit:
            {
                // Also encode the resulting type as op2vnp
                ValueNumPair op2vnp;
                op2vnp.SetBoth(vnStore->VNForIntCon(INT32(tree->TypeGet())));

                excSetPair     = op1Xvnp;
                normalPair     = vnStore->VNPairForFunc(tree->TypeGet(), GetVNFuncForNode(tree), op1vnp, op2vnp);
                tree->gtVNPair = vnStore->VNPWithExc(normalPair, excSetPair);
                return;
            }
…a SIMD LclVar in PerformCSE
briansull added 4 commits Feb 12, 2020
…StructHndMismatch and avoid making the candidate into a CSE
briansull added 7 commits Feb 12, 2020
Added Mismatched Struct Handle assert in ConsiderCandidates
added VNF_SimdType // A value number function to compose a SIMD type
added vnDumoSimdType
Removed ismatched Struct Handle assert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.