Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Current proposal for fallback code gen for Hardware Intrinsics with immediate parameters is problematic #9522

Closed
4creators opened this issue Jan 9, 2018 · 6 comments

Comments

@4creators
Copy link
Contributor

Hardware Intrinsics according to current consensus on their implementation should fall back to generating in Jit switch table based implementations for immediate operands. This approach should be applied when Jit compiler encounters method requiring immediate parameter with non compile constant value passed or when similar method is called via reflection.

The following example illustrates problem which can be encountered with such implementation:

void MethodWithBug()
{
    MethodInfo shiftLeft = typeof(Sse2).GetMethod("ShiftLeftLogical", BindingFlags.Public | BindingFlags.Static );

    Vector128<uint>[] results = new 
    for (byte i = 0; i < vectors.Length; i++)
    {
        // Jit will have to recompile call every time another i value is passed
        Vector128<uint> result = shiftLeft.Invoke(null, new object[] { FooGetVector128<uint>(), i});
  
        // do something with result
    }
}

Problem with the above code is that everytime shifLeft.Invoke is called with new value of i Jit has to compile whole method again. This could lead to several problems:

  • every call would equal new method compilation
  • Jit would have to handle all jitted method versions
  • execution would be extremely slow

IMO it would be better to drop support for execution of method with invalid parameters and throw ArgumentException or convert immediate argument to Vector where it is possible (majority of ShiftNNN instructions but not all will support such conversion).

We can as well expand usage of Analyzers to raise error for such code or provide check inside reflection to throw on such calls. This same holds for delegates.

@tannergooding @fiigii @sdmaclea @CarolEidt @jkotas

@mikedn
Copy link
Contributor

mikedn commented Jan 9, 2018

Problem with the above code is that everytime shifLeft.Invoke is called with new value of i Jit has to compile whole method again.

Huh?!?

@tannergooding
Copy link
Member

tannergooding commented Jan 9, 2018

Problem with the above code is that everytime shifLeft.Invoke is called with new value of i Jit has to compile whole method again.

This is not the case.

See dotnet/coreclr@a8d8aa3#diff-847442c7efdf7a78e78711bea30fd4b2R544 for how these could be implemented.

For the software fallback case, we JIT the method once and emit a giant switch table for all 255 cases (or max + default, for the scenario where all 255 cases aren't supported) where each switch entry executes the appropriate instruction with the relevant constant.

@4creators
Copy link
Contributor Author

OK got it. But than we end up with huge emitted method - about 1 kB for 256 switch cases for single intrinsic call. Usually implementations make 10s or sometimes 100s of calls and if part is with immediate operands this would translate into huge memory consumption.

@mikedn
Copy link
Contributor

mikedn commented Jan 9, 2018

Usually implementations make 10s or sometimes 100s of calls and if part is with immediate operands this would translate into huge memory consumption.

What does the number of calls has to do with memory consumption?

@CarolEidt
Copy link
Contributor

Calling these intrinsics with a non-immediate operand is not the recommended usage - in fact, we've talked about the desirability of analysis tools that would identify such cases. The reason we are planning to support it is that it is required in any event to support reflection and diagnostic tools, and it therefore presents uniform behavior when it is supported for the general case.

Each intrinsic would only ever be JIT'd once per instantiated generic parameter, so I don't see that the expansion will have undue impact on generated code size.

@4creators
Copy link
Contributor Author

Each intrinsic would only ever be JIT'd once per instantiated generic parameter

That addresses my worry. Thanks for explanation.

@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants