-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New IL instruction for typeswitch #12260
Comments
Why would this be better than a sequence of if/else pairs? Is this about reducing IL size? Do you expect the GC to have a more optimal implementation than the c# compiler generating the equivalent of below? if (o.GetType() == typeof(string))
{
…
}
else if (o.GetType() == typeof(object))
{
....
}
else
{
...
} |
@davidwrighton I think for literature on compiler optimization of pattern matching, the following two OCaml papers give you requisite knowledge and understanding: Compiling Pattern Matching to Good Decision Trees Further, by hoisting this to a logical IL instruction, it is conceivable that really large F#/C# data science algorithms utilizing complex pattern matching could benefit from hot path optimization of the compiled decision tree. I realize C# only implements basic switch-on-type pattern matching right now, but the general benefits are wide-spread to further optimization. For OOP type-matching like switch-on-type, the compiler can specifically reason that: a) there are significantly more "default case" type matches that fall-through to the default case The only remaining question is - if you're going to add an instruction, why only take one type parameter? The IL instruction should allow stacking it so it works as a decision tree. For a view into how OCaml does this, see the free chapter 23 of Real World Ocaml: https://v1.realworldocaml.org/v1/en/html/the-compiler-backend-byte-code-and-native-code.html @gafter What do you think? |
Here are two reasons:
[edited 2019-11-04 to add] |
@jzabroski Thanks for those references! |
Ok, you can flip the sequence to use
The most efficient way to do a type switch over large number of types is hashtable. The runtime can make this work well by proving efficient way to get a type hash code if it is not efficient enough today. Other than that, the compiler or the compiler support library can build about as efficient hashtable as the runtime itself. I think it is likely that building the hashtable in the compiler or the compiler support library would give you more flexibility for implementing more complex matching patterns efficiently. |
Are you suggesting that we build a hashtable dynamically and incrementally? Wouldn't that prevent the unloading of types once they appear as keys in that table?
That would remove all flexibility that the runtime would otherwise have to implement type switches in the most efficient manner for the platform. |
Yes, the hashtable would need to be aware of type unloading. Hashtables aware of type unloading are not unusual and the primitives available for building them are not great. It is something that we should improve.
As the C# pattern matching is getting more complex, I think it starting to look pretty similar to We should consider taking a similar route for pattern matching as we have taken for |
@jkotas You may have convinced me. It looks like the break-even point for using a hashtable is about 20 types in a switch. See dotnet/roslyn#31515 (comment) for a benchmark. I'm not sure how the jit could do better. Without evidence to the contrary, perhaps this issue should be closed? |
Improved the prototype library so that the break-even point for using a hashtable is about 10 types in a switch. See https://github.com/gafter/TypeSwitch/blob/master/TypeSwitch/TypeSwitchDispatch.cs |
I believe it is more appropriate for this kind of support to be directly in the runtime, where it can be optimized dynamically based on the program's actual behavior. The library approach fixes the time-space tradeoff at compile-time and makes it nearly impossible for the runtime to do any better. |
Maybe we can teach the JIT to recognize the pattern:
No IL changes and the IL works on older runtimes. |
They are subtype tests, not exact type tests. |
The runtime/JIT does recognize many framework types as intrinsics and optimizes them in custom ways. The library approach does not prevent the time-space tradeoff from getting optimized based on the actual program behavior if the type is recognized as intrinsic. |
@jkotas Can you please give me an example of a case in which the runtime eliminates a static field due to replacing it and its use with intrinsic behavior? |
The JIT is able to inline the value of readonly static fields. It does not eliminate it. The field is still there, but it is not used on the hot path. |
@jkotas Making an appropriate time-space tradeoff would require eliminating the static field (which uses a lot of space to get reasonable performance). If the runtime does not do that kind of thing, then a new instruction would be an appropriate approach. |
@jkotas Is there any way I can check whether the JIT has inlined something? For example from some code I have written, how would I tell this has been inlined: static class Constructor<T>
where T : new()
{
public static readonly Func<IScope, T> Empty = Expression.Lambda<Func<IScope, T>>(scope => scope.Exists<T>() ?? new T()).Compile();
} Similarly, is there any "Auditing Machine" I can query inside the runtime to determine how far away the code is/was from being JIT'ed? |
@jzabroski Your question is not relevant to this issue. Please open a new issue on this.
@gafter I do not think elimination of the static field would be required for a decent time-space tradeoff. Designs that require low-level implementation are harder to evolve and improve. They may be better in theory, but we find them to be worse in practice because of the experience and amount of work required to optimize them. I do not have a problem with this issue being open as a new IL instruction proposal. But i is not obvious to me that implementing this as a new IL instruction will achieve the best results. |
Dumping my notes for future reference A code sequence
Would be defined to be equivalent to the following, except required to execute in (amortized) constant time:
|
Let's make the JIT recognize that IL pattern then. No need for new opcodes. |
Then all the patterns JIT recognize must be documented |
The problem is that this pattern must be reliably recognized by all JITs, not just one of them under some circumstances. Otherwise programmers cannot rely on the performance in writing their algorithms. Adding a new instruction adds confidence that JITs recognize it. |
It seems like this is something that might fit fairly well as an intrinsic. No new IL or IL pattern matching would be needed, and it could be engineered so that jit/runtime optimization was possible but also optional. We'd agree on some BCL method(s) for this operation, eg This method (or set of methods) would be targeted by C#, which when doing pattern matching would follow this up with an actual switch to direct control to the appropriate label. There might be some complexities fitting this in with side effect ordering -- I don't know how much code can run during the matching phase currently. The The |
That is a very pragmatic middle-ground. Also, by hoisting TypeSwitch up to intrinsic level, it might even be possible to do fancy C# to FPGA synthesizer (completely different target architecture), as you have not yet dipped into the needless abstraction of CIL registers (and even so, the registers become fields in a special component named device in the FPGA), so your approach is likely the most portable and meets specific use case of "emit event if not optimized" so that developers can see in Benchmark.Net or whatever what's going on and what basis they can make optimizer assumptions. |
The "non-optimized" path is likely to be significantly worse that the straightforward IL sequence when not optimized (for a few reasons*). That is very unfortunate, though possibly not fatal. The fatal flaw in this approach is that it relies on an optional optimization for an asymptotic performance improvement. A straightforward complexity analysis demonstrates that you cannot reliably get that asymptotic improvement to the program as a whole unless the optimization is reliably applied (nearly) everywhere it is applicable. That is why I believe a new IL instruction is the best approach - it guarantees that the runtime recognizes the "pattern". I'll explain in more detail when we meet on Monday afternoon. If the performance is linear time (in the size of the data to the call) when not optimized but constant (amortized) time (which is what we want) when "optimized", then is it not something that we would be likely to want to rely on in designing language features (though as I will explain we need to rely on that in designing upcoming features), and probably not something we would benefit from using in normal C# code-gen scenarios. *Some of the reasons that your proposed intrinsic is likely to be significantly slower than the (also unfortunately linear-time) straightforward IL sequence:
|
It is fine to add a guarantee that the generated code will be always amortized constant time and never linear (for large number of types, not observable for small number of types), for either API or IL instruction design. For non-optimized code, the JIT is going to call a method to implement this, even if it is a special IL instruction. For optimized code, the JIT can recognize the method call as intrinsic and achieve the same outcomes as a IL instruction, again no fundamental difference between the two designs. I believe that the interesting optimizations for the API vs. IL instruction design discussion are:
The question to ask is whether introducing a new IL instruction is worth the trouble and whether it will make the above set of optimizations significantly easier to do. |
Adding @dsyme to this thread. |
A primitive or intrinsic to aid efficient compilation of pattern matching would be very useful and clearly is part of an improved overall design for .NET in the 21st century. I thought a lot about proposing such an intrinsic around the time 2001-03 as we worked through the foundations necessary for F# (generics etc.) but didn't iterate to a concrete design.
|
How would you guarantee constant time switches for n types, each of which have thousands of implementors/derived types? Eg: private string Switch(object o) {
return o switch {
IReadOnlyList _ =>...
IList _ => ...
IEnumerable _ => ...
ISerialisable _ => ...
etc.
}
} |
Ah, I see that we cache the index that a type was found at, so that after first usage it's constant time. |
This issue track the same request as #29002. I believe the current consensus is that it can be done as an API in which the set of types to match are passed as generic type arguments (e.g. the elements of a tuple type). See https://github.com/gafter/TypeSwitch/blob/master/TypeSwitch/TypeSwitchDispatch.cs for a prototype that does not address the issue mentioned in #29002 (comment) |
It would be very convenient to have a CLR instruction for switching on the type of an object. The C# compiler could use this to generate code for a pattern-matching switch statement. The IL instruction would be followed by a series of (type, label) pairs. At runtime it would consume its operand, determine the first type in the list that the object inherits from, and jump to the corresponding label with a reference of the appropriate type on the stack. It would be required to operate in constant amortized time, no matter the number of types and whether or not those types are sealed.
We currently generate a series of type tests and branches for this case.
It is intended that this instruction would run in constant time for user-written types.
The text was updated successfully, but these errors were encountered: