Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Casting to a variant generic interface is much slower than to non-variant #4090

Closed
ikopylov opened this issue Mar 31, 2015 · 6 comments
Closed
Assignees
Labels
Milestone

Comments

@ikopylov
Copy link
Contributor

In the CoreFx issue (https://github.com/dotnet/corefx/issues/1182) was the idea to do explicit check whether IReadOnlyCollection<T> interface is supported. But it was rejected due to the slow casting to a variant generic interface. According to my tests it is ~20 times slower than casting to non-variant.

Source code of the test: CastToInterfaceTest.cs
Test results:

Cast From Cast To Is Variant Total time (ms)
List<int> ICollection<int> false 560
List<double> ICollection<int> false 1290
Thread IReadOnlyCollection<int> true 9713
List<int> IReadOnlyCollection<int> true 21988
List<double> IReadOnlyCollection<int> true 24294

It would be great to see some improvements here.

@ikopylov
Copy link
Contributor Author

Some thoughts.

If I understand all correctly, the code of casting is here:
https://github.com/dotnet/coreclr/blob/b4bd51a485b0afedbcd168eba148ca6d17eff218/src/vm/methodtable.cpp#L1585

I see two possible optimizations:

  1. Add a fast check whether the generic variant interface implemented with the same set of generic arguments (full equivalence). This can be done easily by comparing MethodTable pointers inside CanCastByVarianceToInterfaceOrDelegate.
  2. Do the following check earlier:
if (GetTypeDefRid() != pTargetMT->GetTypeDefRid() || GetModule() != pTargetMT->GetModule()) {...}

I'm not sure that this can help noticeably. But I believe that .NET team can dig out the true reason and fix the problem if it is possible.

@omariom
Copy link
Contributor

omariom commented Apr 1, 2015

wow. That's a big difference.
@ikopylov you could build CLR yourself and perf test your changes.

@jkotas
Copy link
Member

jkotas commented Apr 29, 2016

This issue was fixed for CoreRT by caching the casting results dotnet/corert@ede4733 . We may consider implementing similar cache for CoreCLR as well.

@danielcrabtree
Copy link

I stumbled into this myself recently and did some investigation. Here are the details I provided on dotnet/coreclr#11094 that relate to this issue.

Results Summary

Using BenchmarkDotNet I've found that covariant and contravariant casting is approximately:

  • 200x slower than regular casting on .NET Framework.
  • 60x slower than regular casting on .NET Core.
  • 33x slower than generic casting on .NET Framework.
  • 17x slower than generic casting on .NET Core.

The code to reproduce these experiments and my investigation is found in this blog post.

Using BenchmarkDotNet I've found that covariant and contravariant casting + method call is approximately:

  • 3.25x slower than dynamic cast + method call on .NET Framework.
  • 2.75x slower than dynamic cast + method call on .NET Core.

The code to reproduce these experiments and my investigation is found in another blog post. Note: these results exclude the first call to dynamic, which is ~1200x slower than the first call using covariant and contravariant casting.

Also, see the comments on my blog posts where readers have reproduced these results and conducted their own insightful experiments that lead to my findings.

Test Environment and Raw .NET Core Results

Please note that my blog posts only include the results from .NET Framework on 64bit RyuJIT. The results above are from .NET Framework on 64bit RyuJIT and .NET Core on 64bit RyuJIT. I have also run tests on .NET Framework on 32bit LegacyJIT and while there are absolute performance differences, the results are on the same order of magnitude as those presented above.

Here are the test environments as reported by BenchmarkDotNet for the results presented on my blog posts:

BenchmarkDotNet=v0.10.3.0, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7 CPU 970 3.20GHz, ProcessorCount=12
Frequency=3128907 Hz, Resolution=319.6004 ns, Timer=TSC
  [Host]     : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0
  DefaultJob : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0

BenchmarkDotNet=v0.10.3.0, OS=Microsoft Windows NT 6.2.9200.0
Processor=Intel(R) Core(TM) i7 CPU 970 3.20GHz, ProcessorCount=12
Frequency=3128910 Hz, Resolution=319.6001 ns, Timer=TSC
  [Host]     : Clr 4.0.30319.42000, 32bit LegacyJIT-v4.6.1637.0
  DefaultJob : Clr 4.0.30319.42000, 32bit LegacyJIT-v4.6.1637.0

Here are the results for .NET Core that are not on my blog post and the .NET Core test environment (the target framework is set to .NETCoreApp 1.1):

BenchmarkDotNet=v0.10.3.0, OS=Microsoft Windows 10.0.14393
Processor=Intel(R) Core(TM) i7 CPU 970 3.20GHz, ProcessorCount=12
Frequency=3128910 Hz, Resolution=319.6001 ns, Timer=TSC
dotnet cli version=1.0.3
  [Host]     : .NET Core 4.6.25009.03, 64bit RyuJIT
  DefaultJob : .NET Core 4.6.25009.03, 64bit RyuJIT

Casting Results:
==================================================
             Method |       Mean |    StdDev | Scaled | Scaled-StdDev |
------------------- |----------- |---------- |------- |-------------- |
         ObjectCast |  0.4848 ns | 0.0008 ns |   0.42 |          0.00 |
 ImplementationCast |  1.1616 ns | 0.0006 ns |   1.00 |          0.00 |
      InterfaceCast |  2.8750 ns | 0.0049 ns |   2.48 |          0.00 |
        GenericCast |  4.0288 ns | 0.0090 ns |   3.47 |          0.01 |
      CovariantCast | 70.5734 ns | 0.0317 ns |  60.76 |          0.04 |
  ContravariantCast | 71.1789 ns | 0.0116 ns |  61.28 |          0.03 |

Covariant Casting + Method Call Results:
==================================================
 |   Method |       Mean |    StdDev |     Median | Scaled | Scaled-StdDev |
 |--------- |----------- |---------- |----------- |------- |-------------- |
 |   Direct | 18.4310 ns | 0.1172 ns | 18.3872 ns |   1.00 |          0.00 |
 | Implicit | 18.3568 ns | 0.1045 ns | 18.2858 ns |   1.00 |          0.01 |
 | Explicit | 84.4047 ns | 0.2345 ns | 84.4041 ns |   4.58 |          0.03 |
 |  Dynamic | 30.6939 ns | 0.0023 ns | 30.6943 ns |   1.67 |          0.01 |

Contravariant Casting + Method Call Results:
==================================================
 |   Method |       Mean |    StdDev | Scaled | Scaled-StdDev |
 |--------- |----------- |---------- |------- |-------------- |
 |   Direct | 17.7818 ns | 0.0041 ns |   1.00 |          0.00 |
 | Implicit | 17.7725 ns | 0.0047 ns |   1.00 |          0.00 |
 | Explicit | 83.4591 ns | 0.0097 ns |   4.69 |          0.00 |
 |  Dynamic | 28.2057 ns | 0.0670 ns |   1.59 |          0.00 |

grant-d referenced this issue in k2workflow/Clay Aug 31, 2018
* TODO: Casting to covariant interface is up to 200x slower: https://github.com/dotnet/coreclr/issues/603

* Huffman

* More optimizations

* Optimizations

* Optimizations

* Precompute codeMax

* Rename tests

* Cleanup

* Comments

* Rename classes

* Add more units using real data
grant-d referenced this issue in k2workflow/Clay Aug 31, 2018
* TODO: Casting to covariant interface is up to 200x slower: https://github.com/dotnet/coreclr/issues/603

* Huffman

* More optimizations

* Optimizations

* Optimizations

* Precompute codeMax

* Rename tests

* Cleanup

* Comments

* Rename classes

* Add more units using real data

* Feature/huffman (#253)

* New test

* Jump

* Fix bench

* Manual merge
@EgorBo
Copy link
Member

EgorBo commented Oct 24, 2018

image
^ just a proof that CoreRT does great job here 🙂

@VSadov VSadov self-assigned this Mar 22, 2019
@VSadov
Copy link
Member

VSadov commented Oct 29, 2019

Fixed in dotnet/coreclr#23548

@VSadov VSadov closed this as completed Oct 29, 2019
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 30, 2020
@msftgits msftgits added this to the Future milestone Jan 30, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

7 participants