Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[arm64] These benchmarks are faster on Rosetta-x64 than on native arm64 #60616

Closed
EgorBo opened this issue Oct 19, 2021 · 7 comments
Closed

[arm64] These benchmarks are faster on Rosetta-x64 than on native arm64 #60616

EgorBo opened this issue Oct 19, 2021 · 7 comments
Assignees
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Oct 19, 2021

The following simple benchmarks are faster on Rosetta (x64 emulation) than on arm64 (native) on the same CPU "Apple M1 mac mini". I think it's a clear sign something can be improved on arm64 side:

using System;
using System.Linq;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

public enum MyEnum
{
    A,B,C,D,E,F
}

public class Program
{
    static void Main(string[] args) => BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);


    [Benchmark]
    [Arguments(MyEnum.F)]
    public string EnumToString(MyEnum e) => e.ToString();


    [Benchmark]
    [Arguments(10000)]
    public char[] AllocateUninit(int len) => GC.AllocateUninitializedArray<char>(len);
    

    private static readonly int[] _array = Enumerable.Range(1, 1000).ToArray();
    [Benchmark]
    public void ArrayReverse() => Array.Reverse(_array);
}

Rosetta (.NET 6.0 rc2 osx-x64):

|          Method |       Mean |     Error |    StdDev |
|---------------- |-----------:|----------:|----------:|
|    ArrayReverse | 176.320 ns | 2.4289 ns | 2.2720 ns |
|  AllocateUninit | 343.028 ns | 1.0766 ns | 0.8405 ns |
|    EnumToString |  23.064 ns | 0.0278 ns | 0.0247 ns |

Native (.NET 6.0 rc2 osx-arm64):

|          Method |       Mean |     Error |    StdDev |
|---------------- |-----------:|----------:|----------:|
|    ArrayReverse | 201.936 ns | 0.0345 ns | 0.0323 ns |
|  AllocateUninit | 524.582 ns | 1.5993 ns | 1.4960 ns |
|    EnumToString |  44.961 ns | 0.1068 ns | 0.0999 ns |

AllocateUninit and EnumToString most likely are GC issues - not sure it's the same as #60166 as that function reports 2MB for LLC size (however, it might be less than the actual, see https://en.wikipedia.org/wiki/Apple_M1)

PS: I'm sure we'll find more such cases if we run the whole dotnet/performance suite on Rosetta vs native - I already launched a script to do so - it should finish in two days.

/cc @dotnet/jit-contrib @dotnet/gc

@EgorBo EgorBo added the tenet-performance Performance related issue label Oct 19, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Oct 19, 2021
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@EgorBo EgorBo added arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Oct 19, 2021
@ghost
Copy link

ghost commented Oct 19, 2021

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

The following simple benchmarks are faster on Rosetta (x64 emulation) than on arm64 (native) Apple M1 mac mini. I think it's a clear sign something can be improved on arm64 side:

using System;
using System.Linq;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

public enum MyEnum
{
    A,B,C,D,E,F
}

public class Program
{
    static void Main(string[] args) => BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);


    [Benchmark]
    [Arguments(MyEnum.F)]
    public string EnumToString(MyEnum e) => e.ToString();


    [Benchmark]
    [Arguments(10000)]
    public char[] AllocateUninit(int len) => GC.AllocateUninitializedArray<char>(len);
    

    private static readonly int[] _array = Enumerable.Range(1, 1000).ToArray();
    [Benchmark]
    public void ArrayReverse() => Array.Reverse(_array);
}

Rosetta (.NET 6.0 rc2 osx-x64):

|          Method |       Mean |     Error |    StdDev |
|---------------- |-----------:|----------:|----------:|
|    ArrayReverse | 176.320 ns | 2.4289 ns | 2.2720 ns |
|  AllocateUninit | 343.028 ns | 1.0766 ns | 0.8405 ns |
|    EnumToString |  23.064 ns | 0.0278 ns | 0.0247 ns |

Native (.NET 6.0 rc2 osx-arm64):

|          Method |       Mean |     Error |    StdDev |
|---------------- |-----------:|----------:|----------:|
|    ArrayReverse | 201.936 ns | 0.0345 ns | 0.0323 ns |
|  AllocateUninit | 524.582 ns | 1.5993 ns | 1.4960 ns |
|    EnumToString |  44.961 ns | 0.1068 ns | 0.0999 ns |

AllocateUninit and EnumToString most likely are GC issues - not sure it's the same as #60166 as that function reports 2MB for LLC size (however, it might be less than the actual, see https://en.wikipedia.org/wiki/Apple_M1)

PS: I'm sure we'll find more such cases if we run the whole dotnet/performance suite on Rosetta vs native - I already launched a script to do so - it should finish in two days.

/cc @dotnet/jit-contrib @dotnet/gc

Author: EgorBo
Assignees: -
Labels:

arch-arm64, tenet-performance, area-CodeGen-coreclr, untriaged

Milestone: -

@janvorli
Copy link
Member

as that function reports 2MB for LLC size

That is strange, the sysctl that we call reports 4MB L2 cache size on my M1.

@EgorBo
Copy link
Member Author

EgorBo commented Oct 19, 2021

as that function reports 2MB for LLC size

That is strange, the sysctl that we call reports 4MB L2 cache size on my M1.

Yes, I was wrong - the function reports 4Mb, I only wanted to note that wiki says it's 12Mb for performance cores, but I'm pretty sure it's not the reason for these issues.

@janvorli
Copy link
Member

That's true. It seems that the OS reports the conservative value used by the energy efficient cores. I have checked and there doesn't seem to be other way to get the cache size using the sysctl, so I don't know if there is a way to get the performance core cache size.

@EgorBo
Copy link
Member Author

EgorBo commented Oct 20, 2021

More benchmarks which are slower on arm64 M1 in comparison with M1-Rosetta on the same CPU:

summary:
better: 977, geomean: 1.576
worse: 161, geomean: 1.329
total diff: 1138

| Slower                                                                           | diff/base | Base Median (ns) | Diff Median (ns) | Modality|
| -------------------------------------------------------------------------------- | ---------:| ----------------:| ----------------:| -------- |
| System.Globalization.Tests.Perf_DateTimeCultureInfo.Parse(culturestring: ja)     |      6.59 |           300.53 |          1979.50 |         |
| System.Buffers.Binary.Tests.BinaryReadAndWriteTests.ReadStructFieldByFieldUsingB |      4.39 |             5.12 |            22.45 |         |
| System.Buffers.Text.Tests.Base64Tests.Base64Encode(NumberOfBytes: 1000)          |      4.19 |           147.05 |           616.13 |         |
| System.Buffers.Text.Tests.Base64Tests.Base64EncodeDestinationTooSmall(NumberOfBy |      4.02 |           153.28 |           615.93 |         |
| System.Diagnostics.Perf_Process.Start                                            |      3.54 |       4618187.50 |      16356041.00 |         |
| SIMD.ConsoleMandel.VectorDoubleSinglethreadADT                                   |      3.38 |     502304541.00 |    1696300250.00 |         |
| System.Diagnostics.Perf_Process.StartAndWaitForExit                              |      2.29 |       8650450.89 |      19782744.79 |         |
| Microsoft.Extensions.DependencyInjection.GetServiceIEnumerable.Transient         |      2.21 |           118.41 |           261.96 |         |
| PerfLabTests.StackWalk.Walk                                                      |      2.20 |      22078975.71 |      48627036.50 | several?|
| SIMD.ConsoleMandel.VectorFloatSinglethreadADT                                    |      2.11 |     268342625.00 |     565315479.00 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.ProducerConsumer(RentalSiz |      2.08 |           508.53 |          1056.67 |         |
| System.Collections.Concurrent.AddRemoveFromDifferentThreads<String>.ConcurrentQu |      1.74 |      15371937.00 |      26794646.00 |         |
| System.Collections.Concurrent.Count<String>.Queue_EnqueueCountDequeue(Size: 512) |      1.73 |            11.20 |            19.36 |         |
| Benchstone.BenchI.BenchE.Test                                                    |      1.69 |     281416000.00 |     474800917.00 |         |
| System.Buffers.Binary.Tests.BinaryReadAndWriteTests.ReadStructFieldByFieldUsingB |      1.67 |            18.58 |            31.00 |         |
| System.Collections.Concurrent.AddRemoveFromDifferentThreads<Int32>.ConcurrentQue |      1.65 |      16991040.50 |      27988708.00 | several?|
| System.Collections.CopyTo<String>.Memory(Size: 2048)                             |      1.62 |           235.29 |           382.19 | bimodal |
| System.Collections.CtorFromCollection<Int32>.SortedDictionaryDeepCopy(Size: 512) |      1.61 |         13650.02 |         22026.81 |         |
| System.Collections.IndexerSetReverse<Int32>.Span(Size: 512)                      |      1.55 |           175.11 |           271.54 |         |
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.53 |           140.54 |           215.43 |         |
| System.Collections.IterateFor<String>.Array(Size: 512)                           |      1.52 |           185.00 |           282.10 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.ProducerConsumer(RentalSiz |      1.52 |          3671.85 |          5587.57 | bimodal |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleParallel(RentalSize: |      1.49 |           200.91 |           298.71 |         |
| System.Collections.CreateAddAndClear<Int32>.ConcurrentQueue(Size: 512)           |      1.48 |          4817.63 |          7138.38 |         |
| System.Collections.Concurrent.Count<Int32>.Queue_EnqueueCountDequeue(Size: 512)  |      1.46 |            12.86 |            18.73 |         |
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.45 |           188.51 |           274.01 |         |
| Benchstone.BenchI.TreeInsert.Test                                                |      1.45 |         35253.85 |         51211.78 | bimodal |
| System.Collections.IndexerSet<String>.Array(Size: 512)                           |      1.45 |           197.82 |           287.16 |         |
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.45 |           152.18 |           220.35 |         |
| System.Collections.CtorFromCollectionNonGeneric<String>.Stack(Size: 512)         |      1.45 |          8914.65 |         12904.38 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.SingleParallel(RentalSize: 4 |      1.45 |           242.43 |           350.50 | several?|
| System.Collections.Tests.Add_Remove_SteadyState<String>.ConcurrentBag(Count: 512 |      1.44 |            32.24 |            46.47 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.MultipleSerial(RentalSize: 4 |      1.44 |         13129.79 |         18910.53 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.SingleSerial(RentalSize: 409 |      1.42 |          2209.51 |          3148.04 |         |
| System.Collections.IterateForEach<String>.Span(Size: 512)                        |      1.42 |           178.65 |           253.68 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.MultipleSerial(RentalSize: 4 |      1.42 |         14778.31 |         20935.38 | several?|
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.41 |           152.73 |           215.85 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.ProducerConsumer(RentalSize: |      1.40 |          1807.35 |          2534.79 |         |
| System.Collections.IndexerSetReverse<Int32>.Array(Size: 512)                     |      1.39 |           228.02 |           316.23 |         |
| System.Collections.CreateAddAndClear<String>.ConcurrentQueue(Size: 512)          |      1.39 |          5496.37 |          7612.95 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleSerial(RentalSize: 4 |      1.38 |          2154.65 |          2982.23 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.SingleSerial(RentalSize: 409 |      1.38 |          2249.49 |          3101.94 |         |
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.38 |            99.72 |           137.45 |         |
| System.Collections.CtorFromCollection<String>.SortedDictionaryDeepCopy(Size: 512 |      1.37 |         17337.07 |         23681.98 |         |
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.36 |           157.37 |           214.46 |         |
| System.Collections.IndexerSet<Int32>.Span(Size: 512)                             |      1.36 |           169.49 |           230.78 |         |
| System.Collections.IterateFor<String>.ImmutableArray(Size: 512)                  |      1.35 |           183.15 |           248.16 |         |
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.35 |           158.62 |           214.84 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleSerial(RentalSize: 4 |      1.35 |          2201.76 |          2980.77 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.MultipleSerial(RentalSize: 4 |      1.35 |         14277.59 |         19286.85 |         |
| System.Collections.ContainsFalse<Int32>.Stack(Size: 512)                         |      1.35 |         57545.30 |         77616.76 |         |
| Microsoft.Extensions.DependencyInjection.GetServiceIEnumerable.Scoped            |      1.34 |            54.29 |            72.90 |         |
| System.Collections.CreateAddAndClear<String>.ConcurrentBag(Size: 512)            |      1.33 |         14646.01 |         19498.61 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleSerial(RentalSize: 4 |      1.33 |          2811.41 |          3737.58 |         |
| System.Collections.IterateForEach<Int32>.Array(Size: 512)                        |      1.32 |           217.42 |           287.64 |         |
| System.Collections.IndexerSetReverse<String>.Array(Size: 512)                    |      1.32 |           239.60 |           316.32 |         |
| System.Collections.IterateFor<Int32>.ImmutableArray(Size: 512)                   |      1.32 |           218.61 |           287.73 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleSerial(RentalSize: 4 |      1.31 |          2853.88 |          3746.01 |         |
| LinqBenchmarks.Count00ForX                                                       |      1.31 |      76997218.75 |     100802875.00 |         |
| System.Collections.CtorDefaultSize<Int32>.ConcurrentQueue                        |      1.30 |            55.46 |            72.37 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.MultipleSerial(RentalSize: |      1.30 |         19155.43 |         24939.33 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleParallel(RentalSize: |      1.30 |          1687.52 |          2196.01 | several?|
| System.Collections.IndexerSet<Int32>.ConcurrentDictionary(Size: 512)             |      1.29 |         13695.98 |         17627.36 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.SingleParallel(RentalSize: 4 |      1.28 |          3768.25 |          4831.34 |         |
| LinqBenchmarks.Where00ForX                                                       |      1.27 |     160637896.00 |     204757291.00 | bimodal |
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.27 |           108.45 |           137.97 |         |
| System.Buffers.Text.Tests.Base64Tests.ConvertToBase64CharArray(NumberOfBytes: 10 |      1.27 |           670.82 |           850.24 |         |
| System.Collections.IterateForEach<Int32>.ConcurrentDictionary(Size: 512)         |      1.26 |         17375.08 |         21972.85 |         |
| System.Collections.IndexerSet<Int32>.Array(Size: 512)                            |      1.26 |           227.47 |           286.77 |         |
| System.Collections.IterateFor<Int32>.Array(Size: 512)                            |      1.26 |           228.48 |           287.70 |         |
| System.Collections.IterateForEach<Int32>.List(Size: 512)                         |      1.26 |           395.38 |           497.33 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.SingleSerial(RentalSize: 409 |      1.25 |            21.28 |            26.61 |         |
| System.Collections.IndexerSetReverse<String>.Span(Size: 512)                     |      1.25 |           218.39 |           272.68 |         |
| System.Collections.IterateFor<Int32>.List(Size: 512)                             |      1.24 |           249.85 |           310.69 |         |
| System.Perf_Convert.ToBase64CharArray(binaryDataSize: 1024, formattingOptions: N |      1.24 |           685.19 |           851.70 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.MultipleSerial(RentalSize: 4 |      1.24 |         16681.64 |         20647.50 | bimodal |
| System.Collections.ContainsFalse<Int32>.Span(Size: 512)                          |      1.24 |         48085.68 |         59495.81 |         |
| System.Collections.IterateFor<String>.List(Size: 512)                            |      1.24 |           251.74 |           311.37 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleParallel(RentalSize: |      1.23 |          3727.43 |          4601.15 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.MultipleSerial(RentalSize: |      1.23 |         20755.86 |         25545.18 | bimodal |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.SingleSerial(RentalSize: 409 |      1.23 |          1649.56 |          2028.00 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.SingleParallel(RentalSize: 4 |      1.23 |          3378.27 |          4151.76 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.ProducerConsumer(RentalSize: |      1.23 |           640.00 |           785.53 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.MultipleSerial(RentalSize: |      1.22 |         21494.91 |         26263.16 | several?|
| System.Perf_Convert.ChangeType                                                   |      1.22 |            33.36 |            40.72 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.ProducerConsumer(RentalSize: |      1.22 |          3370.25 |          4114.12 | several?|
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.22 |            90.82 |           110.87 |         |
| System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-U |      1.22 |           481.29 |           586.84 |         |
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.22 |           176.54 |           214.94 |         |
| System.Collections.Tests.Perf_BitArray.BitArrayCopyToByteArray(Size: 512)        |      1.22 |            96.42 |           117.37 |         |
| LinqBenchmarks.Where01LinqMethodX                                                |      1.22 |     269209541.50 |     327139792.00 |         |
| System.Collections.ContainsTrue<Int32>.Span(Size: 512)                           |      1.21 |         26465.11 |         32038.82 |         |
| System.Collections.IterateForEach<Int32>.ImmutableArray(Size: 512)               |      1.21 |           238.67 |           287.79 |         |
| System.Collections.IterateForEach<String>.ConcurrentDictionary(Size: 512)        |      1.20 |         21502.15 |         25897.47 |         |
| System.Collections.ContainsFalse<Int32>.Array(Size: 512)                         |      1.20 |         50629.39 |         60934.57 |         |
| System.Collections.ContainsTrue<Int32>.List(Size: 512)                           |      1.20 |         26984.51 |         32435.26 |         |
| System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, O |      1.20 |           489.85 |           588.07 |         |
| System.Collections.ContainsTrue<Int32>.Queue(Size: 512)                          |      1.20 |         27818.07 |         33328.78 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.SingleParallel(RentalSize: 4 |      1.20 |          3284.96 |          3929.36 |         |
| System.Collections.ContainsFalse<Int32>.Queue(Size: 512)                         |      1.20 |         52506.74 |         62749.16 |         |
| LinqBenchmarks.Where01LinqQueryX                                                 |      1.19 |     276607167.00 |     327933209.00 |         |
| System.Collections.Tests.Add_Remove_SteadyState<Int32>.ConcurrentBag(Count: 512) |      1.18 |            27.97 |            33.08 |         |
| System.Collections.IterateForEach<String>.Array(Size: 512)                       |      1.18 |           238.55 |           282.12 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleParallel(RentalSize: |      1.18 |          4203.38 |          4965.83 |         |
| System.Collections.TryAddDefaultSize<String>.ConcurrentDictionary(Count: 512)    |      1.18 |         68944.04 |         81074.48 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.MultipleSerial(RentalSize: |      1.17 |         20647.41 |         24235.39 |         |
| LinqBenchmarks.Where01ForX                                                       |      1.17 |     150912500.00 |     176946125.00 |         |
| Microsoft.Extensions.DependencyInjection.GetService.Scoped                       |      1.17 |            62.89 |            73.71 |         |
| System.Perf_Convert.ToBase64CharArray(binaryDataSize: 1024, formattingOptions: I |      1.17 |           785.19 |           916.57 |         |
| System.Collections.IndexerSet<String>.Span(Size: 512)                            |      1.17 |           177.80 |           207.45 |         |
| System.Perf_Convert.ToBase64String(formattingOptions: None)                      |      1.17 |           858.77 |          1001.15 |         |
| Benchstone.BenchI.IniArray.Test                                                  |      1.17 |      56395416.75 |      65710104.13 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Byte>.SingleSerial(RentalSize: 409 |      1.16 |          1725.39 |          2006.73 | several?|
| System.Collections.ContainsFalse<Int32>.ImmutableArray(Size: 512)                |      1.16 |         53329.54 |         61844.41 |         |
| System.Diagnostics.Perf_Activity.EnumerateActivityTagsLarge                      |      1.16 |         10164.48 |         11755.17 |         |
| System.Collections.ContainsTrue<Int32>.ICollection(Size: 512)                    |      1.16 |         28277.25 |         32699.46 |         |
| System.Collections.ContainsTrue<Int32>.Array(Size: 512)                          |      1.16 |         28549.70 |         33003.68 |         |
| System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-U |      1.15 |            63.51 |            73.30 |         |
| System.ComponentModel.Tests.Perf_TypeDescriptorTests.GetConverter(typeToConvert: |      1.15 |           119.20 |           137.33 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleParallel(RentalSize: |      1.15 |          4683.98 |          5383.40 |         |
| System.Collections.ContainsFalse<Int32>.List(Size: 512)                          |      1.15 |         52484.24 |         60291.75 |         |
| Microsoft.Extensions.DependencyInjection.ActivatorUtilitiesBenchmark.CreateInsta |      1.15 |           959.78 |          1102.49 |         |
| BenchmarksGame.Mandelbrot_2.Bench(width: 4000, checksum: "C7-E6-66-43-66-73-F8-A |      1.15 |    1030941791.00 |    1183280875.00 |         |
| System.Collections.CopyTo<String>.Span(Size: 2048)                               |      1.15 |           228.38 |           261.66 |         |
| System.Collections.CtorDefaultSize<String>.ConcurrentQueue                       |      1.14 |            81.76 |            92.95 |         |
| Span.Sorting.BubbleSortArray(Size: 512)                                          |      1.13 |        142937.46 |        162082.96 |         |
| Benchstone.BenchI.MulMatrix.Test                                                 |      1.13 |     376508416.00 |     426447791.00 |         |
| System.Collections.CreateAddAndClear<Int32>.Span(Size: 512)                      |      1.13 |           470.70 |           531.43 |         |
| System.Collections.ContainsTrue<Int32>.ImmutableArray(Size: 512)                 |      1.13 |         30055.76 |         33930.74 |         |
| PerfLabTests.LowLevelPerf.GenericClassWithSTringGenericInstanceMethod            |      1.13 |        139251.75 |        156748.26 |         |
| System.Collections.IterateForEach<String>.ReadOnlySpan(Size: 512)                |      1.13 |           178.82 |           201.28 |         |
| System.Collections.Concurrent.AddRemoveFromSameThreads<String>.ConcurrentQueue(S |      1.13 |     281975688.00 |     317363292.00 |         |
| System.Perf_Convert.FromBase64String                                             |      1.12 |            59.17 |            66.42 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.SingleParallel(RentalSize: |      1.12 |          5267.31 |          5875.60 |         |
| System.Collections.Tests.Perf_BitArray.BitArrayByteArrayCtor(Size: 512)          |      1.11 |           146.58 |           163.24 |         |
| Benchstone.BenchI.XposMatrix.Test                                                |      1.11 |         14195.12 |         15781.00 |         |
| System.Collections.Tests.Perf_BitArray.BitArraySetLengthShrink(Size: 512)        |      1.11 |           149.01 |           165.24 |         |
| Benchstone.BenchF.Trap.Test                                                      |      1.11 |     112110354.00 |     124062250.00 |         |
| Benchstone.BenchI.HeapSort.Test                                                  |      1.11 |        250767.83 |        277148.94 |         |
| System.Buffers.Tests.RentReturnArrayPoolTests<Object>.MultipleSerial(RentalSize: |      1.10 |           201.13 |           222.03 |         |
| System.Collections.CopyTo<String>.Array(Size: 2048)                              |      1.09 |           214.33 |           234.65 |         |
| SciMark2.kernel.benchSparseMult                                                  |      1.09 |     526069541.00 |     574615250.00 |         |
| System.Collections.Sort<IntStruct>.List(Size: 512)                               |      1.09 |          3167.00 |          3456.72 |         |
| System.Buffers.Text.Tests.Utf8ParserTests.TryParseDouble(value: 1.79769313486231 |      1.09 |           258.72 |           282.36 |         |
| System.Collections.IterateForEachNonGeneric<Int32>.ArrayList(Size: 512)          |      1.09 |          3914.18 |          4267.01 |         |
| System.Collections.Sort<BigStruct>.Array(Size: 512)                              |      1.08 |          5399.93 |          5858.71 |         |
| System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (, No |      1.08 |           293.79 |           317.80 |         |
| System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (, None,  |      1.08 |           289.13 |           312.52 |         |
| System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, I |      1.08 |           288.75 |           312.07 |         |
| System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, N |      1.08 |           289.39 |           312.32 |         |
| System.Collections.CtorFromCollection<Int32>.Dictionary(Size: 512)               |      1.08 |          1450.84 |          1563.63 |         |
| System.Collections.Sort<IntStruct>.Array(Size: 512)                              |      1.08 |          3214.22 |          3459.75 |         |
| System.Collections.Tests.Perf_BitArray.BitArrayRightShift(Size: 512)             |      1.07 |           171.46 |           184.11 |         |
| Benchstone.BenchI.CSieve.Test                                                    |      1.07 |       3953432.95 |       4231179.67 |         |
| System.Collections.CreateAddAndClear<Int32>.ConcurrentBag(Size: 512)             |      1.07 |          9635.24 |         10311.60 |         |
| System.Buffers.Text.Tests.Utf8ParserTests.TryParseDouble(value: -1.7976931348623 |      1.07 |           264.20 |           282.46 |         |
| System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-U |      1.07 |           292.89 |           312.33 |         |
| System.Collections.CtorFromCollection<Int32>.ConcurrentBag(Size: 512)            |      1.06 |          6626.17 |          7055.03 |         |
| System.Collections.IterateFor<String>.ReadOnlySpan(Size: 512)                    |      1.06 |           185.96 |           197.66 |         |
| System.Collections.IterateFor<String>.Span(Size: 512)                            |      1.06 |           189.60 |           201.04 |         |
| System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-U |      1.06 |           294.75 |           312.04 |         |

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Nov 3, 2021
@JulieLeeMSFT JulieLeeMSFT added this to the Future milestone Nov 3, 2021
@EgorBo
Copy link
Member Author

EgorBo commented Oct 22, 2022

Fixed with #64576

@EgorBo EgorBo closed this as completed Oct 22, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Nov 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

3 participants