Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ARM64] Performance regression: Utf8Encoding #41699

Closed
Tracked by #93172 ...
adamsitnik opened this issue Sep 1, 2020 · 37 comments · Fixed by #42052
Closed
Tracked by #93172 ...

[ARM64] Performance regression: Utf8Encoding #41699

adamsitnik opened this issue Sep 1, 2020 · 37 comments · Fixed by #42052
Assignees
Labels
arch-arm64 area-System.Text.Encoding Priority:1 Work that is critical for the release, but we could probably ship without tenet-performance Performance related issue
Milestone

Comments

@adamsitnik
Copy link
Member

adamsitnik commented Sep 1, 2020

After running benchmarks for 3.1 vs 5.0 using "Ubuntu arm64 Qualcomm Machines" owned by the JIT Team, I've found few regressions related to Utf8Encoding. They are alll reproducible and I've verified that it's not a matter of loop alignment (by running them with --envVars COMPlus_JitAlignLoops:1).

It looks like it's ARM64 specific regression, I was not able to reproduce it for ARM (the 32 bit variant).

Repro

git clone https://github.com/dotnet/performance.git
python3 ./performance/scripts/benchmarks_ci.py -f netcoreapp3.1 netcoreapp5.0 --architecture arm64 --filter Perf_Utf8Encoding

BenchmarkDotNet=v0.12.1.1405-nightly, OS=ubuntu 16.04
Unknown processor
[Host] : .NET Core 3.1.8 (CoreCLR 4.700.20.41105, CoreFX 4.700.20.41903), Arm64 RyuJIT
Job-VTSQOV : .NET Core 3.1.8 (CoreCLR 4.700.20.41105, CoreFX 4.700.20.41903), Arm64 RyuJIT
Job-RAMSQZ : .NET Core 5.0.0 (CoreCLR 5.0.20.41714, CoreFX 5.0.20.41714), Arm64 RyuJIT

Method Runtime Input Mean Ratio Allocated
GetByteCount .NET Core 3.1 EnglishAllAscii 38.00 us 1.00 -
GetByteCount .NET Core 5.0 EnglishAllAscii 40.66 us 1.07 -
GetBytes .NET Core 3.1 EnglishAllAscii 101.09 us 1.00 163840 B
GetBytes .NET Core 5.0 EnglishAllAscii 104.96 us 1.04 163855 B
GetString .NET Core 3.1 EnglishAllAscii 103.47 us 1.00 327648 B
GetString .NET Core 5.0 EnglishAllAscii 95.76 us 0.93 327677 B
GetByteCount .NET Core 3.1 EnglishMostlyAscii 117.50 us 1.00 -
GetByteCount .NET Core 5.0 EnglishMostlyAscii 221.40 us 1.88 -
GetBytes .NET Core 3.1 EnglishMostlyAscii 273.49 us 1.00 169880 B
GetBytes .NET Core 5.0 EnglishMostlyAscii 377.67 us 1.38 169895 B
GetString .NET Core 3.1 EnglishMostlyAscii 262.55 us 1.00 327656 B
GetString .NET Core 5.0 EnglishMostlyAscii 250.18 us 0.95 327685 B
GetByteCount .NET Core 3.1 Chinese 53.34 us 1.00 -
GetByteCount .NET Core 5.0 Chinese 90.21 us 1.69 -
GetBytes .NET Core 3.1 Chinese 245.94 us 1.00 177752 B
GetBytes .NET Core 5.0 Chinese 279.62 us 1.14 177768 B
GetString .NET Core 3.1 Chinese 373.80 us 1.00 150112 B
GetString .NET Core 5.0 Chinese 358.11 us 0.96 150126 B
GetByteCount .NET Core 3.1 Cyrillic 45.35 us 1.00 -
GetByteCount .NET Core 5.0 Cyrillic 76.01 us 1.68 -
GetBytes .NET Core 3.1 Cyrillic 193.34 us 1.00 100880 B
GetBytes .NET Core 5.0 Cyrillic 222.10 us 1.15 100889 B
GetString .NET Core 3.1 Cyrillic 262.69 us 1.00 130856 B
GetString .NET Core 5.0 Cyrillic 259.83 us 0.99 130868 B
GetByteCount .NET Core 3.1 Greek 58.36 us 1.00 -
GetByteCount .NET Core 5.0 Greek 97.41 us 1.67 -
GetBytes .NET Core 3.1 Greek 275.88 us 1.00 129248 B
GetBytes .NET Core 5.0 Greek 314.00 us 1.14 129260 B
GetString .NET Core 3.1 Greek 394.55 us 1.00 164264 B
GetString .NET Core 5.0 Greek 394.35 us 1.00 164278 B

Docs

Profiling workflow for dotnet/runtime repository
Benchmarking workflow for dotnet/runtime repository

cc @kunalspathak @carlossanlop @pgovind @tannergooding

@adamsitnik adamsitnik added this to the 5.0.0 milestone Sep 1, 2020
@ghost
Copy link

ghost commented Sep 1, 2020

Tagging subscribers to this area: @tarekgh, @krwq
See info in area-owners.md if you want to be subscribed.

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Sep 1, 2020
@tarekgh tarekgh added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-System.Text.Encoding labels Sep 1, 2020
@tarekgh
Copy link
Member

tarekgh commented Sep 1, 2020

I have changed the tag to jit for now as this is most likely not the UTF8Encoding code itself. if proven otherwise, please re-tag it with encoding label again.

@kunalspathak
Copy link
Member

@pgovind
Copy link
Contributor

pgovind commented Sep 1, 2020

Just adding a note here: Despite what the name suggests, the 4 regressions listed here are likely from methods in UTF8Utility and/or ASCIIUtility. The GetString benchmark doesn't seem to show any improvements, but it's not straightforward to reverse the changes that this benchmark hits because the UTF8Utility and ASCIIUtility methods are highly coupled and they do show decent speedup in the other benchmarks.

@JulieLeeMSFT
Copy link
Member

@echesakovMSFT please look into this.

CC @AndyAyersMS

@JulieLeeMSFT JulieLeeMSFT removed the untriaged New issue has not been triaged by the area owner label Sep 1, 2020
@echesakov
Copy link
Contributor

For the following simple repro

robox@DDARM64S-003:~/echesako/Runtime_41699$ cat Program.cs
using System.IO;
using System.Runtime.CompilerServices;
using System.Text;

namespace Runtime_41699
{
    public class Program
    {
        public static void Main()
        {
            string unicode;
            byte[] bytes;
            UTF8Encoding utf8Encoding;

            unicode = File.ReadAllText("/home/robox/echesako/Runtime_41699/EnglishMostlyAscii.txt");
            utf8Encoding = new UTF8Encoding();
            bytes = utf8Encoding.GetBytes(unicode);

            while (true)
            {
                 Consume(utf8Encoding.GetByteCount(unicode));
            }
        }

//      public int GetByteCount() => _utf8Encoding.GetByteCount(_unicode);
//      public byte[] GetBytes() => _utf8Encoding.GetBytes(_unicode);
//      public string GetString() => _utf8Encoding.GetString(_bytes);

        [MethodImpl(MethodImplOptions.NoInlining)]
        private static void Consume<T>(in T _) { }
    }
}

I am seeing 4 times more cache-misses in net5.0 and twice more stalled-cycles-backend. The following counter stat collections are done for the loop only.

robox@DDARM64S-003:~/echesako/Runtime_41699$ cat netcoreapp3.1-Runtime_41699.txt
# started on Fri Sep  4 11:37:51 2020


 Performance counter stats for process id '35473':

           351,895      branch-misses
         5,009,414      cache-misses
    77,972,050,461      cpu-cycles
   157,542,185,470      instructions              #    2.02  insn per cycle
                                                  #    0.03  stalled cycles per insn
        66,244,057      stalled-cycles-frontend   #    0.08% frontend cycles idle
     4,396,095,861      stalled-cycles-backend    #    5.64% backend cycles idle

      30.004692431 seconds time elapsed

robox@DDARM64S-003:~/echesako/Runtime_41699$ cat net5.0-Runtime_41699.txt
# started on Fri Sep  4 11:39:04 2020


 Performance counter stats for process id '35507':

           270,958      branch-misses
        21,878,633      cache-misses
    77,971,943,239      cpu-cycles
    98,498,800,981      instructions              #    1.26  insn per cycle
                                                  #    0.10  stalled cycles per insn
        58,217,904      stalled-cycles-frontend   #    0.07% frontend cycles idle
     9,625,981,589      stalled-cycles-backend    #   12.35% backend cycles idle

      30.005090846 seconds time elapsed

@echesakov
Copy link
Contributor

echesakov commented Sep 4, 2020

Combining PopCount with GetNonAsciiBytes in Utf16Utility.GetPointerToFirstInvalidChar

diff --git a/src/libraries/System.Private.CoreLib/src/System/Text/Unicode/Utf16Utility.Validation.cs b/src/libraries/System.Private.CoreLib/src/System/Text/Unicode/Utf16Utility.Validation.cs
index f2df0ccdf53..73a1ea29bec 100644
--- a/src/libraries/System.Private.CoreLib/src/System/Text/Unicode/Utf16Utility.Validation.cs
+++ b/src/libraries/System.Private.CoreLib/src/System/Text/Unicode/Utf16Utility.Validation.cs
@@ -146,16 +146,18 @@ static Utf16Utility()

                         Vector128<ushort> charIsThreeByteUtf8Encoded;
                         uint mask;
+                        uint popcnt;

                         if (AdvSimd.IsSupported)
                         {
                             charIsThreeByteUtf8Encoded = AdvSimd.Subtract(vectorZero, AdvSimd.ShiftRightLogical(utf16Data, 11));
-                            mask = GetNonAsciiBytes(AdvSimd.Or(charIsNonAscii, charIsThreeByteUtf8Encoded).AsByte(), bitMask128);
+                            popcnt = GetNonAsciiBytesAndPopCount(AdvSimd.Or(charIsNonAscii, charIsThreeByteUtf8Encoded).AsByte(), bitMask128);
                         }
                         else
                         {
                             charIsThreeByteUtf8Encoded = Sse2.Subtract(vectorZero, Sse2.ShiftRightLogical(utf16Data, 11));
                             mask = (uint)Sse2.MoveMask(Sse2.Or(charIsNonAscii, charIsThreeByteUtf8Encoded).AsByte());
+                            popcnt = (uint)BitOperations.PopCount(mask);
                         }

                         // Each even bit of mask will be 1 only if the char was >= 0x0080,
@@ -182,8 +184,6 @@ static Utf16Utility()
                         // unpaired surrogates in our data. (Unpaired surrogates would invalidate
                         // our computed result and we'd have to throw it away.)

-                        uint popcnt = (uint)BitOperations.PopCount(mask);
-
                         // Surrogates need to be special-cased for two reasons: (a) we need
                         // to account for the fact that we over-counted in the addition above;
                         // and (b) they require separate validation.
@@ -485,6 +485,22 @@ static Utf16Utility()
             return pInputBuffer;
         }

+        [MethodImpl(MethodImplOptions.AggressiveInlining)]
+        private static uint GetNonAsciiBytesAndPopCount(Vector128<byte> value, Vector128<byte> bitMask128)
+        {
+            Debug.Assert(AdvSimd.Arm64.IsSupported);
+
+            Vector128<byte> mostSignificantBitIsSet = AdvSimd.ShiftRightArithmetic(value.AsSByte(), 7).AsByte();
+            Vector128<byte> extractedBits = AdvSimd.And(mostSignificantBitIsSet, bitMask128);
+
+            // self-pairwise add until all flags have moved to the first two bytes of the vector
+            extractedBits = AdvSimd.Arm64.AddPairwise(extractedBits, extractedBits);
+            extractedBits = AdvSimd.Arm64.AddPairwise(extractedBits, extractedBits);
+            extractedBits = AdvSimd.Arm64.AddPairwise(extractedBits, extractedBits);
+            Vector128<byte> popcnt = AdvSimd.PopCount(extractedBits);
+            return AdvSimd.Arm64.AddPairwise(popcnt, popcnt).ToScalar();
+        }
+
         [MethodImpl(MethodImplOptions.AggressiveInlining)]
         private static uint GetNonAsciiBytes(Vector128<byte> value, Vector128<byte> bitMask128)
         {

seems to help a little bit with stalled-cycles-backend

robox@DDARM64S-003:~/echesako/Runtime_41699$ perf stat -e "branch-misses,cache-misses,cpu-cycles,instructions,stalled-cycles-frontend,stalled-cycles-backend" -p 35949 sleep 30

 Performance counter stats for process id '35949':

           287,604      branch-misses
        24,786,695      cache-misses
    77,971,487,139      cpu-cycles
    95,056,853,183      instructions              #    1.22  insn per cycle
                                                  #    0.07  stalled cycles per insn
        58,614,087      stalled-cycles-frontend   #    0.08% frontend cycles idle
     6,650,628,910      stalled-cycles-backend    #    8.53% backend cycles idle

      30.005114026 seconds time elapsed

This avoid moving mask back and forth between SIMD and general-purpose registers files.

@echesakov
Copy link
Contributor

Below measurement are done on

processor       : 0
model name      : ARMv8 Processor rev 1 (v8l)
BogoMIPS        : 38.40
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x1
CPU part        : 0xd07
CPU revision    : 1

processor       : 1
model name      : ARMv8 Processor rev 1 (v8l)
BogoMIPS        : 38.40
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x1
CPU part        : 0xd07
CPU revision    : 1

processor       : 2
model name      : ARMv8 Processor rev 1 (v8l)
BogoMIPS        : 38.40
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x1
CPU part        : 0xd07
CPU revision    : 1

processor       : 3
model name      : ARMv8 Processor rev 1 (v8l)
BogoMIPS        : 38.40
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x1
CPU part        : 0xd07
CPU revision    : 1

.NET Core 3.1.6

BenchmarkDotNet=v0.12.1.1405-nightly, OS=ubuntu 18.04
ARMv8 Processor rev 1 (v8l), 4 logical cores
.NET Core SDK=6.0.100-alpha.1.20454.4
  [Host]     : .NET Core 3.1.6 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.31603), Arm64 RyuJIT
  Job-VJGWPE : .NET Core 3.1.6 (CoreCLR 4.700.20.26901, CoreFX 4.700.20.31603), Arm64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Runtime=.NET Core 3.1  Arguments=/p:DebugType=portable
Toolchain=netcoreapp3.1  IterationTime=250.0000 ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Input Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
GetByteCount EnglishAllAscii 118.9 μs 0.24 μs 0.21 μs 118.9 μs 118.6 μs 119.4 μs - - - -
GetBytes EnglishAllAscii 268.4 μs 2.14 μs 1.78 μs 267.6 μs 266.4 μs 272.4 μs 49.7881 49.7881 49.7881 163840 B
GetString EnglishAllAscii 322.4 μs 1.78 μs 1.49 μs 321.7 μs 321.5 μs 326.5 μs 99.4898 99.4898 99.4898 327648 B
GetByteCount EnglishMostlyAscii 328.3 μs 0.43 μs 0.39 μs 328.1 μs 328.0 μs 329.3 μs - - - -
GetBytes EnglishMostlyAscii 680.3 μs 0.80 μs 0.62 μs 680.3 μs 679.3 μs 681.5 μs 51.6304 51.6304 51.6304 169880 B
GetString EnglishMostlyAscii 591.1 μs 2.17 μs 1.92 μs 590.2 μs 588.7 μs 594.6 μs 99.5370 99.5370 99.5370 327656 B
GetByteCount Chinese 149.9 μs 0.29 μs 0.26 μs 149.9 μs 149.7 μs 150.5 μs - - - -
GetBytes Chinese 646.8 μs 1.21 μs 0.95 μs 647.0 μs 645.1 μs 647.9 μs 55.0000 55.0000 55.0000 177752 B
GetString Chinese 943.7 μs 2.97 μs 2.63 μs 943.3 μs 940.9 μs 950.1 μs 44.1176 44.1176 44.1176 150112 B
GetByteCount Cyrillic 130.3 μs 0.21 μs 0.19 μs 130.4 μs 130.0 μs 130.7 μs - - - -
GetBytes Cyrillic 487.0 μs 1.30 μs 1.08 μs 486.8 μs 485.2 μs 489.2 μs 29.2969 29.2969 29.2969 100880 B
GetString Cyrillic 648.8 μs 1.74 μs 1.45 μs 649.5 μs 646.6 μs 650.9 μs 39.0625 39.0625 39.0625 130856 B
GetByteCount Greek 163.7 μs 0.08 μs 0.07 μs 163.7 μs 163.6 μs 163.8 μs - - - -
GetBytes Greek 723.1 μs 3.83 μs 3.58 μs 721.8 μs 718.7 μs 728.8 μs 39.7727 39.7727 39.7727 129248 B
GetString Greek 968.3 μs 8.42 μs 7.47 μs 965.3 μs 960.8 μs 980.7 μs 47.7941 47.7941 47.7941 164264 B

NET Core 5.0.0

BenchmarkDotNet=v0.12.1.1405-nightly, OS=ubuntu 18.04
ARMv8 Processor rev 1 (v8l), 4 logical cores
.NET Core SDK=6.0.100-alpha.1.20454.4
  [Host]     : .NET Core 5.0.0 (CoreCLR 5.0.20.41714, CoreFX 5.0.20.41714), Arm64 RyuJIT
  Job-OUMKUS : .NET Core 5.0.0 (CoreCLR 5.0.20.41714, CoreFX 5.0.20.41714), Arm64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Runtime=.NET Core 5.0  Arguments=/p:DebugType=portable
Toolchain=netcoreapp5.0  IterationTime=250.0000 ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Input Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
GetByteCount EnglishAllAscii 109.6 μs 0.51 μs 0.47 μs 109.3 μs 109.3 μs 110.6 μs - - - -
GetBytes EnglishAllAscii 261.1 μs 2.71 μs 2.53 μs 259.9 μs 258.7 μs 266.3 μs 48.9583 48.9583 48.9583 163854 B
GetString EnglishAllAscii 205.6 μs 0.58 μs 0.46 μs 205.6 μs 204.8 μs 206.3 μs 99.5066 99.5066 99.5066 327677 B
GetByteCount EnglishMostlyAscii 565.9 μs 0.93 μs 0.78 μs 565.9 μs 564.2 μs 567.2 μs - - - 1 B
GetBytes EnglishMostlyAscii 912.2 μs 1.65 μs 1.29 μs 912.0 μs 910.3 μs 914.8 μs 52.0833 52.0833 52.0833 169896 B
GetString EnglishMostlyAscii 574.2 μs 1.69 μs 1.32 μs 573.6 μs 572.5 μs 576.9 μs 99.5370 99.5370 99.5370 327685 B
GetByteCount Chinese 258.3 μs 0.83 μs 0.77 μs 257.9 μs 257.5 μs 259.8 μs - - - -
GetBytes Chinese 749.3 μs 3.55 μs 3.14 μs 747.9 μs 746.7 μs 756.1 μs 53.5714 53.5714 53.5714 177768 B
GetString Chinese 896.2 μs 9.65 μs 9.03 μs 891.6 μs 889.5 μs 914.0 μs 45.1389 45.1389 45.1389 150126 B
GetByteCount Cyrillic 223.8 μs 0.46 μs 0.39 μs 223.6 μs 223.4 μs 224.8 μs - - - -
GetBytes Cyrillic 592.1 μs 4.66 μs 4.13 μs 590.1 μs 588.8 μs 602.4 μs 30.0926 30.0926 30.0926 100889 B
GetString Cyrillic 630.9 μs 1.32 μs 1.17 μs 630.6 μs 629.6 μs 633.7 μs 37.5000 37.5000 37.5000 130868 B
GetByteCount Greek 281.7 μs 0.35 μs 0.31 μs 281.8 μs 281.2 μs 282.3 μs - - - -
GetBytes Greek 844.0 μs 2.50 μs 2.09 μs 843.5 μs 841.8 μs 848.9 μs 39.4737 39.4737 39.4737 129260 B
GetString Greek 951.6 μs 10.59 μs 9.91 μs 949.3 μs 941.3 μs 971.0 μs 47.7941 47.7941 47.7941 164279 B

NET Core 5.0.0 (with the suggested change to Utf16Utility.GetPointerToFirstInvalidChar)

BenchmarkDotNet=v0.12.1.1405-nightly, OS=ubuntu 18.04
ARMv8 Processor rev 1 (v8l), 4 logical cores
.NET Core SDK=6.0.100-alpha.1.20454.4
  [Host]     : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 5.0.20.41714), Arm64 RyuJIT
  Job-XEOYJB : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 5.0.20.41714), Arm64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Runtime=.NET Core 5.0  Arguments=/p:DebugType=portable
Toolchain=netcoreapp5.0  IterationTime=250.0000 ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Input Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
GetByteCount EnglishAllAscii 109.2 μs 0.46 μs 0.41 μs 109.0 μs 108.8 μs 110.0 μs - - - -
GetBytes EnglishAllAscii 260.3 μs 2.08 μs 1.74 μs 259.7 μs 259.0 μs 265.2 μs 48.9583 48.9583 48.9583 163854 B
GetString EnglishAllAscii 207.3 μs 1.23 μs 1.03 μs 206.9 μs 206.3 μs 209.7 μs 99.5066 99.5066 99.5066 327677 B
GetByteCount EnglishMostlyAscii 414.5 μs 0.49 μs 0.41 μs 414.3 μs 414.3 μs 415.6 μs - - - -
GetBytes EnglishMostlyAscii 753.2 μs 3.51 μs 2.93 μs 752.4 μs 750.2 μs 759.9 μs 50.5952 50.5952 50.5952 169895 B
GetString EnglishMostlyAscii 574.0 μs 8.49 μs 7.94 μs 570.8 μs 566.9 μs 590.0 μs 98.2143 98.2143 98.2143 327685 B
GetByteCount Chinese 189.1 μs 0.16 μs 0.14 μs 189.1 μs 189.0 μs 189.5 μs - - - -
GetBytes Chinese 675.3 μs 1.07 μs 0.89 μs 675.1 μs 673.9 μs 677.0 μs 54.3478 54.3478 54.3478 177768 B
GetString Chinese 895.7 μs 6.66 μs 6.23 μs 892.0 μs 889.4 μs 904.7 μs 45.1389 45.1389 45.1389 150126 B
GetByteCount Cyrillic 164.1 μs 0.11 μs 0.09 μs 164.1 μs 164.0 μs 164.3 μs - - - -
GetBytes Cyrillic 527.7 μs 2.71 μs 2.40 μs 527.0 μs 525.4 μs 533.3 μs 29.1667 29.1667 29.1667 100889 B
GetString Cyrillic 624.7 μs 2.56 μs 2.00 μs 624.6 μs 622.2 μs 630.4 μs 38.4615 38.4615 38.4615 130868 B
GetByteCount Greek 206.9 μs 0.44 μs 0.39 μs 206.7 μs 206.6 μs 208.0 μs - - - -
GetBytes Greek 764.2 μs 3.49 μs 3.27 μs 762.9 μs 760.4 μs 769.6 μs 38.6905 38.6905 38.6905 129260 B
GetString Greek 962.2 μs 4.29 μs 3.59 μs 961.0 μs 958.4 μs 970.9 μs 46.8750 46.8750 46.8750 164279 B

It's clear from the data for GetByteCount benchmark the issue with stalled cycles due to PopCount is one of potentially many causes of the performance regression here. We need to do thorough analysis to discover them all.

I am moving this to .NET 6.0. I don't believe this is a JIT issue, so I am relabeling this back to area-System.Text.Encoding.

cc @JulieLeeMSFT @jeffhandley

@echesakov echesakov added area-System.Text.Encoding and removed area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Sep 4, 2020
@ghost
Copy link

ghost commented Sep 4, 2020

Tagging subscribers to this area: @tarekgh, @krwq
See info in area-owners.md if you want to be subscribed.

@echesakov echesakov modified the milestones: 5.0.0, 6.0.0 Sep 4, 2020
@echesakov echesakov removed their assignment Sep 4, 2020
@JulieLeeMSFT
Copy link
Member

@jeffhandley I am assigning this to you now.

@tarekgh
Copy link
Member

tarekgh commented Sep 4, 2020

Thanks @echesakovMSFT for your analysis.

CC @GrabYourPitchforks

@echesakov
Copy link
Contributor

One more observation - if I remove the code under if ((AdvSimd.Arm64.IsSupported && BitConverter.IsLittleEndian) || Sse2.IsSupported) in Utf16Utility.GetPointerToFirstInvalidChar but keep the code under else if (Vector.IsHardwareAccelerated) and replace it with if (Vector.IsHardwareAccelerated) which I presume would be true on Arm64 I will get the following results

(AdvSimd.Arm64.IsSupported && BitConverter.IsLittleEndian)

BenchmarkDotNet=v0.12.1.1405-nightly, OS=ubuntu 18.04
ARMv8 Processor rev 1 (v8l), 4 logical cores
.NET Core SDK=6.0.100-alpha.1.20454.4
  [Host]     : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 5.0.20.41714), Arm64 RyuJIT
  Job-IXQCIC : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 5.0.20.41714), Arm64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Runtime=.NET Core 5.0  Arguments=/p:DebugType=portable
Toolchain=netcoreapp5.0  IterationTime=250.0000 ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Input Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
GetByteCount EnglishAllAscii 109.7 μs 0.65 μs 0.60 μs 109.5 μs 109.1 μs 111.0 μs - - - -
GetBytes EnglishAllAscii 263.1 μs 4.69 μs 4.39 μs 260.7 μs 259.0 μs 273.0 μs 48.9583 48.9583 48.9583 163854 B
GetString EnglishAllAscii 206.7 μs 0.43 μs 0.36 μs 206.6 μs 206.2 μs 207.5 μs 99.5066 99.5066 99.5066 327677 B
GetByteCount EnglishMostlyAscii 566.1 μs 2.25 μs 2.11 μs 564.5 μs 563.8 μs 568.9 μs - - - 1 B
GetBytes EnglishMostlyAscii 909.5 μs 2.34 μs 1.96 μs 909.3 μs 907.1 μs 914.0 μs 52.0833 52.0833 52.0833 169896 B
GetString EnglishMostlyAscii 573.2 μs 1.73 μs 1.35 μs 572.8 μs 572.1 μs 577.0 μs 99.5370 99.5370 99.5370 327685 B
GetByteCount Chinese 257.7 μs 0.37 μs 0.33 μs 257.7 μs 257.0 μs 258.1 μs - - - -
GetBytes Chinese 749.4 μs 3.56 μs 2.97 μs 748.3 μs 746.2 μs 756.4 μs 53.5714 53.5714 53.5714 177768 B
GetString Chinese 899.9 μs 7.84 μs 7.33 μs 895.3 μs 893.2 μs 913.5 μs 45.1389 45.1389 45.1389 150126 B
GetByteCount Cyrillic 223.9 μs 0.20 μs 0.18 μs 223.9 μs 223.5 μs 224.0 μs - - - -
GetBytes Cyrillic 593.4 μs 3.35 μs 2.97 μs 592.4 μs 590.1 μs 598.7 μs 30.0926 30.0926 30.0926 100889 B
GetString Cyrillic 630.6 μs 1.14 μs 0.95 μs 630.8 μs 628.6 μs 631.7 μs 37.5000 37.5000 37.5000 130868 B
GetByteCount Greek 280.7 μs 0.95 μs 0.84 μs 280.3 μs 279.6 μs 282.1 μs - - - -
GetBytes Greek 844.0 μs 2.82 μs 2.35 μs 843.8 μs 840.8 μs 849.3 μs 39.4737 39.4737 39.4737 129260 B
GetString Greek 963.8 μs 1.75 μs 1.46 μs 964.2 μs 960.2 μs 965.7 μs 47.7941 47.7941 47.7941 164279 B

(Vector.IsHardwareAccelerated)

BenchmarkDotNet=v0.12.1.1405-nightly, OS=ubuntu 18.04
ARMv8 Processor rev 1 (v8l), 4 logical cores
.NET Core SDK=6.0.100-alpha.1.20454.4
  [Host]     : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 5.0.20.41714), Arm64 RyuJIT
  Job-ECUDLG : .NET Core 5.0.0 (CoreCLR 42.42.42.42424, CoreFX 5.0.20.41714), Arm64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Runtime=.NET Core 5.0  Arguments=/p:DebugType=portable
Toolchain=netcoreapp5.0  IterationTime=250.0000 ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Input Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
GetByteCount EnglishAllAscii 109.2 μs 0.19 μs 0.16 μs 109.2 μs 109.0 μs 109.6 μs - - - -
GetBytes EnglishAllAscii 263.9 μs 0.79 μs 0.66 μs 263.6 μs 263.3 μs 265.2 μs 49.5690 49.5690 49.5690 163855 B
GetString EnglishAllAscii 206.2 μs 1.16 μs 0.96 μs 205.9 μs 205.0 μs 208.5 μs 99.3151 99.3151 99.3151 327677 B
GetByteCount EnglishMostlyAscii 274.0 μs 0.23 μs 0.18 μs 274.0 μs 273.5 μs 274.2 μs - - - -
GetBytes EnglishMostlyAscii 612.1 μs 2.85 μs 2.38 μs 611.3 μs 610.3 μs 618.0 μs 50.4808 50.4808 50.4808 169895 B
GetString EnglishMostlyAscii 573.9 μs 4.85 μs 4.53 μs 572.0 μs 569.4 μs 583.5 μs 98.2143 98.2143 98.2143 327685 B
GetByteCount Chinese 125.3 μs 0.26 μs 0.23 μs 125.3 μs 124.9 μs 125.7 μs - - - -
GetBytes Chinese 615.4 μs 2.08 μs 1.84 μs 615.8 μs 610.7 μs 617.8 μs 55.2885 55.2885 55.2885 177769 B
GetString Chinese 900.0 μs 6.04 μs 5.65 μs 896.8 μs 894.4 μs 912.4 μs 45.1389 45.1389 45.1389 150126 B
GetByteCount Cyrillic 108.8 μs 0.07 μs 0.07 μs 108.7 μs 108.7 μs 108.9 μs - - - -
GetBytes Cyrillic 466.2 μs 0.97 μs 0.76 μs 466.3 μs 464.7 μs 467.1 μs 29.4118 29.4118 29.4118 100889 B
GetString Cyrillic 623.3 μs 1.09 μs 0.91 μs 623.3 μs 621.5 μs 624.8 μs 37.5000 37.5000 37.5000 130868 B
GetByteCount Greek 136.8 μs 0.15 μs 0.14 μs 136.7 μs 136.6 μs 137.1 μs - - - -
GetBytes Greek 679.6 μs 2.22 μs 1.86 μs 680.1 μs 675.1 μs 682.0 μs 38.0435 38.0435 38.0435 129260 B
GetString Greek 971.7 μs 6.79 μs 6.35 μs 968.5 μs 965.5 μs 984.9 μs 47.7941 47.7941 47.7941 164279 B

@kunalspathak
Copy link
Member

One more observation - if I remove the code under if ((AdvSimd.Arm64.IsSupported && BitConverter.IsLittleEndian) || Sse2.IsSupported) in Utf16Utility.GetPointerToFirstInvalidChar but keep the code under else if (Vector.IsHardwareAccelerated) and replace it with if (Vector.IsHardwareAccelerated) which I presume would be true on Arm64 I will get the following results

If I understand it correctly, you are saying that you reverted the changes in GetPointerToFirstInvalidChar () so the code fall backs to Vector.IsHardwareAccelerated (which it was happening before we optimized the method with ARM64 intrinsics), we see the improvements?

@kunalspathak
Copy link
Member

From offline conversation with @pgovind and @carlossanlop , I recall that the benchmarks touched the methods improved in #39506, #39508, #39050, #39041 and #38653 (correct me if I am wrong). While you are there, can you do similar change at other places these PRs touched (specially those methods that mimic SSE logic and can be improved by different algorithm for ARM64) to see if the vectorized implementation was fast enough?

@kunalspathak
Copy link
Member

There are regressions in GetBytes() as well, and it calls Utf8Utility.TranscodeToUtf8 as well, but I assume we don't want to revert it because that benchmark also calls Utf16Utility.GetPointerToFirstInvalidChar which we are reverting anyway?

@GrabYourPitchforks
Copy link
Member

GrabYourPitchforks commented Sep 10, 2020

I opened dotnet/performance#1512 to track changing the benchmarks so that each benchmark is testing exactly one worker function. But the best evidence we have right now suggests that GetPointerToFirstInvalidChar is the bulk of the regression, so that's where the efforts / reversions are currently being focused.

@jeffhandley
Copy link
Member

Re-opening this issue as #42052 was meant to be a temporary workaround and the underlying issue is still open

@jeffhandley jeffhandley reopened this Feb 27, 2022
@akoeplinger
Copy link
Member

@jeffhandley should this still be in the 5.0.0 milestone or be moved to 6.0/7.0?

@jeffhandley jeffhandley modified the milestones: 5.0.0, 7.0.0 Mar 17, 2022
@jeffhandley jeffhandley modified the milestones: 7.0.0, Future Aug 9, 2022
@JulieLeeMSFT
Copy link
Member

Assigning to @TIHan.

@JulieLeeMSFT JulieLeeMSFT added the Priority:1 Work that is critical for the release, but we could probably ship without label Mar 20, 2023
@JulieLeeMSFT JulieLeeMSFT modified the milestones: Future, 8.0.0 Mar 20, 2023
@TIHan
Copy link
Member

TIHan commented Mar 23, 2023

I have some data comparing .NET 5, 7, 8:

BenchmarkDotNet=v0.13.2.2052-nightly, OS=Windows 11 (10.0.22000.1574/21H2)
Snapdragon 7c 2.40 GHz, 1 CPU, 8 logical and 8 physical cores
.NET SDK=8.0.100-preview.2.23157.25
[Host] : .NET 7.0.4 (7.0.423.11508), Arm64 RyuJIT AdvSIMD
Job-YXVEBW : .NET 5.0.17 (5.0.1722.21314), Arm64 RyuJIT AdvSIMD
Job-HYZVQW : .NET 7.0.4 (7.0.423.11508), Arm64 RyuJIT AdvSIMD
Job-MMQVIX : .NET 8.0.0 (8.0.23.12803), Arm64 RyuJIT AdvSIMD

PowerPlanMode=00000000-0000-0000-0000-000000000000 IterationTime=250.0000 ms MaxIterationCount=20
MinIterationCount=15 WarmupCount=1

Method Runtime Input Mean Error StdDev Median Min Max Ratio RatioSD Gen0 Gen1 Gen2 Allocated Alloc Ratio
GetByteCount .NET 5.0 EnglishAllAscii 36.34 us 0.592 us 0.554 us 36.09 us 35.78 us 37.33 us 1.00 0.00 - - - - NA
GetByteCount .NET 7.0 EnglishAllAscii 12.90 us 0.248 us 0.243 us 12.84 us 12.66 us 13.53 us 0.36 0.01 - - - - NA
GetByteCount .NET 8.0 EnglishAllAscii 12.13 us 0.380 us 0.373 us 12.02 us 11.73 us 13.21 us 0.33 0.01 - - - - NA
GetBytes .NET 5.0 EnglishAllAscii 98.71 us 1.625 us 1.440 us 98.52 us 97.00 us 101.97 us 1.00 0.00 52.4691 52.4691 52.4691 167576 B 1.00
GetBytes .NET 7.0 EnglishAllAscii 82.24 us 1.497 us 1.401 us 82.08 us 79.84 us 84.30 us 0.84 0.02 52.4611 52.4611 52.4611 167594 B 1.00
GetBytes .NET 8.0 EnglishAllAscii 78.99 us 1.570 us 1.468 us 78.70 us 76.72 us 81.75 us 0.80 0.02 52.3897 52.3897 52.3897 167594 B 1.00
GetString .NET 5.0 EnglishAllAscii 89.11 us 1.685 us 1.803 us 89.50 us 86.48 us 91.65 us 1.00 0.00 99.9313 99.9313 99.9313 335120 B 1.00
GetString .NET 7.0 EnglishAllAscii 91.48 us 1.823 us 1.951 us 91.26 us 88.83 us 96.25 us 1.03 0.03 99.6429 99.6429 99.6429 335154 B 1.00
GetString .NET 8.0 EnglishAllAscii 88.26 us 1.677 us 1.722 us 88.63 us 85.71 us 90.54 us 0.99 0.02 99.7268 99.7268 99.7268 335154 B 1.00
GetByteCount .NET 5.0 EnglishMostlyAscii 89.67 us 1.470 us 1.375 us 89.46 us 87.98 us 92.34 us 1.00 0.00 - - - - NA
GetByteCount .NET 7.0 EnglishMostlyAscii 71.91 us 0.320 us 0.299 us 71.75 us 71.60 us 72.40 us 0.80 0.01 - - - - NA
GetByteCount .NET 8.0 EnglishMostlyAscii 68.23 us 0.530 us 0.469 us 68.00 us 67.77 us 69.29 us 0.76 0.01 - - - - NA
GetBytes .NET 5.0 EnglishMostlyAscii 221.88 us 1.747 us 1.634 us 221.87 us 219.81 us 225.12 us 1.00 0.00 52.0833 52.0833 52.0833 173616 B 1.00
GetBytes .NET 7.0 EnglishMostlyAscii 207.47 us 1.624 us 1.519 us 207.27 us 205.11 us 210.26 us 0.94 0.01 52.5000 52.5000 52.5000 173634 B 1.00
GetBytes .NET 8.0 EnglishMostlyAscii 206.32 us 4.211 us 4.850 us 204.08 us 201.10 us 216.76 us 0.94 0.02 51.9481 51.9481 51.9481 173634 B 1.00
GetString .NET 5.0 EnglishMostlyAscii 265.13 us 18.047 us 20.783 us 255.43 us 242.75 us 314.89 us 1.00 0.00 87.8906 87.8906 87.8906 335126 B 1.00
GetString .NET 7.0 EnglishMostlyAscii 289.28 us 16.755 us 19.295 us 280.65 us 267.11 us 333.52 us 1.10 0.09 80.0000 80.0000 80.0000 335152 B 1.00
GetString .NET 8.0 EnglishMostlyAscii 408.62 us 117.412 us 135.211 us 405.00 us 264.96 us 668.61 us 1.55 0.52 80.5288 80.5288 80.5288 335152 B 1.00
GetByteCount .NET 5.0 Chinese 40.90 us 0.258 us 0.215 us 40.83 us 40.60 us 41.33 us 1.00 0.00 - - - - NA
GetByteCount .NET 7.0 Chinese 33.29 us 0.187 us 0.166 us 33.29 us 33.09 us 33.64 us 0.81 0.01 - - - - NA
GetByteCount .NET 8.0 Chinese 31.50 us 0.222 us 0.197 us 31.40 us 31.33 us 31.91 us 0.77 0.01 - - - - NA
GetBytes .NET 5.0 Chinese 228.18 us 1.467 us 1.225 us 228.20 us 226.42 us 229.93 us 1.00 0.00 55.2536 55.2536 55.2536 180680 B 1.00
GetBytes .NET 7.0 Chinese 242.76 us 1.883 us 1.761 us 242.37 us 240.12 us 246.59 us 1.06 0.01 54.8077 54.8077 54.8077 180699 B 1.00
GetBytes .NET 8.0 Chinese 235.66 us 2.260 us 1.887 us 235.10 us 234.21 us 240.67 us 1.03 0.01 55.0373 55.0373 55.0373 180699 B 1.00
GetString .NET 5.0 Chinese 366.05 us 2.145 us 1.901 us 365.51 us 363.91 us 370.45 us 1.00 0.00 46.5116 46.5116 46.5116 155960 B 1.00
GetString .NET 7.0 Chinese 384.52 us 1.771 us 1.570 us 383.93 us 382.36 us 387.29 us 1.05 0.01 47.2561 47.2561 47.2561 155977 B 1.00
GetString .NET 8.0 Chinese 394.77 us 3.329 us 3.114 us 394.23 us 391.20 us 401.45 us 1.08 0.01 46.8750 46.8750 46.8750 155976 B 1.00
GetByteCount .NET 5.0 Cyrillic 29.29 us 0.322 us 0.285 us 29.15 us 29.08 us 29.97 us 1.00 0.00 - - - - NA
GetByteCount .NET 7.0 Cyrillic 27.10 us 0.191 us 0.169 us 27.06 us 26.88 us 27.44 us 0.93 0.01 - - - - NA
GetByteCount .NET 8.0 Cyrillic 23.12 us 0.127 us 0.113 us 23.09 us 22.98 us 23.30 us 0.79 0.01 - - - - NA
GetBytes .NET 5.0 Cyrillic 176.68 us 1.383 us 1.226 us 176.72 us 174.90 us 178.73 us 1.00 0.00 31.9767 31.9767 31.9767 102272 B 1.00
GetBytes .NET 7.0 Cyrillic 173.70 us 1.335 us 1.115 us 173.50 us 172.31 us 176.47 us 0.98 0.01 31.9444 31.9444 31.9444 102283 B 1.00
GetBytes .NET 8.0 Cyrillic 169.95 us 1.239 us 1.098 us 169.83 us 167.53 us 172.34 us 0.96 0.01 31.9293 31.9293 31.9293 102283 B 1.00
GetString .NET 5.0 Cyrillic 264.45 us 1.351 us 1.128 us 264.30 us 262.80 us 266.54 us 1.00 0.00 41.3136 41.3136 41.3136 133640 B 1.00
GetString .NET 7.0 Cyrillic 271.45 us 1.796 us 1.680 us 271.21 us 268.76 us 274.51 us 1.03 0.01 40.9483 40.9483 40.9483 133654 B 1.00
GetString .NET 8.0 Cyrillic 268.52 us 1.359 us 1.135 us 268.45 us 266.14 us 270.14 us 1.02 0.01 40.9483 40.9483 40.9483 133654 B 1.00
GetByteCount .NET 5.0 Greek 44.41 us 0.245 us 0.204 us 44.34 us 44.22 us 44.97 us 1.00 0.00 - - - - NA
GetByteCount .NET 7.0 Greek 36.18 us 0.175 us 0.137 us 36.14 us 36.01 us 36.49 us 0.81 0.01 - - - - NA
GetByteCount .NET 8.0 Greek 34.11 us 0.081 us 0.072 us 34.11 us 34.01 us 34.25 us 0.77 0.00 - - - - NA
GetBytes .NET 5.0 Greek 272.78 us 1.707 us 1.513 us 272.91 us 270.64 us 275.45 us 1.00 0.00 40.9483 40.9483 40.9483 131792 B 1.00
GetBytes .NET 7.0 Greek 268.80 us 2.116 us 1.979 us 268.40 us 265.05 us 271.83 us 0.99 0.01 41.3136 41.3136 41.3136 131807 B 1.00
GetBytes .NET 8.0 Greek 264.43 us 1.487 us 1.319 us 264.35 us 261.96 us 266.61 us 0.97 0.01 40.9483 40.9483 40.9483 131806 B 1.00
GetString .NET 5.0 Greek 422.14 us 4.472 us 4.183 us 421.95 us 416.77 us 429.72 us 1.00 0.00 52.3649 52.3649 52.3649 169352 B 1.00
GetString .NET 7.0 Greek 427.54 us 3.342 us 2.963 us 426.73 us 424.04 us 433.39 us 1.01 0.01 52.3649 52.3649 52.3649 169370 B 1.00
GetString .NET 8.0 Greek 423.67 us 2.330 us 2.179 us 423.65 us 420.46 us 428.13 us 1.00 0.01 50.9868 50.9868 50.9868 169370 B 1.00

It looks like GetByteCount has improved but some of the GetString cases have regressed.

@kunalspathak
Copy link
Member

The original issue reported regression when compared with .NET 3.1 (@adamsitnik do you remember if that is accurate)? If so we might need to compare with .NET 3.1. At that time, we just had linux arm64 though, so you will have to test it on linux arm64 box.

@adamsitnik
Copy link
Member Author

@adamsitnik do you remember if that is accurate)

I don't remember the details, but looking at my old description of the issue I am sure that you are right, it was a 3.1 vs 5.0 regression found on Ubuntu machines (the ones owned by the JIT Team, as back then I had no access to any other arm machines).

@TIHan
Copy link
Member

TIHan commented Mar 27, 2023

This was on a linux ARM64 box.

.NET 3.1 results:

Method Input Mean Error StdDev Median Min Max Gen0 Gen1 Gen2 Allocated
GetByteCount EnglishAllAscii 34.04 us 0.067 us 0.062 us 34.04 us 33.94 us 34.16 us - - - -
GetBytes EnglishAllAscii 82.25 us 0.189 us 0.168 us 82.25 us 82.07 us 82.65 us 49.7382 49.7382 49.7382 163840 B
GetString EnglishAllAscii 71.68 us 0.313 us 0.261 us 71.72 us 71.35 us 72.08 us 99.7159 99.7159 99.7159 327648 B
GetByteCount EnglishMostlyAscii 93.30 us 0.840 us 0.785 us 93.60 us 91.36 us 93.65 us - - - -
GetBytes EnglishMostlyAscii 237.55 us 0.298 us 0.279 us 237.53 us 237.14 us 237.99 us 51.7857 51.7857 51.7857 169880 B
GetString EnglishMostlyAscii 217.55 us 0.410 us 0.384 us 217.46 us 216.93 us 218.19 us 99.8264 99.8264 99.8264 327656 B
GetByteCount Chinese 41.65 us 0.364 us 0.340 us 41.42 us 41.41 us 42.13 us - - - -
GetBytes Chinese 199.19 us 0.366 us 0.306 us 199.07 us 198.84 us 199.82 us 55.3797 55.3797 55.3797 177752 B
GetString Chinese 325.68 us 0.185 us 0.173 us 325.67 us 325.31 us 326.04 us 46.8750 46.8750 46.8750 150112 B
GetByteCount Cyrillic 36.38 us 0.005 us 0.004 us 36.38 us 36.37 us 36.39 us - - - -
GetBytes Cyrillic 163.33 us 0.059 us 0.046 us 163.33 us 163.25 us 163.43 us 30.6122 30.6122 30.6122 100880 B
GetString Cyrillic 225.84 us 0.401 us 0.313 us 225.91 us 225.26 us 226.32 us 39.8551 39.8551 39.8551 130856 B
GetByteCount Greek 46.46 us 0.005 us 0.005 us 46.46 us 46.46 us 46.47 us - - - -
GetBytes Greek 240.81 us 0.787 us 0.736 us 240.88 us 239.89 us 242.25 us 39.4231 39.4231 39.4231 129248 B
GetString Greek 346.43 us 0.396 us 0.331 us 346.44 us 345.90 us 347.13 us 48.6111 48.6111 48.6111 164264 B

.NET 7 results:

Method Input Mean Error StdDev Median Min Max Gen0 Gen1 Gen2 Allocated
GetByteCount EnglishAllAscii 10.60 us 0.002 us 0.002 us 10.60 us 10.59 us 10.60 us - - - -
GetBytes EnglishAllAscii 44.57 us 0.058 us 0.054 us 44.58 us 44.47 us 44.65 us 49.9302 49.9302 49.9302 163874 B
GetString EnglishAllAscii 59.10 us 0.193 us 0.181 us 59.02 us 58.86 us 59.48 us 99.8134 99.8134 99.8134 327715 B
GetByteCount EnglishMostlyAscii 73.94 us 0.096 us 0.090 us 73.90 us 73.81 us 74.12 us - - - -
GetBytes EnglishMostlyAscii 179.52 us 0.194 us 0.172 us 179.52 us 179.20 us 179.80 us 52.5568 52.5568 52.5568 169916 B
GetString EnglishMostlyAscii 184.24 us 0.204 us 0.159 us 184.25 us 183.97 us 184.44 us 99.2647 99.2647 99.2647 327723 B
GetByteCount Chinese 34.35 us 0.035 us 0.033 us 34.33 us 34.32 us 34.40 us - - - -
GetBytes Chinese 192.66 us 0.160 us 0.134 us 192.66 us 192.45 us 192.88 us 54.8780 54.8780 54.8780 177790 B
GetString Chinese 309.72 us 0.156 us 0.139 us 309.71 us 309.50 us 309.94 us 46.5686 46.5686 46.5686 150144 B
GetByteCount Cyrillic 21.09 us 0.033 us 0.027 us 21.08 us 21.08 us 21.17 us - - - -
GetBytes Cyrillic 132.44 us 0.189 us 0.168 us 132.41 us 132.24 us 132.79 us 30.9874 30.9874 30.9874 100901 B
GetString Cyrillic 215.88 us 0.380 us 0.337 us 215.85 us 215.41 us 216.58 us 39.9306 39.9306 39.9306 130884 B
GetByteCount Greek 38.45 us 0.292 us 0.273 us 38.60 us 37.94 us 38.70 us - - - -
GetBytes Greek 212.63 us 0.644 us 0.602 us 212.42 us 211.87 us 214.07 us 39.3836 39.3836 39.3836 129275 B
GetString Greek 331.28 us 0.825 us 0.731 us 331.00 us 330.44 us 333.17 us 49.2021 49.2021 49.2021 164298 B

.NET 7 is an all-up improvement over .NET 3.1 results. cc @kunalspathak

@TIHan
Copy link
Member

TIHan commented Mar 27, 2023

Closing as these are not regressions anymore.

@TIHan TIHan closed this as completed Mar 27, 2023
@kunalspathak
Copy link
Member

Thanks @TIHan for checking this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm64 area-System.Text.Encoding Priority:1 Work that is critical for the release, but we could probably ship without tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.