Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the IsLeft/IsRight decision out of the loop and use computed substring set #88516

Closed
wants to merge 5 commits into from

Conversation

IDisposable
Copy link
Contributor

@IDisposable IDisposable commented Jul 7, 2023

While reviewing #87510, I noticed the inline code in the comparers seems like since it's in the hot-loop path, might be faster to move the IsLeft conditionalized code out of the loop by adding a static Slicer to the comparator. The Slicer is the same logic as was originally in the bodies of the Equals and GetHashCode methods, and also matches the delegates that were being passed to the internal CreateAnalysisResults method.

The slicing is really only changed once per Count, so move the IsLeft-dependent logic into aggressively inlined Slicer extension method because that makes things a bit faster and slightly reduces allocations.

Builds on #88709 as that's a trivially true change. Has been merged now.

The summary of changes:

  • Eliminate the the inner TryUseSubstring because we can just early return the calculated results as we build them
  • Hoist the calculation of acceptableNonUniqueCount out to the top level since it never changes (which means we pass that into the HasSufficientUniquenessFactor method for it to "use up" internally (passed by value, so unchanged at call-site)
  • Eliminated the delegate ReadOnlySpan<char> GetSpan and use, which helps reduce dynamic dispatch overhead in the CreateAnalysisResults method
  • Eliminated the IsLeft field of the SubstringComparer since we can tell by the Index being negative that we're doing right-justified slicing (and documented that on the class)
  • Changed the logic managing the Index and Count on the comparer for right-justified substrings.
  • Added [MethodImpl(MethodImplOptions.AggressiveInlining)] to the Equals and GetHashCode overrides.

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Jul 7, 2023
@ghost
Copy link

ghost commented Jul 7, 2023

Tagging subscribers to this area: @dotnet/area-system-collections
See info in area-owners.md if you want to be subscribed.

Issue Details

While reviewing #87510, I noticed the inline code in the Comparers seems like since it's in the hot-loop path, might be faster to move the IsLeft conditionalized code out of the loop by adding a Slicer to the comparator. The Slicer is the same logic as was in the bodies of the Equals and GetHashCode methods, and indeed also matches the delegate that was passed to the internal CreateAnalysisResults method.

The slicing is really only changed once per Count, so move the IsLeft-dependent logic into aggressively inlined SliceLeft and SliceRight to see if that makes things faster. Also reuses the same slicers for the calls to the CreateAnalysisResults so the will get the same JIT perf benefits.

Also made a subtle tweak because we know that ignoreCase is true (due to the test above), so can set the starting state of canSwitchIgnoreCaseToCaseSensitive explicitly to true.

BenchmarkDotNet=v0.13.2.2052-nightly, OS=Windows 11 (10.0.22631.1972)
Intel Core i7-10875H CPU 2.30GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=8.0.100-preview.7.23322.33
  [Host]     : .NET 8.0.0 (8.0.23.32106), X64 RyuJIT AVX2
  Job-ZWELZX : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-KUEFVA : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:EnableUnsafeBinaryFormatterSerialization=true  IterationTime=250.0000 ms  
MaxIterationCount=20  MinIterationCount=15  WarmupCount=1  
Method Job Toolchain Count Mean Error StdDev Median Min Max Ratio MannWhitney(3ms) RatioSD Gen0 Gen1 Gen2 Allocated Alloc Ratio
ToFrozenDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 2,057.56 ns 23.550 ns 20.877 ns 2,053.70 ns 2,004.57 ns 2,095.98 ns 1.26 Same 0.03 0.2027 - - 1720 B 1.00
ToFrozenDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 1,644.78 ns 32.011 ns 35.580 ns 1,631.83 ns 1,603.70 ns 1,715.13 ns 1.00 Base 0.00 0.2047 - - 1720 B 1.00
TryGetValue_True_FrozenDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 63.10 ns 0.752 ns 0.628 ns 63.24 ns 61.43 ns 63.91 ns 1.00 Same 0.02 - - - - NA
TryGetValue_True_FrozenDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 63.30 ns 0.842 ns 0.788 ns 63.31 ns 62.34 ns 65.07 ns 1.00 Base 0.00 - - - - NA
ToFrozenDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 11,800.30 ns 139.610 ns 123.760 ns 11,764.33 ns 11,681.12 ns 12,075.98 ns 1.26 Same 0.01 1.4098 0.0470 - 12112 B 1.00
ToFrozenDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 9,348.64 ns 91.765 ns 76.628 ns 9,367.20 ns 9,209.04 ns 9,481.99 ns 1.00 Base 0.00 1.4145 0.0372 - 12112 B 1.00
TryGetValue_True_FrozenDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 647.76 ns 12.870 ns 12.640 ns 645.45 ns 630.03 ns 677.18 ns 1.00 Same 0.02 - - - - NA
TryGetValue_True_FrozenDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 645.03 ns 8.896 ns 8.321 ns 645.45 ns 631.23 ns 658.04 ns 1.00 Base 0.00 - - - - NA
ToFrozenDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 107,105.57 ns 1,078.411 ns 955.984 ns 106,742.36 ns 106,269.73 ns 109,294.56 ns 1.26 Same 0.01 11.1301 2.5685 - 94864 B 1.00
ToFrozenDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 84,805.63 ns 1,384.405 ns 1,294.973 ns 84,801.32 ns 82,494.28 ns 87,164.12 ns 1.00 Base 0.00 11.2434 2.6455 - 94864 B 1.00
TryGetValue_True_FrozenDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 7,823.96 ns 106.727 ns 99.833 ns 7,841.29 ns 7,668.40 ns 7,999.15 ns 1.04 Same 0.02 - - - - NA
TryGetValue_True_FrozenDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 7,542.00 ns 104.779 ns 92.883 ns 7,555.03 ns 7,364.24 ns 7,702.98 ns 1.00 Base 0.00 - - - - NA
ToFrozenDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 1,641,164.67 ns 23,876.639 ns 22,334.223 ns 1,633,353.12 ns 1,612,245.00 ns 1,676,639.38 ns 1.18 Same 0.02 143.7500 143.7500 143.7500 926133 B 1.00
ToFrozenDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 1,393,440.90 ns 17,878.950 ns 16,723.981 ns 1,394,471.88 ns 1,360,370.31 ns 1,421,018.75 ns 1.00 Base 0.00 145.8333 145.8333 145.8333 926134 B 1.00
TryGetValue_True_FrozenDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 129,167.96 ns 2,134.712 ns 1,996.811 ns 129,264.68 ns 124,220.63 ns 132,173.42 ns 1.01 Same 0.02 - - - - NA
TryGetValue_True_FrozenDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 128,107.54 ns 1,706.036 ns 1,595.827 ns 127,981.50 ns 126,341.48 ns 131,397.43 ns 1.00 Base 0.00 - - - - NA
ToDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 138.87 ns 1.287 ns 1.204 ns 138.58 ns 136.88 ns 141.00 ns 0.94 Same 0.02 0.0526 - - 440 B 1.00
ToDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 147.84 ns 2.518 ns 2.355 ns 147.02 ns 145.32 ns 153.22 ns 1.00 Base 0.00 0.0526 - - 440 B 1.00
ToImmutableDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 1,299.27 ns 83.702 ns 96.391 ns 1,250.37 ns 1,202.89 ns 1,479.53 ns 1.12 Same 0.08 0.0860 - - 736 B 1.00
ToImmutableDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 1,178.81 ns 16.472 ns 15.407 ns 1,171.80 ns 1,158.97 ns 1,204.85 ns 1.00 Base 0.00 0.0847 - - 736 B 1.00
TryGetValue_True_Dictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 128.08 ns 1.673 ns 1.483 ns 128.16 ns 125.29 ns 130.57 ns 1.00 Same 0.01 - - - - NA
TryGetValue_True_Dictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 128.21 ns 1.885 ns 1.763 ns 128.02 ns 125.36 ns 131.44 ns 1.00 Base 0.00 - - - - NA
TryGetValue_True_ImmutableDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 164.46 ns 2.012 ns 1.882 ns 164.70 ns 162.56 ns 168.03 ns 1.00 Same 0.01 - - - - NA
TryGetValue_True_ImmutableDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10 164.62 ns 2.097 ns 1.962 ns 163.06 ns 162.89 ns 167.53 ns 1.00 Base 0.00 - - - - NA
ToDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 1,097.04 ns 21.699 ns 20.298 ns 1,095.36 ns 1,072.24 ns 1,134.88 ns 1.13 Same 0.03 0.3714 0.0043 - 3128 B 1.00
ToDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 972.18 ns 16.163 ns 15.119 ns 968.03 ns 958.70 ns 1,002.61 ns 1.00 Base 0.00 0.3706 0.0039 - 3128 B 1.00
ToImmutableDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 17,821.67 ns 327.754 ns 306.582 ns 17,853.37 ns 17,381.97 ns 18,333.97 ns 0.95 Same 0.02 0.7301 - - 6496 B 1.00
ToImmutableDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 18,780.61 ns 259.705 ns 230.222 ns 18,796.94 ns 18,384.63 ns 19,208.59 ns 1.00 Base 0.00 0.7449 - - 6496 B 1.00
TryGetValue_True_Dictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 1,442.70 ns 19.214 ns 17.973 ns 1,442.10 ns 1,410.93 ns 1,475.64 ns 0.95 Same 0.01 - - - - NA
TryGetValue_True_Dictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 1,524.14 ns 21.632 ns 18.064 ns 1,521.86 ns 1,490.48 ns 1,565.28 ns 1.00 Base 0.00 - - - - NA
TryGetValue_True_ImmutableDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 1,772.76 ns 24.517 ns 22.934 ns 1,769.24 ns 1,735.35 ns 1,807.67 ns 0.88 Same 0.02 - - - - NA
TryGetValue_True_ImmutableDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 100 2,006.92 ns 29.358 ns 26.025 ns 2,011.73 ns 1,962.37 ns 2,050.82 ns 1.00 Base 0.00 - - - - NA
ToDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 9,776.74 ns 168.881 ns 149.708 ns 9,747.11 ns 9,596.95 ns 10,043.39 ns 0.98 Same 0.02 3.6833 0.3877 - 31016 B 1.00
ToDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 10,041.68 ns 178.905 ns 167.348 ns 9,997.95 ns 9,819.95 ns 10,373.81 ns 1.00 Base 0.00 3.6694 0.4032 - 31016 B 1.00
ToImmutableDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 281,817.08 ns 4,379.124 ns 4,300.886 ns 281,971.82 ns 275,266.78 ns 288,499.65 ns 1.04 Same 0.03 6.9444 1.1574 - 64097 B 1.00
ToImmutableDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 272,101.95 ns 5,216.817 ns 5,357.287 ns 270,768.10 ns 266,252.05 ns 285,728.56 ns 1.00 Base 0.00 7.5431 2.1552 - 64097 B 1.00
TryGetValue_True_Dictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 17,205.75 ns 183.564 ns 162.725 ns 17,180.39 ns 16,956.92 ns 17,508.42 ns 0.98 Same 0.02 - - - - NA
TryGetValue_True_Dictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 17,615.34 ns 196.353 ns 183.669 ns 17,640.08 ns 17,242.77 ns 17,890.94 ns 1.00 Base 0.00 - - - - NA
TryGetValue_True_ImmutableDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 65,528.70 ns 303.425 ns 253.374 ns 65,455.07 ns 65,180.80 ns 66,062.89 ns 0.99 Same 0.01 - - - - NA
TryGetValue_True_ImmutableDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 1000 66,317.31 ns 919.599 ns 860.194 ns 65,766.51 ns 65,670.49 ns 68,352.98 ns 1.00 Base 0.00 - - - - NA
ToDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 135,225.46 ns 1,934.785 ns 1,809.799 ns 134,627.07 ns 132,521.68 ns 137,873.59 ns 1.07 Same 0.04 50.0000 46.7391 45.1087 283324 B 1.00
ToDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 126,728.45 ns 3,881.702 ns 4,470.174 ns 126,718.90 ns 117,977.22 ns 134,411.62 ns 1.00 Base 0.00 46.4876 42.8719 41.8388 283336 B 1.00
ToImmutableDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 3,905,016.98 ns 64,558.467 ns 60,388.030 ns 3,887,075.00 ns 3,836,512.50 ns 4,007,631.25 ns 1.00 Same 0.02 62.5000 46.8750 - 640108 B 1.00
ToImmutableDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 3,921,738.06 ns 46,705.911 ns 41,403.573 ns 3,921,446.88 ns 3,879,031.25 ns 4,003,846.88 ns 1.00 Base 0.00 62.5000 46.8750 - 640108 B 1.00
TryGetValue_True_Dictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 239,574.73 ns 3,271.035 ns 2,731.463 ns 240,082.88 ns 233,648.37 ns 243,980.29 ns 1.00 Same 0.01 - - - - NA
TryGetValue_True_Dictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 239,909.95 ns 2,255.089 ns 1,999.078 ns 239,680.77 ns 235,758.37 ns 243,502.01 ns 1.00 Base 0.00 - - - - NA
TryGetValue_True_ImmutableDictionary Job-ZWELZX \runtime\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 1,116,305.32 ns 5,426.525 ns 4,531.395 ns 1,114,173.66 ns 1,113,416.52 ns 1,127,808.48 ns 0.99 Same 0.01 - - - - NA
TryGetValue_True_ImmutableDictionary Job-KUEFVA \runtime_baseline\artifacts\bin\testhost\net8.0-windows-Release-x64\shared\Microsoft.NETCore.App\8.0.0\CoreRun.exe 10000 1,133,305.93 ns 16,927.154 ns 15,005.481 ns 1,127,919.87 ns 1,121,172.32 ns 1,168,315.62 ns 1.00 Base 0.00 - - - - NA

In the JustifiedSubstringComparer, uses the current Slicer

Author: IDisposable
Assignees: -
Labels:

area-System.Collections, community-contribution

Milestone: -

@IDisposable IDisposable force-pushed the faster-freeze-strings branch 5 times, most recently from 00e52d1 to d9d188b Compare July 12, 2023 07:41
public override bool Equals(string? x, string? y) => x.AsSpan(IsLeft ? Index : (x!.Length + Index), Count).SequenceEqual(y.AsSpan(IsLeft ? Index : (y!.Length + Index), Count));
public override int GetHashCode(string s) => Hashing.GetHashCodeOrdinal(s.AsSpan(IsLeft ? Index : (s.Length + Index), Count));
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public override bool Equals(string? x, string? y) => x!.Slicer(Index, Count).SequenceEqual(y!.Slicer(Index, Count));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Slicer does the left/right based on the sign of the Index value now, which should JIT down better.

I really wish that String.AsSpan understood the use of -1 start intrinsically as string.Length - 1... that would make the ternary unneeded.

{
// For each index, get a uniqueness factor for the right-justified substrings.
// If any is above our threshold, we're done.
comparer.Index = -count;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a real improvement. We do the math for setting a negative index (e.g. from the right side) starting with count characters from the right (as before). Then on line 81 we keep decrementing comparer.Index as we go on moving the "cursor" left.

@IDisposable IDisposable force-pushed the faster-freeze-strings branch 3 times, most recently from 8c2b2e4 to 865a9ee Compare July 21, 2023 05:50
@@ -263,25 +241,34 @@ public AnalysisResults(bool ignoreCase, bool allAsciiIfIgnoreCase, int hashIndex
public bool RightJustifiedSubstring => HashIndex < 0;
}

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static ReadOnlySpan<char> Slicer(this string s, int index, int count) => s.AsSpan((index >= 0 ? index : s.Length + index), count);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ternary is the sole remaining conditional jump in the hot loop, but there's no good way to avoid that simultaneous avoiding delegate overhead so make it as simple as possible. The number of jumps (old vs. new) is identical, but this is a tiny tiny block of JITtable goodness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tried running this as a jumpless method body, but it makes little difference and is harder to grok; passing fromRight = 0 for left, or fromRight = 1 for right which requires the SubstringComparers to carry the left/right multiplier with them (so more state...):

public static ReadOnlySpan<char> Slicer(this string s, byte fromRight, int index, int count) => s.AsSpan((s.Length * fromRight) + index), count);

Also, if we COULD swap the HashSet<string>'s comparer out for left vs. right the we could just have that knowledge embedded with a trait and thus fully jumpless, but that would require allocating two HashSet<string>s which might be an allocation regression nobody wants :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did something like that in PR #89689 which is a huge win... since we can't swap the comparer, I ended up creating both a left and right HashSet with backing comparer that "hard codes" the left/right logic. HUGE WINS, see that PR.

@IDisposable IDisposable changed the title Move the IsLeft/IsRight decision out of the loop Move the IsLeft/IsRight decision out of the loop and pass computed set Jul 21, 2023
@IDisposable IDisposable changed the title Move the IsLeft/IsRight decision out of the loop and pass computed set Move the IsLeft/IsRight decision out of the loop and use computed substring set Jul 21, 2023
@IDisposable IDisposable marked this pull request as draft July 21, 2023 08:48
@IDisposable IDisposable marked this pull request as ready for review July 27, 2023 23:15
@@ -129,7 +109,7 @@ private static bool TryUseSubstring(ReadOnlySpan<string> uniqueStrings, bool ign
foreach (string s in uniqueStrings)
{
// Get the span for the substring.
ReadOnlySpan<char> substring = getSubstringSpan(s, index, count);
ReadOnlySpan<char> substring = count == 0 ? s.AsSpan() : Slicer(s, index, count);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to re-slice the string again here. It would be awesome if we could have a HashSet<ReadOnlySpan<char> but that's not going to happen as those would be structs not objects.

The slicing is really only changed once per Count, so move the
IsLeft-dependent logic into `Slicer` method and eliminate the `GetSpan` delegate.

Changed to also pass the already-computed `set` of unique substrings to the `CreateAnalysisResults` method, so we don't recompute the slices twice. In order than either the set or the original `uniqueStrings` can be passed, swapped that argument for the `Analyze` method to take the `uniqueStrings` as a `string[]` (which it already is at all call-sites).
Subtle bug in that the entire string is being placed in the set, not the span.
Since we are working with the same set of input strings in each strategy there's no reason to take the length every time we make an attempt (per count, both left and right justified).
Hoist the calculation of the acceptable number of collisions out to the top, do it once, and pass that number into the `HasSufficientUniquenessFactor` method for it to (locally) use up.
Benchmarks ever so slightly better.
Looks like the overhead of IEnumerable<string> is not worth the savings for the benchmark test data. Perhaps it would matter less if we were freezing more strings, but not likely
@IDisposable
Copy link
Contributor Author

Performance tests (note, PR #89689 wipes the floor on this, so we should merge that instead)

  • Baseline is the main as of d5c4a4e
  • Faster Freeze is the code in this which shows between a 1%-3% performance improvement and unchanged allocation
BenchmarkDotNet=v0.13.2.2052-nightly, OS=Windows 11 (10.0.22631.2115)
Intel Core i7-10875H CPU 2.30GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK=8.0.100-preview.7.23322.33
  [Host]     : .NET 8.0.0 (8.0.23.32106), X64 RyuJIT AVX2
  Job-PMFJSX : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-XUZNBJ : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-ORKLIM : .NET 8.0.0 (42.42.42.42424), X64 RyuJIT AVX2

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:EnableUnsafeBinaryFormatterSerialization=true  IterationTime=250.0000 ms  
MaxIterationCount=20  MinIterationCount=15  WarmupCount=1  
Method Job Toolchain Count Mean Error StdDev Median Min Max Ratio MannWhitney(2ms) RatioSD Gen0 Gen1 Gen2 Allocated Alloc Ratio
ToFrozenDictionary Job-XUZNBJ Baseline 10 1,271.2 ns 24.41 ns 28.11 ns 1,275.2 ns 1,216.0 ns 1,302.7 ns 1.00 Base 0.00 0.2055 - - 1.68 KB 1.00
ToFrozenDictionary Job-ORKLIM Faster Freeze 10 1,246.0 ns 27.06 ns 31.17 ns 1,263.2 ns 1,194.8 ns 1,283.6 ns 0.98 Same 0.02 0.2033 - - 1.67 KB 1.00
ToFrozenDictionary Job-XUZNBJ Baseline 100 7,568.4 ns 168.25 ns 193.76 ns 7,598.1 ns 7,197.6 ns 7,828.9 ns 1.00 Base 0.00 1.4281 0.0304 - 11.83 KB 1.00
ToFrozenDictionary Job-ORKLIM Faster Freeze 100 7,523.3 ns 156.77 ns 180.54 ns 7,583.8 ns 7,023.0 ns 7,818.5 ns 0.99 Same 0.03 1.4343 0.0305 - 11.82 KB 1.00
ToFrozenDictionary Job-XUZNBJ Baseline 1000 70,430.6 ns 1,377.38 ns 1,586.20 ns 70,729.1 ns 67,759.9 ns 73,232.6 ns 1.00 Base 0.00 11.3147 2.6940 - 92.64 KB 1.00
ToFrozenDictionary Job-ORKLIM Faster Freeze 1000 68,147.0 ns 1,356.55 ns 1,507.80 ns 68,636.0 ns 64,592.7 ns 69,908.1 ns 0.97 Same 0.02 11.1229 2.6483 - 92.63 KB 1.00
ToFrozenDictionary Job-XUZNBJ Baseline 10000 1,144,862.8 ns 21,895.81 ns 20,481.36 ns 1,150,661.8 ns 1,108,326.4 ns 1,179,999.3 ns 1.00 Base 0.00 145.8333 145.8333 145.8333 904.43 KB 1.00
ToFrozenDictionary Job-ORKLIM Faster Freeze 10000 1,131,728.1 ns 21,741.06 ns 19,272.89 ns 1,134,184.4 ns 1,105,237.5 ns 1,158,713.3 ns 0.99 Same 0.03 148.4375 148.4375 148.4375 904.42 KB 1.00

@IDisposable
Copy link
Contributor Author

Did much better in PR #89689

@dotnet dotnet locked as resolved and limited conversation to collaborators Sep 8, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Collections community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant