Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Add Span<T>.Sort(...) by changing Array.Sort internals to be Span based #24419

Closed
wants to merge 42 commits into from

Conversation

nietras
Copy link

@nietras nietras commented May 6, 2019

After the exercise in futility of my previous PRs #16986 (and dotnet/corefx#26859), where I tried adding Sort(...) in as performant a way as possible in C# code only (using refs throughout and with the intent to replace existing array sort code too), and these being declined due to security concerns, this PR tries to add Sort(...) for spans by doing as minimal as possible changes to existing Array.Sort code while trying to avoid yet another duplication of sorting code.

The PR starts with exploring whether these minimal changes are a valid path forward, hence changes are cursory (just to explore the path forward). What is missing is whether it is possible to make an overload for TrySZSort that works for Span<T>, is it?

CC: @danmosemsft @jkotas

This is a work in progress and here are some TODOs:

  • Benchmark minimal changes needed to be sure there are no performance regressions
  • Handle TComparer for TKey only (currently based on Comparison<TKey>)
    • Supported via boxing first, later PR can add proper support
  • Handle Comparison<TKey> for TKey, TValue case (no such version exists)
    • Supported via wrapping in ComparisonComparer<T>, later PR can add proper support
  • Clean up null checks and debug asserts (not needed for spans since cannot be null)

ref byte byteRef = ref Unsafe.As<T, byte>(ref array.GetPinnableReference());
fixed (byte* ptr = &byteRef)
{
if (Array.TrySZSort(new IntPtr(ptr), typeof(T), IntPtr.Zero, typeof(T), index, index + length - 1))
Copy link
Author

@nietras nietras May 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a way to either call TrySZSort here for spans as suggested here

fixed (byte* keysPtr = &keysRef)
fixed (byte* itemsPtr = &itemsRef)
{
if (Array.TrySZSort(new IntPtr(keysPtr), typeof(TKey), new IntPtr(itemsPtr), typeof(TValue), index, index + length - 1))
Copy link
Author

@nietras nietras May 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a way to either call TrySZSort here for spans as suggested here

@stephentoub
Copy link
Member

This general approach makes sense to me for 3.0, as long as it can be done without regressing array sorting perf. If array perf doesn't regress, then at least we end up with something functional for spans, and it can be optimized for subsequent releases.

@jkotas
Copy link
Member

jkotas commented May 7, 2019

I agree with @stephentoub that a simple, safe, less-optimized implementation is a good way to start on this.

@jkotas
Copy link
Member

jkotas commented May 7, 2019

The array to Span conversion throws for co-variant array. There may be a bit of unsafe code needed to avoid throwing for Sort on co-variant arrays.


if (length > 1)
{
#if CORECLR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be fine to skip this optimization in the initial implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the #if CORECLR, note the comment though, this will mean Span sort might be different for equal keys since the different Sort variants are not 100% identical as discussed in #16986

@jkotas
Copy link
Member

jkotas commented May 7, 2019

as long as it can be done without regressing array sorting perf

Could you please collect some performance numbers on this?

@nietras
Copy link
Author

nietras commented May 7, 2019

a simple, safe, less-optimized implementation is a good way to start

@stephentoub @jkotas good :) Your initial thoughts mirror mine, we are facing two issues with changing the "backbone" of sort with Span-based versions:

  • Co-variant arrays
  • Possible performance regressions

array to Span conversion throws for co-variant array. There may be a bit of unsafe code needed to avoid throwing for Sort on co-variant arrays.

@jkotas yes exactly, which is why I was wondering if this was a viable path, could you expand on the "bit of unsafe code" needed? I assume we have to convert to a Span with the same type as the underlying array, but what about the custom comparers then?

as long as it can be done without regressing array sorting perf

@stephentoub right that is the question. Lets for now assume that the creation of the Spans for the input arrays has neglible overhead even for small numbers of elements. The biggest problem here then is whether the Span based sorting code will not regress performance.

I haven't done a lot coreclr based development so I will have to get into how to do benchmarks of this first. :)

One thing I can say though, from the experience from the previous PRs, if Span API methods does not call TrySZSort, which is what @jkotas is suggesting and I think is a good start, then the performance for Span for the types supported by TrySZSort will be significantly slower will high likelihood. This could be addressed later I think that is a good plan.

@nietras
Copy link
Author

nietras commented May 7, 2019

Could you please collect some performance numbers on this?

@jkotas I assume there are existing benchmarks in place I can use or?

@jkotas
Copy link
Member

jkotas commented May 7, 2019

I assume we have to convert to a Span with the same type as the underlying array

I think we need to convert T[] to Span<T>, with the assumption that the sort code only swaps elements within the array and co-variant checks are not needed. It can be something like:

new Span<T>(ref Unsafe.As<byte, T>(ref array.GetRawSzArrayData()), a.Length)

I assume there are existing benchmarks in place I can use or?

There is basic one in https://github.com/dotnet/performance repo:
https://github.com/dotnet/performance/blob/2296c92879ed0744cec9b983ac28ac208e7018bd/src/benchmarks/micro/corefx/System.Collections/Sort.cs#L40 . We may need more comprehensive coverage than this for this PR.

@nietras
Copy link
Author

nietras commented May 7, 2019

new Span(ref Unsafe.As<byte, T>(ref array.GetRawSzArrayData()), a.Length)

@jkotas I should have updated to do that for all array to span conversion. Let me know if it looks ok. :)

@nietras
Copy link
Author

nietras commented May 8, 2019

@jkotas now that I have added the proper Span Sort(...) methods, the first issues arise. And things are coming back to me from a year ago :)

The span Sort methods take a TComparer to allow value type use and inlining of the comparer. I have added support for that for TKey, TValue scenarios, since this was easy for the existing code. However, for just TKey the only version that exist is based on Comparison<TKey> and changing a comparer to a delegate by forwarding the method comparer.Compare.

Overall, the array sorting API and code is not exactly consistent in API surface and underlying code across situations. The Span sort API is. But we then face how to use existing sort code and if we can change that or have to add/duplicate sorting code, which means bigger changes.

How do you think we should proceed?

I think we should limit changes and do initial benchmarks as soon as possible to be sure these changes do not lead to performance regressions. But would like at least to have "complete" code that supports both Array and Span APIs so changes needed for that are present and hence benchmarked.

@mikedn
Copy link

mikedn commented May 8, 2019

However, for just TKey the only version that exist is based on Comparison and changing a comparer to a delegate by forwarding the method comparer.Compare.

That's because delegate calls are slightly faster than interface calls. I suppose you can replace the delegate with TComparer throughout the sorting code and wrap the existing delegate in a suitable TComparer.

That said, the whole thing with TComparer is a bit unusual. While it's true that it allows inlining of the compare predicate and that should speed up things, it can also result in significant code bloat if this technique becomes widely used. Well, I suppose it won't since it's far more convenient to write Sort((x, y) => …) than to create a special struct comparer.

@nietras
Copy link
Author

nietras commented Jun 2, 2019

@jkotas ah it appears the master had changed which was the reason for nullables changes, I have merged with latest master to resolve conflict.

@nietras
Copy link
Author

nietras commented Jun 2, 2019

ci build fails with:

shared\System\Collections\Generic\ArraySortHelper.cs(155,10): error CS0246: The type or namespace name 'MethodImplAttribute' could not be found (are you missing a using directive or an assembly reference?) [F:\workspace\_work\1\s\src\System.Private.CoreLib\System.Private.CoreLib.csproj]
shared\System\Collections\Generic\ArraySortHelper.cs(155,10): error CS0246: The type or namespace name 'MethodImpl' could not be found (are you missing a using directive or an assembly reference?) [F:\workspace\_work\1\s\src\System.Private.CoreLib\System.Private.CoreLib.csproj]
shared\System\Collections\Generic\ArraySortHelper.cs(155,21): error CS0103: The name 'MethodImplOptions' does not exist in the current context [F:\workspace\_work\1\s\src\System.Private.CoreLib\System.Private.CoreLib.csproj]

i.e. type or namespace name 'MethodImpl' could not be found how can I resolve this?

@nietras
Copy link
Author

nietras commented Jun 2, 2019

I also have problems building locally after merging with master... even clean.cmd fails:

.\clean.cmd
Installing dotnet using Arcade...
running: powershell -NoProfile -ExecutionPolicy unrestricted -Command ". C:\git\oss\coreclr\eng\configure-toolset.ps1; . C:\git\oss\coreclr\eng\common\tools.ps1; InitializeBuildTool"

Name                           Value
----                           -----
Path                           C:\git\oss\coreclr\.dotnet\dotnet.exe
Command                        msbuild
Tool                           dotnet
Framework                      netcoreapp2.1
Running: dotnet msbuild /nologo /verbosity:minimal /clp:Summary /nodeReuse:false /flp:v=normal;LogFile=clean.log /t:CleanAllProjects
C:\git\oss\coreclr\all.locproj(14,3): error MSB4019: The imported project "C:\tools\devdiv\loc\Loctask\Localization.settings.targets" was not found. Confirm that the path in the <Import> declaration is correct, and that the file exists on disk.

Build FAILED.

C:\git\oss\coreclr\all.locproj(14,3): error MSB4019: The imported project "C:\tools\devdiv\loc\Loctask\Localization.settings.targets" was not found. Confirm that the path in the <Import> declaration is correct, and that the file exists on disk.

@@ -1578,7 +1578,9 @@ public static void Sort<T>(T[] array, int index, int length, System.Collections.
return;
}

ArraySortHelper<TKey, TValue>.Default.Sort(keys, items, index, length, comparer);
var spanKeys = new Span<TKey>(ref Unsafe.As<byte, TKey>(ref keys.GetRawArrayData()), keys.Length);
var spanItems = new Span<TValue>(ref Unsafe.As<byte, TValue>(ref items!.GetRawArrayData()), items!.Length); // TODO-NULLABLE: https://github.com/dotnet/csharplang/issues/538
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please change the comment to:

// TODO-NULLABLE: Remove ! when [DoesNotReturn] respected

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -154,7 +152,8 @@ private static void SwapIfGreater(T[] keys, Comparison<T> comparer, int a, int b
}
}

private static void Swap(T[] a, int i, int j)
[MethodImpl(MethodImplOptions.AggressiveInlining)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this was found to important on Swap, I'm curious that it wasn't found to be important on SwapIfGreater? Is that because SwapIfGreater already wasn't getting inlined?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because SwapIfGreater already wasn't getting inlined?

@stephentoub yes. And depending on input SwapIfGreater is not called nearly as often. Depends a lot on specific input data "median of three killer sequence" would probably change that statement. There are plenty of things that can be done to improve the performance of existing sorting code, as my previous PRs did. Although, .NET Core has also moved since so some things might no longer be needed.


/// <summary>
/// Sorts the elements in the entire <see cref="Span{T}" />
/// using the <see cref="IComparable" /> implementation of each
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This is going to link to the non-generic IComparable rather than the generic one. Same for other occurrences.

/// Sorts the elements in the entire <see cref="Span{T}" />
/// using the <typeparamref name="TComparer" />.
/// </summary>
public static void Sort<T, TComparer>(this Span<T> span, TComparer comparer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be TComparer? comparer instead of TComparer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually the desired / approved shape, having Sort be generic on both the T and the comparer type?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved shape, having Sort be generic on both the T and the comparer type?

@stephentoub yes, this is the approved API. It was specifically designed to allow for value type comparers and hence specific code paths for that. Yes that means "code bloat" that is the purpose of it. :) In fact, it was a specific design goal of mine for the API after struggling for years with the restrictive and somewhat inconsistent Array.Sort API, where we have often had to simply resort to put stl::sort in a native dll and interop to it, given we had native memory and a memory copy would be too expensive, and had specific sorting needs.

There already is specific code paths and "code bloat" for value types in current sorting code for IComparable<T> where T : struct, nothing new here, this simply adds orthogonality to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was specifically designed to allow for value type comparers and hence specific code paths for that

Is that really a common case? This implementation also doesn't appear to be benefiting from that at all, and worse, hides the fact that it's allocating.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really a common case?

@stephentoub in my view yes. I have seen enough Array.Reverse too. In my view there is no downside to this, most users will simply use Sort() or Sort((x, y) => x.CompareTo(y)), advanced users who need this can use the value type comparer.

doesn't appear to be benefiting from that at all, and worse, hides the fact that it's allocating.

@stephentoub we agreed to do this step wise. One step at a time. I intend to fix this in later PRs, this will require code duplication etc.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub note that Array.Sort already does allocs for a custom comparer, converting it to Comparison<T>. I have noted this before and shown this in benchmarks. Can't run the latest ones now since I can't build coreclr anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best to omit this shape of the API from this step.

The value prop of this special API is better performance. If the implementation does not actually provide better performance, it is just going to cause confusion.

/// One or more elements do not implement the <see cref="IComparable" /> interface.
/// </exception>
public static void Sort<T>(this Span<T> span) =>
Sort(span, (IComparer<T>)null!);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ! shouldn't be necessary once the below change is made.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It'll need to be (IComparer?) for the cast.)

ThrowHelper.ThrowArgumentNullException(ExceptionArgument.comparison);
if (span.Length > 1)
{
ArraySortHelper<T>.Sort(span, 0, span.Length, comparison!); // TODO-NULLABLE: https://github.com/dotnet/csharplang/issues/538
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment nit as earlier.

/// element of the <see cref= "Span{TKey}"/>.
/// </summary>
public static void Sort<TKey, TValue>(this Span<TKey> keys, Span<TValue> items) =>
Sort(keys, items, (IComparer<TKey>)null!);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, the ! shouldn't be necessary.

/// using the <typeparamref name="TComparer" />.
/// </summary>
public static void Sort<TKey, TValue, TComparer>(this Span<TKey> keys,
Span<TValue> items, TComparer comparer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this TComparer comparer should be TComparer? comparer.

{
// TODO: Add Comparison<T> overload that is not available in existing Sort code.
// Hence Comparison<T> is wrapped in ComparisonComparer<T> for now.
var comparisonComparer = new ComparisonComparer<TKey>(comparison!); // TODO-NULLABLE: https://github.com/dotnet/csharplang/issues/538
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment nit.

// might differ in both performance and sorting for equal keys

// TODO: Add TComparer support until then value type comparers will be boxed
// and performance will be affected
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a TODO for when... as part of this PR, or subsequently? If subsequently, please make sure there's an open issue for it and include the issue # here in the TODO. But this also contributes to my question about whether the TComparer needs to be generic.

{
// Span based Sort does not call `TrySZSort`, but instead
// uses the managed generic code only. This means Span based Sort
// might differ in both performance and sorting for equal keys
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't affect this PR, but have you measure to see what the difference is between an array-based sort that does use TrySZSort and a span-based sort that doesn't? How far off are they?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub comparing int vs IntStruct can indicate this (showing the reverse in this case).

int

Method Toolchain Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
Array \coreclr-base 512 5.665 us 0.0367 us 0.0343 us 5.670 us 5.608 us 5.725 us 1.00 0.00 - - - -
Array \coreclr 512 5.697 us 0.0393 us 0.0368 us 5.693 us 5.636 us 5.779 us 1.01 0.01 - - - -

IntStruct

Method Toolchain Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
Array \coreclr-base\ 512 3.785 us 0.1062 us 0.1222 us 3.714 us 3.684 us 4.021 us 1.00 0.00 - - - -
Array \coreclr\ 512 3.641 us 0.0299 us 0.0280 us 3.646 us 3.581 us 3.701 us 0.96 0.03 - - - -

Quite a difference here in favor of the managed code path.

Note that this is not just about perf, native sorting code does a NaNPrepass (if I remember correctly) for floats, that the span version would then not do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above is for

BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.17763.503 (1809/October2018Update/Redstone5)
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100-preview5-011568
  [Host]     : .NET Core 3.0.0-preview5-27626-15 (CoreCLR 4.6.27622.75, CoreFX 4.700.19.22408), 64bit RyuJIT
  Job-JTKJDV : .NET Core ? (CoreCLR 4.6.27803.0, CoreFX 4.700.19.22901), 64bit RyuJIT
  Job-DBBNCI : .NET Core ? (CoreCLR 4.6.27808.0, CoreFX 4.700.19.22901), 64bit RyuJIT

Runtime=Core  InvocationCount=10000  IterationTime=250.0000 ms  
MaxIterationCount=20  MinIterationCount=15  UnrollFactor=1  
WarmupCount=1  

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub you can see some more benchmarks in the gist here:

https://gist.github.com/nietras/ccf8199a9722a396731a0597d55916bf

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub after merging with master I have rerun the benchmarks with higher invocation count (25000). This is more inline with what I expected with regards to native (int) vs managed code (IntStruct). native being a bit faster with the caveats that this is based on using IntStruct as a replacement for int.

coreclr-base = 5af77fa (master)
coreclr = 11dd34a (from this PR branch)

Int32

BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.17763.503 (1809/October2018Update/Redstone5)
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100-preview5-011568
  [Host]     : .NET Core 3.0.0-preview5-27626-15 (CoreCLR 4.6.27622.75, CoreFX 4.700.19.22408), 64bit RyuJIT
  Job-VKBMMT : .NET Core ? (CoreCLR 4.700.19.30301, CoreFX 4.700.19.27908), 64bit RyuJIT
  Job-SABOCD : .NET Core ? (CoreCLR 4.700.19.30301, CoreFX 4.700.19.27908), 64bit RyuJIT

Runtime=Core  InvocationCount=25000  IterationTime=250.0000 ms  
MaxIterationCount=20  MinIterationCount=15  UnrollFactor=1  
WarmupCount=1  
Method Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
Array \coreclr-base 512 3.495 us 0.1296 us 0.1387 us 3.469 us 3.323 us 3.753 us 1.00 0.00 - - - -
Array \coreclr 512 3.640 us 0.2213 us 0.2460 us 3.573 us 3.317 us 4.222 us 1.05 0.08 - - - -
Array_ComparerClass \coreclr-base 512 19.367 us 0.1069 us 0.0948 us 19.357 us 19.207 us 19.560 us 1.00 0.00 - - - 64 B
Array_ComparerClass \coreclr 512 19.474 us 0.1196 us 0.1060 us 19.452 us 19.321 us 19.695 us 1.01 0.01 - - - 64 B
Array_ComparerStruct \coreclr-base 512 22.870 us 0.1593 us 0.1490 us 22.820 us 22.696 us 23.192 us 1.00 0.00 - - - 88 B
Array_ComparerStruct \coreclr 512 23.718 us 0.0634 us 0.0593 us 23.709 us 23.614 us 23.859 us 1.04 0.01 - - - 88 B
Array_Comparison \coreclr-base 512 18.615 us 0.0971 us 0.0861 us 18.598 us 18.519 us 18.826 us 1.00 0.00 - - - -
Array_Comparison \coreclr 512 19.760 us 0.2582 us 0.2415 us 19.647 us 19.556 us 20.369 us 1.06 0.01 - - - -
List \coreclr-base 512 3.385 us 0.0916 us 0.0980 us 3.335 us 3.298 us 3.607 us 1.00 0.00 - - - -
List \coreclr 512 3.759 us 0.4917 us 0.5465 us 3.534 us 3.279 us 5.224 us 1.10 0.17 - - - -

IntStruct

BenchmarkDotNet=v0.11.3.1003-nightly, OS=Windows 10.0.17763.503 (1809/October2018Update/Redstone5)
Intel Core i7-8700 CPU 3.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=3.0.100-preview5-011568
  [Host]     : .NET Core 3.0.0-preview5-27626-15 (CoreCLR 4.6.27622.75, CoreFX 4.700.19.22408), 64bit RyuJIT
  Job-VKBMMT : .NET Core ? (CoreCLR 4.700.19.30301, CoreFX 4.700.19.27908), 64bit RyuJIT
  Job-SABOCD : .NET Core ? (CoreCLR 4.700.19.30301, CoreFX 4.700.19.27908), 64bit RyuJIT

Runtime=Core  InvocationCount=25000  IterationTime=250.0000 ms  
MaxIterationCount=20  MinIterationCount=15  UnrollFactor=1  
WarmupCount=1  
Method Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
Array \coreclr-base 512 3.729 us 0.0498 us 0.0416 us 3.731 us 3.635 us 3.786 us 1.00 0.00 - - - -
Array \coreclr 512 3.925 us 0.2357 us 0.2522 us 3.913 us 3.654 us 4.505 us 1.07 0.08 - - - -
Array_ComparerClass \coreclr-base 512 20.788 us 0.0535 us 0.0474 us 20.782 us 20.669 us 20.867 us 1.00 0.00 - - - 64 B
Array_ComparerClass \coreclr 512 21.212 us 0.0889 us 0.0788 us 21.192 us 21.101 us 21.398 us 1.02 0.00 - - - 64 B
Array_ComparerStruct \coreclr-base 512 23.963 us 0.0966 us 0.0856 us 23.949 us 23.830 us 24.183 us 1.00 0.00 - - - 88 B
Array_ComparerStruct \coreclr 512 22.890 us 0.1253 us 0.1047 us 22.894 us 22.684 us 23.034 us 0.96 0.01 - - - 88 B
Array_Comparison \coreclr-base 512 20.161 us 0.1313 us 0.1164 us 20.157 us 19.995 us 20.401 us 1.00 0.00 - - - -
Array_Comparison \coreclr 512 21.104 us 0.0972 us 0.0910 us 21.101 us 20.995 us 21.292 us 1.05 0.01 - - - -
List \coreclr-base 512 4.891 us 1.0011 us 1.1528 us 4.679 us 3.741 us 7.906 us 1.00 0.00 - - - -
List \coreclr 512 4.047 us 0.1858 us 0.1988 us 4.076 us 3.765 us 4.503 us 0.84 0.17 - - - -

@jkotas
Copy link
Member

jkotas commented Jun 2, 2019

The name 'MethodImplOptions' does not exist in the current context

Are you missing using System.Runtime.CompilerServices;?

I also have problems building locally after merging with master.

You can try deleting %USERPROFILE%\.dotnet, %USERPROFILE%\.netcoreeng and %USERPROFILE%\.nuget. The build system started caching stuff in these global locations recently, and things fail in cryptic ways when these cached bits are corrupted.

@nietras
Copy link
Author

nietras commented Jun 3, 2019

@jkotas @stephentoub added a gist with the latest benchmark run (without string).

https://gist.github.com/nietras/c206d4a7d9f173cc9271eabc8d01b1e6

This is for:

coreclr-base = 5af77fa (master)
coreclr = 11dd34a (from this PR branch)

Notably, BigStruct sees some big improvements when switching to Span<T>. Perhaps, indicating that swapping is inlined here and not for the old code without AggressiveInlining.

Method Size Mean Error StdDev Median Min Max Ratio RatioSD Gen 0/1k Op Gen 1/1k Op Gen 2/1k Op Allocated Memory/Op
Array \coreclr-base\ 512 11.964 us 0.3498 us 0.3743 us 11.769 us 11.610 us 13.004 us 1.00 0.00 - - - -
Array \coreclr\ 512 8.234 us 0.0504 us 0.0472 us 8.245 us 8.159 us 8.310 us 0.69 0.02 - - - -

IntClass appears to regress by about 3-4%. IntStruct is probably within variance but there are 8% regressions with the 25000 invocations per iteration, but then List is 16% faster so there is no doubt some variance here for these. int sees minor regressions too, which of course makes no sense, since code is the same when running default.

I'll try to run it with more invocations later at some point.

@stephentoub
Copy link
Member

I'm taking a look at fixing this up.

@@ -1578,7 +1578,9 @@ public static void Sort<T>(T[] array, int index, int length, System.Collections.
return;
}

ArraySortHelper<TKey, TValue>.Default.Sort(keys, items, index, length, comparer);
var spanKeys = new Span<TKey>(ref Unsafe.As<byte, TKey>(ref keys.GetRawArrayData()), keys.Length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be GetRawSzArrayData()

@nietras
Copy link
Author

nietras commented Nov 5, 2019

I'm taking a look at fixing this up.

@stephentoub sounds good :) It took a long time before the changes/fixes to benchmarks got added to https://github.com/dotnet/performance/ and at that time I unfortunately did not have time to revisit this. Sorry about that. Let me know if you have any questions or need review or similar :)

@stephentoub
Copy link
Member

Thanks, @nietras.

Closing as replaced by #27700.

@stephentoub stephentoub closed this Nov 6, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
8 participants