Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: ImmutableArray.UnsafeFreeze<T>(T[]) #28064

Open
benjamin-hodgson opened this issue Mar 14, 2018 · 9 comments

Comments

Projects
None yet
6 participants
@benjamin-hodgson
Copy link
Collaborator

commented Mar 14, 2018

Use case: efficiently creating an ImmutableArray with a fixed (>4) number of elements.

Presently if you need to create an ImmutableArray<T> with a known collection of elements, the most efficient way to do so is to create an ImmutableArray<T>.Builder, set its Capacity to allocate an array of the correct size, fill the Builder with repeated calls to Add, and then use MoveToImmutable to create the ImmutableArray. (This strategy can be marginally improved by storing a cached Builder in a ThreadLocal.)

For element counts smaller than 4, there are overloads of ImmutableArray.Create which are implemented by creating an array and then immediately turning it into an ImmutableArray, but for more than 4 Create takes a params[] and copies it into an ImmutableArray. There is no way to create an array directly, fill it, and freeze it into an ImmutableArray without copying while pinky-promising never to mutate the original array again.

I set up a microbenchmark (see below) which uses codegen to call ImmutableArray<T>'s private T[] constructor, and it's around 4-5 times faster for an array of 1000 longs. I found that a library of mine was spending a reasonable chunk of its time creating ImmutableArrays, and implementing this hack led to a global ~10% speedup in my end-to-end benchmarks. I'm not entirely happy about depending upon private implementation details like that though.

I propose adding a officially-supported static method to freeze an array into an ImmutableArray without copying it. I suggest calling it UnsafeFreeze to emphasise the fact that this method is risky: to use it correctly you have to make sure never to mutate the array that you passed in. (This name may be confusing given that it doesn't have anything to do with C#'s unsafe; I'm open to alternative suggestions.)

public static class ImmutableArray
{
    [EditorBrowsable(EditorBrowsableState.Never)]
    // might wanna add [MethodImpl(MethodImplOptions.AggressiveInlining)]
    public static ImmutableArray<T> UnsafeFreeze<T>(T[] array)
        => new ImmutableArray<T>(array);
}

There may be a use case for a similar UnsafeThaw method which turns an ImmutableArray<T> into a T[] without copying, but I can't think of one right now.

NB. There is precedent in the BCL for APIs which ask you to relinquish ownership of the argument. ArrayPool.Return requires you not to continue using the array you returned because it is liable to get handed out to a future caller of Rent.


Here's the code for the microbenchmark I mentioned above (using BenchmarkDotNet):

public class Bench
{
    static readonly ThreadLocal<ImmutableArray<long>.Builder> _cachedBuilder = new ThreadLocal<ImmutableArray<long>.Builder>(ImmutableArray.CreateBuilder<long>);
    static readonly Func<long[], ImmutableArray<long>> _unsafeFreeze = GetUnsafeFreeze<long>();

    [Benchmark(Baseline = true)]
    public void CachedBuilder()
    {
        var builder = _cachedBuilder.Value;
        builder.Capacity = 1000;
        for (long i = 0; i < builder.Capacity; i++)
        {
            builder.Add(i);
        }
        builder.MoveToImmutable();
    }
    [Benchmark]
    public void UnsafeFreeze()
    {
        var array = new long[1000];
        for (long i = 0; i < array.Length; i++)
        {
            array[i] = i;
        }
        _unsafeFreeze(array);
    }

    static Func<T[], ImmutableArray<T>> GetUnsafeFreeze<T>()
    {
        var ctor = typeof(ImmutableArray<T>)
            .GetConstructors(BindingFlags.NonPublic | BindingFlags.Instance)
            .Single(c => c.GetParameters().Count() == 1 && c.GetParameters().Single().ParameterType.Equals(typeof(T[])));
        var param = Expression.Parameter(typeof(T[]));
        var body = Expression.New(ctor, param);
        var func = Expression.Lambda<Func<T[], ImmutableArray<T>>>(body, param);
        return func.Compile();
    }
}

And the results:

        Method |     Mean |     Error |    StdDev | Scaled | ScaledSD |
-------------- |---------:|----------:|----------:|-------:|---------:|
 CachedBuilder | 8.439 us | 0.1667 us | 0.4089 us |   1.00 |     0.00 |
  UnsafeFreeze | 1.944 us | 0.0399 us | 0.1169 us |   0.23 |     0.02 |
@jkotas

This comment has been minimized.

Copy link
Member

commented Mar 15, 2018

We had methods like these, but they were removed: #196

The current "solution" is to use hack like what you have done. You can find another version of the hack in #196. Multiple versions of this hack are replicating in uncontrolled way. I have even seen one in F#. I think it is a much worse situation than having officially santioned unsafe interop methods. Maybe we should reconsider?

@benjamin-hodgson

This comment has been minimized.

Copy link
Collaborator Author

commented Mar 15, 2018

@jkotas Thanks, I didn't know about #196. I quite like the idea of using a ref parameter and setting it to null; it is a little more complex but it has some minor safety benefits in the relatively common case where the array has only one reference to it. I also quite like the ByteArrayUnion idea in that PR as a way for me to implement my existing hack, might turn out to be marginally faster than calling a dynamic virtual method like I currently do.

I agree with you, the status quo of proliferating unsupported hacks and workarounds is bad for everyone: bad for users who have to write ugly and dangerous code, and bad for you guys who have to consider users who may be depending on private implementation details.

@jkotas

This comment has been minimized.

Copy link
Member

commented Mar 15, 2018

The leanest and fastest version of this hack is to use System.Runtime.CompilerServices.Unsafe. With that, the hack is one-linker like this:

byte[] a = ...
ImmutableArray<byte> ia = Unsafe.As<byte[], ImmutableArray<byte>>(ref a);
@jkotas

This comment has been minimized.

Copy link
Member

commented Mar 15, 2018

cc @VSadov @gafter @jaredpar If I remember correctly, you had strong opinions about not having an officially sanctioned unsafe API for this.

@VSadov

This comment has been minimized.

Copy link
Member

commented Mar 16, 2018

The whole point of ImmutableArray<T> is to be immutable. The type does not offer any other value.

There is a number of patterns, such as argument validation without making private copies that rely on instances never changing values. If we could rely on a convention, we would not need the type and just use ordinary arrays.

I am a bit amused that we have to defend that immutableArray<T> must be immutable, but no one proposes the same ideas for System.String.
Perhaps there is a problem with documentation...

Using System.Runtime.CompilerServices.Unsafe to get around immutability seems the right approach.
That is basically the purpose of System.Runtime.CompilerServices.Unsafe - to get around the type system when so desired.

@VSadov

This comment has been minimized.

Copy link
Member

commented Mar 16, 2018

Another solution could be just using ordinary arrays on the hot internal paths.

I can see how ImmutableArray<T> could be an unnecessary obstacle in a closed system which could implement "I will not mutate" convention and where performance is paramount.
We do use ordinary arrays in Roslyn when performance is critical and immutability stands in the way and when we do not need to expose the data across public APIs.

@jkotas

This comment has been minimized.

Copy link
Member

commented Mar 16, 2018

I agree that getting around immutability is fundamentally unsafe operation. The question is whether we should have a API to codify the unsafe pattern instead of telling everybody to depend on internal implementation details.

For example, we have introduced Memory<T> System.Runtime.InteropServices.MemoryMarshal.AsMemory<T>(ReadOnlyMemory<T> readOnlyMemory). We could have said that folks should use Unsafe to cast ReadOnlyMemory<T> to Memory<T> and take a fragile dependency on the internal implementation details, but we did not. We have API to codify the pattern that gives us control over it - our API telemetry can see how many places it used for, etc. The API is in System.Runtime.InteropServices that is namespace reserved for unsafe code.

@svick

This comment has been minimized.

Copy link
Contributor

commented Mar 17, 2018

I looked into why the safe code is so much slower than the unsafe code, since a good optimizing compiler should be able to make the performance of using Builder close to the performance of bypassing that abstraction.

My conclusion is that adding a new dangerous API is not necessary. What is necessary is:

  1. The caller should be changed from setting Capacity and using Add() to setting Count and using the indexer setter:

    public void CachedBuilderIndexer()
    {
        var builder = _cachedBuilder.Value;
        builder.Count = 1000;
        for (int i = 0; i < builder.Count; i++)
        {
            builder[i] = i;
        }
        builder.MoveToImmutable();
    }
  2. The implementation of ImmutableArray<T>.Builder indexer setter should be tweaked so that the CLR will inline it.

With these two changes, by my measurements, the performance is only 1.24× slower than UnsafeFreeze, which I think is acceptable, especially considering that the original code is 5× slower than UnsafeFreeze.


If you want more details, here's what I did:

First I looked at the disassembly of CachedBuilder (whose scaled performance relative to UnsafeFreeze is 5×) and noticed that the Add() method was not being inlined. So I created my own stripped-down copy of ImmutableArray<T> and tweaked its Add() to make it shorter in IL. This didn't actually put it under the threshold, but it did trigger "Inline candidate is mostly loads and stores.", which meant the method was inlined. The result is that the performance of this method, MyInlinedCachedBuilder, was 2.1×.

When I looked at the disassembly of the previous method, I noticed the presence of (inlined) EnsureCapacity. Since the array will not be resized in the loop, that is not necessary. So I switched to setting Count and then using the indexer setter. The performance of this method, MyCachedBuilderIndexer, was 3.0×. This is worse than the previous method, but we're not done yet.

When I looked at the disassembly, I noticed that the indexer was not being inlined. So I tweaked its code by extracting throw into a separate method. This meant that the indexer was now being inlined, resulting in performance of this method, MyInlinedCachedBuilderIndexer of 1.24×, relative to UnsafeFreeze.

I ran this on .NET Core 2.1.0-preview2-26131-06. The code is here, the raw results were:

Method Mean Error StdDev Scaled ScaledSD Gen 0 Allocated
CachedBuilder 7.562 us 0.0469 us 0.0416 us 4.93 0.41 2.5024 7.84 KB
CachedBuilder
Indexer
4.502 us 0.0122 us 0.0108 us 2.94 0.25 2.5024 7.84 KB
MyCachedBuilder 7.894 us 0.0195 us 0.0183 us 5.15 0.43 2.4414 7.84 KB
MyInlinedCached
Builder
3.247 us 0.0147 us 0.0137 us 2.12 0.18 2.5024 7.84 KB
MyCachedBuilder
Indexer
4.591 us 0.0691 us 0.0612 us 2.99 0.25 2.5024 7.84 KB
MyInlinedCached
BuilderIndexer
1.909 us 0.0352 us 0.0312 us 1.24 0.11 2.5330 7.84 KB
UnsafeFreeze 1.545 us 0.0448 us 0.1322 us 1.00 0.00 2.5330 7.84 KB
@jkotas

This comment has been minimized.

Copy link
Member

commented Mar 18, 2018

More powerful builder helps in some cases, but it won't ever solve the interop case when you need to interop efficiently with the existing code that expects regular arrays and guarantees immutability via documentation. Here is an example of this use case from Roslyn: https://github.com/dotnet/roslyn/blob/944ae114140ba77cbd4be370bf13a7f758f740b3/src/Compilers/Core/Portable/NativePdbWriter/PdbWriter.cs#L593

@safern safern added this to the Future milestone Mar 20, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.