Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use C# Array Pool buffer when calling BebopWriter.Create() to Encode #144

Closed
ProgrammerAL opened this issue Aug 25, 2021 · 3 comments · Fixed by #155
Closed

Use C# Array Pool buffer when calling BebopWriter.Create() to Encode #144

ProgrammerAL opened this issue Aug 25, 2021 · 3 comments · Fixed by #155
Labels
enhancement New feature or request

Comments

@ProgrammerAL
Copy link
Contributor

Is your feature request related to a problem? Please describe.
I was benchmarking Bebop recently and noticed it allocated considerably more memory than the other binary serialization protocols I was comparing it to (Protobuf and MessagePack). The results of the benchmarks are at: https://github.com/ProgrammerAl/SerializationBenchmarks#benchmark-results

Looks like this is happening because new instances of BebopWriter are created with an empty byte array and then it's grown/re-allocated as values are written to it. I propose a new overload is added to BebopWriter.Create() that accepts an array so it doesn't start at empty. By default this array would come from ArrayPool<byte>.Shared.

Describe the solution you'd like
There are 2 alternatives I tried locally.

The first one, temporarily named EncodeImmutablyWithArrayPool(), gets the buffer from the pool using the final size needed to encode to aka ArrayPool<byte>.Shared.Rent(record.ByteCount). The benefit here is we know another array won't be allocated. In my benchmarks this was consistently 20-30 ns slower than the current code. My assumption is it was slower because of the call to GetByteCount(). The benefit is it allocates much less memory.

[global::System.Runtime.CompilerServices.MethodImpl(global::Bebop.Runtime.BebopConstants.HotPath)]
public static global::System.Collections.Immutable.ImmutableArray<byte> EncodeImmutablyWithArrayPool(global::Bebop.Codegen.Library record)
{
    var arrayPool = ArrayPool<byte>.Shared.Rent(record.ByteCount);
    try
    {
        var writer = global::Bebop.Runtime.BebopWriter.Create(arrayPool);
        __EncodeInto(record, ref writer);
        return writer.ToImmutableArray();
    }
    finally
    {
        ArrayPool<byte>.Shared.Return(arrayPool, clearArray: false);
    }
}

The second example, temporarily named EncodeImmutablyWithArrayPoolFixed(), gets an array from the pool using a constant value aka ArrayPool<byte>.Shared.Rent(1000). In my benchmarks this was consistently 20-30 ns faster to run than the current code. We can play around with what constant value to use, but if we rent a buffer too small, we'll be back to our problem of allocating another array. Either way, this will still allocate much less memory than the current code and will run faster.

[global::System.Runtime.CompilerServices.MethodImpl(global::Bebop.Runtime.BebopConstants.HotPath)]
public static global::System.Collections.Immutable.ImmutableArray<byte> EncodeImmutablyWithArrayPoolFixed(global::Bebop.Codegen.Library record)
{
    var arrayPool = ArrayPool<byte>.Shared.Rent(1000);
    try
    {
        var writer = global::Bebop.Runtime.BebopWriter.Create(arrayPool);
        __EncodeInto(record, ref writer);
        return writer.ToImmutableArray();
    }
    finally
    {
        ArrayPool<byte>.Shared.Return(arrayPool, clearArray: false);
    }
}

Additional context
Below are the benchmarks for the above mentioned methods:

Method Runtime Mean Error StdDev Min Max Gen 0 Gen 1 Gen 2 Allocated
EncodeOriginal .NET 4.7.2 468.3 ns 1.39 ns 1.30 ns 466.3 ns 469.9 ns 0.0811 - - 514 B
EncodeWithArrayPool .NET 4.7.2 518.0 ns 3.60 ns 3.19 ns 513.2 ns 524.3 ns 0.0172 - - 112 B
EncodeWithArrayPoolFixed .NET 4.7.2 427.3 ns 1.27 ns 1.19 ns 425.1 ns 428.9 ns 0.0176 - - 112 B
EncodeOriginal .NET Core 5.0 206.6 ns 0.50 ns 0.47 ns 205.8 ns 207.3 ns 0.0610 - - 512 B
EncodeWithArrayPool .NET Core 5.0 236.2 ns 0.51 ns 0.47 ns 235.2 ns 236.9 ns 0.0134 - - 112 B
EncodeWithArrayPoolFixed .NET Core 5.0 183.3 ns 0.42 ns 0.40 ns 182.4 ns 183.7 ns 0.0134 - - 112 B

Here's the benchmark code:

    [MinColumn]
    [MaxColumn]
    [MemoryDiagnoser]
    [SimpleJob(RuntimeMoniker.Net472)]
    [SimpleJob(RuntimeMoniker.NetCoreApp50)]
    public class ObjectReadWrite
    {
        private readonly Library _library;

        public ObjectReadWrite()
        {
            var testGuid = Guid.Parse("81c6987b-48b7-495f-ad01-ec20cc5f5be1");
            var song = new Song
            {
                Title = "Donna Lee",
                Year = 1974,
                Performers = new Musician[]
                {
                    new Musician {Name = "Charlie Parker", Plays = Instrument.Sax},
                    new Musician {Name = "Miles Davis", Plays = Instrument.Trumpet}
                }
            };

            _library = new Library { Songs = new Dictionary<Guid, Song> { { testGuid, song } } };
        }

        [Benchmark]
        public ImmutableArray<byte> EncodeOriginal()
            => Library.EncodeImmutably(_library);

        [Benchmark]
        public ImmutableArray<byte> EncodeWithArrayPool()
            => Library.EncodeImmutablyWithArrayPool(_library);

        [Benchmark]
        public ImmutableArray<byte> EncodeWithArrayPoolFixed()
            => Library.EncodeImmutablyWithArrayPoolFixed(_library);
    }

If this is a feature you want added, I can volunteer to take it on.

Another alternative I did not try is to get a buffer from the Array pool of size MaxByteCount for that entity. Currently the property re-calculates the value each time it's called. But it's essentially a constant value. We could change it to a constant that gets output by the compiler and use that constant when pulling a buffer from the array pool. This would take more effort, and could be done at a later date.

@ProgrammerAL ProgrammerAL added the enhancement New feature or request label Aug 25, 2021
@andrewmd5
Copy link
Contributor

andrewmd5 commented Aug 27, 2021

Thank you for putting this issue together! I started investigating using ArrayPool a few weeks ago and couldn't quite decide on the best approach. Another place where perf wins could be gained is improvements to the string encoder / decoding.

I believe a better option would be to allow for a buffer to be passed into Encode methods that can be managed by the calling application. That way the pooling strategy isn't left up to the Bebop runtime itself. If you want to submit a PR for that it is more than welcome!

@ProgrammerAL
Copy link
Contributor Author

I spent some time over the weekend thinking about this. I agree that the Bebop library shouldn't be the thing managing pooling for something as simple as encoding an object. My concern is that everyone who wants to use this new encode method would have to write the same wrapping code calling this new method. I think eventually this can be included in the SDK/Generated Code, but I don't have a good idea for what that looks like. I'll think about it some more and if I have any good ideas, I'll make a separate issue to talk about them.

In the mean time, for this issue, I'll go with what you said and create an encoding method that can receive a byte[] buffer. Here's what I'm envisioning:

  • New method public EncodeResult EncodeInto(byte[] outBuffer)
    • EncodeResult is a new struct with 2 properties
      • int Length is the amout of bytes written AKA BebopWriter.Length
      • Result is an enum stating the reuslt of the encode process
        • 0 - UnknownError - A default value for some error (probably won't be used in initial PR)
        • 1 - Success - If everything was good when encoding to the buffer
        • 2 - BufferTooSmall - Used if the given buffer was not big enough to hold the encoded bytes
  • Add new overload for BebopWriter.Create() that accepts a given byte[]

I went back and forth with the EncodeResult struct. Considered just throwing an exception if the input buffer was too small, but figure this would be better overall for performance.

@ProgrammerAL
Copy link
Contributor Author

Closing this issue because the PR has been created in issue #155

@andrewmd5 andrewmd5 linked a pull request Sep 14, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants