Use C# Array Pool buffer when calling BebopWriter.Create() to Encode #144

ProgrammerAL · 2021-08-25T14:35:13Z

Is your feature request related to a problem? Please describe.
I was benchmarking Bebop recently and noticed it allocated considerably more memory than the other binary serialization protocols I was comparing it to (Protobuf and MessagePack). The results of the benchmarks are at: https://github.com/ProgrammerAl/SerializationBenchmarks#benchmark-results

Looks like this is happening because new instances of BebopWriter are created with an empty byte array and then it's grown/re-allocated as values are written to it. I propose a new overload is added to BebopWriter.Create() that accepts an array so it doesn't start at empty. By default this array would come from ArrayPool<byte>.Shared.

Describe the solution you'd like
There are 2 alternatives I tried locally.

The first one, temporarily named EncodeImmutablyWithArrayPool(), gets the buffer from the pool using the final size needed to encode to aka ArrayPool<byte>.Shared.Rent(record.ByteCount). The benefit here is we know another array won't be allocated. In my benchmarks this was consistently 20-30 ns slower than the current code. My assumption is it was slower because of the call to GetByteCount(). The benefit is it allocates much less memory.

[global::System.Runtime.CompilerServices.MethodImpl(global::Bebop.Runtime.BebopConstants.HotPath)]
public static global::System.Collections.Immutable.ImmutableArray<byte> EncodeImmutablyWithArrayPool(global::Bebop.Codegen.Library record)
{
    var arrayPool = ArrayPool<byte>.Shared.Rent(record.ByteCount);
    try
    {
        var writer = global::Bebop.Runtime.BebopWriter.Create(arrayPool);
        __EncodeInto(record, ref writer);
        return writer.ToImmutableArray();
    }
    finally
    {
        ArrayPool<byte>.Shared.Return(arrayPool, clearArray: false);
    }
}

The second example, temporarily named EncodeImmutablyWithArrayPoolFixed(), gets an array from the pool using a constant value aka ArrayPool<byte>.Shared.Rent(1000). In my benchmarks this was consistently 20-30 ns faster to run than the current code. We can play around with what constant value to use, but if we rent a buffer too small, we'll be back to our problem of allocating another array. Either way, this will still allocate much less memory than the current code and will run faster.

[global::System.Runtime.CompilerServices.MethodImpl(global::Bebop.Runtime.BebopConstants.HotPath)]
public static global::System.Collections.Immutable.ImmutableArray<byte> EncodeImmutablyWithArrayPoolFixed(global::Bebop.Codegen.Library record)
{
    var arrayPool = ArrayPool<byte>.Shared.Rent(1000);
    try
    {
        var writer = global::Bebop.Runtime.BebopWriter.Create(arrayPool);
        __EncodeInto(record, ref writer);
        return writer.ToImmutableArray();
    }
    finally
    {
        ArrayPool<byte>.Shared.Return(arrayPool, clearArray: false);
    }
}

Additional context
Below are the benchmarks for the above mentioned methods:

Method	Runtime	Mean	Error	StdDev	Min	Max	Gen 0	Gen 1	Gen 2	Allocated
EncodeOriginal	.NET 4.7.2	468.3 ns	1.39 ns	1.30 ns	466.3 ns	469.9 ns	0.0811	-	-	514 B
EncodeWithArrayPool	.NET 4.7.2	518.0 ns	3.60 ns	3.19 ns	513.2 ns	524.3 ns	0.0172	-	-	112 B
EncodeWithArrayPoolFixed	.NET 4.7.2	427.3 ns	1.27 ns	1.19 ns	425.1 ns	428.9 ns	0.0176	-	-	112 B
EncodeOriginal	.NET Core 5.0	206.6 ns	0.50 ns	0.47 ns	205.8 ns	207.3 ns	0.0610	-	-	512 B
EncodeWithArrayPool	.NET Core 5.0	236.2 ns	0.51 ns	0.47 ns	235.2 ns	236.9 ns	0.0134	-	-	112 B
EncodeWithArrayPoolFixed	.NET Core 5.0	183.3 ns	0.42 ns	0.40 ns	182.4 ns	183.7 ns	0.0134	-	-	112 B

Here's the benchmark code:

    [MinColumn]
    [MaxColumn]
    [MemoryDiagnoser]
    [SimpleJob(RuntimeMoniker.Net472)]
    [SimpleJob(RuntimeMoniker.NetCoreApp50)]
    public class ObjectReadWrite
    {
        private readonly Library _library;

        public ObjectReadWrite()
        {
            var testGuid = Guid.Parse("81c6987b-48b7-495f-ad01-ec20cc5f5be1");
            var song = new Song
            {
                Title = "Donna Lee",
                Year = 1974,
                Performers = new Musician[]
                {
                    new Musician {Name = "Charlie Parker", Plays = Instrument.Sax},
                    new Musician {Name = "Miles Davis", Plays = Instrument.Trumpet}
                }
            };

            _library = new Library { Songs = new Dictionary<Guid, Song> { { testGuid, song } } };
        }

        [Benchmark]
        public ImmutableArray<byte> EncodeOriginal()
            => Library.EncodeImmutably(_library);

        [Benchmark]
        public ImmutableArray<byte> EncodeWithArrayPool()
            => Library.EncodeImmutablyWithArrayPool(_library);

        [Benchmark]
        public ImmutableArray<byte> EncodeWithArrayPoolFixed()
            => Library.EncodeImmutablyWithArrayPoolFixed(_library);
    }

If this is a feature you want added, I can volunteer to take it on.

Another alternative I did not try is to get a buffer from the Array pool of size MaxByteCount for that entity. Currently the property re-calculates the value each time it's called. But it's essentially a constant value. We could change it to a constant that gets output by the compiler and use that constant when pulling a buffer from the array pool. This would take more effort, and could be done at a later date.

The text was updated successfully, but these errors were encountered:

andrewmd5 · 2021-08-27T17:45:10Z

Thank you for putting this issue together! I started investigating using ArrayPool a few weeks ago and couldn't quite decide on the best approach. Another place where perf wins could be gained is improvements to the string encoder / decoding.

I believe a better option would be to allow for a buffer to be passed into Encode methods that can be managed by the calling application. That way the pooling strategy isn't left up to the Bebop runtime itself. If you want to submit a PR for that it is more than welcome!

ProgrammerAL · 2021-08-30T16:44:16Z

I spent some time over the weekend thinking about this. I agree that the Bebop library shouldn't be the thing managing pooling for something as simple as encoding an object. My concern is that everyone who wants to use this new encode method would have to write the same wrapping code calling this new method. I think eventually this can be included in the SDK/Generated Code, but I don't have a good idea for what that looks like. I'll think about it some more and if I have any good ideas, I'll make a separate issue to talk about them.

In the mean time, for this issue, I'll go with what you said and create an encoding method that can receive a byte[] buffer. Here's what I'm envisioning:

New method public EncodeResult EncodeInto(byte[] outBuffer)
- EncodeResult is a new struct with 2 properties
  - int Length is the amout of bytes written AKA BebopWriter.Length
  - Result is an enum stating the reuslt of the encode process
    - 0 - UnknownError - A default value for some error (probably won't be used in initial PR)
    - 1 - Success - If everything was good when encoding to the buffer
    - 2 - BufferTooSmall - Used if the given buffer was not big enough to hold the encoded bytes
Add new overload for BebopWriter.Create() that accepts a given byte[]

I went back and forth with the EncodeResult struct. Considered just throwing an exception if the input buffer was too small, but figure this would be better overall for performance.

ProgrammerAL · 2021-09-13T21:17:55Z

Closing this issue because the PR has been created in issue #155

ProgrammerAL added the enhancement New feature or request label Aug 25, 2021

ProgrammerAL mentioned this issue Sep 13, 2021

C# encode to a given array #155

Merged

ProgrammerAL closed this as completed Sep 13, 2021

andrewmd5 linked a pull request Sep 14, 2021 that will close this issue

C# encode to a given array #155

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use C# Array Pool buffer when calling BebopWriter.Create() to Encode #144

Use C# Array Pool buffer when calling BebopWriter.Create() to Encode #144

ProgrammerAL commented Aug 25, 2021

andrewmd5 commented Aug 27, 2021 •

edited

ProgrammerAL commented Aug 30, 2021

ProgrammerAL commented Sep 13, 2021

Use C# Array Pool buffer when calling BebopWriter.Create() to Encode #144

Use C# Array Pool buffer when calling BebopWriter.Create() to Encode #144

Comments

ProgrammerAL commented Aug 25, 2021

andrewmd5 commented Aug 27, 2021 • edited

ProgrammerAL commented Aug 30, 2021

ProgrammerAL commented Sep 13, 2021

andrewmd5 commented Aug 27, 2021 •

edited