-
-
Notifications
You must be signed in to change notification settings - Fork 687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Compression.LZ4ContiguousBlock and implementation #681
Conversation
…nto v2.0 # Conflicts: # src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackSerializer.cs
I've added benchmark, this is results. 9 properties simple object, 1000 array
9 properties simple object, 10000 array
It seems to change drastically when the pool is used up. Edit: |
SequencePool.MinimumSpanLength = 4098
SequencePool.MinimumSpanLength = 32768
SequencePool.MinimumSpanLength = 65536
(The property values are random so result has variations) Looking at all the code paths, I think that MinimumSpanLength should be 65536. var maxLength = 0;
foreach (var item in msgpackUncompressedData)
{
sequenceCount++;
Math.Max(maxLength, item.Length);
}
// rent/return only once.
var maxCompressedLength = LZ4Codec.MaximumOutputLength(maxLength);
var lz4Buffer = msgpackUncompressedData.ArrayPool.Rent(maxCompressedLength);
try
{
foreach (var item in msgpackUncompressedData)
{
int lz4Length = LZ4Codec.Encode(item.Span, lz4Buffer);
writer.Write(lz4Buffer.AsSpan(0, lz4Length));
}
}
finally
{
msgpackUncompressedData.ArrayPool.Return(lz4Buffer);
} Since there are cases where SequencePool is nested (use two), you may need to set Environment.ProcessorCount * 2 for the worst case. Int32(small data) Lz4 Serialization benchmark result:
LZ4 option should be same as None but result is too slow. if (options.Compression.IsCompression() && !PrimitiveChecker<T>.IsFixedSizePrimitive)
{
} Deserialize also has for optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how this avoids copying data into a large contiguous buffer. :)
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackCompression.cs
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackCompression.cs
Outdated
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackCompression.cs
Outdated
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackCompression.cs
Outdated
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackSerializer.cs
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackSerializer.cs
Outdated
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackSerializer.cs
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackSerializer.cs
Outdated
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackSerializer.cs
Outdated
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/ThisLibraryExtensionTypeCodes.cs
Outdated
Show resolved
Hide resolved
Also add support for changing the `MinimumSpanLength` property for rented `Sequence<T>` on an as-needed basis. This should allow the upcoming LZ4 compression scheme added in MessagePack-CSharp#681 to crank up the value to 64KB for the benefits of larger compression blocks. Closes MessagePack-CSharp#680
Also add support for changing the `MinimumSpanLength` property for rented `Sequence<T>` on an as-needed basis. This should allow the upcoming LZ4 compression scheme added in MessagePack-CSharp#681 to crank up the value to 64KB for the benefits of larger compression blocks. Closes MessagePack-CSharp#680
I see you're targeting v2.0 with this PR. Are you keen to get it in for 2.0 (which we're trying to stabilize) or would 2.1 (master branch) be better? |
I want to target to 2.0 if possible. I think this API(Lz4BlockArray) is better than the traditional Lz4Block (it's like avoiding LOH in StringBuilder .NET 4, I can also reduce LOH during deserialization) But if we can't agree on the implementation and it takes too much time, we should change it to 2.1. |
Pushed fixed code. Finally I've changed serialization code to following. foreach (var item in msgpackUncompressedData)
{
var maxCompressedLength = LZ4Codec.MaximumOutputLength(item.Length);
var lz4Span = writer.GetSpan(maxCompressedLength + 5);
int lz4Length = LZ4Codec.Encode(item.Span, lz4Span.Slice(5, lz4Span.Length - 5));
WriteBin32Header((uint)lz4Length, lz4Span);
writer.Advance(lz4Length + 5);
}
|
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackSerializer.cs
Show resolved
Hide resolved
src/MessagePack.UnityClient/Assets/Scripts/MessagePack/MessagePackSerializer.cs
Outdated
Show resolved
Hide resolved
reply to WriteInt32: Oh, yes, I'm sure it works well, But when calculate sequnceCount, we can calc write size too. var extHeaderSize = 0;
foreach (var item in msgpackUncompressedData)
{
sequenceCount++;
extHeaderSize += GetUInt32WriteSize((uint)item.Length);
} that fixed simply and I've pushed. |
Thank you for approval. |
Related to #675, #680
ReusableSequenceWithMinSize.Rent()
instead ofnew Sequence
in LZ4 compression(new Sequence is still exists in Deserialize(Stream))unit test dumps serialized binary size.
when
MinimumSpanLength = 4096
, shows this result.