Add Improvements to Serialization #748

Closed
JasonBock opened this Issue Jul 21, 2017 · 3 comments

Comments

Projects
None yet
3 participants
@JasonBock
Contributor

JasonBock commented Jul 21, 2017

I've done a couple of local tests to CSLA's serialization engine that I want to eventually submit a PR for, that's what this issue is for :)

Here's a summary of the changes:

  • Make two new classes, CslaLegacyBinaryReader and CslaLegacyBinaryWriter. These will be kept in CSLA for a little while as a backup strategy until everyone's comfortable with these new changes.
  • In CslaBinaryReader, remove the boxing calls in Read() to do direct calls to the stream. Also, move the string reads into a ReadString() method.
  • In CslaBinaryWriter, change DictionaryCheckResult to be a struct to reduce memory allocation. Or, alternatively, use a tuple type in C#7 and remove that custom type altogether.
  • With IsDirty and ReferenceId, there's no need (with my approach) to store the "known type" before it. It's required that, for child and value data, they show up in the exact order that you're reading them from. That would save 1 or 2 bytes per child and value data. That may not sound like much, and I get that maybe for consistency's sake it should be kept it in (i.e. each value always has "known type" definition in front of it in the stream) but any little bit (no pun intended) counts :).
  • Decimal.GetBits() always has 4 elements (https://msdn.microsoft.com/en-us/library/system.decimal.getbits(v=vs.110).aspx). There's no need to store that size in the serialization stream. Assume it's always 4. That saves 4 bytes per decimal.
  • The List<int> deserialization can be improved by creating an array first, reading the values from the reader into the array, and then passing that array to a new List<int>.

I don't have the exact percentages on speed and memory allocation available just yet - I've done individual Benchmark.NET-based tests on these improvements but just on the readers and writers individually - but once I have those numbers I'll post those just so people can see what the improvement actually is.

@JasonBock JasonBock self-assigned this Jul 21, 2017

@rockfordlhotka rockfordlhotka added this to the 4.7.100 milestone Jul 21, 2017

@JasonBock

This comment has been minimized.

Show comment Hide comment
@JasonBock

JasonBock Jul 22, 2017

Contributor

So I ran some more Benchmark.NET tests, and I'm seeing:

  • 15% reduction in memory allocation
  • 10% reduction in speed
  • 3% reduction in the serialization payload size

These percentages will vary, especially the payload size. I wouldn't expect people to see any more than what my tests got, but there's definitely an improvement :).

Contributor

JasonBock commented Jul 22, 2017

So I ran some more Benchmark.NET tests, and I'm seeing:

  • 15% reduction in memory allocation
  • 10% reduction in speed
  • 3% reduction in the serialization payload size

These percentages will vary, especially the payload size. I wouldn't expect people to see any more than what my tests got, but there's definitely an improvement :).

@tfreitasleal

This comment has been minimized.

Show comment Hide comment
@tfreitasleal

tfreitasleal Jul 22, 2017

Member

Hi Jason,
I don't quite understand the bit where you say It's required that, for child and value data, they show up in the exact order that you're reading them from.
I understand that this must happen or else we'll be in trouble. Who is responsible for the showing the data in the same order? Is it the serializer job or do we depend on some other piece of code?

Member

tfreitasleal commented Jul 22, 2017

Hi Jason,
I don't quite understand the bit where you say It's required that, for child and value data, they show up in the exact order that you're reading them from.
I understand that this must happen or else we'll be in trouble. Who is responsible for the showing the data in the same order? Is it the serializer job or do we depend on some other piece of code?

@JasonBock

This comment has been minimized.

Show comment Hide comment
@JasonBock

JasonBock Jul 22, 2017

Contributor

@tfreitasleal it's an implementation detail of the serialization layout. Basically every value put into the stream is prepended with a CslaKnownTypes value. But in a couple of cases, there's no reason to prepend the value because you know that it'll be a bool or an int. Hence we can save a bit of space if we don't put those known type values in the stream.

Contributor

JasonBock commented Jul 22, 2017

@tfreitasleal it's an implementation detail of the serialization layout. Basically every value put into the stream is prepended with a CslaKnownTypes value. But in a couple of cases, there's no reason to prepend the value because you know that it'll be a bool or an int. Hence we can save a bit of space if we don't put those known type values in the stream.

rockfordlhotka added a commit that referenced this issue Jul 24, 2017

#748 Added serlization changes to reader and writer (#749)
* #680 Fixes trivia issue

Analyzer no longer removes leading and trailing trivia from new field
node.

* #748 Updated reader and writer with improvements

@rockfordlhotka rockfordlhotka referenced this issue in MarimerLLC/cslaforum Oct 11, 2017

Open

CSLA .NET 4.7 (coming soon) #442

@rockfordlhotka rockfordlhotka referenced this issue in MarimerLLC/cslaforum Mar 22, 2018

Open

CSLA .NET version 4.7.100 release #510

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment