Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve deserialization performance by ~22%. #97

Closed
wants to merge 1 commit into from

Conversation

optimiz3
Copy link
Contributor

Deserialization CPU performance was suffering because both LookupDiscriminatorConvention and LookupSerializer were being called for nearly each class member. Both functions take out and release the global config read-lock. Profiling showed that around 25% of all CPU time was being spent in ReaderWriterLockSlim.TryEnterReadLock and ReaderWriterLockSlim.ExitReadLock due to these functions.

Fix makes the BsonClassMap and BsonMemberMap persist references to the appropriate IDiscriminatorConvention and IBsonSerializer singletons. By doing this, lookups are avoided while maintaining semantic correctness due to the associated BsonClassMap and BsonMemberMap types never changing once constructed.

Also introduced is the FastSingleton class, which extends the existing runtime singleton logic as follows:

  1. Enables more static singleton binding both at compile time and when using generics.
  2. Enables lazy singleton instantiation
  3. Decouples singleton data structure logic from type-specific instantiation logic.
  4. Provides structural performance advantages by isolating runtime binding reader/writer locks and dictionaries by use-case.
    Other issues addressed:
  5. DefaultSerializers are now always lazy-bound. This simplified the implementation by ensuring singletons are only set once in the case where a user has specified a custom serializer.
  6. BsonSerializer.Serialize had a bug where a class implementing IBsonSerializable would always be serialized using the IBsonSerializable implemention, even if the implementation was overridden by a registered IBsonSerializer.

Deserialization CPU performance was suffering because both LookupDiscriminatorConvention and LookupSerializer were being called for nearly each class member. Both functions take out and release the global config read-lock. Profiling showed that around 25% of all CPU time was being spent in ReaderWriterLockSlim.TryEnterReadLock and ReaderWriterLockSlim.ExitReadLock due to these functions.

Fix makes the BsonClassMap and BsonMemberMap persist references to the appropriate IDiscriminatorConvention and IBsonSerializer singletons.  By doing this, lookups are avoided while maintaining semantic correctness due to the associated BsonClassMap and BsonMemberMap types never changing once constructed.

Also introduced is the FastSingleton class, which extends the existing runtime singleton logic as follows:
1.	Enables more static singleton binding both at compile time and when using generics.
2.	Enables lazy singleton instantiation
3.	Decouples singleton data structure logic from type-specific instantiation logic.
4.	Provides structural performance advantages by isolating runtime binding reader/writer locks and dictionaries by use-case.
Other issues addressed:
1.	DefaultSerializers are now always lazy-bound. This simplified the implementation by ensuring singletons are only set once in the case where a user has specified a custom serializer.
2.	BsonSerializer.Serialize had a bug where a class implementing IBsonSerializable would always be serialized using the IBsonSerializable implemention, even if the implementation was overridden by a registered IBsonSerializer.
@rstam
Copy link
Contributor

rstam commented Apr 23, 2012

I tried to reproduce your 22% speed improvement and did not see the same improvement. I ran the test program that you provided me separately 4 times each with the driver from master and from your fork. The results were:

master: 10.68, 10.71, 10.72, 11.30 (seconds for 10,000 iterations)
your fork: 10.49, 10.71, 10.82, 11.03

So I'm seeing essentially no speed difference at all. I'm not sure why.

@rstam
Copy link
Contributor

rstam commented Apr 24, 2012

At Alex's suggestion I have repeated the tests without a debugger attached and also using Release builds. Without a debugger attached I see 7% speed improvement in Debug builds and 9% improvement in Release builds. Maybe not the same improvement Alex is seeing, but certainly significant now.

@rstam
Copy link
Contributor

rstam commented Apr 24, 2012

I've poured over this pull request and I'm pretty convinced that the only thing contributing to any speed improvement is that BsonClassMap and BsonMemberMap are caching information about the corresponding serializer and discriminator convention. I don't think FastSingleton is contributing anything. I also think FastSingleton is incredibly confusing and would rather not adopt it. Check out this experimental commit which achieves even better performance than this pull request with very much simpler changes:

rstam@a4ac453

Would love to know if you see similar performance improvements on your machine using my experimental commit (or whether you see any bugs or thread safety issues in my experimental commit).

@rstam
Copy link
Contributor

rstam commented Apr 24, 2012

I pushed another experimental commit to my personal branch:

rstam@a6aaeca

that replaces the HashSet used in BsonClassMapSerializer.Deserialize with a linear array. I use a shrinking from/to range to avoid resizing the array, and the fact that elements probably appear in the same order as the memberMaps to assume that a linear search will be very fast.

This code needs to be cleaned up a bit before being incorporated into the master branch, but it should serve the purpose of proving that the speed improvements are real.

@rstam
Copy link
Contributor

rstam commented Apr 24, 2012

Created two JIRAs for the two components of speed improvements discussed above (caching serializers and discriminator conventions and not using HashSet to detect missing required elements):

https://jira.mongodb.org/browse/CSHARP-451
https://jira.mongodb.org/browse/CSHARP-452

While we won't be merging in this pull request verbatim I would like to give full credit to this pull request for finding these two opportunities for speed improvements and for the research and prototyping involved.

Thanks Alex!

@rstam rstam closed this Apr 24, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants