New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVRO-1438: Cache schema to increase reader performance #1604
Conversation
This reverts commit b8601c0.
How did you test it ? |
@martin-g I used the Avro.perf project. Ran it 10 times and took an average of the timings. Then updated the code and ran it 10 times again and averaged the timings there. I actually did a 3rd round using a HashSet, but the performance was not good. |
Any potential issue for thread safety? |
Instaed of implementing the caching at in the RecordScheme.CanRead function:
|
@KyleSchoonover Could you post your before and afetr measurements? Here is my diff I applied to
Before:
After:
You are definetely onto somethng with the CanRead function. E.g. Btw, However there are other places where CanRead is called, so caching might make sense there as well for the other types as well. |
Run against Master. Realize this is the averages of 10 runs of the perf project.
Same run, but with updated code, but this will include the percentage change against the mast run.
|
Beside |
Doing a third run with the move of CanRead and the last matched schema. Will have the results in a bit. |
Here are the new results:
|
What are the average numbers with the CanRead in the constructor only? |
I will run it. May have to wait until morning for me to see the results. |
CanRead moved to constructor
|
At some point I may have to visit this perf project to output P99 and P95. |
|
@martin-g @zcsizmadia Closing this out. I will put a note in the jira as well. That this is fixed with AVRO-3474. |
The original intent of this improvement was to increase the performance of the Generic / Specific Reader classes in regards to RecordSchema. I tested for both netcore 3.1 and .Net 6.0. The performance increase is around 15% of netcore 3.1. For .Net 6.0 it is around 7%.
In addition, I tested with a hashset assuming multiple repeating records within a large schema, but it actually decreased performance.
Jira
Tests
Commits
Documentation