Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-1438: Cache schema to increase reader performance #1604

Closed
wants to merge 14 commits into from

Conversation

KyleSchoonover
Copy link
Contributor

The original intent of this improvement was to increase the performance of the Generic / Specific Reader classes in regards to RecordSchema. I tested for both netcore 3.1 and .Net 6.0. The performance increase is around 15% of netcore 3.1. For .Net 6.0 it is around 7%.

In addition, I tested with a hashset assuming multiple repeating records within a large schema, but it actually decreased performance.

Jira

Tests

  • My PR does not need testing for this extremely good reason: Performance improvement. Ran manually through perf test project

Commits

  • My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain Javadoc that explain what it does

@github-actions github-actions bot added the C# label Mar 17, 2022
@martin-g
Copy link
Member

The original intent of this improvement was to increase the performance of the Generic / Specific Reader classes in regards to RecordSchema. I tested for both netcore 3.1 and .Net 6.0. The performance increase is around 15% of netcore 3.1. For .Net 6.0 it is around 7%.

How did you test it ?
AFAIK there are two ways to test performance in the C# SDK. Which one did you use ?

@KyleSchoonover
Copy link
Contributor Author

@martin-g I used the Avro.perf project. Ran it 10 times and took an average of the timings. Then updated the code and ran it 10 times again and averaged the timings there. I actually did a 3rd round using a HashSet, but the performance was not good.

@RyanSkraba RyanSkraba changed the title AVRO-1438 Cache schema to increase reader performance AVRO-1438: Cache schema to increase reader performance Mar 18, 2022
@zcsizmadia
Copy link
Contributor

Any potential issue for thread safety?

@zcsizmadia
Copy link
Contributor

Instaed of implementing the caching at in the RecordScheme.CanRead function:

  1. Can the caching be implemented at the level where the CanRead is called? If the CanRead is called many times, all the other CanRead functions are affected by this performance gain.

  2. Did you investigate what causes the CanRead being called many times? E.g. as much as I can tell by looking at GenericReader code , IMO the CanRead shoudl be done in the constructor and not in every Read, is it possible just simply moving the CanRead check to the constructor will fix this issue without the caching?

@zcsizmadia
Copy link
Contributor

@KyleSchoonover Could you post your before and afetr measurements?

Here is my diff I applied to master:

diff --git a/lang/csharp/src/apache/main/Generic/GenericReader.cs b/lang/csharp/src/apache/main/Generic/GenericReader.cs
index f42e572d..45df3b29 100644
--- a/lang/csharp/src/apache/main/Generic/GenericReader.cs
+++ b/lang/csharp/src/apache/main/Generic/GenericReader.cs
@@ -121,6 +121,9 @@ namespace Avro.Generic
         {
             this.ReaderSchema = readerSchema;
             this.WriterSchema = writerSchema;
+
+            if (!ReaderSchema.CanRead(WriterSchema))
+                throw new AvroException("Schema mismatch. Reader: " + ReaderSchema + ", writer: " + WriterSchema);
         }

         /// <summary>
@@ -134,9 +137,6 @@ namespace Avro.Generic
         /// <returns>Object read from the decoder.</returns>
         public T Read<T>(T reuse, Decoder decoder)
         {
-            if (!ReaderSchema.CanRead(WriterSchema))
-                throw new AvroException("Schema mismatch. Reader: " + ReaderSchema + ", writer: " + WriterSchema);
-
             return (T)Read(reuse, WriterSchema, ReaderSchema, decoder);
         }

Before:

$ dotnet run -c Release -f net6.0
type    impl    action  total_items     batches batch_size      time(ms)
simple  default_specific        serialize       1000000 1000    1000    500
simple  default_specific        deserialize     1000000 1000    1000    1188
simple  preresolved_specific    serialize       1000000 1000    1000    312
simple  preresolved_specific    deserialize     1000000 1000    1000    328
simple  default_generic serialize       1000000 1000    1000    391
simple  default_generic deserialize     1000000 1000    1000    1109
simple  preresolved_generic     serialize       1000000 1000    1000    250
simple  preresolved_generic     deserialize     1000000 1000    1000    469
complex default_specific        serialize       1000000 1000    1000    3015
complex default_specific        deserialize     1000000 1000    1000    7391
complex preresolved_specific    serialize       1000000 1000    1000    2438
complex preresolved_specific    deserialize     1000000 1000    1000    3406
complex default_generic serialize       1000000 1000    1000    2937
complex default_generic deserialize     1000000 1000    1000    5735
complex preresolved_generic     serialize       1000000 1000    1000    2031
complex preresolved_generic     deserialize     1000000 1000    1000    2641
narrow  default_specific        serialize       1000000 1000    1000    203
narrow  default_specific        deserialize     1000000 1000    1000    547
narrow  preresolved_specific    serialize       1000000 1000    1000    156
narrow  preresolved_specific    deserialize     1000000 1000    1000    157
narrow  default_generic serialize       1000000 1000    1000    203
narrow  default_generic deserialize     1000000 1000    1000    562
narrow  preresolved_generic     serialize       1000000 1000    1000    141
narrow  preresolved_generic     deserialize     1000000 1000    1000    219
wide    default_specific        serialize       1000000 1000    1000    2610
wide    default_specific        deserialize     1000000 1000    1000    6593
wide    preresolved_specific    serialize       1000000 1000    1000    2297
wide    preresolved_specific    deserialize     1000000 1000    1000    2141
wide    default_generic serialize       1000000 1000    1000    2109
wide    default_generic deserialize     1000000 1000    1000    6235
wide    preresolved_generic     serialize       1000000 1000    1000    1343
wide    preresolved_generic     deserialize     1000000 1000    1000    2766

After:

$ dotnet run -c Release -f net6.0
type    impl    action  total_items     batches batch_size      time(ms)
simple  default_specific        serialize       1000000 1000    1000    531
simple  default_specific        deserialize     1000000 1000    1000    891
simple  preresolved_specific    serialize       1000000 1000    1000    328
simple  preresolved_specific    deserialize     1000000 1000    1000    344
simple  default_generic serialize       1000000 1000    1000    453
simple  default_generic deserialize     1000000 1000    1000    844
simple  preresolved_generic     serialize       1000000 1000    1000    281
simple  preresolved_generic     deserialize     1000000 1000    1000    547
complex default_specific        serialize       1000000 1000    1000    3579
complex default_specific        deserialize     1000000 1000    1000    7218
complex preresolved_specific    serialize       1000000 1000    1000    2875
complex preresolved_specific    deserialize     1000000 1000    1000    3969
complex default_generic serialize       1000000 1000    1000    3266
complex default_generic deserialize     1000000 1000    1000    4734
complex preresolved_generic     serialize       1000000 1000    1000    2109
complex preresolved_generic     deserialize     1000000 1000    1000    2797
narrow  default_specific        serialize       1000000 1000    1000    219
narrow  default_specific        deserialize     1000000 1000    1000    406
narrow  preresolved_specific    serialize       1000000 1000    1000    141
narrow  preresolved_specific    deserialize     1000000 1000    1000    187
narrow  default_generic serialize       1000000 1000    1000    204
narrow  default_generic deserialize     1000000 1000    1000    375
narrow  preresolved_generic     serialize       1000000 1000    1000    140
narrow  preresolved_generic     deserialize     1000000 1000    1000    250
wide    default_specific        serialize       1000000 1000    1000    2688
wide    default_specific        deserialize     1000000 1000    1000    4781
wide    preresolved_specific    serialize       1000000 1000    1000    1969
wide    preresolved_specific    deserialize     1000000 1000    1000    2078
wide    default_generic serialize       1000000 1000    1000    2094
wide    default_generic deserialize     1000000 1000    1000    4640
wide    preresolved_generic     serialize       1000000 1000    1000    1422
wide    preresolved_generic     deserialize     1000000 1000    1000    2953

You are definetely onto somethng with the CanRead function. E.g. wide default_generic deserialize 1000000 1000 1000
improved from 6235ms ->4640ms = ~35%, which is massive,

Btw, PreresolvingDatumReader does the if (!ReaderSchema.CanRead(WriterSchema)) check in the constructor and not in the Read function. Which seems to be the correct path.

However there are other places where CanRead is called, so caching might make sense there as well for the other types as well.

@KyleSchoonover
Copy link
Contributor Author

Run against Master. Realize this is the averages of 10 runs of the perf project.

type	impl	action	.Net 6 Avg	.Net 5 Avg	.Net Core 3.1 Avg
simple	default_specific	serialize	851.4	873.6	1037.6
simple	default_specific	deserialize	2317.1	2454.7	2637.5
simple	preresolved_specific	serialize	576.6	629.5	709.3
simple	preresolved_specific	deserialize	639.1	687.4	847.1
simple	default_generic	serialize	900.1	953.4	1068.6
simple	default_generic	deserialize	2196.9	2282.6	2564.1
simple	preresolved_generic	serialize	489.2	539.2	609.2
simple	preresolved_generic	deserialize	895.1	915.5	1182.9
complex	default_specific	serialize	7063.8	7190.7	8598.4
complex	default_specific	deserialize	16762.6	17534.4	20399.9
complex	preresolved_specific	serialize	5159.3	5429.9	6175.1
complex	preresolved_specific	deserialize	7609.6	8395.1	9604.8
complex	default_generic	serialize	6720.2	7015.6	7895.3
complex	default_generic	deserialize	13140.6	13942.3	15473.5
complex	preresolved_generic	serialize	4417.2	4970.3	5346.6
complex	preresolved_generic	deserialize	6300.1	6287.5	7490.7
narrow	default_specific	serialize	457.6	475	546.8
narrow	default_specific	deserialize	1233	1243.6	1409.6
narrow	preresolved_specific	serialize	323.4	322	351.6
narrow	preresolved_specific	deserialize	395.2	395.3	540.7
narrow	default_generic	serialize	473.6	521.8	598.4
narrow	default_generic	deserialize	1173.2	1218.9	1549.8
narrow	preresolved_generic	serialize	281.2	295.2	339.1
narrow	preresolved_generic	deserialize	479.8	490.6	689.2
wide	default_specific	serialize	5209.4	5379.7	6717
wide	default_specific	deserialize	13739.2	14301.4	16419
wide	preresolved_specific	serialize	3745.1	4072	4760.8
wide	preresolved_specific	deserialize	4589.3	4906.4	6351.4
wide	default_generic	serialize	4948.3	5198.3	6065.6
wide	default_generic	deserialize	12432.9	13159.5	14436.1
wide	preresolved_generic	serialize	2837.4	3092.1	3448.3
wide	preresolved_generic	deserialize	5551.6	5914	7287.5

Same run, but with updated code, but this will include the percentage change against the mast run.

type	impl	action	.Net 6 Avg	% Change	.Net 5 Avg	% Change	.Net Core 3.1 Avg	% Change
simple	default_specific	serialize	953.1	10.67%	940.9	7.15%	1256.5	17.42%
simple	default_specific	deserialize	1868.7	-24.00%	1893.6	-29.63%	2206.2	-19.55%
simple	preresolved_specific	serialize	596.9	3.40%	676.5	6.95%	854.5	16.99%
simple	preresolved_specific	deserialize	681.2	6.18%	725	5.19%	1003.1	15.55%
simple	default_generic	serialize	943.8	4.63%	1040.8	8.40%	1175.2	9.07%
simple	default_generic	deserialize	1720.3	-27.70%	1782.7	-28.04%	2059.3	-24.51%
simple	preresolved_generic	serialize	529.8	7.66%	579.5	6.95%	657.6	7.36%
simple	preresolved_generic	deserialize	940.5	4.83%	1020.6	10.30%	1326.8	10.85%
complex	default_specific	serialize	7189.1	1.74%	7848.6	8.38%	9432.9	8.85%
complex	default_specific	deserialize	13609.5	-23.17%	15517	-13.00%	18128.1	-12.53%
complex	preresolved_specific	serialize	5313.9	2.91%	5904.8	8.04%	7086	12.85%
complex	preresolved_specific	deserialize	7828.3	2.79%	9107.8	7.83%	10529.7	8.78%
complex	default_generic	serialize	6898.4	2.58%	7665.7	8.48%	8350.1	5.45%
complex	default_generic	deserialize	10679.6	-23.04%	11453.1	-21.73%	13190.4	-17.31%
complex	preresolved_generic	serialize	4623.4	4.46%	5196.9	4.36%	5729.6	6.68%
complex	preresolved_generic	deserialize	6281.3	-0.30%	6623.5	5.07%	8136	7.93%
narrow	default_specific	serialize	454.6	-0.66%	492.1	3.47%	579.5	5.64%
narrow	default_specific	deserialize	867.3	-42.17%	950.2	-30.88%	1079.8	-30.54%
narrow	preresolved_specific	serialize	307.8	-5.07%	368.7	12.67%	397	11.44%
narrow	preresolved_specific	deserialize	378	-4.55%	431.2	8.33%	550.1	1.71%
narrow	default_generic	serialize	484.4	2.23%	540.6	3.48%	623.3	3.99%
narrow	default_generic	deserialize	865.8	-35.50%	951.7	-28.08%	1151.5	-34.59%
narrow	preresolved_generic	serialize	276.4	-1.74%	332.8	11.30%	351.5	3.53%
narrow	preresolved_generic	deserialize	475.1	-0.99%	551.5	11.04%	748.5	7.92%
wide	default_specific	serialize	5165.6	-0.85%	5954.7	9.66%	6662.4	-0.82%
wide	default_specific	deserialize	10800	-27.21%	11476.3	-24.62%	13759.2	-19.33%
wide	preresolved_specific	serialize	4065.5	7.88%	4661.2	12.64%	5074.9	6.19%
wide	preresolved_specific	deserialize	4831.2	5.01%	5445.2	9.89%	6486	2.08%
wide	default_generic	serialize	5139	3.71%	5996.9	13.32%	5997.1	-1.14%
wide	default_generic	deserialize	9175	-35.51%	10412.8	-26.38%	11568.8	-24.78%
wide	preresolved_generic	serialize	2987.7	5.03%	3551.1	12.93%	3976.5	13.28%
wide	preresolved_generic	deserialize	5792.1	4.15%	6469	8.58%	8153	10.62%

@zcsizmadia
Copy link
Contributor

Beside complex default_specific deserialize the number really match up wuth the numbers generated by moving the CanRead to the constructor.

@KyleSchoonover
Copy link
Contributor Author

Doing a third run with the move of CanRead and the last matched schema. Will have the results in a bit.

@KyleSchoonover
Copy link
Contributor Author

Here are the new results:

type	impl	action	.Net 6 Avg	% Change	.Net 5 Avg	% Change	.Net Core 3.1 Avg	% Change
simple	default_specific	serialize	907.8	6.21%	937.6	6.83%	1174.9	11.69%
simple	default_specific	deserialize	1767.3	-31.11%	1798.3	-36.50%	2064.3	-27.77%
simple	preresolved_specific	serialize	579.7	0.53%	665.6	5.42%	865.5	18.05%
simple	preresolved_specific	deserialize	667.2	4.21%	712.6	3.54%	1025.1	17.36%
simple	default_generic	serialize	949.9	5.24%	964	1.10%	1181.3	9.54%
simple	default_generic	deserialize	1692.1	-29.83%	1739.1	-31.25%	1998.4	-28.31%
simple	preresolved_generic	serialize	520.3	5.98%	582.7	7.47%	637.6	4.45%
simple	preresolved_generic	deserialize	947	5.48%	998.5	8.31%	1223.2	3.29%
complex	default_specific	serialize	7506.4	5.90%	7478.1	3.84%	9273.3	7.28%
complex	default_specific	deserialize	13662.4	-22.69%	14518.8	-20.77%	18382.8	-10.97%
complex	preresolved_specific	serialize	5164	0.09%	5496.8	1.22%	6704.6	7.90%
complex	preresolved_specific	deserialize	7761.2	1.95%	8278.1	-1.41%	10847.1	11.45%
complex	default_generic	serialize	6754.5	0.51%	6826.5	-2.77%	9006.1	12.33%
complex	default_generic	deserialize	10596.9	-24.00%	10389.2	-34.20%	13548.6	-14.21%
complex	preresolved_generic	serialize	4501.7	1.88%	4882.8	-1.79%	5988.9	10.72%
complex	preresolved_generic	deserialize	6135.8	-2.68%	6406.1	1.85%	8595.3	12.85%
narrow	default_specific	serialize	470.5	2.74%	473.6	-0.30%	620.2	11.83%
narrow	default_specific	deserialize	879.6	-40.18%	837.3	-48.53%	1075.2	-31.10%
narrow	preresolved_specific	serialize	326.5	0.95%	347.2	7.26%	416.9	15.66%
narrow	preresolved_specific	deserialize	396.9	0.43%	410.7	3.75%	587.7	8.00%
narrow	default_generic	serialize	512.4	7.57%	504.8	-3.37%	645.3	7.27%
narrow	default_generic	deserialize	892.2	-31.50%	848.3	-43.69%	1128	-37.39%
narrow	preresolved_generic	serialize	289.1	2.73%	317.3	6.97%	395.3	14.22%
narrow	preresolved_generic	deserialize	484.4	0.95%	506.2	3.08%	826.7	16.63%
wide	default_specific	serialize	5111	-1.93%	5439.1	1.09%	7100	5.39%
wide	default_specific	deserialize	10374.9	-32.43%	10540.5	-35.68%	13585.9	-20.85%
wide	preresolved_specific	serialize	3806.3	1.61%	4164.1	2.21%	5267.2	9.61%
wide	preresolved_specific	deserialize	4673.5	1.80%	4848.3	-1.20%	7131.3	10.94%
wide	default_generic	serialize	5015.5	1.34%	4970.5	-4.58%	6353.1	4.53%
wide	default_generic	deserialize	9137.7	-36.06%	8968.5	-46.73%	11631.2	-24.12%
wide	preresolved_generic	serialize	3059.3	7.25%	3253.4	4.96%	3820.4	9.74%
wide	preresolved_generic	deserialize	5642.1	1.60%	6148.4	3.81%	7823.2	6.85%

@zcsizmadia
Copy link
Contributor

What are the average numbers with the CanRead in the constructor only?

@KyleSchoonover
Copy link
Contributor Author

What are the average numbers with the CanRead in the constructor only?

I will run it. May have to wait until morning for me to see the results.

@KyleSchoonover
Copy link
Contributor Author

CanRead moved to constructor

type	impl	action	.Net 6 Avg	% Change	.Net 5 Avg	% Change	.Net Core 3.1 Avg	% Change
simple	default_specific	serialize	937.6	9.19%	920.3	5.07%	1163.9	10.85%
simple	default_specific	deserialize	1812.6	-27.83%	1842.3	-33.24%	2023.5	-30.34%
simple	preresolved_specific	serialize	606.3	4.90%	637.3	1.22%	756.2	6.20%
simple	preresolved_specific	deserialize	671.7	4.85%	700	1.80%	923.5	8.27%
simple	default_generic	serialize	1003.2	10.28%	961.1	0.80%	1137.6	6.07%
simple	default_generic	deserialize	1711	-28.40%	1726.5	-32.21%	1903	-34.74%
simple	preresolved_generic	serialize	517.1	5.40%	573.2	5.93%	654.8	6.96%
simple	preresolved_generic	deserialize	923.4	3.06%	962.7	4.90%	1274.9	7.22%
complex	default_specific	serialize	7173.5	1.53%	7512.5	4.28%	8832.6	2.65%
complex	default_specific	deserialize	14051.6	-19.29%	15001.5	-16.88%	17001.7	-19.99%
complex	preresolved_specific	serialize	5398.4	4.43%	5556.2	2.27%	6448.6	4.24%
complex	preresolved_specific	deserialize	8032.8	5.27%	8443.9	0.58%	10178	5.63%
complex	default_generic	serialize	6804.6	1.24%	7175	2.22%	8478.2	6.88%
complex	default_generic	deserialize	10753.2	-22.20%	10893.7	-27.98%	13512.4	-14.51%
complex	preresolved_generic	serialize	4501.6	1.87%	4951.6	-0.38%	5745.3	6.94%
complex	preresolved_generic	deserialize	6276.7	-0.37%	6320.3	0.52%	7959.4	5.89%
narrow	default_specific	serialize	448.2	-2.10%	468.7	-1.34%	595.3	8.15%
narrow	default_specific	deserialize	834.5	-47.75%	879.7	-41.37%	1046.9	-34.65%
narrow	preresolved_specific	serialize	309.5	-4.49%	339	5.01%	378.2	7.03%
narrow	preresolved_specific	deserialize	387.4	-2.01%	387.7	-1.96%	548.4	1.40%
narrow	default_generic	serialize	496.9	4.69%	504.7	-3.39%	575.1	-4.05%
narrow	default_generic	deserialize	836	-40.33%	867.1	-40.57%	1034.1	-49.87%
narrow	preresolved_generic	serialize	282.7	0.53%	303	2.57%	328.2	-3.32%
narrow	preresolved_generic	deserialize	482.7	0.60%	492.2	0.33%	687.6	-0.23%
wide	default_specific	serialize	5279.7	1.33%	5461.1	1.49%	6376.5	-5.34%
wide	default_specific	deserialize	10711.2	-28.27%	10834.1	-32.00%	12629.7	-30.00%
wide	preresolved_specific	serialize	4054.5	7.63%	4165.9	2.25%	4809.3	1.01%
wide	preresolved_specific	deserialize	4912.5	6.58%	4954.7	0.97%	6445.4	1.46%
wide	default_generic	serialize	5189.1	4.64%	5218.6	0.39%	5956.1	-1.84%
wide	default_generic	deserialize	9153.1	-35.83%	9490.6	-38.66%	11014	-31.07%
wide	preresolved_generic	serialize	2926.6	3.05%	3231.1	4.30%	3642.4	5.33%
wide	preresolved_generic	deserialize	5907.7	6.03%	5906.3	-0.13%	7786.1	6.40%

@KyleSchoonover
Copy link
Contributor Author

At some point I may have to visit this perf project to output P99 and P95.

@zcsizmadia
Copy link
Contributor

  1. I would use benchmark for that. it has all the bells and whistles for statistical analysis.
  2. Do you think moving the CanRead to the constructor is enough? Instead of of implementing the caching?

@KyleSchoonover
Copy link
Contributor Author

@martin-g @zcsizmadia Closing this out. I will put a note in the jira as well. That this is fixed with AVRO-3474.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants