-
Notifications
You must be signed in to change notification settings - Fork 522
Add specialized TupleCoders #3350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
scio-core/src/main/scala/com/spotify/scio/coders/instances/ScalaCoders.scala
Show resolved
Hide resolved
47fa3b6 to
77e8b9c
Compare
Codecov Report
@@ Coverage Diff @@
## master #3350 +/- ##
==========================================
- Coverage 72.71% 69.10% -3.61%
==========================================
Files 234 233 -1
Lines 7710 7431 -279
Branches 347 326 -21
==========================================
- Hits 5606 5135 -471
- Misses 2104 2296 +192
Continue to review full report at Codecov.
|
jto
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure this will actually be slower than the current version. Did you run benchmarks ?
scio-core/src/main/scala-2.12/com/spotify/scio/coders/instances/TupleCoders.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jto! good points I should have made them clear in the PR description.
To emphasise the main point of this PR is to address serde time. From the quick benchmarks below we can see improvements between 20% to 50%. This can translate to somewhat big savings when using cogroup ops.
master
[info] Benchmark Mode Cnt Score Error Units
[info] CoderBenchmark.tuple3Decode avgt 5 53.833 ± 0.326 ns/op
[info] CoderBenchmark.tuple3Encode avgt 5 86.729 ± 10.622 ns/op
[info] CoderBenchmark.tuple4Decode avgt 5 69.513 ± 1.937 ns/op
[info] CoderBenchmark.tuple4Encode avgt 5 106.868 ± 1.345 ns/op
PR
[info] Benchmark Mode Cnt Score Error Units
[info] CoderBenchmark.tuple3Decode avgt 5 26.200 ± 0.136 ns/op
[info] CoderBenchmark.tuple3Encode avgt 5 65.883 ± 2.037 ns/op
[info] CoderBenchmark.tuple4Decode avgt 5 40.348 ± 0.209 ns/op
[info] CoderBenchmark.tuple4Encode avgt 5 72.218 ± 0.524 ns/op
Regarding compile time I'm aware that there's in fact some degradation (not really sure how much in user code base but there's some in scio.) but as I mention that's not the main focus here and perhaps this can be alleviated with PR #3170
We can try however a few approaches:
1) Apply this to tuples up to 4/5. Above that we can resort to Coder.gen.
2) If possible try see if we can optimise nested Coder.transform it in a separate PR.
|
I was also talking about serde time. I think I remember fiddling with nested |
3a567d3 to
0d19b9a
Compare
0d19b9a to
51aee2d
Compare
|
So I was looking at what impacted compile time by 2.8% and found out that it was the removal of |
I recently noticed that Tuple3 / Tuple4 are not that uncommon so decided to specialize them to save on serialization. Users should rarely rely on above then that but this adds for the other ones as well (bonus).
We should at some point ditch these python scripts and use either paiges or scalafix.