Encode to ByteBuffer in Circe #717

Merged
merged 1 commit into from Jan 18, 2017

Conversation

Projects
None yet
3 participants
@vkostyukov
Member

vkostyukov commented Jan 12, 2017

As discussed in #676 (and implemented in Circe 0.7.0-M2), encoding should be performed in terms of bytes to avoid extra to-string conversion hence reduce allocations.

New benchmark confirms that printing to bytes directly cuts allocations in half and improves the throughput by ~15-20%.

circe-core (string)

[info] Benchmark                                               Mode  Cnt        Score        Error   Units
[info] ToServiceBenchmark.foos                                thrpt   20      463.981 ±     24.913   ops/s
[info] ToServiceBenchmark.foos:·gc.alloc.rate.norm            thrpt   20  5002135.084 ±   3252.704    B/op
[info] ToServiceBenchmark.ints                                thrpt   20     2118.919 ±     49.846   ops/s
[info] ToServiceBenchmark.ints:·gc.alloc.rate.norm            thrpt   20   873272.466 ±    106.930    B/op

circe-core (bytes)

[info] Benchmark                                               Mode  Cnt        Score        Error   Units
[info] ToServiceBenchmark.foos                                thrpt   20      523.730 ±     21.871   ops/s
[info] ToServiceBenchmark.foos:·gc.alloc.rate.norm            thrpt   20  2602833.269 ±     76.806    B/op
[info] ToServiceBenchmark.ints                                thrpt   20     2234.631 ±    111.156   ops/s
[info] ToServiceBenchmark.ints:·gc.alloc.rate.norm            thrpt   20   545837.968 ±  43806.910    B/op

circe-jackson (string)

[info] Benchmark                                               Mode  Cnt        Score        Error   Units
[info] ToServiceBenchmark.foos                                thrpt   20      420.095 ±     15.935   ops/s
[info] ToServiceBenchmark.foos:·gc.alloc.rate.norm            thrpt   20  5158946.227 ±   8930.739    B/op
[info] ToServiceBenchmark.ints                                thrpt   20     3398.087 ±    244.404   ops/s
[info] ToServiceBenchmark.ints:·gc.alloc.rate.norm            thrpt   20   691384.267 ±    114.019    B/op

circe-jackson (bytes)

[info] Benchmark                                               Mode  Cnt        Score        Error   Units
[info] ToServiceBenchmark.foos                                thrpt   20      475.034 ±     16.369   ops/s
[info] ToServiceBenchmark.foos:·gc.alloc.rate.norm            thrpt   20  3520947.809 ±  36299.420    B/op
[info] ToServiceBenchmark.ints                                thrpt   20     4046.914 ±    185.765   ops/s
[info] ToServiceBenchmark.ints:·gc.alloc.rate.norm            thrpt   20   342040.224 ±      0.011    B/op
@codecov-io

This comment has been minimized.

Show comment
Hide comment
@codecov-io

codecov-io Jan 12, 2017

Current coverage is 78.74% (diff: 100%)

Merging #717 into master will increase coverage by 0.93%

@@             master       #717   diff @@
==========================================
  Files            34         34          
  Lines           649        654     +5   
  Methods         625        630     +5   
  Messages          0          0          
  Branches         24         24          
==========================================
+ Hits            505        515    +10   
+ Misses          144        139     -5   
  Partials          0          0          

Powered by Codecov. Last update 58d4f03...92b3389

codecov-io commented Jan 12, 2017

Current coverage is 78.74% (diff: 100%)

Merging #717 into master will increase coverage by 0.93%

@@             master       #717   diff @@
==========================================
  Files            34         34          
  Lines           649        654     +5   
  Methods         625        630     +5   
  Messages          0          0          
  Branches         24         24          
==========================================
+ Hits            505        515    +10   
+ Misses          144        139     -5   
  Partials          0          0          

Powered by Codecov. Last update 58d4f03...92b3389

@vkostyukov

This comment has been minimized.

Show comment
Hide comment
@vkostyukov

vkostyukov Jan 12, 2017

Member

@travisbrown Out of curiosity, any idea why printing int arrays/lists could be slower when targeting byte-buffers in circe-core? I'm wondering if there is anything we could do on a Circe side to improve it. I'm happy to file a ticket if you think it deserves an attention.

Member

vkostyukov commented Jan 12, 2017

@travisbrown Out of curiosity, any idea why printing int arrays/lists could be slower when targeting byte-buffers in circe-core? I'm wondering if there is anything we could do on a Circe side to improve it. I'm happy to file a ticket if you think it deserves an attention.

@travisbrown

This comment has been minimized.

Show comment
Hide comment
@travisbrown

travisbrown Jan 12, 2017

Member

@vkostyukov No, not of the top of my head. If you don't mind filing a ticket it'd be good to have.

Member

travisbrown commented Jan 12, 2017

@vkostyukov No, not of the top of my head. If you don't mind filing a ticket it'd be good to have.

@travisbrown

This comment has been minimized.

Show comment
Hide comment
@travisbrown

travisbrown Jan 12, 2017

Member

@vkostyukov I just put together some simple benchmarks to try to track this down, but they're not very helpful. I don't really understand how you could see so much of a difference in that one case specifically, since they're both using the same Appendable interface.

Member

travisbrown commented Jan 12, 2017

@vkostyukov I just put together some simple benchmarks to try to track this down, but they're not very helpful. I don't really understand how you could see so much of a difference in that one case specifically, since they're both using the same Appendable interface.

@vkostyukov

This comment has been minimized.

Show comment
Hide comment
@vkostyukov

vkostyukov Jan 14, 2017

Member

@travisbrown I think the benchmark results are just volatile enough (at least on my machine) to not capture the difference precisely. Let me try to make it more stable and see if we can get those numbers aligned. I think by reducing the size of data set (reducing allocations) we should get fewer GC pauses and more predictable results.

Member

vkostyukov commented Jan 14, 2017

@travisbrown I think the benchmark results are just volatile enough (at least on my machine) to not capture the difference precisely. Let me try to make it more stable and see if we can get those numbers aligned. I think by reducing the size of data set (reducing allocations) we should get fewer GC pauses and more predictable results.

@vkostyukov vkostyukov merged commit fb758d9 into master Jan 18, 2017

4 checks passed

codecov/patch 100% of diff hit (target 77.81%)
Details
codecov/project 78.74% (+0.93%) compared to 58d4f03
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@vkostyukov vkostyukov deleted the vk/no-string-in-enc branch Jan 18, 2017

@vkostyukov vkostyukov removed the in progress label Jan 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment