Improve performance of reading large result sets #835

mpilquist · 2023-04-02T13:40:05Z

Fixes #833

codecov · 2023-04-02T13:43:10Z

Codecov Report

Merging #835 (3a868c3) into main (2a87b75) will increase coverage by 0.17%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #835      +/-   ##
==========================================
+ Coverage   84.83%   85.01%   +0.17%     
==========================================
  Files         126      126              
  Lines        1728     1742      +14     
  Branches      202      147      -55     
==========================================
+ Hits         1466     1481      +15     
+ Misses        262      261       -1

Impacted Files	Coverage Δ
...core/shared/src/main/scala/net/MessageSocket.scala	`100.00% <ø> (ø)`
...re/shared/src/main/scala/net/BitVectorSocket.scala	`100.00% <100.00%> (ø)`
...re/shared/src/main/scala/net/message/RowData.scala	`93.33% <100.00%> (+13.33%)`	⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

armanbilge · 2023-04-02T15:23:38Z

modules/core/shared/src/main/scala/net/BitVectorSocket.scala

-      val withTimeout: F[Chunk[Byte]] => F[Chunk[Byte]] = readTimeout match {
+      private val withTimeout: F[Option[Chunk[Byte]]] => F[Option[Chunk[Byte]]] = readTimeout match {
        case _: Duration.Infinite   => identity
        case finite: FiniteDuration => _.timeout(finite)
      }


.timeout(...) now supports Duration.Infinite so this helper shouldn't be needed anymore.

Yeah I tried that but it benchmarked a bit slower since the short circuit happens once with current implementation but on each call with the new CE support.

I'm amazed that makes such a difference 😅

matthughes · 2023-04-03T12:36:15Z

modules/core/shared/src/main/scala/net/BitVectorSocket.scala

+
+      private def readUntilN(nBytes: Int, carry: Chunk[Byte]): F[BitVector] =
+        if (carry.size < nBytes) {
+          withTimeout(socket.read(8192)).flatMap {


Should this parameter be customizeable?

I played with increasing it and it seemed to make no difference fwiw.

Yeah I tried a bunch of values and settled on same value fs2 uses in reads.

Might also need to play with SO_RCVBUF option to see an effect.

Did that too :)

I hope it's ok that I add my comments just as a visitor/user.
I noticed a performance improvement only when setting SND/RCV buffers to a really huge value (8mb). This assumes that the session pool is reasonably small (8-64, for instance.). And it's probably a good idea to have it configurable.

Comments very welcome! The socket options are configurable now. The read size here isn’t but we could make it so. I’d like to show a case where it matters though before breaking compatibility by introducing a new param. Same for the background queue size in BufferedMessageSocket, which is hardcoded to 256 and doesn’t seem to fill.

Also, I suspect receive buffer size may be more influential on performance when connecting to a remote postgres database as opposed to local host.

matthughes · 2023-04-03T12:37:24Z

modules/core/shared/src/main/scala/net/MessageSocket.scala

          bvs.read(5).flatMap { bits =>
-            val (tag, len) = header.decodeValue(bits).require
+            val tag = bits.take(8).toByte()


Are you just inlining the header decoder? Or why this change?

Avoids allocations of a bunch of intermediate objects. Makes a small bit consistent improvement on the benchmark.

matthughes · 2023-04-03T12:42:41Z

modules/tests/jvm/src/test/scala/skunk/LargeResponseTest.scala

+// This software is licensed under the MIT License (MIT).
+// For more information see LICENSE or https://opensource.org/licenses/MIT
+
+package test


I think this should be tests right? A

It’s copied from QueryTest but yeah not sure why we have both test and tests.

Maybe this is too daring: BitVector is used to decode messages in the code above (and of course elsewhere). Maybe it would improve performance using ByteVector. To my knowledge the PG wire protocol's smallest unit is one byte. And ByteVector is really highly optimized. It would require a huge refactoring, though. The refactored code would be a lot simpler dealing with bytes instead of bits.

There shouldn’t be a significant difference and scodec’s Codec abstraction is built on BitVector.

Improve performance of reading large result sets

3a868c3

armanbilge reviewed Apr 2, 2023

View reviewed changes

matthughes reviewed Apr 3, 2023

View reviewed changes

matthughes approved these changes Apr 5, 2023

View reviewed changes

mpilquist merged commit f218281 into typelevel:main Apr 8, 2023
14 checks passed

mpilquist deleted the bugfix/833 branch April 8, 2023 12:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of reading large result sets #835

Improve performance of reading large result sets #835

mpilquist commented Apr 2, 2023

codecov bot commented Apr 2, 2023 •

edited

armanbilge Apr 2, 2023

mpilquist Apr 2, 2023

armanbilge Apr 2, 2023

matthughes Apr 3, 2023

matthughes Apr 3, 2023

mpilquist Apr 3, 2023

armanbilge Apr 3, 2023

mpilquist Apr 3, 2023

guidoschmidt17 Apr 4, 2023 •

edited

mpilquist Apr 4, 2023

mpilquist Apr 4, 2023

matthughes Apr 3, 2023

mpilquist Apr 3, 2023

matthughes Apr 3, 2023

mpilquist Apr 3, 2023

guidoschmidt17 Apr 4, 2023 •

edited

mpilquist Apr 4, 2023

Improve performance of reading large result sets #835

Improve performance of reading large result sets #835

Conversation

mpilquist commented Apr 2, 2023

codecov bot commented Apr 2, 2023 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guidoschmidt17 Apr 4, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

guidoschmidt17 Apr 4, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Apr 2, 2023 •

edited

guidoschmidt17 Apr 4, 2023 •

edited

guidoschmidt17 Apr 4, 2023 •

edited