Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of reading large result sets #835

Merged
merged 1 commit into from Apr 8, 2023

Conversation

mpilquist
Copy link
Member

Fixes #833

@codecov
Copy link

codecov bot commented Apr 2, 2023

Codecov Report

Merging #835 (3a868c3) into main (2a87b75) will increase coverage by 0.17%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #835      +/-   ##
==========================================
+ Coverage   84.83%   85.01%   +0.17%     
==========================================
  Files         126      126              
  Lines        1728     1742      +14     
  Branches      202      147      -55     
==========================================
+ Hits         1466     1481      +15     
+ Misses        262      261       -1     
Impacted Files Coverage Δ
...core/shared/src/main/scala/net/MessageSocket.scala 100.00% <ø> (ø)
...re/shared/src/main/scala/net/BitVectorSocket.scala 100.00% <100.00%> (ø)
...re/shared/src/main/scala/net/message/RowData.scala 93.33% <100.00%> (+13.33%) ⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Comment on lines -46 to 43
val withTimeout: F[Chunk[Byte]] => F[Chunk[Byte]] = readTimeout match {
private val withTimeout: F[Option[Chunk[Byte]]] => F[Option[Chunk[Byte]]] = readTimeout match {
case _: Duration.Infinite => identity
case finite: FiniteDuration => _.timeout(finite)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.timeout(...) now supports Duration.Infinite so this helper shouldn't be needed anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I tried that but it benchmarked a bit slower since the short circuit happens once with current implementation but on each call with the new CE support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm amazed that makes such a difference 😅


private def readUntilN(nBytes: Int, carry: Chunk[Byte]): F[BitVector] =
if (carry.size < nBytes) {
withTimeout(socket.read(8192)).flatMap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this parameter be customizeable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played with increasing it and it seemed to make no difference fwiw.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I tried a bunch of values and settled on same value fs2 uses in reads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also need to play with SO_RCVBUF option to see an effect.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did that too :)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope it's ok that I add my comments just as a visitor/user.
I noticed a performance improvement only when setting SND/RCV buffers to a really huge value (8mb). This assumes that the session pool is reasonably small (8-64, for instance.). And it's probably a good idea to have it configurable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments very welcome! The socket options are configurable now. The read size here isn’t but we could make it so. I’d like to show a case where it matters though before breaking compatibility by introducing a new param. Same for the background queue size in BufferedMessageSocket, which is hardcoded to 256 and doesn’t seem to fill.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I suspect receive buffer size may be more influential on performance when connecting to a remote postgres database as opposed to local host.

bvs.read(5).flatMap { bits =>
val (tag, len) = header.decodeValue(bits).require
val tag = bits.take(8).toByte()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you just inlining the header decoder? Or why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoids allocations of a bunch of intermediate objects. Makes a small bit consistent improvement on the benchmark.

// This software is licensed under the MIT License (MIT).
// For more information see LICENSE or https://opensource.org/licenses/MIT

package test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be tests right? A

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s copied from QueryTest but yeah not sure why we have both test and tests.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is too daring: BitVector is used to decode messages in the code above (and of course elsewhere). Maybe it would improve performance using ByteVector. To my knowledge the PG wire protocol's smallest unit is one byte. And ByteVector is really highly optimized. It would require a huge refactoring, though. The refactored code would be a lot simpler dealing with bytes instead of bits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn’t be a significant difference and scodec’s Codec abstraction is built on BitVector.

@mpilquist mpilquist merged commit f218281 into typelevel:main Apr 8, 2023
14 checks passed
@mpilquist mpilquist deleted the bugfix/833 branch April 8, 2023 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Retrieval performance
4 participants