Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: fix 1ms per async CopyAPI (regression since 42.2.5) #1314

Merged
merged 1 commit into from Oct 25, 2018

Conversation

Projects
None yet
3 participants
@vlsi
Copy link
Member

commented Oct 18, 2018

Let's see what breaks

@vlsi vlsi force-pushed the vlsi:fix_nonblocking_copy branch 6 times, most recently from 0586a6e to ea0ac40 Oct 18, 2018

@davecramer

This comment has been minimized.

Copy link
Member

commented Oct 19, 2018

@vlsi the reason that replication is failing now is that SslMode.of https://github.com/pgjdbc/pgjdbc/blob/master/pgjdbc/src/main/java/org/postgresql/jdbc/SslMode.java#L63-L71

is returning Prefer and setting up an SSL connection which your network timeout code attempted to fix.

Thoughts on how to fix this ?

@vlsi

This comment has been minimized.

Copy link
Member Author

commented Oct 19, 2018

Well, the idea of prefer is to use encrypted connection when possible.

In other words, even if we flip the switch and stick with non-encrypted by default (why would we?), the clients that use SSL would suffer.

I have not reviewed "replication" tests, however the tests might suffer from the similar problem: async read might always immediately return, so replication stream should likely use timed-polling.

@davecramer

This comment has been minimized.

Copy link
Member

commented Oct 19, 2018

I'm not suggesting flipping the switch.

But in the test case I did not ask for an ssl replication connection, but received one anyway.
replication protocol uses the copy protocol so yes, it suffers from the same problem. The question is how to solve this without the performance regression in Copy ?

I expect we need a background thread reading ..

@vlsi

This comment has been minimized.

Copy link
Member Author

commented Oct 19, 2018

  1. Replication stream sends periodic heartbeats to the server:


    and we might force boolean replyRequired=true when readPending is called.
    That would force server to generate some traffic, and it would eventually push out relevant messages that are sitting in the queues.

  2. We could augment org.postgresql.replication.LogicalReplicationTest#receiveMessageWithoutBlock to prevent "non-blocking" mode. For instance it could do a true .pick() call once per 1 second or something like that. We might want to add that once per 1second (or once per 10sec) check to the PGStream.available.

  3. It might be relevant to the way we configure SSL.

@davecramer

This comment has been minimized.

Copy link
Member

commented Oct 19, 2018

regarding 3: no, it's the implementation.. https://bugs.openjdk.java.net/browse/JDK-7134707
One thought it to use your code only if the wrapped connection is an SSLConnection ?

@vlsi vlsi force-pushed the vlsi:fix_nonblocking_copy branch 2 times, most recently from 810f446 to 2e0db8a Oct 21, 2018

perf: fix 1ms per async CopyAPI (regression since 42.2.5)
PGStream#hasMessagePending should not perform network requests otherwise it causes per row delay in async copy

fixes #1312

@vlsi vlsi force-pushed the vlsi:fix_nonblocking_copy branch from 2e0db8a to e6228a5 Oct 24, 2018

@codecov-io

This comment has been minimized.

Copy link

commented Oct 24, 2018

Codecov Report

Merging #1314 into master will increase coverage by 0.04%.
The diff coverage is 88.88%.

@@             Coverage Diff              @@
##             master    #1314      +/-   ##
============================================
+ Coverage     68.54%   68.58%   +0.04%     
- Complexity     3880     3883       +3     
============================================
  Files           178      178              
  Lines         16199    16205       +6     
  Branches       2645     2646       +1     
============================================
+ Hits          11104    11115      +11     
+ Misses         3863     3859       -4     
+ Partials       1232     1231       -1
@vlsi

This comment has been minimized.

Copy link
Member Author

commented Oct 24, 2018

@davecramer , this passes the tests at a cost of:

  1. Once per second it performs blocking 1ms call for regular copy ops
  2. replication "non-blocking" calls always perform "1ms reads" behind the scene

Any thoughts?

@vlsi vlsi changed the title Fix nonblocking copy perf: fix 1ms per async CopyAPI (regression since 42.2.5) Oct 24, 2018

@vlsi vlsi added this to the 42.2.6 milestone Oct 24, 2018

@vlsi vlsi merged commit e2623d6 into pgjdbc:master Oct 25, 2018

2 checks passed

codecov/project 68.58% (+0.04%) compared to e44e4e8
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.