Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #638 Implement support for get/setNetworkTimeout(). #849

Merged
merged 1 commit into from Nov 23, 2017

Conversation

brettwooldridge
Copy link
Contributor

@brettwooldridge brettwooldridge commented Jun 24, 2017

HikariCP logs a warning at connection pool startup when a driver is found to not support get/setNetworkTimeout() -- due to vulnerability to getting stuck during a network disruption or partition event.

Given our user-base of several hundred thousand users, the warning message has turned into a support burden, with users concerned about its meaning. One solution would be to do away with the warning, but that doesn't really serve the user.

So, here is the implementation of said feature -- I should have done this a year ago out of self-preservation.

@brettwooldridge
Copy link
Contributor Author

@vlsi I am not sure why this was originally flagged as "concurrency hard". Ultimately, this call just boils down to a Socket.setSoTimeout() call. Setting the timeout affects the next blocking operation.

The primary purpose of the addtion to the JDBC specification was to prevent connections hung by unacknowledged TCP packets due to a network disruption. It is no substitue for a query timeout, and it is ultimately easy for a user to shoot themselves in the foot. Fortunately, most users rarely if ever call this directly (clearly pgjdbc users have never called it), but infrastructure libraries like HikariCP make use of it in a controlled manner.

@codecov-io
Copy link

codecov-io commented Jun 24, 2017

Codecov Report

Merging #849 into master will increase coverage by 0.01%.
The diff coverage is 71.42%.

@@             Coverage Diff              @@
##             master     #849      +/-   ##
============================================
+ Coverage     65.94%   65.95%   +0.01%     
- Complexity     3561     3567       +6     
============================================
  Files           166      166              
  Lines         15244    15264      +20     
  Branches       2465     2467       +2     
============================================
+ Hits          10052    10067      +15     
- Misses         4022     4025       +3     
- Partials       1170     1172       +2

@vlsi
Copy link
Member

vlsi commented Jun 24, 2017

@brettwooldridge , the question is what would you do in case of TimeoutException?
Do you expect the connection to automatically terminate?

@brettwooldridge
Copy link
Contributor Author

brettwooldridge commented Jun 26, 2017

@vlsi Ah, that is actually a good question. I assumed the existing code paths would already handle socket-level errors. According to the JavaDoc:

Sets the maximum period a Connection or objects created from the Connection will wait for the database to reply to any one request. If any request remains unanswered, the waiting method will return with a SQLException, and the Connection or objects created from the Connection will be marked as closed. Any subsequent use of the objects, with the exception of the close, isClosed or Connection.isValid methods, will result in a SQLException.

If you want, I can investigate the various code paths to ensure that the exception is handled properly.

EDIT: I kind of thought this handling was already in-place, as the driver supports socketTimeout. Both the socketTimeout and setNetworkTimeout() result in setting Socket.setSoTimeout(), so the error/exception should be identical in both cases from the socket-level.

@brettwooldridge
Copy link
Contributor Author

brettwooldridge commented Jun 26, 2017

@vlsi Well ouch, that was not a very happy string to pull. 😭

I thought I would throw in some unit tests. But this one is failing:

try {
   _conn.setNetworkTimeout(null, (int) TimeUnit.SECONDS.toMillis(1));
   stmt = _conn.createStatement();
   stmt.execute("SELECT pg_sleep(2)");
   fail("Connection.setNetworkTimeout() did not throw expected exception");
}
catch (SQLException e) {
   assertTrue(stmt.isClosed());       <--- HERE
   assertTrue(_conn.isClosed());
}

If that line is removed, the test passes.

My assumption was that the Connection would be closed by a socket-level exception -- and it is, which is Goodtm.

So, then I added the assertion that the Statement had been closed as well. It is not. This is Badtm...

According to the JDBC Specification:

All Statement objects will be closed when the connection that created them is closed. However, it is good coding practice for applications to close statements as soon as they have finished processing them.

So looking at PgConnection:1190 I was expecting to find some kind of Statement tracking; I have to do something similar in HikariCP. However, it appears that none of the create/prepareStatement() calls track the Statement objects in any way -- they create and forget.

The lack of resource tracking precludes honoring the JDBC specification contracts, not only with regard to Connections closing Statements, but also Statements closing ResultSets. A Connection.close() should cascade down, closing open statements, which in turn close open result sets. Again, from the specification:

Closing a Statement object will close and invalidate any instances of ResultSet produced by that Statement object.
...
These comments about closing Statement objects apply to PreparedStatement and CallableStatement objects as well.

What I propose is this:

  • Remove the assertion relating to the Statement state (unclosed), it is unrelated to this pull request.
  • Open an issue to track the um...lack of tracking of Connection managed resources.

EDIT: As a side note, it would be nice if pgjdbc supported the specific JDBC exception types such as SQLTimeoutException and SQLNonTransientConnectionException.

@kscaldef
Copy link

This PR seems to be in limbo. Are there any plans to push forward with it?

@davecramer
Copy link
Member

This is sort of up to @brettwooldridge ?

@kscaldef
Copy link

@davecramer Should I interpret that to mean that the project owners approve of this change and will merge it once:

  1. it's been rebased
  2. the indentation is fixed to make checkstyle happy
  3. the tests are migrated to the jdbc41 test suite, since they are using

@davecramer
Copy link
Member

I'm going to say we won't consider it without it. That said I see no reason to not accept it. Although there was an outstanding question of how to handle the timeout exception.

@brettwooldridge
Copy link
Contributor Author

@davecramer @kscaldef I have rebased and moved the NetworkTimeoutTest to the JDBC 4.1 suite.

The issue of proper resource cleanup is independent of this pull request and is being tracked separately by #855. I do have a branch with changes that address that issue, but it is currently unfinished.

@kscaldef
Copy link

@brettwooldridge can you also adjust the indentation so that checkstyle is happy? I'm guessing the other failure was a heisentest, but I'm not sure. Sorry to pester you about this, but this issue (failure to recover from a network outage) has bitten us multiple times, and I'd really love to see it fixed.

@brettwooldridge
Copy link
Contributor Author

@kscaldef Sure, I'll take a look this week; 5am here in Tokyo now.

@vlsi vlsi added the triage/needs-review Issue that needs a review - remove label if all is clear label Sep 25, 2017
@brettwooldridge
Copy link
Contributor Author

I think such a check is pointless.

SecurityException is a perfectly reasonable exception to be thrown; better in fact than a SQLEception that wraps such an exception. Checked exceptions by their nature must be wrapped, but that is not true of unchecked exceptions.

What should definitely not occur is to return from this method without setting the timeout and without throwing an exception.

@jorsol
Copy link
Member

jorsol commented Oct 2, 2017

@brettwooldridge , well it's probably pointless, but the jdbc spec/javadoc requires it:

This method checks to see that there is an SQLPermission object before allowing the method to proceed. If a SecurityManager exists and its checkPermission method denies calling setNetworkTimeout, this method throws a java.lang.SecurityException.

A simple check like this should do the trick:

SecurityManager secMgr = System.getSecurityManager();
if (secMgr != null) {
  secMgr.checkPermission(new SQLPermission("setNetworkTimeout"));
}

@davecramer
Copy link
Member

davecramer commented Oct 2, 2017 via email

@jorsol
Copy link
Member

jorsol commented Oct 2, 2017

won't it throw one anyway if you don't have permission ?

No, the driver has to explicitly check for it.

@davecramer
Copy link
Member

davecramer commented Oct 2, 2017 via email

@jorsol
Copy link
Member

jorsol commented Oct 2, 2017

@davecramer , I'm not aware of the internals of the Security Manager, and in most applications a Security Manager policy is not used or enabled, but the JDBC spec requires that some methods do this check.

There is a SQLPermission class that contains more info.

@brettwooldridge
Copy link
Contributor Author

@jorsol @davecramer There is no need to check this permission. The SecurityManager will throw a SecurityException whether we check for it or not; that is the point of having a SecurityManager.

The only reason we would check if such a permission exists is in the case where the code would somehow need to behave differently upon denial. That is not the case with setNetworkTimeout() -- we have no fallback logic -- and therefore such a check is unnecessary. Trust me, I do have code in another project whose behavior is altered in the presence of a SecurityManager, but again that is not the case here.

@brettwooldridge
Copy link
Contributor Author

Checkstyle issues fixed for NetworkTimeoutTest.java.

@jorsol
Copy link
Member

jorsol commented Oct 2, 2017

@brettwooldridge, I may have misunderstood how the SecurityManager works, but AFAIK this is an implementation code, not a consumer code, so in your case your fallback logic is as consumer code.

I have checked some others drivers and they implement the check, and even the DriverManager wich is a class of the JDK has it.
See http://hg.openjdk.java.net/jdk9/jdk9/jdk/file/a9a5d14bfbb4/src/java.sql/share/classes/java/sql/DriverManager.java#l385

As I said, I have not yet tested it with a SecurityManager enabled, but I just want to be sure that the code works in compliance with what the JDBC spec says, so an additional check that might not be needed is better than an missing check that is needed.

@brettwooldridge
Copy link
Contributor Author

@jorsol Ok, I'll buy that. Check added, consummate with checks done by other drivers.

Copy link
Member

@jorsol jorsol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brettwooldridge
Copy link
Contributor Author

@davecramer @kscaldef I believe this pull request is ready to go.

Copy link

@johnou johnou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@vlsi vlsi removed the triage/needs-review Issue that needs a review - remove label if all is clear label Oct 20, 2017
@jorsol
Copy link
Member

jorsol commented Oct 30, 2017

Is there any reason to hold this on? Is there a tentative milestone to merge this?

@johnou
Copy link

johnou commented Nov 9, 2017

ping @vlsi any changes required to get this pulled in?

@dominicgunn
Copy link

+1 one for this, caused some concern in my team.

@davecramer
Copy link
Member

davecramer commented Nov 10, 2017 via email

@johnou
Copy link

johnou commented Nov 22, 2017

@davecramer I believe the integration test covers it well enough.

@davecramer davecramer merged commit 8a30044 into pgjdbc:master Nov 23, 2017
@brettwooldridge
Copy link
Contributor Author

🎉

openstack-gerrit pushed a commit to openstack-infra/gerrit that referenced this pull request Aug 8, 2018
Besides the usual bug fixes and optimizations, the more recent versions
implement missing features like being able to set network timeouts to
avoid connections hung due to network disruptions [1].

[1] pgjdbc/pgjdbc#849

Change-Id: Idb8b0376fc24b1d050b72057303c551d36fa931b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants