Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SqlgGraph do not close data source in case of error in open #322

Closed
vermaharsh opened this issue Dec 17, 2018 · 4 comments
Closed

SqlgGraph do not close data source in case of error in open #322

vermaharsh opened this issue Dec 17, 2018 · 4 comments
Labels

Comments

@vermaharsh
Copy link
Contributor

We are using SqlgGraph against a HA setup of postgres with one primary and another secondary (read only slave).

We have setup observers to notify when postgres fails-over and we re-construct the SqlgGraph using SqlgGraph.open(configuration).

Though, it is possible that secondary instance is still not ready to accept write traffic and SqlgGraph.open(configuration) fails. It can also happen when we are building the sqlg graph for the first time. So there are many paths in the code which can fail because postgres is not accepting write traffic.

In these scenarios, we retry opening the SqlgGraph if it fails. Though we are noticing that after few tries, postgres runs out of connections and fails with following error.

SEVERE: Connection error:
org.postgresql.util.PSQLException: FATAL: remaining connection slots are reserved for non-replication superuser connections

This is because SqlgDataSource is not closed on error paths and holds on to the connection pool until garbage collector sweeps it. Since it can be a while before garbage collector kicks in, we are seeing these no connection left error.

I tried a fix locally to explicitly close SqlgDataSource on error paths, but I do not correct setup locally and all tests are failing (as well few packages failed to build) with error.

SEVERE: Connection error: 
org.postgresql.util.PSQLException: Connection to localhost:9700 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
	at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:245)
	at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
	at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:195)
	at org.postgresql.Driver.makeConnection(Driver.java:452)
	at org.postgresql.Driver.connect(Driver.java:254)
	at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
	at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
	at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
	at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
	at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
	at com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
	at com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
	at com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
	at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:589)
	at org.postgresql.core.PGStream.<init>(PGStream.java:69)
	at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:156)
	... 13 more

Can you please look into this issue yourself or document development guide for this project?

@vermaharsh
Copy link
Contributor Author

@pietermartin I have created a pull request with the fix, but I am not able to run the end-to-end tests since I am not sure what are the steps for it. Can you please pull this branch and run end-to-end to verify nothing is broken before we can merge it it?

@pietermartin
Copy link
Owner

You should change the postgresql port to be able to test on your side.
The default port is 9700 and 9701/2/3 for running the CitusDB tests.

That said I think it will be better if I make CitusDB not run by default and use the regular postgresql port. That way the test will run without any configuring.

@pietermartin
Copy link
Owner

I merged the PR and ran the tests. However I have not yet written a test to cover this scenario so please test it on your side.
The part I am not fully understanding is what causes SqlgGraph to not open properly as it should open fine on a readOnly database. Maybe there is some special scenario with replication slots?

@vermaharsh
Copy link
Contributor Author

Thanks for merging it in. I have verified on my side that collections are no longer leaking in this scenario.

SqlgGraph is failing to open on readonly database when it is opened for the first time when it needs to create some schema and tables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants