Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
PgJDBC can experience client/server deadlocks during batch execution #194
PgJDBC can encounter client/server deadlocks during batch execution, where the server is waiting for the client and the client is waiting for the server. Neither can progress and one must be terminated.
The client cannot continue until the server consumes some input from the server's receive buffer (the client's send buffer).
The server cannot continue until the client consumes some input from the client's receive buffer, (the server's send buffer).
Each is blocked trying to send to the other. Neither can receive until the other sends.
PgJDBC tries to prevent this case from arising with some heuristics in its batch facilities where it attempts to limit the number of queries that may be queued; see
The main reason that deadlocks are rare is that the 64k buffer size is now unrealistically small; on my Linux system, default buffers are 200kb for both send and receive, giving us 400kb of buffer space to work with.
I've produced a very artificial test case showing that a deadlock can still occur; see
The client's stack looks like:
The server's stack looks something like:
referenced this issue
Oct 2, 2014
To reduce the likeihood of tripping this bug, PgJDBC doesn't queue batches that return result sets, such as a
One possible option for making deadlocks impossible is covered in brief by issue #163 - using non-blocking sockets with java.nio. However, it's likely to be intrusive.
An alternative to completely changing the data exchange mechanism is to instead get PgJDBC to manage its send buffer properly. PgJDBC currently ignores its send buffer and tries to manage the server's buffer. This is backwards.
The only buffer PgJDBC can completely control is its own send buffer. So what we really need to do is avoid blocking on writing to that if we know that there's already a pending query response. (If there's no pending query it's fine to block, the server will continue consuming our input even if there's an error).
Using Non-blocking reads/writes with java.nio streams?
Java doesn't expose any API to query the available space in the TCP send buffer, and there's no portable way to query it from the underlying platform. You need Linux-specific hacks like SIOCOUTQ.
In java.nio (since Java 1.4) there's now the option of creating a non-blocking
We could guarantee that it's safe to read from the receive stream by forcing the server to send more data by writing a
Even if we solved the SSL issue and got a guaranteed non-blocking input stream too, we'd have to muck around with a control loop that select()s the next readable/writeable socket and pipelines more data. This is complicated by the fact that the output socket might still be writable, just not with the message size we want. So doing this with a non-blocking approach would require a pretty major change to the driver.
Writing up to the send buffer size, then syncing and flushing
Instead, we can just avoid blocking on the socket by never filling the send buffer without ending it in a
This is deadlock-proof, but greatly limits the number of big queries that PgJDBC can pipeline in a batch. Currently with an assumed 250 byte reply and 64k buffer PgJDBC assumes it can safely pipeline 64000 / 250 = 256 queries before needing to sync and consume input.
If we instead use the real send buffer size on a typical system, as determined by poking in the driver's guts reflectively, e.g.:
I can see that my default is
That's still a lot of sanely sized queries, and bigger queries will be less affected by round trip costs anyway. So we should consider moving deadlock prevention logic from attempting to control the server's send buffer to trying to control our own send buffer. That's much safer, and lets us safely batch prepared statements that request generated keys.
added a commit
Oct 3, 2014
added a commit
Dec 1, 2014
We had this problem with Maven artifact org.postgresql:postgresql:9.4.1208.jre6. The stacktrace on the client is:
The server just shows in pg_stat_activity that it is executing some SQL statement that is part of the batch, with waiting == false. We had it once that that statement was even supposed to return zero rows (because the corresponding table was empty).
We only seem to have the problem when the client runs on the same host as the server (connection using TCP from some IP address to the same IP address, not necessarily 127.0.0.1).
We only ever had this at our clients that use Windows (but it might also occur on other operating systems).
We are trying to find a way to tweak the settings so as to avoid this problem.
Thx for the report.
I would suggest upgrading as rewriteInsert in 1209 makes batch inserts much
On 16 September 2016 at 04:51, Nicolas Barbier email@example.com
referenced this issue
Aug 31, 2017
Agreed that it's not resolved.
I looked into using a separate thread, but couldn't find much clarity on how threads interact with JDBC drivers and what the rules are there. How would we reliably ensure our receive-pumping thread was terminated when the connection was GC'd and closed, etc. But I expect we can rely on the shared TCP socket for that.
It's probably not that hard, and likely the sensible solution. Java is already so heavily threaded that nobody's going to get upset if we spawn a thread. Some care will be required to make sure the new thread gets the same classloader as the spawning thread to work properly in containerized environments, but that's well established.
I'm a bit unsure why I dismissed a threaded solution when I looked into this before.