Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch returning support #204

merged 5 commits into from Dec 1, 2014

Batch returning support #204

merged 5 commits into from Dec 1, 2014


Copy link

@ringerc ringerc commented Oct 10, 2014

Remove the restriction that a batch that returns generated keys is executed internally as individual queries.

With this change PgJDBC now executes batches that return generated keys in internal sub-batches separated by periodic Sync, flush, and input consume pauses, just, like it does for batches that don't return keys.

This is safe (or almost as safe as it ever was given github #194) because we force a Describe, then estimate the result row size and adjust the batch size accordingly.

This approach does increase the deadlock risk if a batch executes statements where each statement returns many generated keys (e.g. a big multi-entry VALUES clause or an INSERT INTO ... SELECT ...), as it assumes each execution will only return one generated key. That's a fairly reasonable assumption, given that the lack of intra-statement ordering guarantees means you can't reliably associate generated keys to the values that generated them unless you run one statement per generated result. I don't expect this to be an issue in practice. In any case, anyone who's doing this is likely to be doing so as an attempt to work around the very limitation this commit fixes.

Fixes github #195.

ringerc added 5 commits Oct 10, 2014
The tests for batches didn't cover series of DML statements well.
Per GitHub issue #194 and the comments above MAX_BUFFERED_QUERIES,
we're using a pretty rough heuristic for receive buffer management.

This approach can't account for the data in prepared queries that
return generated keys, it assumes a flat 250 bytes per query

Change that, the buffer in bytes, up to an estimated MAX_BUFFERED_RECV_BYTES
(still 64k, same as before) with an estimated NODATA_QUERY_RESPONSE_SIZE_BYTES
or 250 bytes per query.

Behaviour is not changed, we're just counting bytes instead of queries. This
means that it's now possible to adjust the baseline
NODATA_QUERY_RESPONSE_SIZE_BYTES for the size of any returned columns in a
query with a RETURNING clause in a subsequent patch.
With this change, PgJDBC calculates the estimated return size for the result
row, assuming that each batch item returns one row for getGeneratedKeys() use.

Instead of forcing batching off and sending individual queries, it then queues
enough binds and executes to fit safely within the receive buffer and server's
send buffer before sending a sync and reading the result buffer.

Batching is still forced off when the result set size cannot be estimated (e.g.
due to unbounded 'text' columns, use of arrays in the result set, etc).

This is a minimal workaround for github issue #195, though it doesn't prevent
deadlocks that can already occur due to #194.
ringerc added a commit that referenced this pull request Dec 1, 2014
Add limited support for returning generated columns in batches

See individual commits for notes on the limitations. This isn't a perfect solution to issue #194 and #195, but it'll work well for most people.
@ringerc ringerc merged commit 83c8c3b into pgjdbc:master Dec 1, 2014
1 check passed
1 check passed
continuous-integration/travis-ci The Travis CI build passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

1 participant