Performance: Two database queries are executed for each invocation of query() #14

jprafael · 2015-04-20T17:09:51Z

When invoking the AbstractDatabaseEngine#query(Expression expr) method two database queries are executed.

The first call is invoked with the following call stack:

AbstractDatabaseEngine#checkConnection()
AbstractDatabaseEngine#getConnection()
AbstractDatabaseEngine#iterator(String query, int fetchSize)

This query is only used to check the aliveness of the connection. Because of this call each query() incurs in twice the round-trip time. It is also somewhat useless since it doesn't guarantee that the intended operation will succeed because the connection might be dropped meanwhile.

The text was updated successfully, but these errors were encountered:

diogoeag · 2015-04-21T10:11:42Z

The idea behind this check is that the probability of a failure during the execution of a query (not only iterators) should be very similar to the probability of the failure of the check connection. With this we could retry the connection before starting executing the desired operation.

It is true that the rtt is doubled, the question is that if that impacts for real the performance of your application

rpvilao · 2015-04-21T10:17:20Z

For me it seems that makes sense to add an option to control whether you
want or not for that check to be performed. That might some scenarios where
reconnection is not needed and it's not important to recover.

On 21 April 2015 at 12:11, diogoeag notifications@github.com wrote:

The idea behind this check is that the probability of a failure during the
execution of a query (not only iterators) should be very similar to the
probability of the failure of the check connection. With this we could
retry the connection before starting executing the desired operation.

It is true that the rtt is doubled, the question is that if that impacts
for real the performance of your application

—
Reply to this email directly or view it on GitHub
#14 (comment).

defer · 2015-04-21T11:40:17Z

I think it could be optimistic, try the query. If it fails, just do the check, retries and try again.

Then again if the rtt is a problem, there's probably something wrong with the client design anyway because it should be either batching for writes or reading asynchronously while other work completes.

jprafael · 2015-04-21T12:44:44Z

@defer, @diogoeag PDB has no (truly) asynchronous methods. We are mimicking this using a separate thread-pool. But note that query() is not a cheap call. Our profiling shows that on a total of 16 seconds spent inside the checkConnection() function, only 4s are spent blocking for IO. The remaining 12 seconds are spent on the driver's internal functions and will keep a CPU thread busy. This prevents us from simply raising the number of processing threads until it is no longer a problem as it would degrade the performance on other components.

More importantly, the response latency is doubled which is a problem when PDB is used in a realtime environment. In this scenario batching is also sub-optimal because you either use very small batches and the RTT becomes a problem, or you use larger batches and the batch fill time itself becomes a problem.

I agree we should be optimistic and use the available connection. I would actually go a bit further and say that reconnects / retries should not be the responsibility of query() but of its caller (which might have specialized recovery/failure behavior).

diogoeag · 2015-04-21T12:53:24Z

@jprafael these 16s are during what period of profilling? If its 20seconds its one thing, if its one day, I wouldn't care about it. Note that some of our HA capabilities (the ones related to database connections) are strictly tied to the feature of retry / recover of PDB. That will not change for now.

Can you elaborate on the real weight of this on the overall process?

defer · 2015-04-21T13:02:46Z

I'm not entirely sure you should be depending on pdb for latency sensitive workloads. However, if completion of the task is not a requirement for the caller to finish then yeah, either pdb or the client should behave asynchronously.

That being said, 12s of internal work seems odd and should probably be looked at. Do you have more granular data on what is taking that long?

jprafael · 2015-04-21T14:36:00Z

@diogoeag the JVM was running for 8 mins but the application was running only on the last 4 mins (wall time) of profiling on the entire application. Total thread time amounts to 15000s but is mostly irrelevant since most threads are blocking/waiting. Total executing time of the application however is 914s. So this amounts to roughly 1.64% of the application time.

@defer It is mostly oracle.jdbc.driver.* related. I can upload an HTML export of the measurements of this particular function. Where should I send it to?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance: Two database queries are executed for each invocation of query() #14

Performance: Two database queries are executed for each invocation of query() #14

jprafael commented Apr 20, 2015

diogoeag commented Apr 21, 2015

rpvilao commented Apr 21, 2015

defer commented Apr 21, 2015

jprafael commented Apr 21, 2015

diogoeag commented Apr 21, 2015

defer commented Apr 21, 2015

jprafael commented Apr 21, 2015

Performance: Two database queries are executed for each invocation of query() #14

Performance: Two database queries are executed for each invocation of query() #14

Comments

jprafael commented Apr 20, 2015

diogoeag commented Apr 21, 2015

rpvilao commented Apr 21, 2015

defer commented Apr 21, 2015

jprafael commented Apr 21, 2015

diogoeag commented Apr 21, 2015

defer commented Apr 21, 2015

jprafael commented Apr 21, 2015