Connection.execute quietly hangs without calling the callback #481

balabanshik · 2016-07-25T21:17:47Z

TL;DR In some cases, connection.execute() can just hang indefinitely without calling the callback.

Slightly more details:
We have code that inserts a bunch of records into a bunch of tables. Some time ago, we noticed an unpleasant number of orphan records. So we wrapped the whole process into a transaction. A few days after that, we noticed that autoCommit was set in the options, so we set that to false. That's when the whole process started to hang. Vigorous debugging led to a failing query (a dependent record being created outside of a transaction, causing a foreign key constraint failure) that simply does not call its callback at all.

Process:
The business logic is isolated by a model layer, so the query execution code is fairly generic. The overall sequence of events is roughly as follows:
Create a transaction connection (T)
Insert a master object inside T; let's call it 'user'
Insert a few other rows (a dozen or so) into a couple other tables, all using user ID acquired above, and all but one operating inside T
One of the inserts uses an incorrect connection (i.e. it attempts to use the main connection, not the transaction); this would create an integrity constraint error, but the query never finishes and never returns to the callback.

Technical specs:
Initially, this was observed using a fairly old set of software:
OS X,
oracle binaries macos.x64-11.2.0.4.0,
oracledb module 1.2.0
Then we updated to oracledb 1.10.1 and more recent binaries, but that didn't change the behaviour.

Code snippet (some parts removed for simplicity):

function query (sql, values, fn) {
 var queryParts = sql.split(' ')
  var queryWord = queryParts[0]
  getOracleConnection(function (error, connection) {
    if (error) {
      process.log.warn('getOracleConnection failed with: ' + JSON.stringify(error))
      return fn(error)
    }
    process.log.info(queryWord + ' QUERY executing')  /// LINE 1
    connection.execute(sql, values, options, function (error, result) {
      process.log.info(queryWord + ' QUERY resolved')  /// LINE 2
      if (!error) {
        return fn(undefined, result)
      }
      process.log.warn('SQL Statement Failed: \n' + sql + '\n' + JSON.stringify(values), error)
      if (isTransaction && isConnectionError(message)) {
        // Our transaction has permanently failed.
        closeOracleConnection(function (error) {
          if (error) {
            process.log.warn('Failed to close Oracle connection.', error)
          }
        })
        return fn(error)
      }
      return fn(error)
    })
  })
}

So what happens is that the re would be a bunch of console lines about queries executing (LINE 1) and resolving (LINE 2), but the failing query would only produce the first of them, and not the second. Therefore, for whatever reason, the callback is not called. Further, in some cases Node itself may become unstable and not respond to ^C. This hints at the snag being somewhere in the native code.

cjbj · 2016-07-25T22:21:24Z

Are you locking up all the threads? See https://github.com/oracle/node-oracledb/blob/master/doc/api.md#numberofthreads
If you're using a pool, what does _logStats() show

balabanshik · 2016-07-25T23:50:15Z

Yes, we are using a pool. Here are the stats, just after running the last (hanging) query:

Pool statistics:
...total up time (milliseconds): 40398
...total connection requests: 2
...total requests enqueued: 0
...total requests dequeued: 0
...total requests failed: 0
...total request timeouts: 0
...max queue length: 0
...sum of time in queue (milliseconds): 0
...min time in queue (milliseconds): 0
...max time in queue (milliseconds): 0
...avg time in queue (milliseconds): 0
...pool connections in use: 2
...pool connections open: 2
Related pool attributes:
...queueRequests: true
...queueTimeout (milliseconds): 60000
...poolMin: 1
...poolMax: 16
...poolIncrement: 1
...poolTimeout (seconds): 30
...stmtCacheSize: 30
Related environment variables:
...process.env.UV_THREADPOOL_SIZE: undefined

The zeroes are not very inspiring. Did I do something wrong?

Changing UV_THREADPOOL_SIZE doesn't seem to affect anything. Tried setting it to 20, for example, doesn't make any difference.

cjbj · 2016-07-26T00:49:26Z

If you are getting blocked with only 2 connections, perhaps you are simply deadlocking yourself?Sounds like a chance to do some more debugging.

balabanshik · 2016-07-26T20:05:45Z

None of the queries involved touch the same tables
None of the queries use locking of any kind. The only queries are inserts and selects (without 'for update')
Wouldn't it fail with ORA-00060 after a little bit, instead of hanging indefinitely?
Even if no error is encountered, shouldn't there be a timeout of some sort, to ensure the callback gets called?

dmcghan · 2016-07-26T20:21:46Z

@balabanshik I know it's a pain, but could you try to put together a reproducible test case? Without that we have to wonder about how you might have written your application and where the issue really exists.

Chris' comments on deadlocking may be spot on. Try setting your UV_THREADPOOL_SIZE to a higher number, like 20 or 30. Test and let us know how that goes.

See #395 for more details.

cjbj · 2016-07-26T23:18:37Z

@balabanshik You could also experiment with turning autocommit on and see what effect that has.

balabanshik · 2016-08-17T17:58:12Z

Well, I've spent some time trying to find the suggested deadlock, and couldn't find any. In the meanwhile, we refactored some stuff, and the problem stopped somehow. Wish I had a more enlightening resolution.

cjbj added the question label Jul 25, 2016

balabanshik closed this as completed Aug 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Connection.execute quietly hangs without calling the callback #481

Connection.execute quietly hangs without calling the callback #481

balabanshik commented Jul 25, 2016

cjbj commented Jul 25, 2016

balabanshik commented Jul 25, 2016 •

edited

Loading

cjbj commented Jul 26, 2016

balabanshik commented Jul 26, 2016

dmcghan commented Jul 26, 2016

cjbj commented Jul 26, 2016

balabanshik commented Aug 17, 2016

Connection.execute quietly hangs without calling the callback #481

Connection.execute quietly hangs without calling the callback #481

Comments

balabanshik commented Jul 25, 2016

cjbj commented Jul 25, 2016

balabanshik commented Jul 25, 2016 • edited Loading

cjbj commented Jul 26, 2016

balabanshik commented Jul 26, 2016

dmcghan commented Jul 26, 2016

cjbj commented Jul 26, 2016

balabanshik commented Aug 17, 2016

balabanshik commented Jul 25, 2016 •

edited

Loading