Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to keep a pool connection healthy #300

Closed
wmertens opened this issue Dec 4, 2015 · 18 comments
Closed

Document how to keep a pool connection healthy #300

wmertens opened this issue Dec 4, 2015 · 18 comments

Comments

@wmertens
Copy link

wmertens commented Dec 4, 2015

I am trying to use a pool as follows:

  1. at application start, create a Promise "PoolP" for a pool
  2. for every query:
    1. use PoolP to get the pool and request a connection
    2. with the connection, perform a query
    3. always release the connection

This has the following issues:

  • After a suspend and resume the following day, I got the following error when trying to query:

    Error: ORA-03113: end-of-file on communication channel
    Process ID: 5051
    Session ID: 69 Serial number: 140733193389459
    at Error (native)
    

    I suppose I should make a wrapper that intercepts this error and creates a new pool? Is this expected behavior, and are there other error conditions to handle?

  • if a connection is active, the application will keep running in the background when killed, holding open ports etc. I tried adding a SIGINT handler that terminates the pool but that just errors out if a connection is active

Thoughts? Should I just create a connection for every query and not bother with the pool?

@fedulovivan
Copy link

Had faced with similar issue but exposed with another exception - ORA-03135: connection lost contact. Probably the following will be helpful.

Our failing scenario:

  • Initialize connection pool with poolMin=0, poolMax=5, poolTimeout=0, which means oracledb driver will never terminate already initialized connections, and keep them in pool infinitely.
  • Execute any query using connection from pool, release it. Connection is initialized from now, but not busy.
  • After some time (usually next working day) make attempt to execute next query. Node application is still running.
  • Node oracledb driver attempts to utilize connection from pool and gains ORA-03135 error, since connection is already not valid due to some network-related issues.

Also in our setup connection could not be terminated on the side of Oracle DB, since options connection_timeout or connection_expires are not used.

Fix:
Remove zero 'poolTimeout' value from pools initialization code, and using default one - 60 seconds.

@cjbj
Copy link
Member

cjbj commented Dec 7, 2015

Theoretically apps should be coded to handle net dropouts which could happen at any point of execution.

@wmertens your case seems a bit specialized. I assume you are suspending/resuming a laptop? Using a non-default pool timeout recommended by @fedulovivan could help. Or you may want to restart the app, or even use non-pooled connections, depending on the app requirements.

The way the connection pool would work in the case where the net drops out is that previously established connections returned from the pool will give an error at use, such as the ORA errors quoted. Until the connection is used, it isn't possible to tell whether it is still valid. On error, you can release it back to the pool, which will fully clean up the connection. See https://github.com/oracle/node-oracledb/blob/v1.4.0/src/dpi/src/dpiConnImpl.cpp#L678 and https://github.com/oracle/node-oracledb/blob/v1.4.0/src/dpi/src/dpiConnImpl.cpp#L514

Next, call pool.getConnection to get a new connection. But beware that each connection in the pool will be invalid. If you get an existing (invalid) connection you have to repeat the release/getConnection sequence until you get a newly created connection.

We have thought about introducing a 'ping' mechanism to check whether pool.getConnection connections are still valid before they are returned to the app. It might work something like http://php.net/manual/en/oci8.configuration.php#ini.oci8.ping-interval The downside is that pinging reduces overall system efficiency because it adds a roundtrip. And since apps really needs to be able to handle arbitrary net drop-outs, is the added simplicity worth it? Many people would say yes!

@sagiegurari
Copy link

I think its a very important feature.
a connection may get invalid during a flow due to net issues, but to expect the app to do getConnection over and over and to check if the connection is valid for each time until it finds a valid connection doesn't seem logical. seems like a must for a pool to provide such functionality.

@steunix
Copy link

steunix commented Dec 7, 2015

Wait, do you mean that getConnection could return an invalid connection? That's awkward, to say the least...

@cjbj
Copy link
Member

cjbj commented Dec 7, 2015

@steunix only from a pool. And as I tried to explain, the use case is no different from the pool giving you a valid connection and then the network dropping out before you use the connection.

@steunix
Copy link

steunix commented Dec 7, 2015

I see your point... just it looks a bit illogic to me: delegating the connection phase to a pool should indeed free the consumer at least from the initial "problems" that may occur in the connection. Just my 2 cents.

@sagiegurari
Copy link

I think it is far from being the same.
in case of an issue in the middle of some exection, you might decide to cancel the entire flow.
but getconnection is a starting point, and we expect to get a valid connection (unless the pool can't at all) at the end and not some invalid connection.
otherwise, whats the difference between the native oracle pool and just using node generic-pool module? what is the added value?

@cjbj
Copy link
Member

cjbj commented Dec 7, 2015

@sagiegurari a failure at the start or the middle has the same end result: the action/code/block/app has failed and the app developer needs to decide whether to re-run or display an error message. But let's not argue. I've already marked this as an enhancement request.

The OCI session pool (which is used for the node-oracledb connection pool) already has advantages over hand-rolled pools, for instance the session pool can handle FAN events when a RAC node fails and will kill idle pool sessions so invalid sessions won't be returned by getConnection. (Note FAN would need to be enabled).

@sagiegurari
Copy link

I think its different because if i do some operation and there is an error, i should get an error object.
in this case, no error object but there is a connection object.
so obviously this means that everything is ok, and connection is valid.
its confusing otherwise, and I hope it will be implemented as part of the OCI as it makes sense.

anyhow in meantime I updated the simple-oracledb wrapper to provide this functionality because I had it in the past and I assumed that moving to oracledb i would get it automatically.

https://github.com/sagiegurari/simple-oracledb#usage-getconnection

@wmertens
Copy link
Author

wmertens commented Dec 8, 2015

@cjbj how about providing a convenience helper that implements an app-wide pool for the lifetime of the application?

One would simply require it and get connections from it. Only a severe error (like timeout when trying to reconnect to Oracle) would make it to the connection handler.

You could also make it easy to release a connection: Provide a callback to be called when the connection is no longer needed.

So the api could be

import {singlePool} from 'oracledb'

singlePool.getConnection((err, conn, cb) => {
  // Do something with conn
  // Done
  cb()
})

or even, with Promises, it would take a function that gets a connection and returns a Promise for when it is done using the connection.

import {promisePoolConn} from 'oracledb'

promisePoolConn(conn => {
  // Do something with conn and return a result or a Promise for a result
})
.then(
  result => console.log(result),
  err => console.error('database or query handler issue', err)
)

Alternatively, the Connection returned from the Pool could be smarter and handle the connectivity failure by replacing itself with a fresh connection. This doesn't require pinging and behaves like I would expect. I suppose this is the best solution but I just typed all the above so I'm keeping that too 😉.

@wmertens
Copy link
Author

wmertens commented Dec 8, 2015

Hoisting last comment part up for better exposure:

Alternatively, the Connection returned from the Pool could be smarter and handle the connectivity failure by replacing itself with a fresh connection. This doesn't require pinging and behaves like I would expect.

@cjbj
Copy link
Member

cjbj commented Dec 8, 2015

I'm going to get myself in knots if I keep trying to explain; and since a configurable ping is on the todo list we're all going to end up happy (specially if someone submits a PR!).

@wmertens the connection validity of a once-used-and-now-released pooled connection returned by a subsequent getConnection isn't known until it is first (re)used so you do need to ping: we don't waste packets by default (FAN is different) on checking the status. The assumption is that the connection is going to be valid most of the time, and your app anyway needs to be able to handle errors that occur at any point of its execution. If you want to impact scalability by pinging, then you need to factor the impact into your application performance (& manager's budget) expectations.

@sagiegurari
Copy link

can't stand c++ so can't do an official PR, that's why I implemented it on top via js.

but you are aware that oracle weblogic does enable to check connections before returning from the pool when doing jndi lookup to a datasource.
http://docs.oracle.com/cd/E23943_01/apirefs.1111/e13952/pagehelp/JDBCjdbcdatasourcesjdbcdatasourceconfigconnectionpooltitle.html
See items: Test Table Name and Test Connections On Reserve
Enables WebLogic Server to test a connection before giving it to a client. (Requires that you specify a Test Table Name.)

@xpiwo
Copy link

xpiwo commented Dec 8, 2015

i agree that the 'low level' error should stay, and allow the app to decide what it wants to do.

but i would like to see a convenience flag of 'retry N time' (with brand new connection) if failed to make a first connection (that way you don't need pinging).

because right now if you have a pool with 10 connections, and the the db is reset, you will need to cycle all 10 until you get a valid one.

such errors can still be logged, in the node way oracle.on('error', function(err){}).

@sagiegurari
Copy link

agree, thats why in my wrapper i allow you to define the retry count.
otherwise you will have a lot of errors in big pools and its a big code duplication to do it everywhere in app level

@jeffm13
Copy link

jeffm13 commented Jun 20, 2016

@cjbj, adding a ping as an option might be interesting, but of limited value to relatively high-volume use cases. Although I thought we might be able to remove our wrapper once 1.9 was released, I'm reluctant. We're seeing a number of ORA-03113 errors on production servers that are satisfying hundreds of thousands of requests a day. When we hit that error, we retry the operation. That seems to be a feasible solution although it's too early to tell. Maybe a 'retry on failure' option would be of value?

@cjbj
Copy link
Member

cjbj commented Jun 21, 2016

@jeffm13 if you're seeing those on active pools, the issue sounds different. Are you getting trace files?

@cjbj
Copy link
Member

cjbj commented Dec 3, 2016

I jus released node-oracledb 1.12.0-dev to GitHub (not npm). It has the timed connection pool pinging we planned - this has worked well elsewhere. Note that there is an even better quality-of-service if you link node-oracledb with Oracle 12.2 client libraries; this extra benefit is something the team has championed to be included in 12.2's OCI session pool. Oracle 12.2 is currently available on Oracle Database Cloud.

Check the node-oracledb doc out for the feature details!

Also, since this issue was opened, we have the Pool queue, which makes scaling easier and pool usage more resilient. The only important part for apps to do is to make sure to release connections back to the pool in all code paths (including error handlers)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants