Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection problem when a lot of connection requests comes up in milliseconds. #1493

Closed
alichampion opened this issue Apr 19, 2022 · 23 comments
Labels
inactive This was marked by stalebot as inactive question

Comments

@alichampion
Copy link

  1. What versions are you using?
  • process.platform win32
  • process.version v16.14.0
  • process.arch x64
  • require("oracledb").versionString 5.3.0
  • require("oracledb").oracleClientVersionString 19c
  1. Is it an error or a hang or a crash?
    Yes it's an error and hang as well.

  2. What error(s) or behavior you are seeing?

Hi, @dmcghan I'm working on a project where 30-40 requests come up within milliseconds.

And here's the flow at a high level...

  • Create pools on the application bootstrap/start phase.
  • For every request, we get a pool and then get the connection.
  • We perform a query and close the connection.

This all works fine till a point when a lot of new connection open requests pop up connection gets timed out and the connection open takes some 15-90 seconds and sometimes a few milliseconds.

Now my belief in the above explanation is, that DB is unable to handle such an amount of connections and gets timed out but the DBA does not accept this explanation.

So what I did now, is I created a singleton class for both pools and used it, so in this case, a connection will open once and we will not close It from the code now with this implementation I know the connection will be expired or can be broken due to various reasons so singleton class should be so flexible that it can update the connection in singleton class and it's being done using oracledb.connection.ping().

But in this approach, I feel that query execution is slow for each request what do you think?
And when 250+ subsequent requests lands on a single connection it gives ORA-01000 error.

Can you check how can I optimize the queries?

I have attached the OralceDB.txt file which holds JS code for my singleton OracleDB class.

OracleDb.txt

@cjbj
Copy link
Member

cjbj commented Apr 19, 2022

@alichampion can you give the pool size values and also what size you have set for UV_THREADPOOL_SIZE? Also show how you set this environment variable.

What error numbers do you get?

What do your pool statistics show at the time your app has a problem?

Review all the node-oracledb documentation on pooling for some good information.

From a quick look at your code, you seem to be setting pool aliases but not using the aliases to access the pools themselves. This is probably OK (e.g. for logging) but it seems to be adding complexity. With pools you must close connections when you don't need them. Your original statement said you were doing this but your later experiment with "a connection will open once and we will not close It from the code now" isn't going to be usable or scalable.

@cjbj cjbj added question and removed bug labels Apr 19, 2022
@alichampion
Copy link
Author

Hi, @cjbj
As you can see it's being set in app.js line no 7

Screenshot 2022-04-19 at 12 24 15 PM

When working with a single connection upon a lot of requests in a millisecond I get an ORA-01000 error and working with multiple connections upon a lot of requests I get an ORA-03135 error and sometimes we get an ORA-2396 error.

And yes there is another file from where we create pools and open them on application startup.

I had concluded that the error is due to the connection problems with the DB as I told in my question, so I decided to open only one connection and don't close it but later I realized that queries are taking much time so IDK now what to do.

I have also thought of using Database Resident Connection Pool (DRCP) but we have some limitations from the DBA side so we can not do this.

Thanks.

@cjbj
Copy link
Member

cjbj commented Apr 19, 2022

Since you're on Windows setting UV_THREADPOOL_SIZE inside the app has no effect. Even on Linux it's better to set it before the apps starts. See the doc: https://oracle.github.io/node-oracledb/doc/api.html#numberofthreads

Can you give the pool creation parameter values?

Have you reviewed how the pool alias is used in https://github.com/oracle/node-oracledb/blob/main/examples/connectionpool.js and https://github.com/oracle/node-oracledb/blob/main/examples/webapp.js ?

@alichampion
Copy link
Author

Hi, @cjbj

let me try setting it up before the app start.

Here are the pool creation parameters and unfortunately, I can't share the values.

However, the poolMax value is 150.

Screenshot 2022-04-19 at 1 06 20 PM

And yes I have reviewed the documentation and I'm using pool aliases same as mentioned there.

Thanks

@alichampion
Copy link
Author

Hi, @cjbj is it possible for you to have a quick call with me?

@alichampion
Copy link
Author

Hey, @cjbj I have also noticed that when 250 concurrent requests are completed successfully and I send another 250 concurrent requests my docker container exits with error code 137 which means Not enough Ram is available.

I'm unsure if the docker exit problem is related to the open cursor.

Can you assist me, please?

Thanks in advance.

@cjbj
Copy link
Member

cjbj commented Apr 20, 2022

Did you sort out the UV_THREADPOOL_SIZE setting? How are you now setting it?

Did you have solid architectural reasons for using dynamic pool sizes (where poolMin is not the same as poolMax) and the (small) pool increment size? See Connection Pool Sizing and try using fixed size pools.

Why do you need two pools?

Regarding cursors, are you using bind variables properly? I haven't seen anyone mention ORA-1000 for a long time so my first thoughts are that the app isn't pooling properly. For what it's worth, you can check out the old post https://blogs.oracle.com/opal/post/node-oracledb-avoiding-ora-01000-maximum-open-cursors-exceeded

Please share the pool stats I asked for before (Note that the stats show the UV_THREADPOOL_SIZE environment variable value but the Node.js thread pool may have started with a lower size - see the doc)

You might like to read https://cjones-oracle.medium.com/always-use-connection-pools-and-how-909bc2c65444

@alichampion
Copy link
Author

Thanks, @cjbj I'll check your recommendations and let you know.

@alichampion
Copy link
Author

Hi, @cjbj check this out.
{
"gatheredDate": 1650513341362,
"upTime": 57495868,
"upTimeSinceReset": 57482531,
"connectionRequests": 750,
"requestsEnqueued": 312,
"requestsDequeued": 0,
"failedRequests": 0,
"rejectedRequests": 0,
"requestTimeouts": 312,
"maximumQueueLength": 63,
"currentQueueLength": 0,
"timeInQueue": 125141,
"minimumTimeInQueue": 139,
"maximumTimeInQueue": 1110,
"averageTimeInQueue": 401,
"connectionsInUse": 55,
"connectionsOpen": 150,
"connectString": "",
"edition": "",
"events": false,
"externalAuth": false,
"homogeneous": true,
"poolAlias": "prime",
"poolIncrement": 1,
"poolMax": 150,
"poolMaxPerShard": 0,
"poolMin": 150,
"poolPingInterval": 60,
"poolTimeout": 60,
"queueMax": 500,
"queueTimeout": 60,
"sodaMetaDataCache": false,
"stmtCacheSize": 30,
"user": "",
"threadPoolSize": "350"
}

I have followed your blog recommendations except idle connection killing, I feel I can not do this as it has to be done by DBA.

@cjbj
Copy link
Member

cjbj commented Apr 21, 2022

Hard to comment without knowing the load profile.

Do some debugging and check you are closing connections. The stats show queuing but no dequeueing. So what is holding onto connections? Or is UV_THREADPOOL_SIZE not set.

Overall, reduce all the sizes (pool size, application load) to simplify the scenario and try again.

@alichampion
Copy link
Author

I can not understand what you mean by load profile?

I'm releasing the connection as soon as the query is completed rather than closing it.

I have set the UV_THREADPOOL_SIZE in Dockerfile before starting the node process I believe that's the best way right?

@pvenkatraman
Copy link

The connectString, user fields are empty. It should show some valid values.

@alichampion
Copy link
Author

I did it myself before sharing the stats.

@alichampion
Copy link
Author

Hi, @cjbj I did some modifications like used OC-21, close connection after query is completed.

And here is the stats logs.

{
"gatheredDate": 1650875761037,
"upTime": 109012,
"upTimeSinceReset": 88087,
"connectionRequests": 250,
"requestsEnqueued": 100,
"requestsDequeued": 0,
"failedRequests": 0,
"rejectedRequests": 0,
"requestTimeouts": 100,
"maximumQueueLength": 69,
"currentQueueLength": 0,
"timeInQueue": 34456,
"minimumTimeInQueue": 228,
"maximumTimeInQueue": 822,
"averageTimeInQueue": 345,
"connectionsInUse": 150,
"connectionsOpen": 150,
"connectString": "*****",
"edition": "",
"events": false,
"externalAuth": false,
"homogeneous": true,
"poolAlias": "",
"poolIncrement": 1,
"poolMax": 150,
"poolMaxPerShard": 0,
"poolMin": 150,
"poolPingInterval": 60,
"poolTimeout": 60,
"queueMax": 500,
"queueTimeout": 60,
"sodaMetaDataCache": false,
"stmtCacheSize": 30,
"user": "
",
"threadPoolSize": "350"
}

I don't understand why 100 requests were requestsEnqueued and 100 requests were requestTimeouts.

To me I think Oracle should maintain the queue it self.

Let me know what should I do?

@alichampion
Copy link
Author

Hi, @cjbj check this out u can see 150 is poolMin size but opened connections are 131 only and u can also see thread pool size.

{
"gatheredDate": 1650888622747,
"upTime": 625843,
"upTimeSinceReset": 604582,
"connectionRequests": 250,
"requestsEnqueued": 0,
"requestsDequeued": 0,
"failedRequests": 0,
"rejectedRequests": 0,
"requestTimeouts": 0,
"maximumQueueLength": 0,
"currentQueueLength": 0,
"timeInQueue": 0,
"minimumTimeInQueue": 0,
"maximumTimeInQueue": 0,
"averageTimeInQueue": 0,
"connectionsInUse": 0,
"connectionsOpen": 131,
"connectString": "",
"edition": "",
"events": false,
"externalAuth": false,
"homogeneous": true,
"poolAlias": "
",
"poolIncrement": 1,
"poolMax": 150,
"poolMaxPerShard": 0,
"poolMin": 150,
"poolPingInterval": 60,
"poolTimeout": 90,
"queueMax": 500,
"queueTimeout": 0,
"sodaMetaDataCache": false,
"stmtCacheSize": 30,
"user": "
**",
"threadPoolSize": "350"
}

@alichampion
Copy link
Author

alichampion commented Apr 25, 2022

Hi, @cjbj

I also got this weird ORA-3156 error code.

error on 1234567_222 [Error: DPI-1080: connection was closed by ORA-3156] { errorNum: 0, offset: 0 }

@alichampion
Copy link
Author

Hi, @cjbj

I also got this weird ORA-3156 error code.

error on 1234567_222 [Error: DPI-1080: connection was closed by ORA-3156] { errorNum: 0, offset: 0 }

@cjbj just a nudge on this one.

@cjbj
Copy link
Member

cjbj commented May 8, 2022

@alichampion I've was on vacation. How are you going with this?

@alichampion
Copy link
Author

Can you go through the last four questions/explanations?

@stale
Copy link

stale bot commented Jun 12, 2022

This issue has been automatically marked as inactive because it has not been updated recently. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the inactive This was marked by stalebot as inactive label Jun 12, 2022
@stale
Copy link

stale bot commented Jun 20, 2022

This issue has been automatically closed because it has not been updated for a month.

@stale stale bot closed this as completed Jun 20, 2022
@jatinseth2007
Copy link

Hello @cjbj,

I am facing the same issue on linux machine - DPI-1080: connection was closed by ORA-2396,
"oracledb": "^5.4.0"

connection settings ->
poolMax: 25, poolMin: 5, poolIncrement: 5, poolTimeout: 1800, poolPingInterval: 300

One thing, I wanted to understand, if we have poolMin as 5, poolTimeout as 1800, and poolPingInterval 300, in documentation, it's written as after poolTimeout seconds it will kill ideal connections and bring them back to poolMin but what about the connections it's holding as poolMin, will it kill those as well and recreates new to match poolMin?

Our Oracle has a setting to kill connections that are ideal for more than an hour, that's why we kept the poolTimeout as half of their time but still connections went to this error state.

@cjbj
Copy link
Member

cjbj commented Apr 14, 2023

@jatinseth2007 I opened a new issue for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
inactive This was marked by stalebot as inactive question
Projects
None yet
Development

No branches or pull requests

4 participants