Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARTEMIS-2955 commons-dbcp2 performance issue with Derby Embedded DBMS #3308

Merged
merged 1 commit into from Oct 20, 2020

Conversation

franz1981
Copy link
Contributor

This commit is fixing:

  • a missing commit that can make leak a connection
  • restricting default specific commons-dbcp2 to the default data source
  • setting poolPreparedStatements true by default
  • configured embedded Derby to be in-memory to speedup tests

This commit is fixing:
- a missing commit that can make leak a connection
- restricting default specific commons-dbcp2 to the default data source
- setting poolPreparedStatements true by default
- configured embedded Derby to be in-memory to speedup tests
@franz1981
Copy link
Contributor Author

@gtully @uomik That's it
Opened for discussion

@gtully
Copy link
Contributor

gtully commented Oct 20, 2020

looks good

@gtully gtully merged commit 28d9038 into apache:master Oct 20, 2020
@uomik
Copy link
Contributor

uomik commented Oct 20, 2020

Looks good to me, too.

One thought, though. With dbcp2, are poolPreparedStatements and maxTotal such properties that should always set to default values unless explicitely defined in datasource properties?

@franz1981
Copy link
Contributor Author

@uomik

One thought, though. With dbcp2, are poolPreparedStatements and maxTotal such properties that should always set to default values unless explicitely defined in datasource properties?

Yes exactly, to avoid mixing default with configured ones.

@uomik
Copy link
Contributor

uomik commented Oct 20, 2020

One thought, though. With dbcp2, are poolPreparedStatements and maxTotal such properties that should always set to default values unless explicitely defined in datasource properties?

Yes exactly, to avoid mixing default with configured ones.

Now those are only configured if no custom properties are set. But if any custom property is set these will not be applied, unless one adds them to broker.xml

Just a sidenote. I was never quite happy with the solution I had to do for setting custom properties for datasource in broker.xml. It's just too complicated, but had to do in this way to maintain backwards compatibility.

@franz1981
Copy link
Contributor Author

@uomik this seems a nice future enhancement to ease configuring it :)
You know that PRs are welcome and you already did a nice job ;)
In case I will do it or help on this 🙂

@uomik
Copy link
Contributor

uomik commented Oct 22, 2020

Setting poolPreparedStatements = true has proven to be a little problematic. The random database connection losses have made a comeback. The nodes randomly reboot themselves due to connection issues when renewing live locks. There always seem to be a batch of large messages going through the failing node when this happens.

It seems that the eviction runs don't work as expected when this setting is used, or there is some undocumented limit for open cursors for a db connection, and the connection is exhausted with too many open prepared statements. Anyway, this is not a general issue, but most likely caused by our MySQL server that has aggressive connection reset procedures.

I will experiment with datasource configurations and will get back to this

@uomik
Copy link
Contributor

uomik commented Oct 30, 2020

Setting poolPreparedStatements = true has proven to be a little problematic. The random database connection losses have made a comeback.

This indeed was again an issue with our database. It seems – quite obviously – that open prepared statements keep the connection active and prevent them to appear on idle connection eviction runs. The fix was to simply set maximum live time for db connections. Now everything works like a charm :)

@franz1981
Copy link
Contributor Author

franz1981 commented Oct 30, 2020

HI @uomik! Sorry I've missed your first comment on

It seems – quite obviously – that open prepared statements keep the connection active and prevent them to appear on idle connection eviction runs

We're talking about default pool configuration right?
It means maxTotal is infinite, testWhileIdle is false, poolPreparedStatements is true and maxIdle is 8.

My expectations are:

  1. if a page/large-message load would happen, the number of connections could grow unbounded (along with the related prepared statements)
  2. Given that testWhileIdle is false, there is no object evictor that would test any idle connection until borrowed again

But

that open prepared statements keep the connection active and prevent them to appear on idle connection eviction runs

That seems more a pool "hidden" logic: given that a connection has some prepared statement it cannot be assumed as idle, despite none is checking that the connection is still valid.
That's correct? I'm going to run some test to verify that.

I don't see yet the correlation with lock renew failures if you're using the latest master: see https://issues.apache.org/jira/browse/ARTEMIS-2941 for more info.

@uomik
Copy link
Contributor

uomik commented Oct 30, 2020

Correct. With default pool configuration, we had issues. But by using more aggressive settings everything works fine. And all this because our database closes connections after one hour, being active or not. This is not a general issue, I suppose. The extra pool settings are:

  • timeBetweenEvictionRunsMillis: "900000"
  • numTestsPerEvictionRun: "3"
  • testWhileIdle: "true"
  • testOnReturn: "true"
  • testOnBorrow: "true"
  • maxWaitMillis: "60000"
  • minIdle: "0"
  • maxConnLifetimeMillis: "1800000"
  • removeAbandonedOnMaintenance: "true"

And yes, the snapshot we are using was made before ARTEMIS-2941. Unfortunately we will have to wait until the official 2.16.0 release until update, for "organisational" reasons.

@franz1981
Copy link
Contributor Author

franz1981 commented Oct 30, 2020

@uomik
what's your maxTotal? Are you caching prepared statements?

Because with the default value we set (ie infinite) maxWaitMillis shouldn't be necessary.
The doc says

...
when there are no available connections

But I think it means that it await that time before giving up using some of the connections in the pool, not to create new ones...not sure TBH

And yes, the snapshot we are using was made before ARTEMIS-2941. Unfortunately we will have to wait until the official 2.16.0 release until update, for "organisational" reasons.

Got it: just wanna be sure that is fine for default users of 2.16 to have
poolPreparedStatements true and maxOpenPreparedStatements infinite .
If not we're still in time to send a PR with some better default configuration...

@uomik
Copy link
Contributor

uomik commented Oct 30, 2020

Accidentally removed maxTotal: "50" from above configuration snippet. Setting maxTotal and maxOpenPreparedStatements to finite values is just because there is a connection and cursor limit for each user set on our database. Again, not a general requirement. We did run the cluster with default values for a while and had no issues, these are there just to be sure. We are caching prepared statement, poolPreparedStatements is set to true

Sorry for creating this much confusion. My attempt here was just to clarify and confirm that this change had no negative impact on us. In our case we just needed some special configurations to keep the eviction runs working

@franz1981
Copy link
Contributor Author

@uomik

We did run the cluster with default values for a while and had no issues

Hence without object eviction? I'm still running some test now :)

Sorry for creating this much confusion. My attempt here was just to clarify and confirm that this change had no negative impact on us. In our case we just needed some special configurations to keep the eviction runs working

you've helped and you're helping a lot, it's fine :)

@uomik
Copy link
Contributor

uomik commented Oct 30, 2020

Hence without object eviction? I'm still running some test now :)

Actually I was just referring to default values for maxTotal and maxOpenPreparedStatements. Without eviction runs we need to restart the Artemis instaces once in an hour. Not that practical ;)

@franz1981
Copy link
Contributor Author

@uomik another couple of quick questions..

maxWaitMillis: "60000"

that should be applied in case of an empty (limited) pool, but it means that a borrow operation could take 1 minute awaiting a new connection to be released in the pool, that seems quite bad for the lock renew, that should happen on timely periods...it's correct?
Consider that in the new version I've forced a query timeout of renew to be within the renew period before failing, to get a timely check...

maxConnLifetimeMillis: "1800000"

That should be sufficient to save the broker to get random failures (if we assume that a connection will be used in less then 30 minutes); beside some reactivity needs I don't see a reason to use a background object eviction, it's correct?

I'm trying to collect some of these info into some docs to share it :)

@uomik
Copy link
Contributor

uomik commented Oct 30, 2020

maxWaitMillis: "60000"

that should be applied in case of an empty (limited) pool, but it means that a borrow operation could take 1 minute awaiting a new connection to be released in the pool, that seems quite bad for the lock renew, that should happen on timely periods...it's correct?
Consider that in the new version I've forced a query timeout of renew to be within the renew period before failing, to get a timely check...

This could definitely be lower, especially with a low maxTotal value. I've yet not thought this through, didn't just want it to be the default that's wait indefinitely :)

maxConnLifetimeMillis: "1800000"

That should be sufficient to save the broker to get random failures (if we assume that a connection will be used in less then 30 minutes); beside some reactivity needs I don't see a reason to use a background object eviction, it's correct?

That's true. Setting testOnReturn = true and configuring background eviction runs are there just to keep the pool clean of expired connections. But these are not needed, the lifetime expiration is checked when connections are handed out from the pool.

@franz1981
Copy link
Contributor Author

thanks @uomik and I'm going to send soon some PRs to improve things there...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants