Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression in checking out connections between Rails 4.2 and 6.1 #1140

Open
skunkworker opened this issue Jan 30, 2024 · 1 comment

Comments

@skunkworker
Copy link
Contributor

skunkworker commented Jan 30, 2024

I have noticed a large performance regression between Rails 4.2 and 6.1 when quickly checking out a large number of connections.

https://github.com/skunkworker/jruby_activerecord_checkout_regression

The included benchmark creates connection pools, randomly selects a pool, opens N connections, runs select 1=1 then checks in the connections. Connected to the postgres database (which should be empty).

When the Rails log level is set to :debug. Rails 4.2 is up to 5x faster than 6.1. When using MRI, Rails 6.1 goes back to a normal amount of time.
When the Rails log level is set to :info. Rails 4.2 is 1.5x faster than 6.1.

I'll be doing further investigation into multi database and multi schema with large tables to see if I can reproduce some of the behavior we have been seeing.

@skunkworker
Copy link
Contributor Author

skunkworker commented Jan 30, 2024

I've gone back and added more tests and a readme to the above repo. Along with a schema of approximately 1700 columns

Preliminary results:

JRuby - Rails 4.2

Run count Time
1000 2.94
5000 8.71
10000 9.23
20000 14.13

JRuby - Rails 6.1 with databaseMetadataCacheFieldsMiB=0

Run count Time
1000 18.39
5000 86.0
10000 172.79
20000 328.26

JRuby - Rails 6.1 with databaseMetadataCacheFieldsMiB=5

Run count Time
1000 4.95
5000 17.74
10000 29.07
20000 58.48

JRuby - Rails 6.1 with databaseMetadataCacheFieldsMiB=1

Run count Time
1000 5.12
5000 18.29
10000 28.59
20000 43.98

I will be adding more tests around having hundreds of schemas and seeing if I can stress test the database metadatacache because of it.

If we take for example 1000 schemas (shards), then the metadata cache would be almost useless in it's current state as there would be a possibility of 1700 * 1000 = 1,700,000 columns to cache, and since the cache is per-connection and not shared for the entire adapter, it will quickly fill up the metadata cache and render it irrelevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant