-
Notifications
You must be signed in to change notification settings - Fork 21.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply schema cache dump when creating connections #17632
Apply schema cache dump when creating connections #17632
Conversation
Does this improve the performance of creating a new connection? If so could you share the benchmark? |
Simple benchmark: Pool size set to 20, postgreSQL & mySQL database on localhost, different apps Benchmark.measure { 19.times { ActiveRecord::Base.connection_pool.checkout } } The first result in both tests is without any schema caching. The second run is with schema cache but no patch. The third is with the patch. This is primarily an optimisation/bugfix of the schema cache functionality that exists with some non-obvious benefits. For example, in mySQL the description of a table is obtained by a In a heavily-trafficked rails application, with a reasonable pool size, you're pretty much guaranteed to run most (if not all) of those queries at the same time for your core tables, seriously impacting database performance as it locks the tables to answer those description requests. This is where the most benefit from this PR is felt. |
Do we need to worry about making |
086f935
to
114d6a8
Compare
@thedarkone the schema cache is populated with every table in the database before it's dumped, so in theory it should be complete and never need to be updated. However my intention was to make a copy anyway, since relying on that assumption seems like a bad idea; I just misunderstood how I've added an implementation of @rafaelfranca this patch doesn't speed up creating a connection, but it does speed up the first query to each table from a new connection. I'll put together a benchmark to demonstrate this in the next few days. |
114d6a8
to
7b20f53
Compare
Here's a benchmark: https://gist.github.com/eugeneius/f2ccf1694d8759df7884#file-benchmark When you boot a multithreaded server and start sending it traffic, many threads will all need to check out new connections and query tables at the same time. That's what this benchmark is simulating, by clearing the models' column caches and removing all connections from the pool at the start of each iteration. Without the schema cache dump, this patch has no effect:
With the schema cache dump in place, the benchmark runs significantly faster:
|
7b20f53
to
7ddc517
Compare
7ddc517
to
b06a83e
Compare
@tenderlove you reviewed this feature originally, what do you think of this change? |
b06a83e
to
e814913
Compare
e0a8631
to
97c8c9e
Compare
The `db:schema:cache:dump` rake task dumps the database schema structure to `db/schema_cache.dump`. If this file is present, the schema details are loaded into the currently checked out connection by a railtie while Rails is booting, to avoid having to query the database for its schema. The schema cache dump is only applied to the initial connection used to boot the application though; other connections from the same pool are created with an empty schema cache, and still have to load the structure of each table directly from the database. With this change, a copy of the schema cache is associated with the connection pool and applied to connections as they are created.
9ed3bbc
to
33fe7cc
Compare
I've rebased this branch and removed the CHANGELOG entry, since it was the only thing that conflicted. For posterity, here's what I had written:
Does anyone have feedback on whether this patch is likely to be accepted, or what changes I need to make for that to happen? I've been running this as a monkey patch in production for over a year now; deploying our app without it causes CPU usage on the database to spike as every new process fetches the schema at once. We can't deploy without it, but I'd like to not have to maintain the monkey patch. |
This looks good to me but I want to get @tenderlove or @matthewd feedback before merging. |
…_pool Apply schema cache dump when creating connections
The
db:schema:cache:dump
rake task dumps the database schema structure todb/schema_cache.dump
. If this file is present, the schema details are loaded into the currently checked out connection by a railtie while Rails is booting, to avoid having to query the database for its schema.The schema cache dump is only applied to the initial connection used to boot the application though; other connections from the same pool are created with an empty schema cache, and still have to load the structure of each table directly from the database.
With this change, a copy of the schema cache is associated with the connection pool and applied to connections as they are created.