New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase performance for table_exists? #3867
Conversation
At New Relic, we have hundreds of thousands of tables, and our migrations took 30 minutes without this similar patch. This cuts it down to a more reasonable amount of time. The rescue false part is ugly, but necessary as far as I can tell. I don't know of a cross-database statement you can make that will work without trapping errors.
+1 |
Isn't the I'm hesitant to add this patch because it could mask actual errors coming from the adapter. |
The list is cached, but that doesn't help if you truly have a lot of tables, or if you have tables being generated dynamically. Migrations, for example, do not get the caching benefit. And this would be a performance improvement for anyone. One possibility is the adapters could override with their own, database-dependent way of determining whether a table exists. That would preserve the performance improvement but allow you to avoid swallowing errors (which is ugly). |
The problem is doing a "SHOW TABLES". Takes up to 30 minutes in production, per shard. Different adapters might have specific exception classes that we could look for, but then it wouldn't work for all adapters. |
@jadeforrest can we push this down to the mysql adapter(s)? I'd be more comfortable / happy with it if we rescued only expected exceptions. Also, is this technique faster on SQLite3 when it has many tables? |
@tenderlove It will be faster on any database, by a huge amount. We can push it into the mysql adapters if you prefer, but in my opinion the base case is pretty broken. Imagine a database with a million tables. Do you want to ask the database if that one table exists, or pull down the entire list of a million tables, and then store that whole list of tables, and query against it? That said, I totally understand your distaste for swallowing the exceptions. I see two possibilities: either we could add a comment encouraging the database adapters to override with a db specific version that does not swallow the exception, or we could just only implement this in the adapters. Which sounds better to you? |
@jadeforrest good point. Let's leave your pull request as is for now. If people complain, we can revert or fix it. |
Increase performance for table_exists?
i've confirmed this is fine for postgres (9.0+ anyway) ... the query planner applies the false condition filter before it ever scans any rows, so execution time should always be extremely fast (<0.1ms in my tests) |
Thank you for testing it on Postgres! Sent from my iPhone On Dec 5, 2011, at 8:06 PM, Jon Jensenreply@reply.github.com wrote:
|
At New Relic, we have hundreds of thousands of tables, and our migrations took 30 minutes without a similar patch. This cuts it down to a more reasonable amount of time. The more tables you have, the more efficiency gain you derive from the patch.
The rescue false part is ugly, but necessary as far as I can tell. I don't know of a cross-database statement you can make that will work without trapping and relying on errors.
Tested on MySQL and SQLite, but I believe this should work across any database.