Increase performance for table_exists? #3867

jadeforrest · 2011-12-05T19:27:00Z

At New Relic, we have hundreds of thousands of tables, and our migrations took 30 minutes without a similar patch. This cuts it down to a more reasonable amount of time. The more tables you have, the more efficiency gain you derive from the patch.

The rescue false part is ugly, but necessary as far as I can tell. I don't know of a cross-database statement you can make that will work without trapping and relying on errors.

Tested on MySQL and SQLite, but I believe this should work across any database.

At New Relic, we have hundreds of thousands of tables, and our migrations took 30 minutes without this similar patch. This cuts it down to a more reasonable amount of time. The rescue false part is ugly, but necessary as far as I can tell. I don't know of a cross-database statement you can make that will work without trapping errors.

jfernandez · 2011-12-05T19:35:48Z

+1

tenderlove · 2011-12-05T19:58:01Z

Isn't the tables list cached? If it's a problem of the array being too long, can we just convert it to a hash and do tables.key?(table_name)?

I'm hesitant to add this patch because it could mask actual errors coming from the adapter.

jadeforrest · 2011-12-05T20:47:37Z

The list is cached, but that doesn't help if you truly have a lot of tables, or if you have tables being generated dynamically. Migrations, for example, do not get the caching benefit.

And this would be a performance improvement for anyone.

One possibility is the adapters could override with their own, database-dependent way of determining whether a table exists. That would preserve the performance improvement but allow you to avoid swallowing errors (which is ugly).

rkbodenner · 2011-12-05T20:49:38Z

The problem is doing a "SHOW TABLES". Takes up to 30 minutes in production, per shard. Different adapters might have specific exception classes that we could look for, but then it wouldn't work for all adapters.

tenderlove · 2011-12-05T21:13:19Z

@jadeforrest can we push this down to the mysql adapter(s)? I'd be more comfortable / happy with it if we rescued only expected exceptions. Also, is this technique faster on SQLite3 when it has many tables?

jadeforrest · 2011-12-05T21:43:51Z

@tenderlove It will be faster on any database, by a huge amount.

We can push it into the mysql adapters if you prefer, but in my opinion the base case is pretty broken. Imagine a database with a million tables. Do you want to ask the database if that one table exists, or pull down the entire list of a million tables, and then store that whole list of tables, and query against it?

That said, I totally understand your distaste for swallowing the exceptions. I see two possibilities: either we could add a comment encouraging the database adapters to override with a db specific version that does not swallow the exception, or we could just only implement this in the adapters.

Which sounds better to you?

tenderlove · 2011-12-05T22:15:44Z

@jadeforrest good point. Let's leave your pull request as is for now. If people complain, we can revert or fix it.

Increase performance for table_exists?

jenseng · 2011-12-06T04:06:02Z

i've confirmed this is fine for postgres (9.0+ anyway) ... the query planner applies the false condition filter before it ever scans any rows, so execution time should always be extremely fast (<0.1ms in my tests)

jadeforrest · 2011-12-06T04:11:04Z

Thank you for testing it on Postgres!

Sent from my iPhone

On Dec 5, 2011, at 8:06 PM, Jon Jensenreply@reply.github.com wrote:

i've confirmed this is fine for postgres (9.0+ anyway) ... the query planner applies the false condition filter before it ever scans any rows, so execution time should always be extremely fast (<0.1ms in my tests)

Reply to this email directly or view it on GitHub:
#3867 (comment)

tenderlove added a commit that referenced this pull request Dec 5, 2011

Merge pull request #3867 from jadeforrest/master

988061d

Increase performance for table_exists?

tenderlove merged commit 988061d into rails:master Dec 5, 2011

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase performance for table_exists? #3867

Increase performance for table_exists? #3867

jadeforrest commented Dec 5, 2011

jfernandez commented Dec 5, 2011

tenderlove commented Dec 5, 2011

jadeforrest commented Dec 5, 2011

rkbodenner commented Dec 5, 2011

tenderlove commented Dec 5, 2011

jadeforrest commented Dec 5, 2011

tenderlove commented Dec 5, 2011

jenseng commented Dec 6, 2011

jadeforrest commented Dec 6, 2011

Increase performance for table_exists? #3867

Increase performance for table_exists? #3867

Conversation

jadeforrest commented Dec 5, 2011

jfernandez commented Dec 5, 2011

tenderlove commented Dec 5, 2011

jadeforrest commented Dec 5, 2011

rkbodenner commented Dec 5, 2011

tenderlove commented Dec 5, 2011

jadeforrest commented Dec 5, 2011

tenderlove commented Dec 5, 2011

jenseng commented Dec 6, 2011

jadeforrest commented Dec 6, 2011