ActiveRecord 3.1.2 takes 10 times slower than 3.1.1 #3678

Closed
Psli opened this Issue Nov 18, 2011 · 44 comments

Comments

Projects
None yet

Psli commented Nov 18, 2011

Mac OS X 10.6.8
MySQL 5.5.15

rails new blog -d mysql
rails generate scaffold Post name:string title:string content:text

rails 3.1.1

Started GET "/posts/1" for 127.0.0.1 at 2011-11-18 13:24:59 +0800
  Processing by PostsController#show as HTML
  Parameters: {"id"=>"1"}
  Deal Load (0.3ms)  SELECT `posts`.* FROM `posts` WHERE `posts`.`id` = 1 LIMIT 1
Rendered posts/show.html.erb within layouts/application (14.7ms)
Completed 200 OK in 29ms (Views: 21.8ms | ActiveRecord: 4.8ms)

rails 3.1.2

Started GET "/posts/1" for 127.0.0.1 at 2011-11-18 13:27:33 +0800
  Processing by PostsController#show as HTML
  Parameters: {"id"=>"1"}
  Deal Load (0.3ms)  SELECT `posts`.* FROM `posts` WHERE `posts`.`id` = 1 LIMIT 1
Rendered posts/show.html.erb within layouts/application (15.7ms)
Completed 200 OK in 309ms (Views: 24.0ms | ActiveRecord: 282.5ms)

hooopo commented Nov 18, 2011

yeah,the same as my development environment.But my production environment on Heroku with PostgreSQL db is ok.

Contributor

christos commented Nov 18, 2011

I can confirm that ActiveRecord on 3.1.2 is dog slow on MacOS X 10.7.2 and MySQL 5.5.14

Using git bisect I arrived to this commit: fb73423 which has a high probability of being the culprit.

I am investigating further.

/cc @kennyj

Contributor

kennyj commented Nov 18, 2011

I have now investigated. But I couldn't reproduce this issue in my environment (mysql 5.0.77).
mysql 5.5.x?

Psli commented Nov 18, 2011

ruby 1.9.3
mysql 5.1.x

also slow

Contributor

christos commented Nov 18, 2011

Could be 5.5.x. Launching this from the console:

SELECT t.constraint_type, k.column_name
FROM information_schema.table_constraints t
JOIN information_schema.key_column_usage k
USING (constraint_name, table_schema, table_name)
WHERE t.table_schema = DATABASE()
AND t.table_name   = 'posts';

...takes 200ms vs ~0ms for a simple DESCRIBE

Using explain it seems that it does a lot:

+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                                                                  |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------------------------------------------+
|  1 | SIMPLE      | t     | ALL  | NULL          | NULL | NULL    | NULL | NULL | Using where; Open_full_table; Scanned all databases                    |
|  1 | SIMPLE      | k     | ALL  | NULL          | NULL | NULL    | NULL | NULL | Using where; Open_full_table; Scanned all databases; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+------------------------------------------------------------------------+

Could it have something to do with the amount of different databases each of us have (I have about 20 in my installation)

Contributor

christos commented Nov 18, 2011

OK. This is not a database version issue. It most definitely has to do with the number of databases in the server...

I just tested it against the shared DB where we host our production code, which probably has hundreds of databases and it is running version 5.0.87-d10-ourdelta65-log. It took 16s (yes, seconds, not milliseconds)

Contributor

tjh commented Nov 18, 2011

I could not reproduce the issue on MySQL 5.1.59, REE 1.8.7 p253.

/cc @kennyj

Contributor

christos commented Nov 18, 2011

This is also happening in the master branch.

Pull requests #3525 and #3533 that resolve issue #3440 are the culprits.

/cc'ing @jonleighton who might be able to shed some more light on this

Contributor

kennyj commented Nov 18, 2011

@christos @tjh @Psli thanks your report. I'm thinking about how we can speed up this SQL.
If you have a good idea, then please let me know.

Contributor

christos commented Nov 18, 2011

@kennyj I can't think of a way that wouldn't involve creating indexes for information_schema (is that even possible?)

A different way to attack it would be to rollback the change and then use an alternative version of pk_and_sequence_for, with your code, only to be used by the schema dumper?

Contributor

kennyj commented Nov 18, 2011

@christos agree. we can't create indexes for information_schema. We need another way we can correctly determine the unique key.

pk_and_sequence_for is called by primary_key method. Therefore, this method would have been used in various places.

@christos christos pushed a commit to christos/rails that referenced this issue Nov 18, 2011

Christos Zisopoulos Only used detailed schema introspection when doing a schema dump.
Fixes #3678
280b2b7
Contributor

christos commented Nov 18, 2011

@kennyj See pull request above. The only side-effect is that the schema_dumper_test.rb runs rather slowly.

Contributor

kennyj commented Nov 18, 2011

@christos thanks your fix. so quickly !

BTW, I have one idea.
Please execute show index from ~ statement in your environment (my one has only a little detabases).

mysql > create table foo (
     -> id int(11) auto_increment,
     -> primary key (id)
     -> );

mysql> show index from foo;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| foo   |          0 | PRIMARY  |            1 | id          | A         |           0 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

mysql> create table fooo (
    -> id1 int(11) not null,
    -> id2 int(11) not null,
    -> primary key (id1, id2)
    -> );
Query OK, 0 rows affected (0.02 sec)

mysql> show index from fooo;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| fooo  |          0 | PRIMARY  |            1 | id1         | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| fooo  |          0 | PRIMARY  |            2 | id2         | A         |           0 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

mysql  > create table foooo (
      -> bar varchar(20) not null unique
      ->);

mysql> show index from foooo;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| foooo |          0 | bar      |            1 | bar         | A         |           0 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

Assumes thus the ruby ​​code like the following.

      def pk_and_sequence_for(table)
        execute_and_free("SHOW INDEX FROM #{quote_table_name(table)} WHERE Key_name = 'PRIMARY' ", 'SCHEMA') do |result|
          keys = each_hash(result).map { |row| row[:Column_name] }
          keys.length == 1 ? [keys.first, nil] : nil
        end
      end
Contributor

kenn commented Nov 18, 2011

+1 to the last patch from @kennyj - information_schema should never be used anywhere as it involves re-opening the entire database files and/or full directory scans and I don't think there's any way to make it faster when one of the databases / tables happens to have large dataset.

http://dev.mysql.com/doc/refman/5.1/en/information-schema-optimization.html

It's also critical for development iteration - it could take 20-30 seconds to run rake db:migrate or rake db:rollback if you have large data on desktop.

@kennyj kennyj added a commit to kennyj/rails that referenced this issue Nov 18, 2011

@kennyj kennyj Use `show index from`. We could fix `pk_and_sequence_for` method's pe…
…rformance problem (GH #3678)
f9b9179

@kennyj kennyj added a commit to kennyj/rails that referenced this issue Nov 18, 2011

@kennyj kennyj Use `show index from`. We could fix `pk_and_sequence_for` method's pe…
…rformance problem (GH #3678)
04cc3fa
Contributor

christos commented Nov 18, 2011

@kennyj, using show index from seems to be even faster than describe (I get 0.2ms vs. 1.3ms for describe)

You should go ahead and create the pull request...

Wicked!

Contributor

kennyj commented Nov 19, 2011

@christos, thank you !
I'll send pull request.

Thanks for guys help, we are likely to resolve the issue.

@ghost

ghost commented Nov 19, 2011

Rails 3.1.2
Started POST "/users" for 127.0.0.1 at 2011-11-19 18:31:33 +0700
Rendered users/new.haml within layouts/application (24.0ms)
Completed 200 OK in 1267ms (Views: 97.6ms | ActiveRecord: 1092.9ms)
Rails 3.1.1
Started POST "/users" for 127.0.0.1 at 2011-11-19 18:48:53 +0700
Rendered users/new.haml within layouts/application (89.7ms)
Completed 200 OK in 178ms (Views: 96.6ms | ActiveRecord: 4.7ms)
@ghost

ghost commented Nov 20, 2011

how can i install this version now? i need to fork it, make a gem and bring it to my gem folder?

Contributor

asanghi commented Nov 20, 2011

Does anyone think this is problem enough to push a Rails 3.1.3 ASAP? I would imagine lots of mysql databases is not uncommon for developers, neither is development mode.

Member

jonleighton commented Nov 20, 2011

Yes. I am doing it now, had some other stuff to fix first.

I've just upgraded to 3.1.2 (then 3.1.3) from 3.0.12, and am also using Resque, which forks a process for each job and am using MySQL 5.0.51. I've got a large database (over 80million rows on one table, 30+ million on another, etc, and SHOW INDEX can take up to 10 seconds per table.
This is not great on deploy, or running the console (which also slows down MySQL), but the real problem seems with Resque, which is doing this for each job as far as I can tell. I think the prob is that SHOW INDEX has to work out the cardinality, which as far as I can tell #pk_and_sequence_for doesn't need. An alternative might be SHOW CREATE TABLE table_name, which doesn't fetch the cardinality and therefore is pretty much instantaneous, even for very large tables.

Contributor

kennyj commented Nov 22, 2011

I'm testing now. But I cann't reproduce @CountCulture 's problem (10 million row on one table ...)
But, I'll consider @CountCulture 's views after work.

CREATE TABLE animals (
  id INT(11) NOT NULL AUTO_INCREMENT,
  name CHAR(30) NOT NULL,
  PRIMARY KEY (id)
);

# insert 10 million rows

mysql> show index from animals where Key_name = 'PRIMARY';
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| animals |          0 | PRIMARY  |            1 | id          | A         |    10000000 |     NULL | NULL   |      | BTREE      |         |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
1 row in set (0.00 sec)

Contributor

kennyj commented Nov 22, 2011

p.s. 30 million rows

mysql> show index from animals where Key_name = 'PRIMARY';
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table   | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| animals |          0 | PRIMARY  |            1 | id          | A         |    30000000 |     NULL | NULL   |      | BTREE      |         |
+---------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
1 row in set (0.00 sec)

I suspect this will be a tricky one to reproduce as an isolated test case. I'm not that great with the internals of MySQL but I suspect it's got to do with the load on the server (quite heavy), the memory available and the size of the index (which may need to be loaded into memory when running the query). I'll post a question at StackOverflow/MySQL forums to see how the query is run internally and under what conditions it will be made very slow.

Contributor

kennyj commented Nov 22, 2011

@CountCulture

BTW, Has your first mention been realized in the next commit ?
kennyj/rails@3517437

(This is backup plan...)

Yes, that would fix it. Many thanks. Have posted a question on the MySQL forums and will post here as soon as I get a response.

Contributor

kenn commented Nov 23, 2011

I was able to reproduce and confirm what @CountCulture said, with a table with 400M rows.

> SHOW INDEX FROM details WHERE Key_name = 'PRIMARY';
1 row in set (1.44 sec)

> SHOW CREATE TABLE details;
1 row in set (0.00 sec)

I tried this 5 times, and confirmed that SHOW INDEX consistently took 1.x second, while SHOW CREATE TABLE always returned instantaneously. The load average on the production server was not so high at the time of this test.

HTH

Contributor

kennyj commented Nov 23, 2011

@kenn thanks for your report.

@CountCulture
/cc @christos @jonleighton @Psli

Now, I think that we should use show create table .
What do you think ?

If using show create table, we should test it more before sending pull request.

  1. with large datasets.
  2. with lots of other databases.
    ...

p.s. @CountCulture , I'll also wait for your response.

Thanks Kenny. Posted it on MySQL forums but haven't had any responses yet, so just posted in StackOverflow too

Contributor

kennyj commented Nov 23, 2011

  1. with lots of other databases.

@christos @Psli
Please test SHOW CREATE TABLE performance?

Member

jonleighton commented Nov 23, 2011

I am beginning to regret that we ever changed this method in the first place, for a minor release.

My current position is that the current slowness is unlikely to affect the majority of users, and I am worried about causing further regressions in the stable branch.

So I'd be happy to accept a patch against master to use SHOW CREATE TABLE but would need more persuading to put it in 3-1-stable.

cc @tenderlove, do you have a view?

I've posted also on the Resque mailing list to see if anyone there is feeling the pain or has any suggestions.

At the moment I'm faced with either downgrading to a version of Rails prior to this change, or moving from Resque back to DelayedJob (which has its own problems at the scale we're using it for, hence the move to Resque)

Contributor

christos commented Nov 25, 2011

@joinleighton Safest bet would be to either

  • In 3-1-stable revert the original commit fb73423 and subsequent fixes (mine and @kennyj's) or...
  • ...or: Revert to 280b2b7 which although it slows downs the tests when the many databases situation occurs it still uses the original DESCRIBE table_name for runtime primary key introspection, code which is guaranteed not to slow down anything.

For master, you could try and apply @yaaule's commit 6020df5 and let people see how it works for them.

Still not got any response to queries at MySQL forums and StackOverflow, but I suspect Thanksgiving has taken many offline. Not sure what the best way of trying out the patch in production. Can I monkey patch... in which case will it be OK to put in an initializer, or is this too late in the loading cycle?

Contributor

kennyj commented Nov 26, 2011

@CountCulture

It seems to work fine a monkey patch in an initializer.

for rails-3.1.3
$ vim config/initializers/monkey_patch.rb

require 'active_record'
require 'active_record/base'

module ActiveRecord
  module ConnectionAdapters
    class Mysql2Adapter < AbstractAdapter
      def pk_and_sequence_for(table) #:nodoc:
puts "!!!"
        keys = []
        result = execute("SHOW CREATE TABLE #{quote_table_name(table)}", 'SCHEMA')
        ct = ''
        result.each(:symbolize_keys => true, :as => :hash) { |row| ct << row[:"Create Table"] }
        if ct.to_s =~ /PRIMARY KEY\s+\((.+)\)/
          keys = $1.split(",").map { |key| key.gsub(/`/, "") }
          keys.length == 1 ? [keys.first, nil] : nil
        else
          nil
        end
      end
    end
  end
end
Owner

jeremy commented Dec 2, 2011

@jonleighton 👍 to going back to square one with this issue, just revert everything, then use SHOW CREATE TABLE in 3.2.

For what it's worth I've been using the Monkey Patch in development and production for a week now, and there have been no problems at all with it (and it's made a huge difference to performance).

Member

jonleighton commented Dec 5, 2011

Backported: 8d55a6d

Contributor

kenn commented Jan 25, 2012

Any idea as to when 3.1.4 gets released?

WTF

We still see 50-times-slower-than-usual restart at deploy with 3.1.3.

apolzon commented Mar 6, 2012

We were experiencing very slow queries after upgrading to rails 3.1.2 from 3.0.11.

In particular, a simple model.assocation.find_by_id(x) went from ~85ms to an average of ~285ms.

Upgrading to Rails 3.1.4 did NOT alleviate the issue (neither did adding any of the patches on 3.1.2). Our average time on that query is still around 260ms.

UPDATE:

After making a couple of optimizations, we have gotten our 3.1 performance CLOSE to 3.0. Our specific sore-spot is still a resque worker which is just inserting rows using Model.create!

In 3.0, on a local box, we can process about 4.5 events per second.
In 3.1, on a local box, we can process about 3 events per second.

tangent
We also identified that a call to self.columns in 3.1 is MUCH slower than in 3.0. We changed that call to a static list until the performance in that call can be investigated.

@ghost

ghost commented Mar 13, 2012

@apolzon: Any optimizations to Rails, or just app code? We're also finding Rails 3.1 to be very slow.

apolzon commented Mar 13, 2012

We had a composite key. Something in active record around this area changed and caused performance to tank when using this unsupported feature. Added the composite keys gem and we were more or less back to our previous performance (with the primary key patch from above). A couple of other guys on the team worked on this and had to make some modifications to get it working in 3.1.

Additionally, if you use Resque, you really need to prime the schema cache. We did this by priming the schema cache in a before first fork hook (just call #columns on any active record models you need).

Overall, we observed active records create calls to actually be faster overall in 3.1 than in 3.0.

@ghost

ghost commented Mar 14, 2012

Thanks, @apolzon, I appreciate the response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment