Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

case_sensitive_equality_operator including BINARY is slow on MySQL #1399

Closed
kenn opened this issue May 29, 2011 · 3 comments
Closed

case_sensitive_equality_operator including BINARY is slow on MySQL #1399

kenn opened this issue May 29, 2011 · 3 comments
Assignees

Comments

@kenn
Copy link
Contributor

kenn commented May 29, 2011

It's about time we should fix this - right now ActiveRecord::ConnectionAdapters::MysqlAdapter has this code block:

def case_sensitive_equality_operator
  "= BINARY"
end

which renders validates_uniqueness_of prohibitively slow. With 7M records, it takes 3+ seconds in our case. It really should be:

def case_sensitive_equality_operator
  "="
end

That way, we benefit from MySQL's collation for case sensitivity in the index (like utf8_unicode_ci or utf8_bin), which is FAST. Or we could add :case_sensitive => nil as a new option particularly for that case.

It's a known problem for a long time actually, and patches have been around:

@sikachu
Copy link
Member

sikachu commented Jun 1, 2011

I think I was the one who working on this patch. The blog posts are quite unrelated because the ability to validates uniqueness with case sensitive for MySQL was lost between 2.3.x and 3.0.x transition.

I bought back the = BINARY() because there is no way you can do case sensitive testing in MySQL without having to alter your database. It was broken in 3.0.x and nobody noticed it.

If you're going to do update your schema to something that will make MySQL always do case sensitive, I think you can already do :case_sensitive => false on the validate_uniqueness_of. I don't think there's a good fix for it, since telling people to migrate to utf8_unicode_ci is also impossible.

@tenderlove, what is your thought on this ticket?

@kenn
Copy link
Contributor Author

kenn commented Jun 1, 2011

Oh, I use both utf8_unicode_ci and utf8_bin in the same app and there's absolutely no way to ask people to change column/index collation, that's my point too.

I use utf8_unicode_ci for email and username, because I want to match KennEjima with kennejima, and I use utf8_bin for secret tokens, because I don't want to let AyBV6hja1 accidentally match with AYBV6HJA1 (token is BASE62 representation of SecureRandom) - both collations exist in the same users table.

In the very early days after releasing the app, we removed every line of validate_uniqueness_of from our code and made a custom validator to do just that.

class BetterUniquenessValidator < ActiveModel::EachValidator
  def validate_each(record, attribute, value)
    if record.send("#{attribute}_changed?")
      target = options[:with_type] ? record.class : record.class.base_class
      if target.where(attribute => record.read_attribute(attribute)).exists?
        record.errors[attribute] << "already in use"
      end
    end
  end
end

# In User.rb
validates :username, :better_uniqueness => true
validates :email, :better_uniqueness => true

IIRC, I found the noticeable performance degradation when we only have 10K users, so the problem should be pretty common, and people must be doing either:

  1. Don't use validate_uniqueness_of and instead use a custom validator just like mine, from the beginning
  2. Don't use validate_uniqueness_of at all and let database raise exception when duplicate (this is actually the fastest if duplicate case is very rare (e.g. UUID), we use this strategy in some places)
  3. Monkey patch case_sensitive_equality_operator
  4. Don't care performance

And the current implementation only benefits people like 4 (and Postgres, maybe? but I guess :case_sensitive => false could kill Postgres too), so I think the reason nobody noticed the regression in 3.0.x is because nobody is using it. :P

Anyway, my suggestion for change is either:

  1. In case :case_sensitive => nil is explicitly given (options.has_key?(:case_sensitive) && options[:case_sensitive].nil?), the equality operator should be just "IS" or "=". Period. (Backward compatible)
  2. Or better yet, do not set :case_sensitive => true by default, but let people explicitly set it for "= BINARY". (Desired)

@ghost ghost assigned sikachu Jun 1, 2011
jake3030 pushed a commit to jake3030/rails that referenced this issue Jun 28, 2011
…ils#1399 state:resolved]

Signed-off-by: Joshua Peek <josh@joshpeek.com>
@sikachu
Copy link
Member

sikachu commented Apr 25, 2012

It's already '=' in current master. Closing this.

@sikachu sikachu closed this as completed Apr 25, 2012
kamipo added a commit that referenced this issue Feb 21, 2019
…collation issues

In MySQL, the default collation is case insensitive. Since the
uniqueness validator enforces case sensitive comparison by default, it
frequently causes mismatched collation issues (performance, weird
behavior, etc) to MySQL users.

https://grosser.it/2009/12/11/validates_uniqness_of-mysql-slow/
#1399
#13465
gitlabhq/gitlabhq@c1dddf8
huginn/huginn#1330 (comment)

This extracts `default_uniqueness_comparison` to ease to handle the
mismatched collation issues on the connection.
kamipo added a commit to kamipo/rails that referenced this issue Mar 4, 2019
In MySQL, the default collation is case insensitive. Since the
uniqueness validator enforces case sensitive comparison by default, it
frequently causes mismatched collation issues (performance, weird
behavior, etc) to MySQL users.

https://grosser.it/2009/12/11/validates_uniqness_of-mysql-slow/
rails#1399
rails#13465
gitlabhq/gitlabhq@c1dddf8
huginn/huginn#1330 (comment)

I'd like to deprecate the implicit default enforcing since I frequently
experienced the problems in code reviews.

Note that this change has no effect to sqlite3, postgresql, and
oracle-enhanced adapters which are implemented as case sensitive by
default, only affect to mysql2 adapter (I can take a work if sqlserver
adapter will support Rails 6.0).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants