Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Correctly apply inflection rules on multi-word strings. (Fixes #7132) #7134

Closed
wants to merge 2 commits into from

5 participants

Godfrey Chan José Valim Michael Koziarski David Celis Steve Klabnik
Godfrey Chan
Owner

Edit: see my comment below for a more general version of this problem/fix.

Currently, apply_inflections does not take into account that the input might be a multi-word string separated by underscore. (tableize etc relies on this behavior to work correctly.) This bug affects only uncountable multi-word strings separated by underscores:

   "funky jeans".singularize # => "funky jeans"
   "client information".pluralize # => "client information"

   "funky_jeans".singularize # => "funky_jean"
   "client_information".pluralize # => "client_informations"

It's also worth pointing out that "funky_jeans".singularize used to work correctly before 9b4622a, which was merged almost 2 years ago so this is essentially broken since Rails 3.

On the other hand, it appears that "client_information".pluralize has always been broken, so this does change the behavior of tableize and could potentially break some older apps out there.

Closes #7132.

Godfrey Chan chancancode Correctly apply inflection rules on uncountable multi-word strings.
Currently, apply_inflections does not take into account that the input might be a multi-word string separated by underscore. (tableize etc relies on this behavior to work correctly.) This bug affects only uncountable multi-word strings separated by underscores:

   "funky jeans".singularize # => "funky jeans"
   "client information".pluralize # => "client information"

   "funky_jeans".singularize # => "funky_jean"
   "client_information".pluralize # => "client_informations"

It's also worth pointing out that "funky_jeans".singularize used to work correctly before 9b4622a, which was merged almost 2 years ago so this is essentially broken since Rails 3.

On the other hand, it appears that "client_information".pluralize has always been broken, so this does change the behavior of tableize and could potentially break some older apps out there.

Closes #7132.
fba8a6a
José Valim
Owner

Thanks for the detailed reported. Although I don't think this is the proper solution. After all, after removing the boundary character from the regexp \b, we could match words that contain just part of the uncountable word. Perhaps we want to add [\b\_\-] (or even break \b apart) explicitly instead:

1.9.3-p0-perf :001 > "fish" =~ /\bfish/
=> 0
1.9.3-p0-perf :002 > "sawfish" =~ /\bfish/
=> nil
1.9.3-p0-perf :003 > "saw_fish" =~ /\bfish/
=> nil
1.9.3-p0-perf :004 > "saw_fish" =~ /[\b_-]fish/
=> 3

ps: I am not sure if sawfish is a valid word or not, but the idea is just to show a scenario where we could possibly generate invalid matches.

Godfrey Chan
Owner

@josevalim Actually, the fix I included here works by breaking \w apart, thus removing the need for \b. \w expands to (I think) [a-zA-Z0-9_], which is basically what we want here except we don't care about A-Z (because of the i flag) and we don't want _. Therefore, [a-z0-9]+$ would always match the last word in the string without the need of \b.

José Valim
Owner

Right, thanks. Looks good to me.

Godfrey Chan
Owner

Actually, I just noticed the problem is more wide-spread than I think. Consider this:

'optical mouse' # => "optical mouses" (instead of "optical mice")

This is broken since 16e7f2f (see #5177). The root of the problem is we don't have a clear spec on what arguments are considered "fair game" for the inflectors.

I did a little bit of digging. Pretty much since the beginning of time (dc3d6eb), pluralize was expected to support multi-word strings separated by underscores. However, this is not really documented explicitly so everyone forgot about it. And then we started adding inflector rules that expects the input to be a single-word strings (note the /^...$/i). Overtime more and more such rules were added. And then at some point, we started assumeing pluralize/singularize should take on CamelCase strings as well. Eventually everyone forget about pluralize's double duty.

So basically, no one really knows what singularize/pluralize should do, and the fact that tableize worked okay all along (on most inputs anyways) is more or less a miracle.

I can update the PR to address this, but we will have to first agree on which of the followings is the "right" fix:

  1. "singularize/pluralize is supposed to take phrases": Fix the inflector rules to drop all usage of /^.../i and replace them with a lookahead of something (because we already know \b doesn't work).

  2. "singularize/pluralize is supposed to take phrases, but inflector rules are supposed to operate on a single word": Then change apply_inflection to match the rules on the last word in the input.

  3. "singularize/pluralize is supposed to take a single word only": Then change tableize, update the docs and drop the unnecessary tests.

  4. "Admit that the inflector is broken/inconsistent, but is too big to fail at this point": Do nothing.

None of these options is ideal. 4 might be most backwards-compatible, but we already broke backwards compatibility in 16e7f2f, we might as well fix this once and for all in Rails 4.

You could, in theory, fix all of these problems by adding custom inflector rules, but it's not trivial. For example, let's say you need an 'Admin::CustomerInformation' model. To work around this bug one might try:

ActiveSupport::Inflector.inflections do |inflect|
  inflect.uncountable "information" # Wrong. The inflector already know this.

  inflect.uncountable "customer_information" # Also wrong. tableize('admin_customer_information') doesn't work.

  inflect.plural /customer_information$/i, "customer_information" # Almost. But 'CustomerInformation'.pluralize won't work, if you ever need to do that.

  inflect.plural /(information)$/i, "\1" # Bingo! (Actually, this has the suffix problem that @josevalim mentioned...)
end

And things like "stinky fish", "optical mouse" would each require pretty different hacks to make them work. My point is, working around this requires a far amount of knowledge to the internals of the Inflector, and offloading this to the users seems like asking a little bit too much IMO.

Perfect example of hydra code?

Godfrey Chan
Owner

/cc @jeremy

Godfrey Chan chancancode Correctly handle phrases in singularize/pluralize.
Currently, the inflector is not set up correctly to handle phrases
(multi-word strings) as input. This ocasionally works right now, but it
is more or less a coincidence. Since Rails internally depend on this
singularize/pluraize being able to handle phrases (see tableize), this
commit fixes the bug and correctly handle phrases in singularize and
pluralize. The following should work as expected after this patch:

   'MyClass'.pluralize # => 'MyClasses'
   'some-slug'.pluralize # => 'some-slugs'
   'saw_fish'.pluralize # => 'saw_fish'
   'optical mouse'.pluralize # => 'optical mice'
   'funky jeans'.singularize # => 'funky jeans'

See #7134 for discussions.
ca856d6
Godfrey Chan
Owner

I updated the PR, assuming option 2 is the way to go. I didn't squash them because I think they deserve to each live inside their own commit to make things more traceable.

Godfrey Chan
Owner

@NZKoz Can I get some feedback from you as well?

Michael Koziarski
Owner

My opinion is that both this change, and the one which broke backwards compatibility previously, are not worth the can of worms they'll undoubtably open.

If I was forced to pick here, then 3 would be my choice, but honestly, this way madness lies

David Celis

Option 4 seems silly. There are change logs and upgrade guides for a reason; if developers are haphazardly updating their Rails versions without reading those, do they have anybody to blame but themselves? Rails is a huge framework; updates shouldn't be taken lightly. Being this concerned about backwards compatibility with a major release (4.0) seems silly to me, especially considering inflections such as these can be easily altered by the user.

Steve Klabnik
Collaborator

This change will break tons of apps, and we're not interested in breaking changes. Furthermore, the inflector is generally considered frozen. I agree with @NZKoz .

David Celis

Interesting that you should link to @wycats' comment on that thread, because isn't it stating that Rails versioning actually does pretty much work like I think? Wouldn't this paragraph be an argument against @chancancode's 4th point?

Rails major versions usually make substantial breaking changes. The primary difference between minor and major releases in Rails is the magnitude of the breaking changes. Changes made in minor versions are small and should not require large-scale application changes, unless required for security fixes. Changes made in major releases often do require large-scale application changes (think XSS protection in Rails 3.0 and the Ruby 1.9 requirement in Rails 4.0).

Just finding it interesting that, in a topic where many claim Rails doesn't follow Semver, you link directly to @wycats' comment stating that it does actually follow a form of Semver where major upgrades introduce breaking changes of a high order of magnitude (such as breaking changes that would be brought in via fixing the inflector)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jul 23, 2012
  1. Godfrey Chan

    Correctly apply inflection rules on uncountable multi-word strings.

    chancancode authored
    Currently, apply_inflections does not take into account that the input might be a multi-word string separated by underscore. (tableize etc relies on this behavior to work correctly.) This bug affects only uncountable multi-word strings separated by underscores:
    
       "funky jeans".singularize # => "funky jeans"
       "client information".pluralize # => "client information"
    
       "funky_jeans".singularize # => "funky_jean"
       "client_information".pluralize # => "client_informations"
    
    It's also worth pointing out that "funky_jeans".singularize used to work correctly before 9b4622a, which was merged almost 2 years ago so this is essentially broken since Rails 3.
    
    On the other hand, it appears that "client_information".pluralize has always been broken, so this does change the behavior of tableize and could potentially break some older apps out there.
    
    Closes #7132.
  2. Godfrey Chan

    Correctly handle phrases in singularize/pluralize.

    chancancode authored
    Currently, the inflector is not set up correctly to handle phrases
    (multi-word strings) as input. This ocasionally works right now, but it
    is more or less a coincidence. Since Rails internally depend on this
    singularize/pluraize being able to handle phrases (see tableize), this
    commit fixes the bug and correctly handle phrases in singularize and
    pluralize. The following should work as expected after this patch:
    
       'MyClass'.pluralize # => 'MyClasses'
       'some-slug'.pluralize # => 'some-slugs'
       'saw_fish'.pluralize # => 'saw_fish'
       'optical mouse'.pluralize # => 'optical mice'
       'funky jeans'.singularize # => 'funky jeans'
    
    See #7134 for discussions.
This page is out of date. Refresh to see the latest.
22 activesupport/lib/active_support/inflector/methods.rb
View
@@ -312,12 +312,28 @@ def const_regexp(camel_cased_word) #:nodoc:
# apply_inflections("posts", inflections.singulars) # => "post"
def apply_inflections(word, rules)
result = word.to_s.dup
+ prefix, last_word = split_phrase(result)
- if word.empty? || inflections.uncountables.include?(result.downcase[/\b\w+\Z/])
+ if word.empty? || inflections.uncountables.include?(last_word.downcase)
result
else
- rules.each { |(rule, replacement)| break if result.sub!(rule, replacement) }
- result
+ rules.each { |(rule, replacement)| break if last_word.sub!(rule, replacement) }
+ prefix + last_word
+ end
+ end
+
+ # Split a phrase into two chunks for +apply_inflections+:
+ # split_phrase('word') # => ['', 'word']
+ # split_phrase('two words') # => ['two ', 'words']
+ # split_phrase('two_words') # => ['two_', 'words']
+ # split_phrase('two-words') # => ['two-', 'words']
+ # split_phrase('myWord') # => ['my','Word']
+ # split_phrase('many many words') # => ['many many ','words']
+ def split_phrase(phrase)
+ if last_word = underscore(phrase)[/[a-z0-9]+\Z/i]
+ [phrase[0...-last_word.length], phrase[-last_word.length..-1]]
+ else
+ [phrase, '']
end
end
end
12 activesupport/test/inflector_test.rb
View
@@ -17,6 +17,18 @@ def test_pluralize_empty_string
assert_equal "", ActiveSupport::Inflector.pluralize("")
end
+ def test_pluralize_camel_case_string
+ assert_equal "camelCases", ActiveSupport::Inflector.pluralize("camelCase")
+ assert_equal "OpticalMice", ActiveSupport::Inflector.pluralize("OpticalMouse")
+ assert_equal "SawFish", ActiveSupport::Inflector.pluralize("SawFish")
+ end
+
+ def test_singularize_camel_case_string
+ assert_equal "camelCase", ActiveSupport::Inflector.singularize("camelCases")
+ assert_equal "OpticalMouse", ActiveSupport::Inflector.singularize("OpticalMice")
+ assert_equal "SawFish", ActiveSupport::Inflector.singularize("SawFish")
+ end
+
ActiveSupport::Inflector.inflections.uncountable.each do |word|
define_method "test_uncountability_of_#{word}" do
assert_equal word, ActiveSupport::Inflector.singularize(word)
5 activesupport/test/inflector_test_cases.rb
View
@@ -15,6 +15,7 @@ module InflectorTestCases
"jeans" => "jeans",
"funky jeans" => "funky jeans",
"my money" => "my money",
+ "your mouse" => "your mice",
"category" => "categories",
"query" => "queries",
@@ -62,6 +63,9 @@ module InflectorTestCases
"old_news" => "old_news",
"news" => "news",
+ "funky_jeans" => "funky_jeans",
+ "stinky_fish" => "stinky_fish",
+
"series" => "series",
"species" => "species",
@@ -81,7 +85,6 @@ module InflectorTestCases
"status" => "statuses",
"status_code" => "status_codes",
"mouse" => "mice",
-
"louse" => "lice",
"house" => "houses",
"octopus" => "octopi",
Something went wrong with that request. Please try again.