Skip to content
This repository

Correctly apply inflection rules on multi-word strings. (Fixes #7132) #7134

Closed
wants to merge 2 commits into from

5 participants

Godfrey Chan José Valim Michael Koziarski David Celis Steve Klabnik
Godfrey Chan
Collaborator

Edit: see my comment below for a more general version of this problem/fix.

Currently, apply_inflections does not take into account that the input might be a multi-word string separated by underscore. (tableize etc relies on this behavior to work correctly.) This bug affects only uncountable multi-word strings separated by underscores:

   "funky jeans".singularize # => "funky jeans"
   "client information".pluralize # => "client information"

   "funky_jeans".singularize # => "funky_jean"
   "client_information".pluralize # => "client_informations"

It's also worth pointing out that "funky_jeans".singularize used to work correctly before 9b4622a, which was merged almost 2 years ago so this is essentially broken since Rails 3.

On the other hand, it appears that "client_information".pluralize has always been broken, so this does change the behavior of tableize and could potentially break some older apps out there.

Closes #7132.

Godfrey Chan Correctly apply inflection rules on uncountable multi-word strings.
Currently, apply_inflections does not take into account that the input might be a multi-word string separated by underscore. (tableize etc relies on this behavior to work correctly.) This bug affects only uncountable multi-word strings separated by underscores:

   "funky jeans".singularize # => "funky jeans"
   "client information".pluralize # => "client information"

   "funky_jeans".singularize # => "funky_jean"
   "client_information".pluralize # => "client_informations"

It's also worth pointing out that "funky_jeans".singularize used to work correctly before 9b4622a, which was merged almost 2 years ago so this is essentially broken since Rails 3.

On the other hand, it appears that "client_information".pluralize has always been broken, so this does change the behavior of tableize and could potentially break some older apps out there.

Closes #7132.
fba8a6a
José Valim
Owner

Thanks for the detailed reported. Although I don't think this is the proper solution. After all, after removing the boundary character from the regexp \b, we could match words that contain just part of the uncountable word. Perhaps we want to add [\b\_\-] (or even break \b apart) explicitly instead:

1.9.3-p0-perf :001 > "fish" =~ /\bfish/
=> 0
1.9.3-p0-perf :002 > "sawfish" =~ /\bfish/
=> nil
1.9.3-p0-perf :003 > "saw_fish" =~ /\bfish/
=> nil
1.9.3-p0-perf :004 > "saw_fish" =~ /[\b_-]fish/
=> 3

ps: I am not sure if sawfish is a valid word or not, but the idea is just to show a scenario where we could possibly generate invalid matches.

Godfrey Chan
Collaborator

@josevalim Actually, the fix I included here works by breaking \w apart, thus removing the need for \b. \w expands to (I think) [a-zA-Z0-9_], which is basically what we want here except we don't care about A-Z (because of the i flag) and we don't want _. Therefore, [a-z0-9]+$ would always match the last word in the string without the need of \b.

José Valim
Owner

Right, thanks. Looks good to me.

Godfrey Chan
Collaborator

Actually, I just noticed the problem is more wide-spread than I think. Consider this:

'optical mouse' # => "optical mouses" (instead of "optical mice")

This is broken since 16e7f2f (see #5177). The root of the problem is we don't have a clear spec on what arguments are considered "fair game" for the inflectors.

I did a little bit of digging. Pretty much since the beginning of time (dc3d6eb), pluralize was expected to support multi-word strings separated by underscores. However, this is not really documented explicitly so everyone forgot about it. And then we started adding inflector rules that expects the input to be a single-word strings (note the /^...$/i). Overtime more and more such rules were added. And then at some point, we started assumeing pluralize/singularize should take on CamelCase strings as well. Eventually everyone forget about pluralize's double duty.

So basically, no one really knows what singularize/pluralize should do, and the fact that tableize worked okay all along (on most inputs anyways) is more or less a miracle.

I can update the PR to address this, but we will have to first agree on which of the followings is the "right" fix:

  1. "singularize/pluralize is supposed to take phrases": Fix the inflector rules to drop all usage of /^.../i and replace them with a lookahead of something (because we already know \b doesn't work).

  2. "singularize/pluralize is supposed to take phrases, but inflector rules are supposed to operate on a single word": Then change apply_inflection to match the rules on the last word in the input.

  3. "singularize/pluralize is supposed to take a single word only": Then change tableize, update the docs and drop the unnecessary tests.

  4. "Admit that the inflector is broken/inconsistent, but is too big to fail at this point": Do nothing.

None of these options is ideal. 4 might be most backwards-compatible, but we already broke backwards compatibility in 16e7f2f, we might as well fix this once and for all in Rails 4.

You could, in theory, fix all of these problems by adding custom inflector rules, but it's not trivial. For example, let's say you need an 'Admin::CustomerInformation' model. To work around this bug one might try:

ActiveSupport::Inflector.inflections do |inflect|
  inflect.uncountable "information" # Wrong. The inflector already know this.

  inflect.uncountable "customer_information" # Also wrong. tableize('admin_customer_information') doesn't work.

  inflect.plural /customer_information$/i, "customer_information" # Almost. But 'CustomerInformation'.pluralize won't work, if you ever need to do that.

  inflect.plural /(information)$/i, "\1" # Bingo! (Actually, this has the suffix problem that @josevalim mentioned...)
end

And things like "stinky fish", "optical mouse" would each require pretty different hacks to make them work. My point is, working around this requires a far amount of knowledge to the internals of the Inflector, and offloading this to the users seems like asking a little bit too much IMO.

Perfect example of hydra code?

Godfrey Chan
Collaborator

/cc @jeremy

Godfrey Chan Correctly handle phrases in singularize/pluralize.
Currently, the inflector is not set up correctly to handle phrases
(multi-word strings) as input. This ocasionally works right now, but it
is more or less a coincidence. Since Rails internally depend on this
singularize/pluraize being able to handle phrases (see tableize), this
commit fixes the bug and correctly handle phrases in singularize and
pluralize. The following should work as expected after this patch:

   'MyClass'.pluralize # => 'MyClasses'
   'some-slug'.pluralize # => 'some-slugs'
   'saw_fish'.pluralize # => 'saw_fish'
   'optical mouse'.pluralize # => 'optical mice'
   'funky jeans'.singularize # => 'funky jeans'

See #7134 for discussions.
ca856d6
Godfrey Chan
Collaborator

I updated the PR, assuming option 2 is the way to go. I didn't squash them because I think they deserve to each live inside their own commit to make things more traceable.

Godfrey Chan
Collaborator

@NZKoz Can I get some feedback from you as well?

Michael Koziarski
Owner
NZKoz commented

My opinion is that both this change, and the one which broke backwards compatibility previously, are not worth the can of worms they'll undoubtably open.

If I was forced to pick here, then 3 would be my choice, but honestly, this way madness lies

David Celis

Option 4 seems silly. There are change logs and upgrade guides for a reason; if developers are haphazardly updating their Rails versions without reading those, do they have anybody to blame but themselves? Rails is a huge framework; updates shouldn't be taken lightly. Being this concerned about backwards compatibility with a major release (4.0) seems silly to me, especially considering inflections such as these can be easily altered by the user.

Steve Klabnik
Collaborator

This change will break tons of apps, and we're not interested in breaking changes. Furthermore, the inflector is generally considered frozen. I agree with @NZKoz .

Steve Klabnik steveklabnik closed this
David Celis

Interesting that you should link to @wycats' comment on that thread, because isn't it stating that Rails versioning actually does pretty much work like I think? Wouldn't this paragraph be an argument against @chancancode's 4th point?

Rails major versions usually make substantial breaking changes. The primary difference between minor and major releases in Rails is the magnitude of the breaking changes. Changes made in minor versions are small and should not require large-scale application changes, unless required for security fixes. Changes made in major releases often do require large-scale application changes (think XSS protection in Rails 3.0 and the Ruby 1.9 requirement in Rails 4.0).

Just finding it interesting that, in a topic where many claim Rails doesn't follow Semver, you link directly to @wycats' comment stating that it does actually follow a form of Semver where major upgrades introduce breaking changes of a high order of magnitude (such as breaking changes that would be brought in via fixing the inflector)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 2 unique commits by 1 author.

Jul 23, 2012
Godfrey Chan Correctly apply inflection rules on uncountable multi-word strings.
Currently, apply_inflections does not take into account that the input might be a multi-word string separated by underscore. (tableize etc relies on this behavior to work correctly.) This bug affects only uncountable multi-word strings separated by underscores:

   "funky jeans".singularize # => "funky jeans"
   "client information".pluralize # => "client information"

   "funky_jeans".singularize # => "funky_jean"
   "client_information".pluralize # => "client_informations"

It's also worth pointing out that "funky_jeans".singularize used to work correctly before 9b4622a, which was merged almost 2 years ago so this is essentially broken since Rails 3.

On the other hand, it appears that "client_information".pluralize has always been broken, so this does change the behavior of tableize and could potentially break some older apps out there.

Closes #7132.
fba8a6a
Godfrey Chan Correctly handle phrases in singularize/pluralize.
Currently, the inflector is not set up correctly to handle phrases
(multi-word strings) as input. This ocasionally works right now, but it
is more or less a coincidence. Since Rails internally depend on this
singularize/pluraize being able to handle phrases (see tableize), this
commit fixes the bug and correctly handle phrases in singularize and
pluralize. The following should work as expected after this patch:

   'MyClass'.pluralize # => 'MyClasses'
   'some-slug'.pluralize # => 'some-slugs'
   'saw_fish'.pluralize # => 'saw_fish'
   'optical mouse'.pluralize # => 'optical mice'
   'funky jeans'.singularize # => 'funky jeans'

See #7134 for discussions.
ca856d6
This page is out of date. Refresh to see the latest.
22  activesupport/lib/active_support/inflector/methods.rb
@@ -312,12 +312,28 @@ def const_regexp(camel_cased_word) #:nodoc:
312 312
     #  apply_inflections("posts", inflections.singulars) # => "post"
313 313
     def apply_inflections(word, rules)
314 314
       result = word.to_s.dup
  315
+      prefix, last_word = split_phrase(result)
315 316
 
316  
-      if word.empty? || inflections.uncountables.include?(result.downcase[/\b\w+\Z/])
  317
+      if word.empty? || inflections.uncountables.include?(last_word.downcase)
317 318
         result
318 319
       else
319  
-        rules.each { |(rule, replacement)| break if result.sub!(rule, replacement) }
320  
-        result
  320
+        rules.each { |(rule, replacement)| break if last_word.sub!(rule, replacement) }
  321
+        prefix + last_word
  322
+      end
  323
+    end
  324
+
  325
+    # Split a phrase into two chunks for +apply_inflections+:
  326
+    #  split_phrase('word') # => ['', 'word']
  327
+    #  split_phrase('two words') # => ['two ', 'words']
  328
+    #  split_phrase('two_words') # => ['two_', 'words']
  329
+    #  split_phrase('two-words') # => ['two-', 'words']
  330
+    #  split_phrase('myWord') # => ['my','Word']
  331
+    #  split_phrase('many many words') # => ['many many ','words']
  332
+    def split_phrase(phrase)
  333
+      if last_word = underscore(phrase)[/[a-z0-9]+\Z/i]
  334
+        [phrase[0...-last_word.length], phrase[-last_word.length..-1]]
  335
+      else
  336
+        [phrase, '']
321 337
       end
322 338
     end
323 339
   end
12  activesupport/test/inflector_test.rb
@@ -17,6 +17,18 @@ def test_pluralize_empty_string
17 17
     assert_equal "", ActiveSupport::Inflector.pluralize("")
18 18
   end
19 19
 
  20
+  def test_pluralize_camel_case_string
  21
+    assert_equal "camelCases", ActiveSupport::Inflector.pluralize("camelCase")
  22
+    assert_equal "OpticalMice", ActiveSupport::Inflector.pluralize("OpticalMouse")
  23
+    assert_equal "SawFish", ActiveSupport::Inflector.pluralize("SawFish")
  24
+  end
  25
+
  26
+  def test_singularize_camel_case_string
  27
+    assert_equal "camelCase", ActiveSupport::Inflector.singularize("camelCases")
  28
+    assert_equal "OpticalMouse", ActiveSupport::Inflector.singularize("OpticalMice")
  29
+    assert_equal "SawFish", ActiveSupport::Inflector.singularize("SawFish")
  30
+  end
  31
+
20 32
   ActiveSupport::Inflector.inflections.uncountable.each do |word|
21 33
     define_method "test_uncountability_of_#{word}" do
22 34
       assert_equal word, ActiveSupport::Inflector.singularize(word)
5  activesupport/test/inflector_test_cases.rb
@@ -15,6 +15,7 @@ module InflectorTestCases
15 15
     "jeans"       => "jeans",
16 16
     "funky jeans" => "funky jeans",
17 17
     "my money"    => "my money",
  18
+    "your mouse"  => "your mice",
18 19
 
19 20
     "category"    => "categories",
20 21
     "query"       => "queries",
@@ -62,6 +63,9 @@ module InflectorTestCases
62 63
     "old_news"    => "old_news",
63 64
     "news"        => "news",
64 65
 
  66
+    "funky_jeans" => "funky_jeans",
  67
+    "stinky_fish" => "stinky_fish",
  68
+
65 69
     "series"      => "series",
66 70
     "species"     => "species",
67 71
 
@@ -81,7 +85,6 @@ module InflectorTestCases
81 85
     "status"      => "statuses",
82 86
     "status_code" => "status_codes",
83 87
     "mouse"       => "mice",
84  
-
85 88
     "louse"       => "lice",
86 89
     "house"       => "houses",
87 90
     "octopus"     => "octopi",
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.