public
Description: Ruby on Rails
Homepage: http://rubyonrails.org
Clone URL: git://github.com/rails/rails.git
Flesh out the parameterize method to support non-ascii text and underscores.
NZKoz (author)
Thu Sep 11 07:03:38 -0700 2008
commit  1ddde91303883b47f2215779cf45d7008377bd0d
tree    a1a389cba8c08504bad19fcf4d8340cc554e8cee
parent  46bac29de7e39bd2af6ed6cfba0498a921b5213e
...
257
258
259
260
 
261
262
263
...
257
258
259
 
260
261
262
263
0
@@ -257,7 +257,7 @@ module ActiveSupport
0
     #   <%= link_to(@person.name, person_path %>
0
     #   # => <a href="/person/1-donald-e-knuth">Donald E. Knuth</a>
0
     def parameterize(string, sep = '-')
0
-      string.gsub(/[^a-z0-9]+/i, sep).downcase
0
+      string.chars.normalize(:kd).to_s.gsub(/[^\x00-\x7F]+/, '').gsub(/[^a-z0-9_\-]+/i, sep).downcase
0
     end
0
 
0
     # Create the name of a table like Rails does for models to table names. This method
...
144
145
146
147
 
 
 
 
148
149
150
...
144
145
146
 
147
148
149
150
151
152
153
0
@@ -144,7 +144,10 @@ module InflectorTestCases
0
 
0
   StringToParameterized = {
0
     "Donald E. Knuth"                     => "donald-e-knuth",
0
-    "Random text with *(bad)* characters" => "random-text-with-bad-characters"
0
+    "Random text with *(bad)* characters" => "random-text-with-bad-characters",
0
+    "Malmö"                               => "malmo",
0
+    "Garçons"                             => "garcons",
0
+    "Allow_Under_Scores"                  => "allow_under_scores"
0
   }
0
 
0
   UnderscoreToHuman = {

Comments

henrik Thu Sep 11 09:44:18 -0700 2008

Nice. Shouldn’t the to_s go right after “string”, though?

tarmo Thu Sep 11 13:27:59 -0700 2008

to_s is to convert the Multibyte::Chars back to a string after normalization.

henrik Thu Sep 11 16:33:43 -0700 2008

tarmo: Ah, right. A to_s after “string” would make it more robust for input like nil or numbers, but that might not be desired.

NZKoz Fri Sep 12 02:28:34 -0700 2008

I’m not sure the nil safety is warranted. 99.999% of people will call this with String#parameterize, not Inflector.parameterize…

tomstuart Fri Sep 12 02:41:20 -0700 2008

This method should also collapse multiple occurrences of the separator (‘foo—-bar’ => ‘foo-bar’) and strip leading/trailing occurrences (‘foo-bar’ => ‘foo-bar’).

Manfred Fri Sep 12 05:07:22 -0700 2008

A couple of considerations. When $KCODE isn’t set to UTF-8 in Ruby <= 1.8.6 this will break because normalize isn’t defined on String. Parameterizing non-ASCII strings results in a blank string: ‘おはよ’.parameterize => ‘’. I know that non of the other inflector methods support non-ASCII characters, what’s the verdict on this?

henrik Fri Sep 12 07:26:54 -0700 2008

I updated Slugalizer based on some of the code traded in the parameterize comments. The biggest change was that is now turns e.g. “foo@bar.com” into “foo-bar-com” instead of “foobarcom” – but it still squeezes multiple separators and removes leading/trailing separators, so " ! foo—dash@bar.com ! " becomes “foo-dash-bar-com”.

I think the current version of Slugalizer has no downsides compared to the current version of parameterize, but it also handles the stuff tomstuart mentioned. It also works with other $KCODEs than ‘u’, that I can tell.

While I do think it’s good to keep it lean, if this method should be present at all, it might as well be as good as it can be – at least as long as it’s just a matter of another short line or two of code.

Regarding the blank string, I think that’s perfectly reasonable. It would certainly be more useful if Japanese etc were transcribed, but I think then we’re firmly in plugin country (see Stringex).

karmi Fri Sep 12 08:07:06 -0700 2008

Thanks, NZKoz!

Also check this ticket