Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Browse files

Added Inflector#parameterize for easy slug generation ("Donald E. Knu…

…th".parameterize => "donald-e-knuth") #713 [Matt Darby]
  • Loading branch information...
commit b8e8be83e952163e225f9b38bd7251cba9c44f38 1 parent b141624
@dhh dhh authored
View
2  activesupport/CHANGELOG
@@ -1,5 +1,7 @@
*Edge*
+* Added Inflector#parameterize for easy slug generation ("Donald E. Knuth".parameterize => "donald-e-knuth") #713 [Matt Darby]
+
* Changed cache benchmarking to be reported in milliseconds [DHH]
* Fix Ruby's Time marshaling bug in pre-1.9 versions of Ruby: utc instances are now correctly unmarshaled with a utc zone instead of the system local zone [#900 state:resolved] [Luca Guidi, Geoff Buesing]
View
19 activesupport/lib/active_support/core_ext/string/inflections.rb
@@ -87,6 +87,25 @@ def demodulize
Inflector.demodulize(self)
end
+ # Replaces special characters in a string so that it may be used as part of a 'pretty' URL.
+ #
+ # ==== Examples
+ #
+ # class Person
+ # def to_param
+ # "#{id}-#{name.parameterize}"
+ # end
+ # end
+ #
+ # @person = Person.find(1)
+ # # => #<Person id: 1, name: "Donald E. Knuth">
+ #
+ # <%= link_to(@person.name, person_path %>
+ # # => <a href="/person/1-donald-e-knuth">Donald E. Knuth</a>
+ def parameterize
+ Inflector.parameterize(self)
+ end
+
# Creates the name of a table like Rails does for models to table names. This method
# uses the +pluralize+ method on the last word in the string.
#
View
19 activesupport/lib/active_support/inflector.rb
@@ -240,6 +240,25 @@ def humanize(lower_case_and_underscored_word)
def demodulize(class_name_in_module)
class_name_in_module.to_s.gsub(/^.*::/, '')
end
+
+ # Replaces special characters in a string so that it may be used as part of a 'pretty' URL.
+ #
+ # ==== Examples
+ #
+ # class Person
+ # def to_param
+ # "#{id}-#{name.parameterize}"
+ # end
+ # end
+ #
+ # @person = Person.find(1)
+ # # => #<Person id: 1, name: "Donald E. Knuth">
+ #
+ # <%= link_to(@person.name, person_path %>
+ # # => <a href="/person/1-donald-e-knuth">Donald E. Knuth</a>
+ def parameterize(string, sep = '-')
+ string.gsub(/[^a-z0-9]+/i, sep).downcase
+ end
# Create the name of a table like Rails does for models to table names. This method
# uses the +pluralize+ method on the last word in the string.
View
6 activesupport/test/inflector_test.rb
@@ -98,6 +98,12 @@ def test_tableize
end
end
+ def test_parameterize
+ StringToParameterized.each do |some_string, parameterized_string|
+ assert_equal(parameterized_string, ActiveSupport::Inflector.parameterize(some_string))
+ end
+ end
+
def test_classify
ClassNameToTableName.each do |class_name, table_name|
assert_equal(class_name, ActiveSupport::Inflector.classify(table_name))
View
5 activesupport/test/inflector_test_cases.rb
@@ -142,6 +142,11 @@ module InflectorTestCases
"NodeChild" => "node_children"
}
+ StringToParameterized = {
+ "Donald E. Knuth" => "donald-e-knuth",
+ "Random text with *(bad)* characters" => "random-text-with-bad-characters"
+ }
+
UnderscoreToHuman = {
"employee_salary" => "Employee salary",
"employee_id" => "Employee",

22 comments on commit b8e8be8

@tilsammans

Would have been even better with stringex, since that catches foreign characters much better (i.e. at all).

http://github.com/rsl/stringex/tree/master

@henrik

stringex looks very cool, but seems like it could be overkill here. Slugalizer is very little code but handles accented characters as well as some corner cases that parameterize doesn’t:

http://github.com/henrik/slugalizer/tree/master

@karmi

The idea is great, but unfortunately this is of no use for any accented characters (Czech, Polish, other alphabets).

Example:

puts parameterize('Žluťoučký kůň skákal přes rozpálené koleje')
# => -lu-ou-k-k-sk-kal-p-es-rozp-len-koleje
puts parameterize('Garçons')
# => gar-ons
puts parameterize('Malmö')
# => malm-

Stringex has very good implementation.

We have been using with good results Iconv for this in Czech context:

puts Iconv.new('ascii//translit', 'utf-8').iconv("Žluťoučký kůň skákal v tůňce na Öresündu ©").tr(' ', '-').downcase.gsub(/[^0-9a-z-]/, '') 
=> zlutoucky-kun-skakal-v-tunce-na-oresundu-c

Even beter solution is this one from http://workingwithrails.com/person/12298-adam-cig-nek:

(See the “cig-nek” in URL? That should be “ciganek”. )

string.chars.normalize(:kd).to_s.gsub(/[^\x00-\x7F]/, '')

See http://forum.rubyonrails.cz/forums/1/topics/9?page=2#posts-227 for explanation if you can read Czech, otherwise write here.

For the purpose is the implementation certainly insufficient. In case of names (see above “cig-nek”) maybe downright insulting :)

@karmi

Sorry about the train-wreck with the pre tags.

And more one thing to clarify the point: to me, Rails is above all about best practices in web development. Dropping letters from people’s names with accented chars, as you see on so many Rails-based websites (WWR, Slideshare, etc) is certainly not best practice.

@masterkain

I vote for slugalizer

@bumi

Rails should be as lightweight as possible that’s why I think this implementation is great and is good for 80% ppl using it. For the rest of us we could use plugins like slugarizer oder stringex, which are really awesome, too!

@mdarby

I agree. 80% rule applies here.

@karmi

Of course, for an English text, it’s almost 100% accurate. The problem with such implementation is that it promises some functionality which is in principle insufficient and for every non-English text broken.

@djanowski

Please remember we are not talking about L18N here. Even in the US and UK (is that what others regard as 80%?) there are people with names from different cultures. I don’t see why you would make any application so English-centric and not allow foreign names on it (or make them look ugly).

@chuyeow

I think karmi hit the nail on the head – the documentation ’’’promises’’’ some functionality that is insufficient.

Perhaps a simple change of documentation to make a note about non-English characters not being parameterized properly is sufficient? Maybe a recommendation to use Slugalizer or some suitable plugin too.

@NZKoz
Owner

If we can make something less dumb using the String#chars stuff we already have, then we should. Otherwise we can just update the documentation. I don’t want to depend on iconv.

FWIW, the irony of this changeset mangling david’s new home town isn’t lost on me ;)

@karmi

Even in the US and UK (…) there are people with names from different cultures…

Precisely. It still hurts me that WWR drops two letters from my name so I end up being “karel-mina-k” there. Much more examples in almost every Rails application on the web.

I vote for the documentation caveat.

Moreover, as stated on Lighthouse, we use with very good results this little cryptic, but working code:

string.chars.normalize(:kd).to_s.gsub(/[^\x00-\x7F]/, ’’)

Try this in script/console:

>> “Malm\303\266”.chars.normalize(:kd).to_s.gsub(/[^\x00-\x7F]/, ’’) => “Malmo”

(Note: Mac Terminal escapes non-ASCII)

@NZKoz
Owner

Karmi, why wouldn’t we change the parameterize implementation to use your example above?

@NZKoz
Owner

Perhaps something like this: http://gist.github.com/10227

@NZKoz
Owner

gist appears to be determined to wreck the non-ascii characters in that patch, but you get the gist.

@henrik

While stringex is huge, note that Slugalizer is a oneliner, if you ignore the whitespace-for-readability and validating the argument. With all that, it’s about ten short lines. Most of slugalizer.rb are tests.

I think Slugalizer strikes a good balance between lightweight and best practice.

@karmi

NZKoz, that would go much better with accented chars, although I don’t think the first line of the method body is needed? Ie.:

>> "All the gar\303\247ons from Malm\303\266".chars.normalize(:kd).to_s.gsub(/[^\x00-\x7F]+/, '').gsub(/[^a-z0-9_\-]+/i, sep).downcase
=> "all-the-garcons-from-malmo"

(Note please that I am not the author of the code.)

@karmi

> … don’t think the first line of the method body is needed?

Eh, I am sorry. Apparently I need to slow down a bit to be able to read colorized (!) diff at least :)

@henrik

Now handles accented characters – further discussion here: http://github.com/rails/rails/commit/1ddde91303883b47f2215779cf45d7008377bd0d#comments

@Bounga

I’ve got a plugin that does the same, but the string sanitizing seems better.

You should take a look at http://github.com/Bounga/acts_as_nice_url/

@henrik

Bounga: There have been more commits since this discussion. See my link above.

I see your plugin uses iconv. That has some issues. See the README of Slugalizer, linked above.

@henrik

Bounga: There have been more commits since this discussion. See my link above.

I see your plugin uses iconv. That has some issues. See the README of Slugalizer, linked above.

Please sign in to comment.
Something went wrong with that request. Please try again.