Skip to content

Commit

Permalink
Added Inflector#parameterize for easy slug generation ("Donald E. Knu…
Browse files Browse the repository at this point in the history
…th".parameterize => "donald-e-knuth") #713 [Matt Darby]
  • Loading branch information
dhh committed Sep 10, 2008
1 parent b141624 commit b8e8be8
Show file tree
Hide file tree
Showing 5 changed files with 51 additions and 0 deletions.
2 changes: 2 additions & 0 deletions activesupport/CHANGELOG
@@ -1,5 +1,7 @@
*Edge*

* Added Inflector#parameterize for easy slug generation ("Donald E. Knuth".parameterize => "donald-e-knuth") #713 [Matt Darby]

* Changed cache benchmarking to be reported in milliseconds [DHH]

* Fix Ruby's Time marshaling bug in pre-1.9 versions of Ruby: utc instances are now correctly unmarshaled with a utc zone instead of the system local zone [#900 state:resolved] [Luca Guidi, Geoff Buesing]
Expand Down
19 changes: 19 additions & 0 deletions activesupport/lib/active_support/core_ext/string/inflections.rb
Expand Up @@ -87,6 +87,25 @@ def demodulize
Inflector.demodulize(self)
end

# Replaces special characters in a string so that it may be used as part of a 'pretty' URL.
#
# ==== Examples
#
# class Person
# def to_param
# "#{id}-#{name.parameterize}"
# end
# end
#
# @person = Person.find(1)
# # => #<Person id: 1, name: "Donald E. Knuth">
#
# <%= link_to(@person.name, person_path %>
# # => <a href="/person/1-donald-e-knuth">Donald E. Knuth</a>
def parameterize
Inflector.parameterize(self)
end

# Creates the name of a table like Rails does for models to table names. This method
# uses the +pluralize+ method on the last word in the string.
#
Expand Down
19 changes: 19 additions & 0 deletions activesupport/lib/active_support/inflector.rb
Expand Up @@ -240,6 +240,25 @@ def humanize(lower_case_and_underscored_word)
def demodulize(class_name_in_module)
class_name_in_module.to_s.gsub(/^.*::/, '')
end

# Replaces special characters in a string so that it may be used as part of a 'pretty' URL.
#
# ==== Examples
#
# class Person
# def to_param
# "#{id}-#{name.parameterize}"
# end
# end
#
# @person = Person.find(1)
# # => #<Person id: 1, name: "Donald E. Knuth">
#
# <%= link_to(@person.name, person_path %>
# # => <a href="/person/1-donald-e-knuth">Donald E. Knuth</a>
def parameterize(string, sep = '-')
string.gsub(/[^a-z0-9]+/i, sep).downcase
end

# Create the name of a table like Rails does for models to table names. This method
# uses the +pluralize+ method on the last word in the string.
Expand Down
6 changes: 6 additions & 0 deletions activesupport/test/inflector_test.rb
Expand Up @@ -98,6 +98,12 @@ def test_tableize
end
end

def test_parameterize
StringToParameterized.each do |some_string, parameterized_string|
assert_equal(parameterized_string, ActiveSupport::Inflector.parameterize(some_string))
end
end

def test_classify
ClassNameToTableName.each do |class_name, table_name|
assert_equal(class_name, ActiveSupport::Inflector.classify(table_name))
Expand Down
5 changes: 5 additions & 0 deletions activesupport/test/inflector_test_cases.rb
Expand Up @@ -142,6 +142,11 @@ module InflectorTestCases
"NodeChild" => "node_children"
}

StringToParameterized = {
"Donald E. Knuth" => "donald-e-knuth",
"Random text with *(bad)* characters" => "random-text-with-bad-characters"
}

UnderscoreToHuman = {
"employee_salary" => "Employee salary",
"employee_id" => "Employee",
Expand Down

22 comments on commit b8e8be8

@tilsammans
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would have been even better with stringex, since that catches foreign characters much better (i.e. at all).

http://github.com/rsl/stringex/tree/master

@henrik
Copy link
Contributor

@henrik henrik commented on b8e8be8 Sep 10, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stringex looks very cool, but seems like it could be overkill here. Slugalizer is very little code but handles accented characters as well as some corner cases that parameterize doesn’t:

http://github.com/henrik/slugalizer/tree/master

@karmi
Copy link
Contributor

@karmi karmi commented on b8e8be8 Sep 10, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is great, but unfortunately this is of no use for any accented characters (Czech, Polish, other alphabets).

Example:

puts parameterize('Žluťoučký kůň skákal přes rozpálené koleje')
# => -lu-ou-k-k-sk-kal-p-es-rozp-len-koleje
puts parameterize('Garçons')
# => gar-ons
puts parameterize('Malmö')
# => malm-

Stringex has very good implementation.

We have been using with good results Iconv for this in Czech context:

puts Iconv.new('ascii//translit', 'utf-8').iconv("Žluťoučký kůň skákal v tůňce na Öresündu ©").tr(' ', '-').downcase.gsub(/[^0-9a-z-]/, '') 
=> zlutoucky-kun-skakal-v-tunce-na-oresundu-c

Even beter solution is this one from http://workingwithrails.com/person/12298-adam-cig-nek:

(See the “cig-nek” in URL? That should be “ciganek”. )

string.chars.normalize(:kd).to_s.gsub(/[^\x00-\x7F]/, '')

See http://forum.rubyonrails.cz/forums/1/topics/9?page=2#posts-227 for explanation if you can read Czech, otherwise write here.

For the purpose is the implementation certainly insufficient. In case of names (see above “cig-nek”) maybe downright insulting :)

@karmi
Copy link
Contributor

@karmi karmi commented on b8e8be8 Sep 10, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the train-wreck with the pre tags.

And more one thing to clarify the point: to me, Rails is above all about best practices in web development. Dropping letters from people’s names with accented chars, as you see on so many Rails-based websites (WWR, Slideshare, etc) is certainly not best practice.

@masterkain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for slugalizer

@bumi
Copy link

@bumi bumi commented on b8e8be8 Sep 10, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rails should be as lightweight as possible that’s why I think this implementation is great and is good for 80% ppl using it.
For the rest of us we could use plugins like slugarizer oder stringex, which are really awesome, too!

@mdarby
Copy link

@mdarby mdarby commented on b8e8be8 Sep 10, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. 80% rule applies here.

@karmi
Copy link
Contributor

@karmi karmi commented on b8e8be8 Sep 10, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, for an English text, it’s almost 100% accurate. The problem with such implementation is that it promises some functionality which is in principle insufficient and for every non-English text broken.

@djanowski
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remember we are not talking about L18N here. Even in the US and UK (is that what others regard as 80%?) there are people with names from different cultures. I don’t see why you would make any application so English-centric and not allow foreign names on it (or make them look ugly).

@chuyeow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think karmi hit the nail on the head – the documentation ’’’promises’’’ some functionality that is insufficient.

Perhaps a simple change of documentation to make a note about non-English characters not being parameterized properly is sufficient? Maybe a recommendation to use Slugalizer or some suitable plugin too.

@NZKoz
Copy link
Member

@NZKoz NZKoz commented on b8e8be8 Sep 11, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can make something less dumb using the String#chars stuff we already have, then we should. Otherwise we can just update the documentation. I don’t want to depend on iconv.

FWIW, the irony of this changeset mangling david’s new home town isn’t lost on me ;)

@karmi
Copy link
Contributor

@karmi karmi commented on b8e8be8 Sep 11, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even in the US and UK (…) there are people with names from different cultures…

Precisely. It still hurts me that WWR drops two letters from my name so I end up being “karel-mina-k” there. Much more examples in almost every Rails application on the web.

I vote for the documentation caveat.

Moreover, as stated on Lighthouse, we use with very good results this little cryptic, but working code:

string.chars.normalize(:kd).to_s.gsub(/[^\x00-\x7F]/, ’’)

Try this in script/console:

> > “Malm\303\266”.chars.normalize(:kd).to_s.gsub(/[^\x00-\x7F]/, ’’)
> > => “Malmo”

(Note: Mac Terminal escapes non-ASCII)

@NZKoz
Copy link
Member

@NZKoz NZKoz commented on b8e8be8 Sep 11, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Karmi, why wouldn’t we change the parameterize implementation to use your example above?

@NZKoz
Copy link
Member

@NZKoz NZKoz commented on b8e8be8 Sep 11, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps something like this: http://gist.github.com/10227

@NZKoz
Copy link
Member

@NZKoz NZKoz commented on b8e8be8 Sep 11, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gist appears to be determined to wreck the non-ascii characters in that patch, but you get the gist.

@henrik
Copy link
Contributor

@henrik henrik commented on b8e8be8 Sep 11, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While stringex is huge, note that Slugalizer is a oneliner, if you ignore the whitespace-for-readability and validating the argument. With all that, it’s about ten short lines. Most of slugalizer.rb are tests.

I think Slugalizer strikes a good balance between lightweight and best practice.

@karmi
Copy link
Contributor

@karmi karmi commented on b8e8be8 Sep 12, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NZKoz, that would go much better with accented chars, although I don’t think the first line of the method body is needed? Ie.:

>> "All the gar\303\247ons from Malm\303\266".chars.normalize(:kd).to_s.gsub(/[^\x00-\x7F]+/, '').gsub(/[^a-z0-9_\-]+/i, sep).downcase
=> "all-the-garcons-from-malmo"

(Note please that I am not the author of the code.)

@karmi
Copy link
Contributor

@karmi karmi commented on b8e8be8 Sep 12, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

> … don’t think the first line of the method body is needed?

Eh, I am sorry. Apparently I need to slow down a bit to be able to read colorized (!) diff at least :)

@henrik
Copy link
Contributor

@henrik henrik commented on b8e8be8 Sep 12, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now handles accented characters – further discussion here: http://github.com/rails/rails/commit/1ddde91303883b47f2215779cf45d7008377bd0d#comments

@Bounga
Copy link
Contributor

@Bounga Bounga commented on b8e8be8 Sep 23, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve got a plugin that does the same, but the string sanitizing seems better.

You should take a look at http://github.com/Bounga/acts_as_nice_url/

@henrik
Copy link
Contributor

@henrik henrik commented on b8e8be8 Sep 24, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bounga: There have been more commits since this discussion. See my link above.

I see your plugin uses iconv. That has some issues. See the README of Slugalizer, linked above.

@henrik
Copy link
Contributor

@henrik henrik commented on b8e8be8 Sep 24, 2008

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bounga: There have been more commits since this discussion. See my link above.

I see your plugin uses iconv. That has some issues. See the README of Slugalizer, linked above.

Please sign in to comment.