Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Flesh out the parameterize method to support non-ascii text and under…
…scores.
- Loading branch information
Showing
2 changed files
with
5 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1ddde91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Shouldn’t the to_s go right after “string”, though?
1ddde91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_s is to convert the Multibyte::Chars back to a string after normalization.
1ddde91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tarmo: Ah, right. A to_s after “string” would make it more robust for input like nil or numbers, but that might not be desired.
1ddde91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure the nil safety is warranted. 99.999% of people will call this with String#parameterize, not Inflector.parameterize…
1ddde91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method should also collapse multiple occurrences of the separator (‘foo—-bar’ => ‘foo-bar’) and strip leading/trailing occurrences (‘
foo-bar’ => ‘foo-bar’).1ddde91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of considerations. When $KCODE isn’t set to UTF-8 in Ruby <= 1.8.6 this will break because normalize isn’t defined on String. Parameterizing non-ASCII strings results in a blank string: ‘おはよ’.parameterize => ‘’. I know that non of the other inflector methods support non-ASCII characters, what’s the verdict on this?
1ddde91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated Slugalizer based on some of the code traded in the parameterize comments. The biggest change was that is now turns e.g. “foo@bar.com” into “foo-bar-com” instead of “foobarcom” – but it still squeezes multiple separators and removes leading/trailing separators, so " ! foo—dash@bar.com ! " becomes “foo-dash-bar-com”.
I think the current version of Slugalizer has no downsides compared to the current version of parameterize, but it also handles the stuff tomstuart mentioned. It also works with other $KCODEs than ‘u’, that I can tell.
While I do think it’s good to keep it lean, if this method should be present at all, it might as well be as good as it can be – at least as long as it’s just a matter of another short line or two of code.
Regarding the blank string, I think that’s perfectly reasonable. It would certainly be more useful if Japanese etc were transcribed, but I think then we’re firmly in plugin country (see Stringex).
1ddde91
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, NZKoz!
Also check this ticket