Browse files

fixes 'malformed utf8 character' crash

  • Loading branch information...
1 parent 1dcf0ba commit 9b4cf995d0a82ffa3f18013df9efd440906a89f5 @lloydpick committed Apr 1, 2010
Showing with 2 additions and 2 deletions.
  1. +2 −2 lib/friendly_id/slug_string.rb
4 lib/friendly_id/slug_string.rb
@@ -155,7 +155,7 @@ def initialize(string)
# @return String
def approximate_ascii!(*args)
@maps = (self.class.approximations + args + [:common]).flatten.uniq
- @wrapped_string = normalize_utf8(:c).unpack("U*").map { |char| approx_char(char) }.flatten.pack("U*")
+ @wrapped_string = tidy_bytes.normalize_utf8(:c).unpack("U*").map { |char| approx_char(char) }.flatten.pack("U*")
# Removes leading and trailing spaces or dashses, and replaces multiple
@@ -175,7 +175,7 @@ def downcase!
# Remove any non-word characters.
# @return String
def word_chars!
- @wrapped_string = normalize_utf8(:c).unpack("U*").map { |char|
+ @wrapped_string = tidy_bytes.normalize_utf8(:c).unpack("U*").map { |char|
case char
# control chars
when 0..31

1 comment on commit 9b4cf99


The problem with this is that if there are invalid characters, it fails completely with 1.9 because tidy_bytes is completely broken for 1.9 in ActiveSupport.

When it doesn't raise an error, it still removes any non-ascii characters, which makes this unusable for sites that want URL's with Russian or Chinese strings (for example).

I have a library in progress that I will add probably add as a dependency to FriendlyId or patch into ActiveSupport; in the mean time you may want to install it and use it to preprocess your strings before passing them into FriendlyId.

The library is incomplete but usable, and works with 1.8.6 - 1.9.1.

Please sign in to comment.