Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documents with extended UTF-8 URIs causes regexp error #29

Closed
electrum opened this issue Mar 18, 2011 · 4 comments
Closed

Documents with extended UTF-8 URIs causes regexp error #29

electrum opened this issue Mar 18, 2011 · 4 comments

Comments

@electrum
Copy link

Any idea how to work around this issue?

ruby-1.9.2-p136 :002 > Loofah.document("<a href=\"\u5927\">").scrub!(:strip)
Encoding::CompatibilityError: incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string)
    from /Users/dphillips/.rvm/gems/ruby-1.9.2-p136/gems/loofah-1.0.0/lib/loofah/html5/scrub.rb:20:in `gsub'
@ippa
Copy link

ippa commented Apr 12, 2011

I got this too through feedzirra / ruby 1.9.2p180 (2011-02-18 revision 30909) [x86_64-linux].

While parsing http://www.cagepotato.com/feed/ (utf-8 feed)

/usr/local/lib/ruby/gems/1.9.1/gems/loofah-1.0.0/lib/loofah/html5/scrub.rb:20:in `gsub': incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string) (Encoding::CompatibilityError)

@jmatraszek
Copy link

anyone managed to work around this issue?

@flavorjones
Copy link
Owner

I have a fix for this issue. Will be released early next week. Sorry for the delay.

@flavorjones
Copy link
Owner

This fix is in 1.1.0, just released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants