Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encode(string, :named) is re-encoding valid entries and breaking the HTML #7

Closed
3den opened this issue May 11, 2011 · 3 comments
Closed

Comments

@3den
Copy link

3den commented May 11, 2011

Hi,
First of all I would like to thank you for this awesome gem. But I found a bug while trying to sanitize a string that has both valid and invalid chars. below i explain this problem better:

coder = HTMLEntities.new
string = "> Car &amp; Bike <"
new = coder.encode(string)  # BUG =>  "&gt; Car &amp;amp; Bike &lt;" 
worst_then_new = coder.encode(new) # BUG => "&amp;gt; Car &amp;amp;amp; Bike &amp;lt;" 

A workaround this problem would be to "decode" before "encode" but this hack is to slow...

@3den
Copy link
Author

3den commented May 11, 2011

I solved this problem with the following code:

class HTMLEntities
  class Encoder #:nodoc:     
    def basic_entity_regexp
      @basic_entity_regexp ||= (
        case @flavor
        when /^html/
          /[<>"]|(\&(?!\w))/
        else
          /[<>'"]|(\&(?!\w))/
        end
      )
    end
  end
end

@3den 3den closed this as completed May 11, 2011
@threedaymonk
Copy link
Owner

This is the intended behaviour: the goal of HTMLEntities is to encode and decode in a predictable manner. It's not supposed to fix invalid or incoherent sources, and it doesn't make any attempt to try to understand the intent of inconsistencies in the source, as this depends on the application.

@3den
Copy link
Author

3den commented May 14, 2011

Indeed HTMLEntities is pretty good to encode and decode, but in my case I needed a simple and fast sanitizer to clean inconsistent sources, so i used HTMLEntities to created the https://github.com/3den/ruby-sanitizer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants