Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to escape non-whitelisted elements instead of stripping them #1

Closed
rgrove opened this issue Sep 14, 2009 · 5 comments
Closed

Comments

@rgrove
Copy link
Owner

rgrove commented Sep 14, 2009

Feature request from Ævar Arnfjörð Bjarmason:

My use case is that I'm calling Sanitize like this:

def clean(html)
  return Sanitize.clean(html, :elements => ['a'],
                   :attributes => {'a' => ['href']},
                   :protocols => {'a' => {'href' => ['http', 'https']}})
end

And this is the output I really want:

clean("<b>") ==> &lt;b&gt;
clean("<b><a href=\"http://example.com\">example</a>") ==> &lt;b&gt;<a href="http://example.com">example</a>

Instead Sanitize will comptetely strip the unacceptable HTML tag.

We should implement a setting to make the stripping optional.

@rgrove
Copy link
Owner Author

rgrove commented Sep 17, 2009

Turns out this is likely to be quite a bit more complicated to implement than I originally thought, unless we escape the tag and all of its contents, regardless of whether the contents include legal tags.

This is going to take some more thought.

@cluesque
Copy link

+1 .. my example is simpler:

clean("a < b") ==> "a &lt; b"

@rgrove
Copy link
Owner Author

rgrove commented Jan 25, 2010

Added an :escape_only config setting. If set to true, Sanitize will escape non-whitelisted elements and their contents instead of removing them. Closed by 5bbd6d3

Not a perfect solution, but it's the best that can be done without adding unwarranted complexity.

@iangreenleaf
Copy link

Looks like this feature was removed in 122c29f. How come?

@rgrove
Copy link
Owner Author

rgrove commented Mar 22, 2011

Basically for the reason described above. Escaping an element means you must also escape all its children, even if some of them would otherwise be whitelisted. In many cases, double-escaping can result as well. So far I haven't found a solution I'm happy with, and I'd rather have no feature than a buggy feature.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants