-
Notifications
You must be signed in to change notification settings - Fork 302
Closed
Description
Hello. Not sure if my implementation is incorrect or if I found a bug, but i've noticed that ampersands within anchor href attributes are being converted to &, their html special characters equivalent.
Here is how i'm sanitizing:
def escape_text(self, text):
"""Use html5lib to escape evil html tags."""
parser = html5lib.HTMLParser(tokenizer=HTMLSanitizer)
walker = html5lib.treewalkers.getTreeWalker('etree')
stream = walker(parser.parseFragment(text))
serializer = HTMLSerializer(quote_attr_values=True, omit_optional_tags=False,
alphabetical_attributes=True)
return serializer.render(stream)
Input: <p><a class="linked-url" target="_blank" href="https://sprint.ly/?one=1&two=2">https://sprint.ly/?one=1&two=2</a></p>
Output: <p><a class="linked-url" href="https://sprint.ly/?one=1&two=2" target="_blank">https://sprint.ly/?one=1&two=2</a></p>
thanks!
Metadata
Metadata
Assignees
Labels
No labels