Python bindings to the ammonia HTML sanitization library.
pip install nh3
Use clean()
to sanitize HTML fragments:
>>> import nh3
>>> nh3.clean("<unknown>hi")
'hi'
>>> nh3.clean("<b><img src='' onerror='alert(\\'hax\\')'>XSS?</b>")
'<b><img src="">XSS?</b>'
It has many options to customize the sanitization, as documented below. For example, to only allow <b>
tags:
>>> nh3.clean("<b><a href='https://example.com'>Hello</a></b>", tags={"b"})
'<b>Hello</b>'
nh3
ALLOWED_TAGS
The default set of tags allowed by clean()
. Useful for customizing the default to add or remove some tags:
>>> tags = nh3.ALLOWED_TAGS - {"b"}
>>> nh3.clean("<b><i>yeah</i></b>", tags=tags)
'<i>yeah</i>'
ALLOWED_ATTRIBUTES
The default mapping of tags to allowed attributes for clean()
. Useful for customizing the default to add or remove some attributes:
>>> from copy import deepcopy
>>> attributes = deepcopy(nh3.ALLOWED_ATTRIBUTES)
>>> attributes["img"].add("data-invert")
>>> nh3.clean("<img src='example.jpeg' data-invert=true>", attributes=attributes)
'<img src="example.jpeg" data-invert="true">'