Skip to content

Loading…

Adjust sanitization code to allow HTML5 'data-' attributes through. #19

Open
wants to merge 1 commit into from

1 participant

@nathan-osman

Basically I have made two minor modifications to files in the planet/vendor folder.

Both planet/vendor/feedparser.py and planet/vendor/html5lib/sanitizer.py (which are used for sanitizing the HTML / XHTML encountered in a feed) strip out HTML5 data attributes. These attributes have no meaning to the browser (they are ignored) but third party scripts that are added to the planet page may make use of these attributes and having them stripped out breaks the scripts (or causes them to fail).

Therefore, I have modified the two above files to allow these attributes to pass through the filtering / sanitization unaltered.

Please let me know if you have any questions / concerns.

@nathan-osman nathan-osman Modified feed parser and html5lib to allow data- HTML5 attributes thr…
…ough because they have no meaning to a browser or SGML parser and may contain meta-data or other attributes that scripts added to the planet page may require.
cf07846
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on May 25, 2012
  1. @nathan-osman

    Modified feed parser and html5lib to allow data- HTML5 attributes thr…

    nathan-osman committed
    …ough because they have no meaning to a browser or SGML parser and may contain meta-data or other attributes that scripts added to the planet page may require.
This page is out of date. Refresh to see the latest.
Showing with 3 additions and 1 deletion.
  1. +2 −0 planet/vendor/feedparser.py
  2. +1 −1 planet/vendor/html5lib/sanitizer.py
View
2 planet/vendor/feedparser.py
@@ -2518,6 +2518,8 @@ def unknown_starttag(self, tag, attrs):
elif key=='style':
clean_value = self.sanitize_style(value)
if clean_value: clean_attrs.append((key,clean_value))
+ elif key.startswith('data-'):
+ clean_attrs.append((key, value))
_BaseHTMLProcessor.unknown_starttag(self, tag, clean_attrs)
def unknown_endtag(self, tag):
View
2 planet/vendor/html5lib/sanitizer.py
@@ -169,7 +169,7 @@ def sanitize_token(self, token):
if token.has_key("data"):
attrs = dict([(name,val) for name,val in
token["data"][::-1]
- if name in self.allowed_attributes])
+ if name in self.allowed_attributes or name.startswith('data-')])
for attr in self.attr_val_is_uri:
if not attrs.has_key(attr):
continue
Something went wrong with that request. Please try again.