Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Partially revert 9efe1a7, strip_tags improvements

The new regex seems not stable enough for being released. Stripping
with regex might need reevaluation for the next release.
Refs #19237.
  • Loading branch information...
commit 20ac33100cd20c17d1ceab075d96698974cc4778 1 parent 3b95212
@claudep claudep authored
Showing with 1 addition and 2 deletions.
  1. +1 −1  django/utils/html.py
  2. +0 −1  tests/regressiontests/utils/html.py
View
2  django/utils/html.py
@@ -33,7 +33,7 @@
html_gunk_re = re.compile(r'(?:<br clear="all">|<i><\/i>|<b><\/b>|<em><\/em>|<strong><\/strong>|<\/?smallcaps>|<\/?uppercase>)', re.IGNORECASE)
hard_coded_bullets_re = re.compile(r'((?:<p>(?:%s).*?[a-zA-Z].*?</p>\s*)+)' % '|'.join([re.escape(x) for x in DOTS]), re.DOTALL)
trailing_empty_content_re = re.compile(r'(?:<p>(?:&nbsp;|\s|<br \/>)*?</p>\s*)+\Z')
-strip_tags_re = re.compile(r'</?\S([^=]*=(\s*"[^"]*"|\s*\'[^\']*\'|\S*)|[^>])*?>', re.IGNORECASE)
+strip_tags_re = re.compile(r'<[^>]*?>', re.IGNORECASE)
def escape(text):
View
1  tests/regressiontests/utils/html.py
@@ -65,7 +65,6 @@ def test_strip_tags(self):
('<f', '<f'),
('</fe', '</fe'),
('<x>b<y>', 'b'),
- ('a<p onclick="alert(\'<test>\')">b</p>c', 'abc'),
('a<p a >b</p>c', 'abc'),
('d<a:b c:d>e</p>f', 'def'),
)

2 comments on commit 20ac331

@RichardBronosky

It would be great to see some test text that fails on the old way but passes on the new way.

import re
re_old = re.compile(r'</?\S([^=]*=(\s*"[^"]*"|\s*\'[^\']*\'|\S*)|[^>])*?>', re.IGNORECASE)
re_new = re.compile(r'<[^>]*?>', re.IGNORECASE)

test_text = '<a href="https://github.com/django/django/commit/20ac33" id="the_ticket" class="not_working needs_help regex_wtf">Partially revert 9efe1a721, strip_tags improvements - 20ac331 - django/django</a>'
expected_result = 'Partially revert 9efe1a721, strip_tags improvements - 20ac331 - django/django'

result_new = re_new.sub('', test_text)
result_old = re_old.sub('', test_text)
print(result_new==expected_result, result_new)
print(result_old==expected_result, result_old)

Please replace test_text and expected_result with the breaky breaky stuff. Thanks.

@claudep
Collaborator

Yes, we really need a good bunch of tests. Please follow-up on the ticket instead (https://code.djangoproject.com/ticket/19237).

Please sign in to comment.
Something went wrong with that request. Please try again.