Skip to content

Commit

Permalink
[5.0.x] Fixed CVE-2024-27351 -- Prevented potential ReDoS in Truncato…
Browse files Browse the repository at this point in the history
…r.words().

Thanks Seokchan Yoon for the report.

Co-Authored-By: Mariusz Felisiak <felisiak.mariusz@gmail.com>
  • Loading branch information
shaib and felixxm committed Mar 4, 2024
1 parent 80761c3 commit 3394fc6
Show file tree
Hide file tree
Showing 5 changed files with 105 additions and 2 deletions.
57 changes: 55 additions & 2 deletions django/utils/text.py
Expand Up @@ -23,8 +23,61 @@ def capfirst(x):
return x[0].upper() + x[1:]


# Set up regular expressions
re_words = _lazy_re_compile(r"<[^>]+?>|([^<>\s]+)", re.S)
# ----- Begin security-related performance workaround -----

# We used to have, below
#
# re_words = _lazy_re_compile(r"<[^>]+?>|([^<>\s]+)", re.S)
#
# But it was shown that this regex, in the way we use it here, has some
# catastrophic edge-case performance features. Namely, when it is applied to
# text with only open brackets "<<<...". The class below provides the services
# and correct answers for the use cases, but in these edge cases does it much
# faster.
re_notag = _lazy_re_compile(r"([^<>\s]+)", re.S)
re_prt = _lazy_re_compile(r"<|([^<>\s]+)", re.S)


class WordsRegex:
@staticmethod
def search(text, pos):
# Look for "<" or a non-tag word.
partial = re_prt.search(text, pos)
if partial is None or partial[1] is not None:
return partial

# "<" was found, look for a closing ">".
end = text.find(">", partial.end(0))
if end < 0:
# ">" cannot be found, look for a word.
return re_notag.search(text, pos + 1)
else:
# "<" followed by a ">" was found -- fake a match.
end += 1
return FakeMatch(text[partial.start(0) : end], end)


class FakeMatch:
__slots__ = ["_text", "_end"]

def end(self, group=0):
assert group == 0, "This specific object takes only group=0"
return self._end

def __getitem__(self, group):
if group == 1:
return None
assert group == 0, "This specific object takes only group in {0,1}"
return self._text

def __init__(self, text, end):
self._text, self._end = text, end


# ----- End security-related performance workaround -----

# Set up regular expressions.
re_words = WordsRegex
re_chars = _lazy_re_compile(r"<[^>]+?>|(.)", re.S)
re_tag = _lazy_re_compile(r"<(/)?(\S+?)(?:(\s*/)|\s.*?)?>", re.S)
re_newlines = _lazy_re_compile(r"\r\n|\r") # Used in normalize_newlines
Expand Down
8 changes: 8 additions & 0 deletions docs/releases/3.2.25.txt
Expand Up @@ -7,6 +7,14 @@ Django 3.2.25 release notes
Django 3.2.25 fixes a security issue with severity "moderate" and a regression
in 3.2.24.

CVE-2024-27351: Potential regular expression denial-of-service in ``django.utils.text.Truncator.words()``
=========================================================================================================

``django.utils.text.Truncator.words()`` method (with ``html=True``) and
:tfilter:`truncatewords_html` template filter were subject to a potential
regular expression denial-of-service attack using a suitably crafted string
(follow up to :cve:`2019-14232` and :cve:`2023-43665`).

Bugfixes
========

Expand Down
8 changes: 8 additions & 0 deletions docs/releases/4.2.11.txt
Expand Up @@ -7,6 +7,14 @@ Django 4.2.11 release notes
Django 4.2.11 fixes a security issue with severity "moderate" and a regression
in 4.2.10.

CVE-2024-27351: Potential regular expression denial-of-service in ``django.utils.text.Truncator.words()``
=========================================================================================================

``django.utils.text.Truncator.words()`` method (with ``html=True``) and
:tfilter:`truncatewords_html` template filter were subject to a potential
regular expression denial-of-service attack using a suitably crafted string
(follow up to :cve:`2019-14232` and :cve:`2023-43665`).

Bugfixes
========

Expand Down
8 changes: 8 additions & 0 deletions docs/releases/5.0.3.txt
Expand Up @@ -7,6 +7,14 @@ Django 5.0.3 release notes
Django 5.0.3 fixes a security issue with severity "moderate" and several bugs
in 5.0.2.

CVE-2024-27351: Potential regular expression denial-of-service in ``django.utils.text.Truncator.words()``
=========================================================================================================

``django.utils.text.Truncator.words()`` method (with ``html=True``) and
:tfilter:`truncatewords_html` template filter were subject to a potential
regular expression denial-of-service attack using a suitably crafted string
(follow up to :cve:`2019-14232` and :cve:`2023-43665`).

Bugfixes
========

Expand Down
26 changes: 26 additions & 0 deletions tests/utils_tests/test_text.py
Expand Up @@ -183,6 +183,32 @@ def test_truncate_html_words(self):
truncator = text.Truncator("<p>I &lt;3 python, what about you?</p>")
self.assertEqual("<p>I &lt;3 python,…</p>", truncator.words(3, html=True))

# Only open brackets.
test = "<" * 60_000
truncator = text.Truncator(test)
self.assertEqual(truncator.words(1, html=True), test)

# Tags with special chars in attrs.
truncator = text.Truncator(
"""<i style="margin: 5%; font: *;">Hello, my dear lady!</i>"""
)
self.assertEqual(
"""<i style="margin: 5%; font: *;">Hello, my dear…</i>""",
truncator.words(3, html=True),
)

# Tags with special non-latin chars in attrs.
truncator = text.Truncator("""<p data-x="א">Hello, my dear lady!</p>""")
self.assertEqual(
"""<p data-x="א">Hello, my dear…</p>""",
truncator.words(3, html=True),
)

# Misplaced brackets.
truncator = text.Truncator("hello >< world")
self.assertEqual(truncator.words(1, html=True), "hello…")
self.assertEqual(truncator.words(2, html=True), "hello >< world")

@patch("django.utils.text.Truncator.MAX_LENGTH_HTML", 10_000)
def test_truncate_words_html_size_limit(self):
max_len = text.Truncator.MAX_LENGTH_HTML
Expand Down

0 comments on commit 3394fc6

Please sign in to comment.