Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markup.striptags: comments now get replaced with a space #417

Closed
kholmanskikh opened this issue Jan 24, 2024 · 1 comment · Fixed by #418
Closed

Markup.striptags: comments now get replaced with a space #417

kholmanskikh opened this issue Jan 24, 2024 · 1 comment · Fixed by #418
Milestone

Comments

@kholmanskikh
Copy link

kholmanskikh commented Jan 24, 2024

In 2.1.4, if Markup.striptags is called, the comment gets replaced by a space. In versions before that it's completely removed from the output.

Test case to reproduce:

import unittest

from markupsafe import Markup

class MarkupSafeTest(unittest.TestCase):
    def test_striptags(self):
        value = 'x <!-- comment -->'
        self.assertEqual(Markup(value).striptags(), 'x')


if __name__ == '__main__':
    unittest.main()

With markupsafe 2.1.4 it fails, with 2.1.3 - it passes.

I'm running with Python 3.11

The issue was originally found by running the test_striptags jinja test case with markupsafe 2.1.4:

https://github.com/pallets/jinja/blob/3fd91e4d11bdd131d8c12805177dbe74d85e7b82/tests/test_filters.py#L94

@dairiki
Copy link
Contributor

dairiki commented Jan 25, 2024

I've just noticed this change, too. I think the issue is that the old version of Markup.striptags removed comments and tags, and then coalesced whitespace:

# Use two regexes to avoid ambiguous matches.
value = _strip_comments_re.sub("", self)
value = _strip_tags_re.sub("", value)
value = " ".join(value.split())

The new version has reversed this order and coalesces whitespace before stripping comments and tags:

def striptags(self) -> str:
""":meth:`unescape` the markup, remove tags, and normalize
whitespace to single spaces.
>>> Markup("Main &raquo;\t<em>About</em>").striptags()
'Main » About'
"""
# collapse spaces
value = " ".join(self.split())
# Look for comments then tags separately. Otherwise, a comment that
# contains a tag would end early, leaving some of the comment behind.
while True:
# keep finding comment start marks
start = value.find("<!--")

(I think the old order of operations is more correct.)

[eta: I've just created PR #418 that addresses this issue.]

dairiki added a commit to dairiki/markupsafe that referenced this issue Jan 25, 2024
bmwiedemann pushed a commit to bmwiedemann/openSUSE that referenced this issue Jan 29, 2024
https://build.opensuse.org/request/show/1142212
by user dgarcia + anag+factory
- Disable broken test with latest version of MarkupSafe (2.1.4)
  (gh#pallets/jinja#1930, gh#pallets/markupsafe#417)
@davidism davidism added this to the 2.1.5 milestone Feb 2, 2024
@davidism davidism closed this as completed Feb 2, 2024
bmwiedemann pushed a commit to bmwiedemann/openSUSE that referenced this issue Feb 7, 2024
https://build.opensuse.org/request/show/1142212
by user dgarcia + anag+factory
- Disable broken test with latest version of MarkupSafe (2.1.4)
  (gh#pallets/jinja#1930, gh#pallets/markupsafe#417)
bmwiedemann pushed a commit to bmwiedemann/openSUSE that referenced this issue Feb 7, 2024
https://build.opensuse.org/request/show/1142212
by user dgarcia + anag+factory
- Disable broken test with latest version of MarkupSafe (2.1.4)
  (gh#pallets/jinja#1930, gh#pallets/markupsafe#417)
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants