Skip to content

Conversation

miss-islington
Copy link
Contributor

@miss-islington miss-islington commented Oct 6, 2025

"] ]>" and "]] >" no longer end the CDATA section.

Make CDATA section parsing context depending.
Add private method HTMLParser._set_support_cdata() to change the context.
If called with True, "<[CDATA[" starts a CDATA section which ends with "]]>".
If called with False, "<[CDATA[" starts a bogus comments which ends with ">".
(cherry picked from commit 0cbbfc4)
(cherry picked from commit dcf2476)

Co-authored-by: Serhiy Storchaka storchaka@gmail.com

…onGH-135665) (pythonGH-137774)

"] ]>" and "]] >" no longer end the CDATA section.

Make CDATA section parsing  context depending.
Add private method HTMLParser._set_support_cdata() to change the context.
If called with True, "<[CDATA[" starts a CDATA section which ends with "]]>".
If called with False, "<[CDATA[" starts a bogus comments which ends with ">".
(cherry picked from commit 0cbbfc4)
(cherry picked from commit dcf2476)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Copy link
Member

@ezio-melotti ezio-melotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.9 seems to have two additional tests at the end of test_htmlparser:

  • test_invalid_keyword_error_exception
  • test_invalid_keyword_error_pass

These are missing in 3.10+ and the former is currently failing:

======================================================================
FAIL: test_invalid_keyword_error_exception (test.test_htmlparser.AttributesTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/cpython/cpython/Lib/test/test_htmlparser.py", line 1118, in test_invalid_keyword_error_exception
    parser.feed('<![invalid>')
AssertionError: InvalidMarkupException not raised

@bedevere-app
Copy link

bedevere-app bot commented Oct 6, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be poked with soft cushions!

@ambv
Copy link
Contributor

ambv commented Oct 7, 2025

3.9 seems to have two additional tests at the end of test_htmlparser:

Let me check how they got to be removed from 3.10 and if this isn't problematic, we'll do the same here.

@ambv
Copy link
Contributor

ambv commented Oct 7, 2025

@serhiy-storchaka the additional tests were added in #32256. Do you think they are valid?

@serhiy-storchaka
Copy link
Member

Originally, the HTML parser used the code from _markupbase to parse comments, CDATA sections and declarations. The _markupbase changes were not backported for compatibility, this is why we only have these tests in 3.9. In newer versions we moved away from using _markupbase because it is not compliant with HTML5. And now we are backporting the HTML parser changes. This is technically a breaking change -- the parser used to call the error() method for some HTML, and now it handles it in conformance with the HTML5 specs and never calls error(). All previous backports had a similar effect, but this was not covered by tests.

These tests were correct for existing code, but are no longer correct for the new code. If we decide to accept these changes in 3.9, then the tests should be removed (and there is no replacement, the tested code no longer exist). If we decided to give up on the backport, they remain.

@ambv ambv merged commit ed904d5 into python:3.9 Oct 7, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-security A security issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants