gh-102153: fix CVE-2023-24329 #102470

xiaoge1001 · 2023-03-06T14:49:40Z

fix CVE-2023-24329

Issue: urllib.parse space handling CVE-2023-24329 appears unfixed #102153

cpython-cla-bot · 2023-03-06T14:50:06Z

All commit authors signed the Contributor License Agreement.

xiaoge1001 · 2023-03-07T02:29:21Z

Before applying #102470:

>>> from urllib.parse import urlparse, urlsplit
>>> urlparse(" https://example.com")
ParseResult(scheme='', netloc='', path=' https://example.com', params='', query='', fragment='')

After applying #102470:

>>> from urllib.parse import urlparse, urlsplit
>>> urlparse(" https://example.com")
ParseResult(scheme='https', netloc='example.com', path='', params='', query='', fragment='')

Misc/NEWS.d/next/Security/2023-03-06-22-48-08.gh-issue-102153.eiaVrE.rst

CharlieZhao95 · 2023-03-07T07:17:52Z

BTW, this solution is mentioned at the end of the issue finder's article, see: https://pointernull.com/security/python-url-parse-problem.html for more details.

xiaoge1001 · 2023-03-07T07:22:16Z

BTW, this solution is mentioned at the end of the issue finder's article, see: https://pointernull.com/security/python-url-parse-problem.html for more details.

Yes, I read this article, too. About the leading blank problem, I can solve it using lstrip() function even without reading this article.

xiaoge1001 · 2023-03-07T12:16:45Z

#102153 (comment)

As described by @illia-v, I think it is necessary to remove the leading blanks from the URL in urllib.parse.

illia-v · 2023-03-07T19:13:03Z

#102153 (comment)

As described by @illia-v, I think it is necessary to remove the leading blanks from the URL in urllib.parse.

Right, but a few characters not covered by simple .lstrip() have to be stripped too, I created an alternative – #102508.

Thanks CharlieZhao95

zhuofeng6 · 2023-03-08T08:13:51Z

i think that the PR of @xiaoge1001 is a good idea, could you review the PR, my brother @gpshead

xiaoge1001 · 2023-03-09T01:36:07Z

Hello, is there anyone concerned about this problem?

hugovk · 2023-03-09T06:46:05Z

Thank you for the PRs, someone will review them and let's be patient in the meantime.

See also #102153 (comment)

gpshead · 2023-03-09T09:43:35Z

#102508 is more along the lines of what we'll want to consider. I'm closing this one in favor of that.

xiaoge1001 · 2023-03-09T13:30:03Z

@gpshead I'm sorry to bother you, but I don't know urllib very well. Can you answer my question?

If I use this solution （https://github.com/python/cpython/pull/102470） to fix CVE when I don't think about compatibility, can the CVE be completely fixed? Will other problems be introduced?

gpshead · 2023-03-09T21:23:33Z

We aren't currently confident that we can safely backport a fix to older Python versions as something likely depends on some of the behavior around spaces. That said, this solution likely does work for a lot of code.

It is a reasonable recommendation (as I think the blog post suggested?) for people to call .lstrip() themselves on their url before passing it to urlparse from their own code rather than assuming they're running on top of a Python without the issue.

The other PR I closed this in favor of will have potentially even more side effects that could impact other uses of urlparse as it removes a wider/different set of characters than lstrip() considers from anywhere in the entire url string.

If you do go with this or any of the PRs in your own patched Python interpreter, please let us know how that went. Details like the relative size of your overall Python code base using your patched interpreter was and how many, if any, issues around the behavior change in the API came up after some period of time testing and deploying things with it. Those are all good data to have to inform decisions. (I'll be trying some testing of that nature myself)

illia-v · 2023-03-10T09:20:21Z

The other PR I closed this in favor of will have potentially even more side effects that could impact other uses of urlparse as it removes a wider/different set of characters than lstrip() considers from anywhere in the entire url string.

@gpshead yes, it may have more side effects. But it removes the characters only at the beginning and the end of a URL string (and a scheme if one is passed separately.)

xiaoge1001 · 2023-03-30T06:59:00Z

We aren't currently confident that we can safely backport a fix to older Python versions as something likely depends on some of the behavior around spaces. That said, this solution likely does work for a lot of code.

It is a reasonable recommendation (as I think the blog post suggested?) for people to call .lstrip() themselves on their url before passing it to urlparse from their own code rather than assuming they're running on top of a Python without the issue.

The other PR I closed this in favor of will have potentially even more side effects that could impact other uses of urlparse as it removes a wider/different set of characters than lstrip() considers from anywhere in the entire url string.

If you do go with this or any of the PRs in your own patched Python interpreter, please let us know how that went. Details like the relative size of your overall Python code base using your patched interpreter was and how many, if any, issues around the behavior change in the API came up after some period of time testing and deploying things with it. Those are all good data to have to inform decisions. (I'll be trying some testing of that nature myself)

I plan to add constraints to fix CVE: "The first character of the URL-scheme string must be one ASCII alpha." Then I initiated a check on users for this constraint for two weeks, and did not receive any feedback on the impact. I think the impact of fixing the CVE problem should be small.

xiaoge1001 added 3 commits March 6, 2023 21:42

fix CVE-2023-24329

954be4d

add test for CVE-2023-24329

c546775

doc

dccba70

bedevere-bot mentioned this pull request Mar 6, 2023

urllib.parse space handling CVE-2023-24329 appears unfixed #102153

Closed

bedevere-bot added the awaiting review label Mar 6, 2023

xiaoge1001 added 2 commits March 6, 2023 22:54

Merge branch 'main' into fix-cve

f15ec24

Merge branch 'main' into fix-cve

7e4b434

CharlieZhao95 suggested changes Mar 7, 2023

View reviewed changes

Misc/NEWS.d/next/Security/2023-03-06-22-48-08.gh-issue-102153.eiaVrE.rst Outdated Show resolved Hide resolved

bedevere-bot added awaiting core review and removed awaiting review labels Mar 7, 2023

arhadthedev added type-security A security issue stdlib Python modules in the Lib dir labels Mar 7, 2023

xiaoge1001 added 3 commits March 8, 2023 07:16

use strip() replace lstrip()

b32d74d

revert use strip() replace lstrip()

ebcd46c

Merge branch 'main' into fix-cve

43b1b8b

xiaoge1001 mentioned this pull request Mar 8, 2023

gh-102153: Start stripping C0 control and space chars in urlsplit #102508

Merged

update doc

b00c0ea

Thanks CharlieZhao95

xiaoge1001 requested a review from CharlieZhao95 March 8, 2023 03:52

gpshead self-assigned this Mar 8, 2023

gpshead added the DO-NOT-MERGE label Mar 9, 2023

gpshead closed this Mar 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-102153: fix CVE-2023-24329 #102470

gh-102153: fix CVE-2023-24329 #102470

xiaoge1001 commented Mar 6, 2023 •

edited

cpython-cla-bot bot commented Mar 6, 2023 •

edited

xiaoge1001 commented Mar 7, 2023 •

edited

CharlieZhao95 commented Mar 7, 2023

xiaoge1001 commented Mar 7, 2023

xiaoge1001 commented Mar 7, 2023 •

edited

illia-v commented Mar 7, 2023

zhuofeng6 commented Mar 8, 2023

xiaoge1001 commented Mar 9, 2023

hugovk commented Mar 9, 2023

gpshead commented Mar 9, 2023

xiaoge1001 commented Mar 9, 2023 •

edited

gpshead commented Mar 9, 2023

illia-v commented Mar 10, 2023 •

edited

xiaoge1001 commented Mar 30, 2023

gh-102153: fix CVE-2023-24329 #102470

gh-102153: fix CVE-2023-24329 #102470

Conversation

xiaoge1001 commented Mar 6, 2023 • edited

cpython-cla-bot bot commented Mar 6, 2023 • edited

xiaoge1001 commented Mar 7, 2023 • edited

CharlieZhao95 commented Mar 7, 2023

xiaoge1001 commented Mar 7, 2023

xiaoge1001 commented Mar 7, 2023 • edited

illia-v commented Mar 7, 2023

zhuofeng6 commented Mar 8, 2023

xiaoge1001 commented Mar 9, 2023

hugovk commented Mar 9, 2023

gpshead commented Mar 9, 2023

xiaoge1001 commented Mar 9, 2023 • edited

gpshead commented Mar 9, 2023

illia-v commented Mar 10, 2023 • edited

xiaoge1001 commented Mar 30, 2023

xiaoge1001 commented Mar 6, 2023 •

edited

cpython-cla-bot bot commented Mar 6, 2023 •

edited

xiaoge1001 commented Mar 7, 2023 •

edited

xiaoge1001 commented Mar 7, 2023 •

edited

xiaoge1001 commented Mar 9, 2023 •

edited

illia-v commented Mar 10, 2023 •

edited