Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-42967: only use '&' as a query string separator #24297

Merged
merged 13 commits into from Feb 14, 2021

Conversation

AdamGold
Copy link
Contributor

@AdamGold AdamGold commented Jan 22, 2021

bpo-42967: [security] urllib.parse.parse_qsl(): Web cache poisoning -
; as a query args separator

https://bugs.python.org/issue42967

Co-authored-by: Ken Jin 28750310+Fidget-Spinner@users.noreply.github.com

@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@AdamGold

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You can check yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

bpo-42967: [security] urllib.parse.parse_qsl(): Web cache poisoning -
`;` as a query args separator
@Fidget-Spinner
Copy link
Member

Fidget-Spinner commented Jan 23, 2021

@AdamGold I went ahead and re-ran the CLA bot and it seems you signed the CLA here :).

BTW, what are your thoughts about combining our PRs into one? I already added you as a Co-author to PR-24271, meaning the final commit will show us both as commit authors. I also wanted to use some really nice ideas from this PR if you're okay with it.

@AdamGold
Copy link
Contributor Author

@Fidget-Spinner Sounds like a good idea, do you want me to checkout to your branch and push a commit?

@Fidget-Spinner
Copy link
Member

@AdamGold sorry I changed my mind, I decided to close my PR in favor of this one, because I think 2 open PRs might split core devs' attention. I'll just review this one instead.

Copy link
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things:

  • cgi.py should expose these new arguments IMO so that users can switch. Most of cgi.py's arguments are already passed directly into urllib.parse anyways (the docs also mention that behavior here). You can see the affected functions in my PR here https://github.com/python/cpython/pull/24271/files#
  • Following that,FieldStorage should also accept the separator switch IMO, and you may have to pass the separator in a few more places (you can refer to my PR for that too, should be mostly painless :) ).

Other than those nits, and waiting for decision on the bpo about whether to boolean switch or allow user selection, I think this PR is quite well done :).

Lib/test/test_urlparse.py Show resolved Hide resolved
@AdamGold
Copy link
Contributor Author

@Fidget-Spinner Thanks! much appreciated.
Regarding CGI and FieldStorage, isn't that used for POST requests?

Lib/test/test_urlparse.py Outdated Show resolved Hide resolved
@Fidget-Spinner
Copy link
Member

@Fidget-Spinner Thanks! much appreciated.
Regarding CGI and FieldStorage, isn't that used for POST requests?

Yup, however it uses parse_qsl internally somewhere, so I'm guessing we probably still have to pass it in (most of its __init__ arguments are already directly passed to parse_qsl anyways).

@AdamGold
Copy link
Contributor Author

@Fidget-Spinner Thanks! much appreciated.
Regarding CGI and FieldStorage, isn't that used for POST requests?

Yup, however it uses parse_qsl internally somewhere, so I'm guessing we probably still have to pass it in (most of its __init__ arguments are already directly passed to parse_qsl anyways).

From the docs:

The script’s input is connected to the client too, and sometimes the form data is read this way; at other times the form data is passed via the “query string” part of the URL

I honestly don't know enough about CGI to tell - how is the query string being used there?

Copy link
Member

@Fidget-Spinner Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly don't know enough about CGI to tell - how is the query string being used there?

So I didn't know this either, but a POST form can contain a query string too... 😮; so FieldStorage has to parse query strings: https://stackoverflow.com/questions/14710061/is-it-valid-to-combine-a-form-post-with-a-query-string/14710450

Lib/cgi.py Outdated Show resolved Hide resolved
Lib/cgi.py Outdated Show resolved Hide resolved
Lib/cgi.py Outdated Show resolved Hide resolved
Lib/cgi.py Show resolved Hide resolved
Copy link
Member

@merwok merwok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefered the other version that always split on & and had a boolean param to also parse on ;. Waiting on Senthil’s and others’ opinion on the bug report.

@AdamGold
Copy link
Contributor Author

AdamGold commented Jan 25, 2021

So I didn't know this either, but a POST form can contain a query string too..

@Fidget-Spinner implemented.

@merwok
Copy link
Member

merwok commented Jan 25, 2021

Just a process note: github’s user experience is not great for reviewers when a PR has its history rewritten.
We prefer that people add new commits, and the final merge is always a squash merge. Thanks!
See https://devguide.python.org/pullrequest/#submitting

@AdamGold
Copy link
Contributor Author

@merwok Just saw this comment - noted, will be following this from now on.

@orsenthil
Copy link
Member

I prefered the other version that always split on & and had a boolean param to also parse on ;.

@merwok - This is a good thinking, and I originally thought along the lines of stick with '&' or just '&' and ';' in some form.

Taking werkzeug package example (used in Flask and other projects) made it think that it's very reasonable to with a separator argument, with default as & pallets/werkzeug@6503519#diff-2d06ddf3f490f8cbcda42ae788a0586f08d5c96e67d89784a785c8295759e60dR30 (changed since 2009)

  1. The W3C recommendation -

Let strings be the result of strictly splitting the string payload on U+0026 AMPERSAND characters (&).

is taken care by this patch.

  1. Using a bool to accommodate an older unrecommended separator will confuse people. But to accommodate the old un-changed practice, like you showed the example of debian bugs, providing the option to override separator made more sense, since it was explicit.

The breaking change is disallowing both & and ; as separator together, and due to attack-vector of cache-poisoning exposed as a vulnerability, the primary breaking change is about disallowing both these characters together. I think, seems reasonable thing to do, even as it potential break some software that were relying on it.

Copy link
Member

@orsenthil orsenthil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, using the separator with '&' as default seems much better than not providing any option change and strictly use '&' as this moment.

We will only need a documentation that highlights this behavior and change in behavior from past and this patch can be merged.

@AdamGold
Copy link
Contributor Author

@orsenthil Where should I write this documentation?

ned-deily pushed a commit that referenced this pull request Feb 15, 2021
…H-24532)

bpo-42967: [security] Address a web cache-poisoning issue reported in
urllib.parse.parse_qsl().

urllib.parse will only us "&" as query string separator by default
instead of both ";" and "&" as allowed in earlier versions. An optional
argument seperator with default value "&" is added to specify the
separator.

Co-authored-by: Éric Araujo <merwok@netwok.org>
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Co-authored-by: Adam Goldschmidt <adamgold7@gmail.com>
gentoo-bot pushed a commit to gentoo/cpython that referenced this pull request Mar 4, 2021
…4297)  (pythonGH-24532)

bpo-42967: [security] Address a web cache-poisoning issue reported in
urllib.parse.parse_qsl().

urllib.parse will only us "&" as query string separator by default
instead of both ";" and "&" as allowed in earlier versions. An optional
argument seperator with default value "&" is added to specify the
separator.

Co-authored-by: Éric Araujo <merwok@netwok.org>
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Co-authored-by: Adam Goldschmidt <adamgold7@gmail.com>

Rebased for Python 2.7 by Michał Górny
kaxil added a commit to astronomer/airflow that referenced this pull request Mar 10, 2021
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache@061cd23
- apache@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.
kaxil added a commit to apache/airflow that referenced this pull request Mar 10, 2021
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- 061cd23
- 49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.
adorilson pushed a commit to adorilson/cpython that referenced this pull request Mar 13, 2021
bpo-42967: [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl().

urllib.parse will only us "&" as query string separator by default instead of both ";" and "&" as allowed in earlier versions. An optional argument seperator with default value "&" is added to specify the separator.


Co-authored-by: Éric Araujo <merwok@netwok.org>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Co-authored-by: Éric Araujo <merwok@netwok.org>
kaxil added a commit to apache/airflow that referenced this pull request Mar 19, 2021
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- 061cd23
- 49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

(cherry picked from commit ffe3bd2)
raspbian-autopush pushed a commit to raspbian-packages/python3.5 that referenced this pull request Apr 8, 2021
ashb pushed a commit to apache/airflow that referenced this pull request Apr 15, 2021
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- 061cd23
- 49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

(cherry picked from commit ffe3bd2)
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Sep 16, 2021
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

(cherry picked from commit ffe3bd29574d62a0a692cd8f63995856bbff8c0b)

GitOrigin-RevId: 4033041ab9a8806c139c6dc3e9b77f3818aca962
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Sep 17, 2021
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Sep 23, 2021
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
raspbian-autopush pushed a commit to raspbian-packages/python3.5 that referenced this pull request Nov 4, 2021
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Nov 27, 2021
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Mar 10, 2022
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Jun 4, 2022
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Jul 9, 2022
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Aug 27, 2022
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Oct 4, 2022
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
aglipska pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Oct 7, 2022
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Dec 7, 2022
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this pull request Jan 27, 2023
python/cpython#24297 change was included in
Python 3.8.8 to fix a vulnerability (bpo-42967)

Depending on which Base Python Image is run in our CI, two of the tests
can fail or succeed.

Our Previous two attempts:

- apache/airflow@061cd23
- apache/airflow@49952e7

We might for a while get different base python version depending on the changes of a PR (whether or not it includes a change to dockerfiler).
a) when you have PR which do not have changes in the Dockerfile, they will use the older python version as base (for example Python 3.8.7)
b) when you have PR that touches the Dockerfile and have setup.py changes in master, it should pull Python 3.8.8 first.

GitOrigin-RevId: ffe3bd29574d62a0a692cd8f63995856bbff8c0b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants