New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct query loss when using parse_qsl to dict #622

Merged
merged 1 commit into from Aug 17, 2016

Conversation

Projects
None yet
4 participants
@dhermes
Contributor

dhermes commented Aug 11, 2016

NOTE: I noticed this when doing a fairly comprehensive source check to find the httplib2 footprint (for #128).

The issue is that repeated params could just be dropped:

>>> from six.moves import urllib
>>> params_list = urllib.parse.parse_qsl('a=1&a=2')
>>> params_list
[('a', '1'), ('a', '2')]
>>> dict(params_list)
{'a': '2'}

@googlebot googlebot added the cla: yes label Aug 11, 2016

@theacodes

This comment has been minimized.

Show comment
Hide comment
@theacodes

theacodes Aug 11, 2016

Member

This LGTM, but I wonder if there's not already something that handles this well?

Member

theacodes commented Aug 11, 2016

This LGTM, but I wonder if there's not already something that handles this well?

@dhermes

This comment has been minimized.

Show comment
Hide comment
@dhermes

dhermes Aug 11, 2016

Contributor

I just did git grep parse_q. It's crazy it was used incorrectly in 4 different places.

Contributor

dhermes commented Aug 11, 2016

I just did git grep parse_q. It's crazy it was used incorrectly in 4 different places.

@dhermes

This comment has been minimized.

Show comment
Hide comment
@dhermes

dhermes Aug 11, 2016

Contributor

@jonparrott I want to hold off on the merge until @nathanielmanistaatgoogle weighs in

Contributor

dhermes commented Aug 11, 2016

@jonparrott I want to hold off on the merge until @nathanielmanistaatgoogle weighs in

@theacodes

This comment has been minimized.

Show comment
Hide comment
@theacodes

theacodes Aug 11, 2016

Member

@dhermes good plan. This seems like hackery I'm not sure I like.

Member

theacodes commented Aug 11, 2016

@dhermes good plan. This seems like hackery I'm not sure I like.

@dhermes

This comment has been minimized.

Show comment
Hide comment
@dhermes

dhermes Aug 11, 2016

Contributor

This seems like hackery I'm not sure I like.

What is "this"?

Contributor

dhermes commented Aug 11, 2016

This seems like hackery I'm not sure I like.

What is "this"?

@theacodes

This comment has been minimized.

Show comment
Hide comment
@theacodes

theacodes Aug 11, 2016

Member

Doing urlparsing and such.

Member

theacodes commented Aug 11, 2016

Doing urlparsing and such.

@dhermes

This comment has been minimized.

Show comment
Hide comment
@dhermes

dhermes Aug 11, 2016

Contributor

Yeah I dunno. You can file an issue to follow-up?

Contributor

dhermes commented Aug 11, 2016

Yeah I dunno. You can file an issue to follow-up?

@theacodes

This comment has been minimized.

Show comment
Hide comment
@theacodes

theacodes Aug 11, 2016

Member

Just more to think about before we close #128 I think.

Member

theacodes commented Aug 11, 2016

Just more to think about before we close #128 I think.

@dhermes

This comment has been minimized.

Show comment
Hide comment
@dhermes

dhermes Aug 11, 2016

Contributor

Currently typing one-handed w/baby in my arms, can you add a checkbox about it in the long desc. for #128 for me?

Contributor

dhermes commented Aug 11, 2016

Currently typing one-handed w/baby in my arms, can you add a checkbox about it in the long desc. for #128 for me?

@theacodes

This comment has been minimized.

Show comment
Hide comment
@theacodes

theacodes Aug 11, 2016

Member

I read that as "typing with baby arms". :D

Will do.

Member

theacodes commented Aug 11, 2016

I read that as "typing with baby arms". :D

Will do.

@dhermes dhermes referenced this pull request Aug 11, 2016

Closed

Enable BYO http library #128

2 of 7 tasks complete
def update_query_params(uri, params):
"""Updates a URI with new query parameters.
If a given key from ``params`` is repeated in the ``uri``, then

This comment has been minimized.

@nathanielmanistaatgoogle

nathanielmanistaatgoogle Aug 13, 2016

Contributor

Sorcerery!

These are really strange semantics; is there some resource to which you can link that explains why this is correct or useful? (Yes, there's a lot I don't know about the web.)

@nathanielmanistaatgoogle

nathanielmanistaatgoogle Aug 13, 2016

Contributor

Sorcerery!

These are really strange semantics; is there some resource to which you can link that explains why this is correct or useful? (Yes, there's a lot I don't know about the web.)

This comment has been minimized.

@dhermes

dhermes Aug 15, 2016

Contributor

I don't have an opinion on what should happen, so just tell me what you want. The issue is that casting the results of parse_qsl (a list of tuples, with repeats allowed) directly to a dict is that query params can be lost.

It's possible / probable that all call sites assumed the same thing that parse_unique_urlencoded does and then meant to replace (if existing) or add the param.

@dhermes

dhermes Aug 15, 2016

Contributor

I don't have an opinion on what should happen, so just tell me what you want. The issue is that casting the results of parse_qsl (a list of tuples, with repeats allowed) directly to a dict is that query params can be lost.

It's possible / probable that all call sites assumed the same thing that parse_unique_urlencoded does and then meant to replace (if existing) or add the param.

@dhermes

This comment has been minimized.

Show comment
Hide comment
@dhermes

dhermes Aug 15, 2016

Contributor

@nathanielmanistaatgoogle PTAL.

In particular, we should resolve what we think the actual goal of all those dict(parse_qsl(...)) calls were.

Contributor

dhermes commented Aug 15, 2016

@nathanielmanistaatgoogle PTAL.

In particular, we should resolve what we think the actual goal of all those dict(parse_qsl(...)) calls were.

@nathanielmanistaatgoogle

This comment has been minimized.

Show comment
Hide comment
@nathanielmanistaatgoogle

nathanielmanistaatgoogle Aug 17, 2016

Contributor

Agreed that this needs a deeper look; probably site-by-site (luckily there are few). Replace-except-if-the-value-is-a-list-of-two-or-more-then-add is just weird semantics.

If it's the case that all sites in the library are correct with replace semantics, then our library-private _helpers module should make available a helper function that implements replace semantics.

If it's the case that all sites in the library are correct with add-to semantics, then our library-private _helpers module should make available a helper function that implements add-to semantics.

If it's the case that some sites in the library are correct with replace semantics and some are correct with add-to semantics, then our library-private _helpers module should make available a helper function that implements replace semantics and a helper function that implements add-to semantics.

I seriously doubt that it's the case that there's a site in the library where the correct semantics are replace-except-if-the-value-is-a-list-of-two-or-more-then-add.

Am I making sense?

Contributor

nathanielmanistaatgoogle commented Aug 17, 2016

Agreed that this needs a deeper look; probably site-by-site (luckily there are few). Replace-except-if-the-value-is-a-list-of-two-or-more-then-add is just weird semantics.

If it's the case that all sites in the library are correct with replace semantics, then our library-private _helpers module should make available a helper function that implements replace semantics.

If it's the case that all sites in the library are correct with add-to semantics, then our library-private _helpers module should make available a helper function that implements add-to semantics.

If it's the case that some sites in the library are correct with replace semantics and some are correct with add-to semantics, then our library-private _helpers module should make available a helper function that implements replace semantics and a helper function that implements add-to semantics.

I seriously doubt that it's the case that there's a site in the library where the correct semantics are replace-except-if-the-value-is-a-list-of-two-or-more-then-add.

Am I making sense?

@dhermes

This comment has been minimized.

Show comment
Hide comment
@dhermes

dhermes Aug 17, 2016

Contributor

Yes. But the issue with replace is if the original had multiple values, then what does replace mean? Which is why I suggested we first verify that each parameter shows up exactly once, then replace.

i.e. a=1&b=2 replace b=3 (or c=3) is easy but what if the original was b=1&b=2

Contributor

dhermes commented Aug 17, 2016

Yes. But the issue with replace is if the original had multiple values, then what does replace mean? Which is why I suggested we first verify that each parameter shows up exactly once, then replace.

i.e. a=1&b=2 replace b=3 (or c=3) is easy but what if the original was b=1&b=2

@nathanielmanistaatgoogle

This comment has been minimized.

Show comment
Hide comment
@nathanielmanistaatgoogle

nathanielmanistaatgoogle Aug 17, 2016

Contributor

For private helper methods, replace means what we implement it to mean. I could see it going one of two ways: either "replace" semantics mean "associate this key with this value, no matter whether zero, one, two, or more values were associated with the key previously" or "replace" semantics mean "associate this key with this value, and if there were zero values with the key previously, that's fine, and if there was one value with the key previously, that's also fine, but if there were two or more values associated with the key previously, that's a problem so raise an exception". While either of these choices would be acceptable to be blessed as correct, I suspect it's the first that is going to be more useful and applicable in our codebase. Is my suspicion correct?

Contributor

nathanielmanistaatgoogle commented Aug 17, 2016

For private helper methods, replace means what we implement it to mean. I could see it going one of two ways: either "replace" semantics mean "associate this key with this value, no matter whether zero, one, two, or more values were associated with the key previously" or "replace" semantics mean "associate this key with this value, and if there were zero values with the key previously, that's fine, and if there was one value with the key previously, that's also fine, but if there were two or more values associated with the key previously, that's a problem so raise an exception". While either of these choices would be acceptable to be blessed as correct, I suspect it's the first that is going to be more useful and applicable in our codebase. Is my suspicion correct?

@dhermes

This comment has been minimized.

Show comment
Hide comment
@dhermes

dhermes Aug 17, 2016

Contributor

@nathanielmanistaatgoogle Hopefully this works. Don't forget to remind me to squash on LGTM

Contributor

dhermes commented Aug 17, 2016

@nathanielmanistaatgoogle Hopefully this works. Don't forget to remind me to squash on LGTM

@nathanielmanistaatgoogle

This comment has been minimized.

Show comment
Hide comment
@nathanielmanistaatgoogle

nathanielmanistaatgoogle Aug 17, 2016

Contributor

Looks good (after out-of-band discussion); squash and merge.

Contributor

nathanielmanistaatgoogle commented Aug 17, 2016

Looks good (after out-of-band discussion); squash and merge.

@dhermes dhermes merged commit 51ae876 into googleapis:master Aug 17, 2016

3 checks passed

cla/google All necessary CLAs are signed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage remained the same at 100.0%
Details

@dhermes dhermes deleted the dhermes:allow-repeated-params branch Aug 17, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment