Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

query_values regressed with latest release #77

Closed
crawlik opened this issue Jul 25, 2012 · 24 comments
Closed

query_values regressed with latest release #77

crawlik opened this issue Jul 25, 2012 · 24 comments
Labels

Comments

@crawlik
Copy link

crawlik commented Jul 25, 2012

Hi guys,
According this doc http://addressable.rubyforge.org/api/Addressable/URI.html#query_values-instance_method query_values has regressed with latest release.

Here is what get calling it
jruby-1.6.2 :001 > require "addressable/uri" => true jruby-1.6.2 :002 > Addressable::URI.parse( jruby-1.6.2 :003 > "?one[two][three][]=four&one[two][three][]=five" jruby-1.6.2 :004?> ).query_values => {"one[two][three][]"=>"five"}
2.2.8 version returns expected results
ruby-1.6.2 :001 > require "addressable/uri" => true jruby-1.6.2 :002 > require "addressable/uri" => false jruby-1.6.2 :003 > Addressable::URI.parse( jruby-1.6.2 :004 > "?one[two][three][]=four&one[two][three][]=five" jruby-1.6.2 :005?> ).query_values => {"one"=>{"two"=>{"three"=>["four", "five"]}}}

@sporkmonger
Copy link
Owner

This is correct. I haven't been able to update the docs due to a weird SSH error from RubyForge that I haven't figured out yet, but yes, this is a breaking change. The old style of parsing query parameters was non-standard, excessively complex, and worked very poorly in the case of repeated parameters. It's been abandoned in favor of deleting and simplifying the code drastically.

On the one hand, it's good to get the feedback that this functionality was getting used, but I feel very strongly at this point that this is a query parameter format that introduces a great deal of unnecessary complexity. I would recommend trying to transition to simpler query parameters and avoiding complex structured parameters if at all possible.

I believe Rails originally introduced this format, and I think it was a great disservice to everyone to enable this kind of thing instead of forcing the developer to simplify.

The original parsing code is still in source control, and there's no reason why you couldn't borrow from it if you determine that this is something you absolutely have to have, but honestly, I don't recommend it. Yes, it's very mature code, and extremely well-tested, but it's also slow, overly complex, and it causes a lot of problems and interface inconsistencies for much more important scenarios like repeated parameters.

@crawlik
Copy link
Author

crawlik commented Jul 26, 2012

@sporkmonger thanks for explanation. There is one more thing I would like to mention. Take a look an this query string ?one[]=two&one[]=three. query_values returns {"one[]"=>"three"} Note [] is part of returned parameter name. Wonder if it is expected behavior. Used to be returned without []. Please let me know.
BTW is there an RFC that describes use of [] in URI query string. I checked RFC 3986 but could not find any specifics on [] use except that they are reserved. Thanks.

@sporkmonger
Copy link
Owner

This is intended behavior now. The use of [] was, I believe, introduced initially by Rails (and copied by Addressable for compatibility, but then also expanded somewhat) as a way to make URI query parameters more "ruby-like", but I believe this was a counter-productive objective, which is why this functionality was removed.

@Xylakant
Copy link

The use of [] as delimiter for nested query parameters predates rails for quite a while. PHP at least uses the same format for nested query parameters and has done so as far as I can remember (at least back to version 3). Nested query params are understood by rack itself, so any rack-based framework can handle that format.

I'd really love that functionality to stick around since it makes interoperability so much easier.

@sporkmonger
Copy link
Owner

I suspect this is going to continue being a controversial decision, but so far I haven't heard a sufficiently strong argument to reverse it. For absolute certain I will never accept "PHP did it!" as a good reason to follow suit. Happy to be corrected on my history of the usage, but the decision still stands. Nested query parameters are a mistake and Addressable will not support them natively, because there's no way to do so without either massively polluting the interface and causing developer confusion on which method to use, or overloading a single set of methods with a complex parser that ends up being awkward and ambiguous for assignment.

I am far, far more concerned about supporting this use case:

?thing=1&thing=2

Than I am with this use case:

?thing[]=1&thing[]=2

The reason is simple. One is a perfect superset of the other. I can implement the latter using the former, but not the other way around. So I will choose the more expressive option every single time, even if it occasionally makes the developer do a little more work. And especially since I have a great deal of reason at this point to believe that the developer shouldn't be doing this in the first place.

@Xylakant
Copy link

Doing as PHP did is not a reason in itself, but wouldn't it be nice for interoperability if everyone did query parameters in the same way? And PHP has been around a long time, so that it certainly had an influence on how other web-languages/frameworks are built. (like rack for example)

In any case: What's the argument against nested query parameters? I'm the author of https://github.com/Asquera/warden-hmac-authentication/ which uses an hmac signature to authenticate requests between servers. I need to add various parameters to the query string. To prevent polluting the parameter namespace, I use a single parameter name with nested sub-keys. I could do without, but I don't want to accidentally overwrite random parameters. I use addressable to generate the final URIs, and that's now broken for me in the latest release. I certainly can do without or use flat parameter names, but that means that on the receiving side, the parameter namespace is filled with a set of authentication-related parameters that are only of interest for my warden-authenticator. I can certainly clean that up, but that's not nice either.

I've seen similar ways of passing a set of related parameters in other use-cases. In general: Nested parameters are useful, since they allow grouping of query parameters. The use of [] as delimiter is not forbidden by the respective RFC. I'd love to see them supported in a way that's in line with how other frameworks and languages support them

(edit: clarified statement about RFC).

@sporkmonger
Copy link
Owner

Not sure what you mean that [] are covered by the RFC.

@Xylakant
Copy link

They're allowed as separators in the RFC, i.e. not reserved characters. Sorry about the confusion.

@bascht
Copy link

bascht commented Sep 14, 2012

One quick question. I've stumbled across this thread, as I'm both using the Addressable as well as warden-hmac-authentication gem.

Like @sporkmonger said: »this is a breaking change.« - why does it come covert in a minor 2.2.8 to 2.2.9 version bump?

@sporkmonger
Copy link
Owner

@bascht That's 100% my fault, and mea culpa. To be honest, I didn't actually expect it'd even be noticed, and clearly I was wrong. For absolute sure, that should not have happened and won't happen again.

@sporkmonger
Copy link
Owner

Off topic and out of curiosity, why are people today picking warden-hmac-authentication over either OAuth 1 or OAuth 2?

@Xylakant
Copy link

It's a different use-case. HMAC auth works fine if you have two servers that can share one secret, such as signed links to a CDN-Server from a website. So you can generate the link on a page that's protected by login that points to a protected download and all you actually need to share between the servers is one secret. Basically it's the website telling the CDN "I know that guy and he's trustworthy".

@sporkmonger
Copy link
Owner

Eh? That description sounds like a bearer token, which is pretty different from what I saw in the README, so I think I'm still lost.

@Xylakant
Copy link

As far as I understand the usage of bearer tokens I guess it's similar. But using an HMAC does not require you to go through the whole OAUTH token exchange process. You can just calculate it from the URI and the shared secret. AWS S3 for example uses HMACS for the signed URLs to protected files. So it's way simpler to actually implement and sufficient for a lot of use-cases where OAUTH would be overblown.

@flipsasser
Copy link

I'm not sure if this is the most helpful contribution, as I agree strongly with Bob's reasoning, but I went ahead and added nested hash support to Addressabler. Addressabler is an add-on library for Addressable::URI meant to add missing functionality (like TLD or subdomain parsing). Since this appears to be considered "missing" functionality by the community, it's now supported via Rack::Utils.build_nested_query.

I encourage people not to go down this road, but if you want to, I've made it easy. More details here:

http://github.com/flipsasser/addressabler

@sporkmonger
Copy link
Owner

@flipsasser, I'm not at a computer right now, so I can't easily check, but does this add support by monkey-patching the query_values method? If so, I'd like to ask that you use some other method name.

@sporkmonger
Copy link
Owner

I take that back. Apparently I can easily check. Looks good to me.

@flipsasser
Copy link

@sporkmonger it's designed never to get in the way of Addressable. Respect.

@sporkmonger
Copy link
Owner

Much appreciated. Not everyone is so careful. :-)

@pencil
Copy link

pencil commented Mar 28, 2013

What's the status here? I use addressable to remove some query_values:

service_uri.query_values = service_uri.query_values.except('service', 'ticket', 'gateway', 'renew')

But this also removes multiple occurrences with the same name. How can I avoid that?

@sporkmonger
Copy link
Owner

@pencil I'm pretty sure your question is totally unrelated to this issue...

Also you may find it easier to work with an Array and a select filter: service_uri.query_values(Array).select { |k,v| # your code here }, given what it sounds like you're trying to do.

JonRowe pushed a commit to JonRowe/webmock that referenced this issue Jan 9, 2014
@runa
Copy link

runa commented Jan 13, 2014

Sorry, not sure if you are still interested in this regression, but I just found it and I want to comment on the ?thing=1&thing=2 vs ?thing[]=1&thing[]=2 discussion.

  • AFAIK in ?thing=1&thing=2 the 2nd thing overrides the first one, so it ends up being thing=2
  • Also, how would you differentiate between a simple string parameter and a 1 element array using the non-bracketed version?

I'm now using Rack::Utils.parse_nested_query which seems to do the trick

[5] pry(main)> Addressable::URI.parse("?one[two][three]=four").query_values
=> {"one[two][three]"=>"four"}
[6] pry(main)> Rack::Utils.parse_nested_query("one[two][three]=four")
=> {"one"=>{"two"=>{"three"=>"four"}}}

@sporkmonger
Copy link
Owner

Ultimately, the default result for ?thing=1&thing=2 should be {"thing" => [1, 2]}.

@krainboltgreene
Copy link

Sorry to dredge this up, but could this possibly be helped by having Addressable::URI return a Addressable::Query object that can have it's own set of parse logic? For backward compatibility you could make it a Hash-like object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants