Passing a string that contains a semi-colon doesn't properly get encoded #9224

Closed
bobber205 opened this Issue Sep 23, 2014 · 10 comments

Projects

None yet

7 participants

@bobber205

Discovered this today in a rails app as the backend (since it totally barfs on unescaped semis)

I had a string such as "test;test"

passed it to params in $http post() and put() and got

test;test instead of 
%3B

Shouldn't it encode?

@caitp
Contributor
caitp commented Sep 23, 2014

https://github.com/angular/angular.js/blob/master/src/Angular.js#L1128 the comment explains why, but if it breaks rails, hmm --- maybe an option around this would be good

@caitp caitp added this to the 1.3.0 milestone Sep 23, 2014
@bobber205

My workaround was to stop sending my JSON data as a param.
Doesn't look like ; is a valid url separator?
http://stackoverflow.com/questions/3867460/valid-url-separators

@jeffbcross
Contributor

Semicolon is a valid delimiter for some parts of url (at least search, I think path as well).

@jeffbcross jeffbcross modified the milestone: 1.3.0-rc.5, 1.3.0, Backlog Sep 29, 2014
@DanielHeath

From the duplicate issue: http://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#h-B.2.2 is the relevant spec - semicolons should definitely be escaped.

@DanielHeath

RE https://github.com/angular/angular.js/blob/9474ec120acc425103516f1018e63ae39d7d629b/src/Angular.js#L1109

What's so bad about encoding things that might not need to be encoded (to justify a parallel implementation in angular with its own bugs, tests, support etc)?

@pkozlowski-opensource

I must say that I tend to agree with @DanielHeath here. The way I read this comment from rfc3986 is that query can consists of pchars which in turn can contain unreserved / pct-encoded / sub-delims / ":" / "@". But IMO it doesn't say anything about %-encoding / not encoding of sub-delims used in their non-sub-delims role.

In the mentioned RFC we can read (2.2):

URI producing applications should percent-encode data octets that
correspond to characters in the reserved set unless these characters
are specifically allowed by the URI scheme to represent data in that
component. If a reserved character is found in a URI component and
no delimiting role is known for that character, then it must be
interpreted as representing the data octet corresponding to that
character's encoding in US-ASCII.

So much for the spec... Then, when we look at what is going on "in the wild" we can easily see that there are multiple gen-delims and sub-delims used commonly un-encoded (ex.: foo[]=bar, foo+bar).

We are not the only ones struggling with the spec / reality interpretation here, ex.:
http://www.456bereastreet.com/archive/201008/what_characters_are_allowed_unencoded_in_query_strings/

To sum up: we are in the grey area here.... The spec seem to suggest that we should %-encode all the gen-delims and sub-delims when used in their non-delimiting role but this is not what most of the software is doing :-/ So it seems that whatever we do we might break some of the impls.

Coming back to the issue at hand: we could %-encode ; and this will keep some back-ends happy while might break others. I guess we might try in 1.4 and see what gets broken in practice or advice people to %-encode things in their own to match their backend. What we should do in 1.4 is to extract URL-generation logic and let people easily override it.

Putting as 1.4 suggestion.

@petebacondarwin
Member

See #8377

It is a grey area. The spec says that encoding (or not encoding) is application specific.

Perhaps the best 1.4 solution (as part of the $http refactoring) would be to provide a configuration for whether to encode this...

@pkozlowski-opensource

Yeh, this is definitively grey area. And spec really make this open to interpretation... This is one more reason for me to push for a dedicated service that is responsible for request params serialisation - it would be much easier to test / configure this as a separate service.

@petebacondarwin petebacondarwin modified the milestone: Backlog, 1.4.x Dec 15, 2014
@pkozlowski-opensource pkozlowski-opensource added a commit to pkozlowski-opensource/angular.js that referenced this issue Jan 24, 2015
@pkozlowski-opensource pkozlowski-opensource feat($httpUrlParams): introduce new service abstracting params serial…
…ization

Closes #7429
Closes #9224
6aae996
@pkozlowski-opensource pkozlowski-opensource added a commit to pkozlowski-opensource/angular.js that referenced this issue Jan 24, 2015
@pkozlowski-opensource pkozlowski-opensource feat($httpUrlParams): introduce new service abstracting params serial…
…ization

Closes #7429
Closes #9224
6728785
@schmod
Contributor
schmod commented Feb 11, 2015

I work with a platform that has a (partial) implementation of Matrix URIs.

Admittedly, Matrix URIs never became a standard, and are not in widespread use. However, there are definitely existing implementations that rely on unescaped semicolons being in the path.

The query seems to be much more of a grey area, although the older HTML specifications encouraged CGI developers to support semicolon-delimited query strings.

IMO, there's enough precedent to make me apprehensive about automatically escaping semicolons in any URI component. If a developer is interacting with an idiosyncratic server implementation, it seems to make the most sense to fix the server, or write an interceptor to "sanitize" URLs before they are sent to the server.

HOWEVER

@bobber205's issue is a very specific and isolated use case: Putting semicolons inside of a single query parameter value. In this exact situation (and only this situation), I actually think that it is appropriate to escape semicolons, as server implementations might misinterpret an unescaped semicolon as a delimiter.

In summary:

  • path: Don't escape semicolons. Breaks matrix parameters, and no evidence that Angular's current behavior is problematic.
  • query: Escaping the entire query string would be unwise, as it breaks an established (albeit archaic) expectation for servers to be able to handle semicolon-delimited query strings.
  • query values: Probably safe to escape, especially if the server interprets semicolons as query delimiters. Passing complex data types (ie. JSON) via query parameters is a bad idea, but "real-world" string values often contain semicolons, and it seems like a good idea to escape them.
@pkozlowski-opensource pkozlowski-opensource added a commit to pkozlowski-opensource/angular.js that referenced this issue Mar 4, 2015
@pkozlowski-opensource pkozlowski-opensource feat($httpUrlParams): introduce new service abstracting params serial…
…ization

Closes #7429
Closes #9224
f6f00f0
@pkozlowski-opensource pkozlowski-opensource added a commit to pkozlowski-opensource/angular.js that referenced this issue Mar 4, 2015
@pkozlowski-opensource pkozlowski-opensource feat($httpUrlParams): introduce new service abstracting params serial…
…ization

Closes #7429
Closes #9224
16b3a33
@pkozlowski-opensource pkozlowski-opensource added a commit to pkozlowski-opensource/angular.js that referenced this issue Apr 1, 2015
@pkozlowski-opensource pkozlowski-opensource feat($http): support custom params serializers
Closes #3740
Closes #7429
Closes #9224
12f261e
@pkozlowski-opensource pkozlowski-opensource added a commit to pkozlowski-opensource/angular.js that referenced this issue Apr 2, 2015
@pkozlowski-opensource pkozlowski-opensource feat($http): support custom params serializers
Closes #3740
Closes #7429
Closes #9224
292c757
@bobber205

Thanks for all the hardwork! ❤️

@netman92 netman92 added a commit to netman92/angular.js that referenced this issue Aug 8, 2015
@pkozlowski-opensource @netman92 pkozlowski-opensource + netman92 feat($http): support custom params serializers
Closes #3740
Closes #7429
Closes #9224
Closes #11461
91cf48f
@ljagiello ljagiello added a commit to ljagiello/grafana that referenced this issue Apr 6, 2016
@ljagiello ljagiello Fix issue #4533: InfluxDB 0.11.1+ Multiple queries
InfluxDB supports now properly semi-colons and throws an error if
semi-colon is missing.

Grafana in process of joining multiple queries was using only "\n" as a
separator between queries.

My patch fixes this and adding semi-colons between queries.

As a side effect I hit a bug with angular.js and not correct urlencode
semi-colons (there is a longer discussion about it:
angular/angular.js#9224).

I hope it's ok to remove that regex and keep semi-colon encoded in URL.

Tested with InfluxDB 0.10, 0.11 and 0.12 (don't have any 0.9).
15b911b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment