Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Matrix parameters #181

Open
cowwoc opened this issue Nov 7, 2014 · 23 comments
Open

Support Matrix parameters #181

cowwoc opened this issue Nov 7, 2014 · 23 comments
Labels
Milestone

Comments

@cowwoc
Copy link

cowwoc commented Nov 7, 2014

Please add support for matrix parameters: http://www.w3.org/DesignIssues/MatrixURIs.html

@rodneyrehm
Copy link
Member

What kind of API would you expect this to have?

I wonder if @annevk came across any matrix params in the wild in his URL tests.

@annevk
Copy link

annevk commented Nov 8, 2014

This seems like it's just an idea from timbl, not something that actually happened.

@rodneyrehm
Copy link
Member

Yeah, I thought so. I came across this (or something similar?) a couple of years ago, but never saw anyone actually use it. Thanks @annevk!

@cowwoc
Copy link
Author

cowwoc commented Nov 8, 2014

@rodneyrehm I first heard of matrix parameters in OReilly's RESTful Web Services book and have used them in production during the past 4-5 years. They are well supported by JAX-RS and most REST frameworks I've come across.

I am expecting to be able to get/set matrix parameters the same way as I set query parameters, but in addition to adding/removing parameters against the last path segment (as query parameters currently do) I want additional methods to be able to add/remove matrix parameters against non-terminal path segments.

@rodneyrehm
Copy link
Member

Ok, then let's toy with this for a bit. As MatrixURIs is a bit vague at times, I'll quickly re-iterate what I understood.

The character used to delimit keys and values is =, the character to delimit key-value-pairs is ;, both fall into the category of unreserved characters, more precisely into the subset sub-delims. The character group sub-delims is a possible component of every possible segment group (segment, segment-nz, segment-nz-nc) as per Collected URI ABNF, the notation fits general URI rules.

For the most part, the matrix notation fits encoding/decoding of x-www-form-urlencoded - with the exceptions:

  • spaces being encoded as %20 instead of +
  • ; instead of & being the key-value-pair delimiter
  • keys are unique (for compatibility, last declaration wins)
  • ;key, unlike &key does not convey a null value (;key= and &key= behave the same, though) - ;key has to be removed from the segment during normalization
  • order of keys does not matter, so by convention we'll sort them alphabetically

Supporting the MatrixURI notation has implications on the following existing methods

  • .normalizePath() to add normalizePathMatrix() to sort and unique-ify matrix params and remove trailing ;
  • .relativeTo() and .absoluteTo() because URI(;foo=bar).absoluteTo('//foo.com/one;key=val') should yield '//foo.com/one;key=val;foo=bar'

Supporting the MatrixURI notation calls for additional methods

  • .matrix() analogous to .segmentCoded() for access to individual components and the entire path
URI('//foo.com/one;key=val/two;some=m%C3%B6re;data=;bla/thr%3Fe').matrix(true);
// would return the following construct:
[
  {
    segment: 'one',
    matrix: {
      key: 'val',
    },
  },
  {
    segment: 'two',
    matrix: {
      data: '',
      // note: decoded value
      some: 'möre',
    },
  },
  {
    // note: decoded value
    segment: 'thr?e',
    matrix: {},
  },
];

@cowwoc
Copy link
Author

cowwoc commented Nov 9, 2014

@rodneyrehm

Great analysis. This brings up some questions:

  1. Since keys are unique, I guess it's up to the user to decide how to pack multiple values per key-value pair?
  2. You said that;key= does not convey a null value. What does it convey then? :)

Also, you might want to post a new answer to http://stackoverflow.com/q/401981/14731 with some of these points.

@rodneyrehm
Copy link
Member

I might get back to the SO question later, for now let's first figure out what exactly is going on and is expected to happen.

Interesting fact: MatrixURIs do seem to be used in various places:

  • Ruby On Rails uses them to denote page state like som/resource;edit
  • Piwik "supports" them
  • AngularJS seems to support them
  • RFC 3986 Section 3.3 explains the concept without calling it MatrixURI in the last paragraph

Some more resources to (re)visit later:


Since keys are unique, I guess it's up to the user to decide how to pack multiple values per key-value pair?

After looking around for a bit, it seems that the comma (,, also in sub-delims) was intended for just that. Something similar has not been defined for x-www-form-urlencoded because it does not mandate keys being unique, so multiple values could be provided simply by repeating the key over and over again.

You said that;key= does not convey a null value. What does it convey then? :)

It conveys the empty string, which is to be interpreted as "key exists without value".


adding the comma to my example above, we end up with:

URI('//foo.com/one;key=val/two;some=m%C3%B6re;data=;list=1,2,3;bla/thr%3Fe').matrix(true);
// would return the following construct:
[
  {
    segment: 'one',
    matrix: {
      key: 'val',
    },
  },
  {
    segment: 'two',
    matrix: {
      data: '',
      // note: "," delimits multiple values within a key-value component
      list: ['1', '2', '3'],
      // note: decoded value
      some: 'möre',
      // note: key 'bla' is removed because it was null
    },
  },
  {
    // note: decoded value
    segment: 'thr?e',
    matrix: {},
  },
];

@cowwoc
Copy link
Author

cowwoc commented Nov 10, 2014

You said that;key= does not convey a null value. What does it convey then? :)

It conveys the empty string, which is to be interpreted as "key exists without value".

If that's the case, then I question your earlier statement:

;key has to be removed from the segment during normalization

It seems to me that this contradicts:

there must be a syntax for removing an attribute, hopefully distinguishing a removed attribute from one whose value if the empty string

found at http://www.w3.org/DesignIssues/MatrixURIs.html. Where did you read that keys with empty values should be removed during normalization?

@rodneyrehm
Copy link
Member

see the resolving relative urls at the end of that document

@cowwoc
Copy link
Author

cowwoc commented Nov 10, 2014

@rodneyrehm Okay, I see. I see now that a relative path of ";roads" causes a key to be removed whereas ";roads=" does not. I thought the two forms were equivalent, but they are not.

@dario-liberman-jav
Copy link

What is the status on this feature guys?
Quite eager to use Matrix Parameters in my URIs and your library would be a perfect fit if it had the support :)

@rodneyrehm
Copy link
Member

the status is: we have figured out what "Matrix Parameters" are supposed to be. There is no clear specification, only a thought-document.

as far as I know nobody has started implementing anything yet

@dario-liberman-jav
Copy link

Could the matrix function take a few optional configuration options with the contrived edge-cases so that the user decides if they want for example to use last parameter wins or alternatively treat it as a list in the same way the query parameters would do? Similarly so there could be options to configure other edge cases like that, there can't be that many?

@rodneyrehm
Copy link
Member

The problem with that is that two users would interpret the same string as different resources - something that we already see happening with query strings. You want to avoid these situations. like the plague. (that said, I'd go with the proposed comma and be done with it.)

You're welcome to try implementing this. I'll not get around to it for quite some time.

@Ladyrainy90
Copy link

what is the website of URL Matrix? Someone supports me to use it so that i can check My site http://vuanhhospital.com.vn/detal/kham-suc-khoe-tong-quat-svvmmqpnxv , but i cannot find out the site. :(

@tom10271
Copy link

Angular 2 use Matrix URL notation.

https://angular.io/docs/ts/latest/guide/router.html

Angular is very popular and sure Angular 2 is another game changer. Really great is URI.js can support Matrix notation.

@doriantaylor
Copy link

doriantaylor commented May 27, 2016

As the author of one of the documents referenced by @rodneyrehm I figured I'd weigh in. (Thanks @Laurian for alerting me.)

It is true, per @annevk that path parameters (aka matrix parameters) look like something thought up by timbl but never really put into practice. We probably have the early-90s work on CGI and cgi-lib.pl (and its descendants) to thank for that. Consider:

  1. Before there was a key-value QUERY_STRING environment variable, the query parameters were the argv of the script. This behaviour still exists (e.g. ?one+two+three generates ["one", "two", "three"].)
  2. Query strings have been amenable to manipulation by HTML forms (with method="GET") since forever.
  3. There is a clean mapping between argv and QUERY_STRING; not so for path parameters, because there's one list of them for every path segment. (Moreover the / delimiter supersedes sub-delimiters, so you can't have a literal / in a path parameter.)
  4. So what does it mean for, e.g., the second-to-last path segment to have parameters?
  5. Moreover, what does it mean to have a system of parameters in general which is completely orthogonal to the system of query parameters?

(Aside: For the longest time, there was no effective distinction, from the point of view of the CGI-and-its-descendants API, between parameters which came in from the URI, and parameters which came in from the request body. I am happy to see that many frameworks have finally teased these back apart.)

URI query parameters are historically meant to manipulate a resource's response. We imagine the resource to be a function (hopefully with no side effects under GET) and the parameters tell the function how to produce representations. This way it's possible (in theory, less in practice) to conceive of a 1:1 mapping of the domain (the set of query strings) to the range (the set of, e.g., byte segments) of the resource.

So that position is filled. What role should path parameters play, then? There's a set for each path segment, they aren't manipulable by ordinary HTML, and there is already a perfectly good infrastructure around query strings.

In my own work, I've used path parameters to represent successive operations over resource representations. So I already have a resource which generates some representation by whatever indiscriminate means (could be a file, script, framework, etc.), and the path parameters represent functions which take these representations as input, along with a list of arguments (unlike query parameters which represent lists of arguments by repeating the parameter key). In other words, I used the fact that these parameters are relatively untouched by standard or convention and the fact that there's already a perfectly good parametrization mechanism in query strings to come up with a completely orthogonal semantics and processing model.

Here is an example of a picture being turned into a black and white avatar:

/a-picture;crop=100,100,400,400;scale=32;desaturate

We can imagine the picture's pixel data passing through the crop function with the given parameters, say x1 y1 x2 y2 (defined elsewhere), then scale with w (an optional h being omitted because the scaling is square), and finally desaturate, which takes no additional arguments. Not only does the URI path segment indicate that the resource is a derivation, but it also tells the story of what the derivation is. Furthermore, you would be able to clip off the individual parameters at ; and see the penultimate state of the resource's representation, in a manner analogous to clipping off path segments at the / to see the "directory" under the given resource.

Anyway, that was the best I could come up with for path parameters.

@Laurian
Copy link

Laurian commented May 27, 2016

Similarly with @doriantaylor, I'm looking into matrix params as sequence of operations in two specific cases:

  1. composition of Media Fragment URIs into a playlist: /base/video1.mp4;t=0,10/video2.mp4;t=23,56 as the query string format will limit me to one fragment per URL http://www.w3.org/TR/media-frags/#standardisation-URI-queries
  2. parameters for a Capability URL segment which would enable some operations on the resource past that segment: /edit;key=ab;ttl=100;hmac=cafebabe/folder/file.extension similar with Tahoe-LAFS example from http://w3ctag.github.io/capability-urls/#tahoe-lafs

@rodneyrehm
Copy link
Member

As far as I understand Parsing matrix URLs, @doriantaylor's image manipulation example is in conflict with at least two of three implications cited:

/a-picture;crop=100,100,400,400;scale=32;desaturate
  1. attributes can only occur once - not sure
  2. there must be a syntax for removing an attribute - violated by ;desaturate
  3. attributes are unordered - I'm pretty sure ;crop=…;scale=… generates a different image than ;scale=…;crop=…

Correct me if I'm wrong, but this seems to be using the syntax without the semantics. Which is fine, I'm not judging. It does, however, point out how different interpretations make this a tough topic.


Regardless of Matrix URLs I have problems reconciling @Laurian's playlist example (/base/video1.mp4;t=0,10/video2.mp4;t=23,56) with the hierarchical nature of the path. Technically this is not an illegal use of the path segment, but it does feel wrong. (I'd probably have preferred : as the file delimiter, mostly because it's the delimiter in linux, e.g. PATH=some-path:$PATH)

@doriantaylor
Copy link

doriantaylor commented May 27, 2016

@rodneyrehm indeed if I saw that document (I can't remember), it would have been over eight years ago when I wrote mine, and I would have disagreed with it just as much then. ;)

When I reread timbl's Matrix URIs document, I'm basically seeing a description of query string parameters as they already are, with the exception of constraints 1 and 2.

Constraint 1, that an attribute ought only occur once, actually contradicts TBL himself when he wrote that URIs should only ever need to be compared lexically (source forthcoming; see also Fielding). Constraint 3, as well as query parameters, which can vary as n! of the number of keys (to say nothing of semantically-equivalent lexical perturbations in the values), also already violate this other constraint.

Constraint 2, that there must be syntax for removing an attribute, is presumably to be analogous to ../ in relative path URIs. My comment there is: good luck retrofitting that into $EVERYTHING. You can already do relative URIs with query strings; you just have to supply the entire query string, à la href="?foo=bar". In other words, it isn't clear to me what is gained by the ability to prune a path parameter through a relative URI, especially considering what it would cost to realize it.

When I did my original path-parameter work, my motivation stemmed from the fact that I had a bunch of resources that were related in ways that were both purely deterministic, and parametrized, so I was initially looking for a reasonable, less ad-hoc way to generate an addressing scheme. I later came up with the idea that they could represent functions over representations (including things like @Laurian's time slicing). The other cool thing would be that you could develop these filter functions separately from the application.

A side note on parameter sequence: If you really wanted to, you could treat the sequence of query parameters as significant—it's just an (almost universally-held) implementation convention that they aren't. Indeed web browsers append query parameters to form target URIs in the order they appear in the document. (Of course the HTML specs have admonished us for aeons not to bank on that.)

So, I may have read that document 8-9 years ago, I may not have. (I did read several TBL dispatches around then). It's definitely interesting, but in the 20 years since it was slapdashed out, a lot of code has been written.

If the world survived for two decades without matrix parameters as described in that document, then there probably wasn't a serious need for them—as described. The syntax, however, has been sitting patiently all this time waiting for a useful semantics. But then, I suppose to paraphrase Andreessen, "rough consensus and running code wins".

@Laurian
Copy link

Laurian commented May 30, 2016

Anyway, for my immediate purposes I will play with URI.parseQuery on uri.segment() and I will make a transcoding operation from matrix params to query string format. I will report back.

@duaraghav8
Copy link

Hi guys,
I recently had to use matrix URIs at my internship. I found this particular thread extremely informative. Seeing that there wasn't much support out there for Matrix URIs in Node, I've developed an express-compliant npm module (matrix-parser) for this purpose.

Its fully functional (although not optimized, but I'm working on it) and I've tried my best to implement the rules discussed by @rodneyrehm and @cowwoc. Since I'm not so experienced with JS, I'd appreciate if you could have a look at the module and give me feedback.

I've written lots of tests to cover side-cases, you can see their descriptions to understand what rules I'm following.

If you think that this is something URI.js could use, please let me know. I'd like to make a contribution to this repo as well :)

Here's the repo

@onacit
Copy link

onacit commented Jul 17, 2017

One of my colleague is trying to passing an ISO timestamp value to the server which I developed.

For following paths

.../someThings;min=yyyy-MM-ddTHH:mm:ssZ/someOtherThings

The colons of the time part, HH:mm:ss, is escaping.

Can anybody please help me to help him?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants