Url struct layout makes processing URIs in servers difficult #12

mikedilger · 2014-08-20T23:36:14Z

Coming from a URI perspective (RFC 3986) and a webserver perspective, there are strings which contain (path,query,fragment) but omit (scheme,username,password,host,port). These are relative to a web server. I'm working on an OAuth2 library and an REST API library, both of which are server side or have server-side components, and work with such strings.

The way the URL struct is is broken down (path being way down in RelativeSchemeData) makes it difficult to parse/represent (path,query,fragment) type strings. I'm having brief moments of insanity where I'm on the verge of writing another differently structured URI rust library to address this. Could rust-url address my concerns... or is the data structure layout pretty much settled?

As an aside, what is the status/acceptance of http://url.spec.whatwg.org/ ? Is it really the standard that the majority of software providers aspire to for URLs? And whence went URIs? Seems a drastic break from when mention of URLs nearly disappeared from IETF standards, as this one almost fails to mention URIs.

SimonSapin · 2014-08-21T12:56:55Z

Regarding rust-url: nothing is set in stone. Quite the opposite, it’s very much a work in progress. I’m aware that some use cases are not covered yet, and interested in what kind of data structure and API you think would help. I’d definitely ask that we try to figure something out together before you go off and start a competing library.

Regarding the URL standard: "acceptance" pretty much depends on who you ask. It’s designed to reflect how browsers actually work, and to properly define error handling (where RFCs can be quite hand-wavy.)

Regarding URL v.s. URI: first, is it useful to have two distinct concepts? In practice, the difference doesn’t really matter to software. If that distinction is removed, we need to pick a single term. URI might be "more correct", but in practice everybody outside of IETF uses URL. See what happened with TLS: it’s been around for 15 years, but we still talk about SSL which only existed 4 years before that.

mikedilger · 2014-08-21T13:39:55Z

Thanks. Ignore my URI/URL madness. I'll make a more concrete suggestion (hopefully pull req) in the coming days.

SimonSapin · 2014-08-21T14:42:01Z

Related: #2

mikedilger · 2014-08-21T15:26:17Z

After reading more than half the code and re-reading both standards, I've recognized why URL parsing needs to be this way, grouping the scheme-relative URL, rather than putting Authority and Path in the top struct... because an absolute URL needs at least a host in the authority section, and other schemes define the whole scheme-relative URL. So I'll probably just use as-is, parsing with a dummy base url and having blank fields. Closing.

mikedilger closed this as completed Aug 21, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Url struct layout makes processing URIs in servers difficult #12

Url struct layout makes processing URIs in servers difficult #12

mikedilger commented Aug 20, 2014

SimonSapin commented Aug 21, 2014

mikedilger commented Aug 21, 2014

SimonSapin commented Aug 21, 2014

mikedilger commented Aug 21, 2014

Url struct layout makes processing URIs in servers difficult #12

Url struct layout makes processing URIs in servers difficult #12

Comments

mikedilger commented Aug 20, 2014

SimonSapin commented Aug 21, 2014

mikedilger commented Aug 21, 2014

SimonSapin commented Aug 21, 2014

mikedilger commented Aug 21, 2014