New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please document migration path for "url" crate #73

Closed
joshtriplett opened this Issue Jul 31, 2017 · 17 comments

Comments

Projects
None yet
@joshtriplett

joshtriplett commented Jul 31, 2017

Is http::Uri intended as a complete replacement for the url crate? If so, it might make sense to document that, and perhaps work with its author to deprecate it and point users towards the http crate. And, for that matter, I'd suggest adding "uri" and "url" as tags, and mentioning URIs and URLs in the crate description (along with requests and responses), so that it's really obvious this is the crate to use. (And hopefully it'll start showing up prominently in the search results for "uri" and "url".)

@theduke

This comment has been minimized.

theduke commented Jul 31, 2017

I'd also be interested in why the url crate is not used ( too complex? ) and what it's intended scope for the type is, as in, is it a conscious decision not to have any modifiers on Uri and will it stay this way, etc.

@seanmonstar

This comment has been minimized.

Member

seanmonstar commented Jul 31, 2017

(For context, the issue originally mentioned the uri crate, not url.)

I had never heard of that crate before just now. I personally feel it's fine to continue to assume it doesn't exist. Crates with all sorts of names can show up. Any type that a user must use to interact with the http crate will be in the docs. Other types can be assumed to be unrelated. :)

@seanmonstar

This comment has been minimized.

Member

seanmonstar commented Jul 31, 2017

I'd also be interested in why the url crate is not used

So, this issue mentioned uri, with an i instead of L, but I'll answer. The definition of URL is too strict for all use cases of HTTP. All URLs are valid URIs, but the reverse is not true. For instance: https://hyper.rs/guides is valid as both. /guides is a valid URI, invalid URL. Same with *. Same with hyper.rs.

not to have any modifiers on Uri and will it stay this way, etc.

We started conservatively. It's more complicated to get setters to work correctly, especially since in most libraries, setters just allow passing any arbitrary string (such as uri.set_host("https://woops")). Instead, for now, we've opted to have manipulation done elsewhere. Whenever you think you have a valid Uri, you can parse it as one.

@theduke

This comment has been minimized.

theduke commented Jul 31, 2017

So, this issue mentioned uri, with an i instead of L

I noticed, but I thought I'd just piggy back here.

Otherwise, thanks for the answer, makes sense too me.

@joshtriplett

This comment has been minimized.

joshtriplett commented Jul 31, 2017

Er, oops. Yes, I mean the url crate. I should have noticed the rather low usage count and lack of dependent crates. Retitling and editing the issue, for the case I actually care about.

@joshtriplett joshtriplett changed the title from Please document migration path for "uri" crate to Please document migration path for "url" crate Jul 31, 2017

@seanmonstar

This comment has been minimized.

Member

seanmonstar commented Aug 1, 2017

I've commented about the url crate in this comment. I don't believe this crate is meant to replace it. The url crate does much more parsing, and is meant to be used in Firefox (slightly stalled, but eventually).

My opinion is to just ignore that it exists. A user can turn any Url into a string, and http::Uri can parse strings.

@withoutboats

This comment has been minimized.

Collaborator

withoutboats commented Aug 1, 2017

A large number of crates already depend on Url though, we could possibly provide conversions (fallible in one direction), though something like the automatic features RFC would make this more appealing. I think this isn't a 0.1 blocker though.

@domenic

This comment has been minimized.

domenic commented Aug 1, 2017

Given that several open issues are about URL handling, I'd suggest delegating to the url crate, given that it's based on the modern battle-tested standard interoperable with other ecosystems (such as browsers, or HTTP servers that need to accept URLs from browsers.)

As for (U|I)R(|I|L|N), see https://url.spec.whatwg.org/#goals for why we've moved away from the URI term.

It would be unfortunate if a popular HTTP library for Rust goes in a different direction than Servo (or other web browsers) for URL handling.

@carllerche

This comment has been minimized.

Collaborator

carllerche commented Aug 1, 2017

The url crate and the Uri type provided by this crate handle different scenarios.

  • The url crate explicitly only handles absolute URLs
  • http::Uri does not assume that the URL is stored as a string representation in contiguous memory.
  • http::Uri is able to accept Bytes as input, allowing for zero-copy construction of values.

The url crate is already at a 1.0 version and handling these issues would definitely be breaking. For example, http::Uri does not provide an as_str() fn.

That said, I think it would be plausible to provide conversions between http::Uri and the url crate.

@jimmycuadra

This comment has been minimized.

jimmycuadra commented Aug 1, 2017

Previous discussions on the hyper issue tracker regarding ergonomics and confusion regarding Uri and Url:

@pyfisch

This comment has been minimized.

Contributor

pyfisch commented Aug 5, 2017

The Uri type has different variants. In HTTP/1.x it will often only contain the path but in HTTP/2 most often the complete request Uri. In many cases clients and middleware will want to manipulate requests based on the request Uri. What is the recommended way to do these things?

  • On a server only set a response header if the request was send with HTTPS? For HTTP/2 the scheme is part of the Uri but in HTTP/1.x it is not part of the header.
  • Do routing with multiple hosts: Does every middleware need to implement its own logic or use some third-party crate to get the host? Both the uri.authority() and the Host header may provide this information.
  • On a client: If the same function handles HTTP/1.x and HTTP/2 must all requests use the correct Uri/Host combination for the version or is the client expected to convert from Uri with :scheme, :authority and :path to Uri with an absolute path and a Host header and vice versa?
@carllerche

This comment has been minimized.

Collaborator

carllerche commented Oct 10, 2017

Yes, I wonder if server implementations should provide some sort of Uri normalization. Specifically, with HTTP 1.1, using the Host header as the authority component. This would mirror how most HTTP client lib APIs look (you make a request to an URL, not a path + host header).

@softprops

This comment has been minimized.

softprops commented Jan 9, 2018

Sorry if this was brought up before but something I'd love to see is something url’s query_pairs which gives you an iterator over the decoded set of query params. Also something like the form url encoded serializer type for appending params. I suggest these as convenience but also for ergonomics. It's very awkward to have to pull in another crate and wire two crate apis together to exercise a common action, parameterizing an http request.

@mitsuhiko

This comment has been minimized.

mitsuhiko commented Mar 6, 2018

I also want to chime in on this. It's pretty clear that the entire world is moving away from URI/IRI as terms to just URL. With the url crate we have a pretty damn good (new) standard compliant crate that is very popular. Right now I use this everywhere and then when doing requests to hyper I need to convert them which is not very great ergonomics and performance wise. I understand there are cases the url crate does not cover, but maybe at least transparent conversion from url would be possible in cases where we are dealing with an actual url?

@aturon

This comment has been minimized.

Collaborator

aturon commented Sep 19, 2018

I'm astonished that no-one on this thread has reached out to the url crate authors. @SimonSapin, can you weigh in on the discussion here?

@SimonSapin

This comment has been minimized.

SimonSapin commented Sep 19, 2018

As far as I understand it’s not a replacement, they’re for different things.

As already mentioned, url::Url represents an absolute URL record. Parsing a relative URL string such as "../foo.html" requires providing a base URL to resolve against. The result always has an URL scheme/protocol, etc.

In my understanding, http::Uri is something different. It is really the target of an HTTP request. This can be an absolute URL (compatible with url::Url) but the more typical case is a host-relative URL like /foo.html (starting with a slash). There’s other variants, like the single asterisk * for some types of requests that only makes sense in the context of HTTP. Perhaps it should be name something like RequestTarget instead.

I expect that a high-level HTTP client with a make_get_request(url: &url::Url) -> Something API would not convert the entire url to an absolute http::Uri but instead extract the path and query components with something like &url[url::Position::BeforePath..url::Position::AfterQuery] (after looking at other components to extract a SocketAddr and a protocol).

@carllerche

This comment has been minimized.

Collaborator

carllerche commented Sep 19, 2018

@SimonSapin thanks for chiming in.

At this time, I do not believe that there are any steps to take regarding this issue. As such, I will close it. Feel free to explain why it should stay open and I can open it up again.

@carllerche carllerche closed this Sep 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment