Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ffi): add hyper_request_set_uri_parts #2581

Merged

Conversation

divergentdave
Copy link
Contributor

Add a second FFI interface for setting the URI of a request with three separate schema, authority, and path/query strings, rather than one URI string.

I'd like to add this new API to simplify integration with programs like curl where the URL has already been parsed to some degree by the client. This new function, backed by Url::builder() will allow passing more components through as-is, rather than concatenating them into a URL string in C and then parsing it again in Rust.

Copy link
Member

@seanmonstar seanmonstar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool! Thanks for writing this up! A couple thoughts come to my mind:

  • Would it be likely that someone would use this function but expect to be able to pass NULL if only setting some of the parts (like authority-only, or path-only)? If so, from_raw_parts says the pointer has to be non-null. It just seems like a more likely case than calling hyper_request_set_uri(req, NULL, 0). We could either have a null check, or document that it is UB to pass NULL.
  • Not so much a change needed here, I think it's valuable to be able to set them all at once with 1 less allocation. But: we could also consider exposing a hyper_uri *, and then be able to set each part with a different function call. It'd allow someone to better know which part was bad... Like I said, I'm just thinking out loud here.

@divergentdave
Copy link
Contributor Author

It occurs to me that we could check each argument for null, and skip calling the corresponding method on the builder when given a null pointer. I'll take another look at the Url struct's validation criteria and see if that would make sense here.

@divergentdave
Copy link
Contributor Author

Uri::from_parts will allow four different combinations of the builder's internal state:

  • All three parts are present
  • Only an authority is given
  • Only path_and_query is given
  • All three parts are None

Based on this, I think it would make sense to allow each of the three string pointers to be null, and skip calling the relevant builder method when they are null.

@divergentdave
Copy link
Contributor Author

Okay, I rewrote this to add the null checks, and expanded the docstring with requirements on the arguments.

Add a second FFI interface for setting the URI of a request with three
separate schema, authority, and path/query strings, rather than one URI
string.
@divergentdave
Copy link
Contributor Author

Could I get another review of this PR? I would like to get this merged in so it can be used elsewhere. Thanks!

@bagder
Copy link
Contributor

bagder commented Sep 7, 2021

If I understand this proposed new API call, it will make a set of pseudo headers get sent in the request? It seems like a new behavior of the library's behalf as so far we've set all the headers "manually". Would this then mean that hyper would set all psuedo-headers itself?

@divergentdave
Copy link
Contributor Author

This new API is intended as an equivalent alternative to hyper_request_set_uri, which curl is currently using. The difference is that hyper_request_set_uri takes one string, and tries to parse it as a URI, while the new hyper_request_set_uri_parts takes three strings, so it doesn't have to parse apart the scheme, etc. Calling either will result in setting some or all pseudo-headers, depending on the arguments.

@bagder
Copy link
Contributor

bagder commented Sep 7, 2021

This really needs more and clear documentation. Which psuedo-headers are set when.

@seanmonstar
Copy link
Member

In hyper, the request's Uri is probably best described as the request-target from the RFCs. So in HTTP/1, whatever is set will get sent as-is in the first line (GET $uri HTTP/1.1). It supports the 4 defined variants, origin-form, absolute-form, authority-form, and asterisk-form.

The underlying type was built to efficiently support HTTP/2 where the request-target is split over :scheme, :authority, and :path. As such, each part can be set explicitly, or the type can parse a single contiguous string and if a scheme is found, that slot is "set". If the string just starts with a path, only the path portion is set. All pseudo headers that have been parsed/set are sent when the connection type is HTTP/2.


I believe this API suggestion is to allow setting each piece if you already have them separated, instead of needing to combine to a single string first, set in hyper, and then throw away that string.

@bagder
Copy link
Contributor

bagder commented Sep 9, 2021

That's a good beginning of documenting what these hyper functions do. The fact that hyper sets a bunch of headers in one case but not in the other is far from obvious to a user like me.

As I don't want to "accidentally" use the absolute form for with HTTP/1.1 (unless speaking to a proxy), this sounds like I have to set the absolute uri for for h2 (to get the scheme and authority set - and make sure not to include Host:), but for h1 I need to set the relative "URI" only and set the Host: header.

@seanmonstar
Copy link
Member

Good point, I'll add that to the docs of these methods in a follow-up today.

Thanks so much @divergentdave!

@seanmonstar seanmonstar merged commit a54689b into hyperium:master Sep 16, 2021
@divergentdave divergentdave deleted the hyper_request_set_uri_parts branch November 22, 2021 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants