Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we combine .host and .socketPath on HTTPClient.Request? #229

Open
dimitribouniol opened this issue May 28, 2020 · 11 comments
Open

Should we combine .host and .socketPath on HTTPClient.Request? #229

dimitribouniol opened this issue May 28, 2020 · 11 comments

Comments

@dimitribouniol
Copy link
Contributor

While refactoring the unix domain socket implementation in #228, I noticed that we could probably combine the host and (new) socketPath properties on HTTPClient.Request.

From what I could tell:

  • The socket path would end up in the Host header for the request (currently it is "")
  • TLS would use the socket path on certs as the common name/hostname (currently it is checking for "" as opposed to nil, which it's doing for IPs)
    • IPs are explicitly forbidden from the SNI spec. It says:

Literal IPv4 and IPv6 addresses are not permitted in "HostName".

    • The same spec does not forbid using socket paths however:

Currently, the only server names supported are DNS hostnames; however, this does not imply any dependency of TLS on DNS, and other name types may be added in the future (by an RFC that updates this document). TLS MAY treat provided server names as opaque data and pass the names and types to the application.

    • However, we should still check the common name in cert validation for matching hostnames
  • Responses would propagate the socket path as their host as well (currently, it's also set to "" for socket domain responses)
  • We would need better test coverage for these scenarios.
@Lukasa
Copy link
Collaborator

Lukasa commented May 28, 2020

I don't think trying to validate TLS certificates with the socket path in them is likely to work well: so far as I know, nobody does that. The same applies to SNI, given that again, socket paths are essentially never used there. It's possible that the right thing to do for those two cases is to allow an override, such that users can say what Host they want to treat the request as being for, irregardless of what the URL actually says.

curl does this with --resolve, which works nicely.

@Lukasa
Copy link
Collaborator

Lukasa commented May 28, 2020

Actually, in general I wonder if that's not the correct solution to all of this. Essentially, to say that there are some subset of hostnames that are "fake", and resolve to some other underlying socket address. Then the client will essentially map the request to that other socket address, but for things like SNI, cert verification, and other purposes we'll behave as though the host was the one provided directly to us.

In fact, this might be something we can do with apple/swift-nio#1487.

@dimitribouniol
Copy link
Contributor Author

I agree that being able to pass in a custom hostname in any scenario is probably the better way to go about this. If one is not provided though, should it fall back to the socket path? Is it ok to make a request with a Host that is just ""?

@Lukasa
Copy link
Collaborator

Lukasa commented May 29, 2020

Yeah, I don't think it's worth passing the socket path as Host. I don't think any server will ever expect it.

@dimitribouniol
Copy link
Contributor Author

If I were to add this to Configuration, I can do it in two ways:

  • Add a .hostnameOverride: String?, which would always override the hostname with the specified string in SNI and Host headers;
  • Optionally also add a .hostnameFallback: String?, which would only be used for SNI and Host headers if connecting to IP addresses or socket connections.

Note that it seems like this would apply towards the entire HTTPClient at the moment, since a configuration can't seem to be attached to Requests, but would be extra useful once they can.

A Swift-ier solution could be to have one .presentedHostname property on the Configuration, but then to have its value be an enum of one of three cases:

  • .urlHost: use the hostname from the URL
  • .urlHost(fallback host: String): use the fallback host when an IP address or socket path is used
  • .override(_ host: String): always use the override in every situation.

Do you have any thoughts regarding this?

@Lukasa
Copy link
Collaborator

Lukasa commented Jun 2, 2020

In general I’m inclined to say this probably wants to be a mapping of the hostname override from string hostname to string hostname. These string “hostnames” can then also match the hostnames in UDS socket paths and IP addresses. That allows a much more substantial override. Does that make sense?

@dimitribouniol
Copy link
Contributor Author

Ah, that makes sense. Let me draft out something along those lines.

@dimitribouniol
Copy link
Contributor Author

dimitribouniol commented Jun 3, 2020

Here is something I came up with: master...dimitribouniol:hostname-overrides (still missing tests)

Specifically, the HostnameRepresentationResolver here:
https://github.com/dimitribouniol/async-http-client/blob/ebdca3abed0934f4128fbd13386d6f2b1b483ac8/Sources/AsyncHTTPClient/HTTPClient.swift#L658-L742

Essentially, I add a new HostnameRepresentationResolver to the configuration that can be used to register representations for regular host names and socket paths, which will then be replaced accordingly in SNI and Host header situations.

Thoughts before I start writing tests against this?

@Lukasa
Copy link
Collaborator

Lukasa commented Jun 4, 2020

I think that approach looks reasonable to me. @weissi thoughts?

@weissi
Copy link
Contributor

weissi commented Jun 4, 2020

Yes, looks reasonable to me. A few questions:

  • I'm assuming we don't need to actually make the enum public and we can replace this by a struct with static funcs. enums can't be extended without breaking public API.
  • Do we strictly need that much new public API? I'd reduce it to the absolute minimum.
  • In general I find it's often quite limiting to do configuration on HTTPClient itself. Often, one HTTPClient instance is shared quite widely, for example in a whole Vapor webserver. I would expect the SNI overrides to be necessary per request and not just per client. See also redirect configuration is only available top-level on HTTPClient #196 .

@dimitribouniol
Copy link
Contributor Author

I'm assuming we don't need to actually make the enum public and we can replace this by a struct with static funcs. enums can't be extended without breaking public API.

I'm assuming you are referring to RepresentationKey? Yeah, this can be made private. The only place it's useful is for initializing the resolver with a mix of host and socket path overrides, but we can either:

  • A. Add an initializer to account for both a mapping for hosts and a mapping for socket paths in one go (or add a third option any(String) that is always checked).
  • B. Add a third case any(String) that is always checked, so initialization can be mixed.
  • C. Remove the namespacing entirely between hosts and socket paths.

Do we strictly need that much new public API? I'd reduce it to the absolute minimum.

Which methods are you referring to? I'm happy to remove/make private/change anything you think is too much 🙂

In general I find it's often quite limiting to do configuration on HTTPClient itself. Often, one HTTPClient instance is shared quite widely, for example in a whole Vapor webserver. I would expect the SNI overrides to be necessary per request and not just per client. See also #196.

I would also love to do this on a per Request level, I just figured this would be a good first step. Ideally, it could be nice to assign a Configuration when execute()-ing a Request, so the config is not tied to the request on creation by the user per se, but is attached to the copy of it that gets submitted. I don't know how this would mess up the existing connection pool logic though, which doesn't seem to be well set up for this approach quite yet (especially if there are different proxy server requirements and the like).

As always, I'm open to thoughts on the topic and happy to help out where I can 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants