-
Notifications
You must be signed in to change notification settings - Fork 150
Addressing HTTP servers over Unix domain sockets #577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It seems you don't need just addressing for this, but some kind of protocol as well. I recommend using https://wicg.io/ to see if there's interest to turn this into something more concrete. |
I'm not sure I understand why any additional protocol would be necessary. It's just HTTP over a stream socket. The server accepts connections and speaks HTTP just like it would for a TCP socket. Indeed, I can set up such a server today, and it works fine provided that the client provides a way to specify the socket, e.g., |
I don't even understand how this is not a thing yet. Especially now that Windows started supporting AF_UNIX sockets natively, it seems to be the best, cross-platform way to connect web and native apps without consuming a TCP port. |
Let me take a step back, what exactly is the ask from the URL Standard here? |
The ask is for the URL standard to specify a syntax for referring to a page served via HTTP over a UNIX domain socket. Currently, applications that want to support connecting to an HTTP service have to pick from one of the following three:
None of these are ideal. Deciding on a standardized URL syntax allows different implementations to implement the functionality in a common, standards-compliant way. |
I see, https://wicg.io/ is the place for that. The URL standard defines the generic syntax. If you want to define the syntax for a particular URL scheme as well as behavior, you would do that in something that builds upon the URL standard. E.g., https://fetch.spec.whatwg.org/#data-urls for |
Let me rephrase: the specific ask for the URL standard is to provide an allowance in the URL syntax for specifying a UNIX domain socket, either in lieu of the port (e.g., |
I recommend using something like |
It's the same protocol over a stream socket, just a different address (ie. authority part). Ok, so it's a different protocol in the sense of IP, but so are IPPROTO_IP and IPPROTO_IPV6, and the URL standard doesn't treat those as different. The relevant comparison I think are address families for stream sockets, like AF_INET, AF_INET6 and AF_UNIX. Once the stream socket has been established (as specified by the authority part of the URL), HTTP software shouldn't care or even know how the stream is transported. Most invented, non-standard approaches for HTTP-over-unix-sockets seem to gravitate to something like a different scheme (since the authority part can't really be disambiguated from a hostname if relative socket paths are allowed from what I can see), like http+unix or https+unix, and then percent-encoding the socket into the authority part, and then everything works naturally from there from what I can see. I've also seen (and used) enclosing the socket path in [] in the authority part and keeping the scheme as http or https, but I think that namespace clashes with IPv6 style numeric addresses like [::1]:80. RFC 3986 (in section 3.2.2) kind of leaves space for this by anticipating future formats within the [], and providing a version prefix to disambiguate them. Overall I like this approach the best (it extends into the error space so it doesn't change the interpretation of any valid existing URL, lives in an extension space envisioned by the standard, minimally extends just the appropriate part of the standard (authority part), keeps the schemes http and https to mean "this is a resource we talk to this authority using the http(s) protocol for", and so preserves compatibility for software that uses the scheme to know what protocol to speak with the authority over the socket. |
Changing the syntax of URLs is not really something we're willing to do. That has a substantive cost on the overall ecosystem. The benefits would have to be tremendous. |
Syntax in
|
The strongest argument I can think of for this is: http(s) URLs have special parsing quirks which don't apply if the scheme is http+unix. So for a perfect 1:1 behaviour match, UDSs would need to use an actual http URL, not a custom scheme (similar to IP addresses). That said, I'm also not a fan of adding yet another kind of host (file paths). My preference would be to use a combination of:
This is a perfectly valid HTTP URL, and should be capable of representing any HTTP request target. Alternatively, you could try to get (Note: this would also mean that all UDS URLs have the same origin, although that could be remedied by adding a discriminator to the fake hostname to make your own zones of trust, e.g. |
I'm not sure using the fragment is really tenable for these use cases (and local web dev, especially). Many web applications use the fragment for their own purposes in JavaScript, whereas the host (at least it my experience) tends to be handled more opaquely. What would be the main drawback for allowing additional characters within [] for the host portion of an HTTP URL? |
Ah yes, you're right, it wouldn't work for local web development. I was thinking more about generic HTTP servers. The main drawbacks IMO are:
|
Yes, I think the place for the UDS socket is in the authority portion - that's the bit that has the responsibility for describing the endpoint of the stream socket to talk to for this resource. Putting it elsewhere feels like an abuse and likely to cause unforeseen problems (HTTP client software will certainly have the host portion of the URL available in the portion of the code that establishes the stream socket, but may not have the fragment). I think the namespace collision with IPv6 literals and syntax validation for UDS paths can be solved by:
It's up to the host to decode and translate the path into whatever native scheme that OS uses (just as it is for the path portion of the URI). For me the motivation for supporting HTTP over UDS goes way beyond web browsers (and I would see that as a minor use case for this) - for better or worse HTTP has become a lingua franca protocol for anything that wants to communicate on the Internet (consider websockets for some of the forces that drive this), and that is increasingly machine to machine. For example: we run an online marketplace that serves about 10 million requests a day over HTTP (excluding static resources offloaded to a CDN), but each of those involve several HTTP interactions with other services to construct the response: Elasticsearch queries, S3 to fetch image sources that are resized, etc, a whole host of REST services for shipping estimates, geocoding, ratings and reviews, federated authentication providers etc. So, by volume, the overwhelming majority of HTTP requests our webservers are party to are between them and other servers, and aren't transporting web pages. As the trend toward microservices and containerization continues this will only increase, and it's particularly there that I see HTTP-over-UDS being useful:
The other trend is for UIs to be implemented in HTML rather than some OS-native widget set (Android, iOS, GTK, QT, MacOS native controls, Windows native controls, etc), even when the application is entirely local on the user's device. There are very good reasons for this:
In this use case the hierarchical namespace issue is important and addresses a major downside to this pattern - choosing a port from the flat, system-wide shared namespace (ok, so the listening socket can specify 0 and have the OS pick a random unused port on some systems, but that's a bit ugly). Much nicer to use Finally, consider things like headless Chrome in an automated CI/CD pipeline - the software managing the tests being run on the deployment candidate version could start a number of headless chrome instances and run tests in parallel, easily addressing the websocket each provides with a UDS path like The tech already exists to make these obvious next steps in application provisioning and inter-service communication happen (even Windows supports Local sockets aka UDS), and the scope of the change for existing HTTP client software should be small and of limited scope (URL parsing, name resolution and stream socket establishment steps) but it can't happen unless there is a standardised way to address these sockets. |
What exactly is wrong with #577 (comment)? @karwa |
You ask the IETF, just like Personally, I'd go with something like:
Yes, the escaping is ugly, but it's much cleaner than overloading IPV6 in URLs. Alternatively, you might be able to get away with: |
@mnot any update on this? Was it implemented? Should this ticket be reopened? I'm also interested in this. |
I just left a comment with some context; I don't know that anything else has happened. |
I haven't read anything here that seems to justify breaking with the familiar pattern, "<protocol>://<domain>/<filepath>" or injecting a lot of special characters into the URL, or mimicking an IPv6 address. The protocol is simply "http". The domain is right there in the name, "Unix Domain Socket". Like any other top level domain - net, com, org - the domain is simply "unix". I don't know any reason that a web browser application cannot parse the domain from a URL, recognize a nonstandard domain name, and invoke a special handler for a non-network socket. The difficulty seems to be in distinguishing the path to the socket from the path to the resource file. The "HTTP with socket path as the port" option, above, makes the most sense. And since a special handler must already be invoked for this "unix domain", I expect that colons - ":" - can continue to be used as the "port" separator for the socket path. Altogether, that suggests a straightforward URL, as in: "http://unix:/var/run/server/ht.socket:/path/to/resource.html". Is there any reason that those repeating ":/" character sequences would pose a problem in a URL? This approach would not impose any limitation on the use of ":" in the resource path name, since a "unix domain" must be followed by a socket path, and that path will always be delimited by ":/". Any subsequent colons must then be part of the resource path name. And, of course, this URL format still supports specifying any arbitrary protocol, served through a unix domain socket. And there is nothing redundant or misleading in the URL, as would be the case with any format requiring the name "localhost" or involving special parameter passing. |
http+uds:///path/to/socket? |
@michael-o, that doesn't provide any means to specify the resource path, as it is putting the path to the socket where the resource path should go. |
But one thing I sort of need to point out in this context is still the fact that the URL standard is still an "interface". Which means that we need to differentiate between attempts to modify the "interface" of a URL by changing the URL standard, and attempts to modify the "implementation" of URLs by changing individual implementations, the former of which is merely a means of abstractly expressing a Unix domain socket or similar string in a URL with few constraints on the actual implementation, and the latter of which is the actual means of connecting to a Unix domain socket i.e. All of the above syntax proposals would effectively be modifying the "interface" of a URL to support an extra method of connecting to a Unix domain socket. In many cases, interface implementations are not required to implement every possible method, if it is known that users of the interface will not use that method, and that is certainly true in other contexts (such as the Java Collections API with immutable or unmodifiable collections used with functions that only attempt to read from the collection). Sorry, but the Liskov substitution principle is not very applicable, otherwise every client that implements URLs would have to support every single URL scheme, and that is simply infeasible. Which means that even if we do have a standard for encoding a Unix domain socket path or similar string in a URL string, we cannot guarantee that every implementation of URLs (such as in browsers) will honor it. This effectively means that many of the linked issues regarding non-support of Unix domain sockets in various clients that take in URLs might be considered to be wishful thinking, that is, even if a scheme for encoding a Unix domain socket path is devised, there is no guarantee that every app will end up supporting it. On the other hand, my and @randomstuff's proposals of proxy servers or LD_PRELOAD libraries to support the connection of clients to Unix domain sockets are means of changing the implementation of URLs. It is similar to adding support for a new filesystem in an operating system kernel: the new filesystem can be used by applications transparently, by referencing paths on that filesystem in file access APIs, without having to change the application, because all the different filesystems all share the same interface. This is generally much more feasible to accomplish. LD_PRELOAD might not be possible for Go binaries at this moment, but this is being worked on. Ultimately, this means that the mere act of connecting to a Unix domain socket is not necessarily something that requires changing the URL standard, if it is possible to shoehorn it into some existing interface. It may seem very hacky or unsightly, but the major advantage is that client applications do not need to be changed, considering how many HTTP clients or web browsers there exists out in the wild. A similar issue exists in issue #392 where there is discussion on encoding an IPv6 link local zone identifier in a URL. The mere act of connecting to an ipv6 link local address is something that can simply be done by changing the implementation. For example, interpreting subdomains of the A more relatable example is the fact that the URL syntax did not need to be changed in order for connections to domain names to go over IPv6. If the URL standard did not have the square bracket notation, then it would have still been possible to connect to IPv6 websites on the network layer, the only limitation would have been that it would have required the use of a domain name to do so. The main reason why the URL standard ultimately did need to be changed in that case is because of the legitimate interest in connecting to IPv6 literal websites in the same way we could have done it with IPv4. |
This comment was marked as abuse.
This comment was marked as abuse.
This comment was marked as resolved.
This comment was marked as resolved.
Why was this closed? I found this issue based on searching for a solution to this exact problem, and the discussion mirrored and built on my own thoughts wonderfully. In fact, HOW was it closed? The GitHub UI isn't even showing a Closed event of any kind in the timeline. |
Ah thank you, I completely missed that. I've never in my life seen this many comments on a closed issue AFTER it was closed. :) It does seem like there's still a lot of discussion here about whether a different URL scheme is the best solution to the problem - #577 (comment) points out there are at least four different alternatives. Given that I'd argue this issue should be reopened, unless there is another active issue or RFC out there tackling the same problem. |
I should also mention that like several other commenters here, I'm willing to invest some of my own time into a standard solution, as I'd rather contribute there than build my own local standard solution. |
Hmm - reading back through this all:
As I discussed above, reading in RFC 3986, I have argued that the Unix Domain Socket, UDS, must be "the 'port' subcomponent of authority of an Address Family AF_UNIX socket" within the meaning of RFC 3986. In reference to that, I have no idea what you mean by "unix domain sockets have a completely different authority". Please explain. Are you arguing that the UDS is not an "authority"? If so, how so? And, what do you mean by "have" an authority, rather than being an authority - and, "different" from what? The simplest revision to RFC 3986, as I have suggested here, is to generalize the definition of "port" to include a UDS filesystem path, rather than be restricted to being exclusively an AF_INET number, and to also extend the use of the port ":" delimiter, as a kind of "toggle" delimiter using also a trailing ":", optional after an AF_INET number and required after an AF_UNIX socketpath, as in ":some/path:" or ":/some/path:", with the trailing ":" explicitly preceding the "path" component of the "authority" component of the "hier-part" of the URI in RFC 3986. Thus revising the original:
to express the URI in more plain and explicit format:
where "path" is already defined explicitly and separately in Section "3.3. Path":
and extend the definition of the authority, only changing the description, and not the actual definition, to become instead:
and extend the definition of "port", keeping in mind that "path" is already defined in the RFC, to become instead:
Note that this approach still allows the use of a ":" in a "hier-part path", distinct from the "hier-part authority", though it would prohibit the use of a ":" in the socketpath itself: "https://example.com:socket/path:/some:path/with/a:colon". The second ":" in the authority terminates the socketpath. Of course, that prohibition would be true for any delimiter that is used for the socketpath. Still, a literal "::" would terminate any "hier-part authority", without actually specifying a port. Incidentally, from Section "3.3. Path":
I also note that, following my rant about the "://", reference to 'a double slash ("//")' in RFC 3986 Section "3.2. Authority" must not be confused with Section "4.2. Relative Reference", referring to a URI "hier-part path", following a URI "hier-part authority":
This "relative reference" idea just seems to me to be a needless complication of the URI. I don't know that I've ever seen anyone actually use a "relative reference". And, while most browsers seem to accept a URL with a missing "scheme" and elided "authority", which is to say, the "https://" prefix deleted, that is not an example of a "relative reference" as defined in the RFC. It would be useful for people to please argue whether this "socketpath as port" interpretation of the UDS, as outlined here, is, or is not: effective, "minimally invasive" to existing standards, and/or not counterintuitive. |
@annevk |
I do think there's a good case here for making authority specification more flexible. I find @thx1111's proposal roughly appealing, as I agree that the current numeric port section is overly biased for specific protocols. At the same time, extending it to only support socket paths as well (or at least a path-like value) feels like it fails to prevent future instances of the same problem for other host schemes. My preference would be to ignore the port for protocols where it isn't useful and instead focus on scheme-specific authority syntax refinement. As a data point, though, I thought I would go with the I suspect a supplementary RFC proposing standard ways of embedding alternative kinds of authorities in the existing URI host syntax will be more useful in practice. This could be referenced by extensions to existing schemes like FWIW, in my POC I'm finding something building on the |
Yes. Something like |
@thx1111 it's indeed not completed, but I don't think rejection was an option back in 2021. We can't change the URL parser for each new use case that comes along. As Robin suggests above (and mnot explained earlier) there's plenty of room to innovate within the constraints of the existing syntax. |
For comparison, OpenLDAP uses a different scheme for ldap-over-UNIX-sockets: |
Please provide or contrive an example. Also, please clarify - the words "host" and "scheme" are specific and distinct terms-of-art as used in the RFC 3986 definition of a "URI", where "host" is a component of the "authority". By "host scheme", did you mean a "host", or a "scheme" - or something else? You can find a list of registered "schemes" under "Uniform Resource Identifier (URI) Schemes" at: https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml We may note that many of the "schemes" registered there, as seen in their description templates, define no "authority" component, and consist of only a "scheme" and "path". Obviously, without an "authority", there would be no "port". Even a "localhost" AF_UNIX URI would have to specify a "host" component, which is not optional if there is an "authority" in the URI, as for instance, literally "localhost". But then, either a "scheme" needs an "authority" - or it doesn't. Nonetheless, any RFC 3986 defined URI is going to use the exact same definition of "authority", as defined in the RFC. The "authority" does not change with - is not a function of - the type of "scheme" selected. These are two distinct things, the "scheme" and the "authority", within the RFC. I have not even proposed changing the definition of the "authority". I have only proposed extending the definition of the optional "port" component of the "authority", to allow for an AF_UNIX "path", as distinct from the "url-path" or "urlpath", as it was termed originally in RFC 1738:
Hmm - rhetorically, how does a hostname of the form which you suggest, having the top-level domain "localhost", get routed around the Internet? I do not understand what you are describing there. Can you be more specific? You could use the "userinfo" component of the "authority", preceding the "@" delimiter, to customize the effect of the URI. From the RFC, Section "3.2.1. User Information":
Did you mean something like that?
In terms of RFC 3986, that format also places the UDS socketpath into the URI "authority". But, where is the "host" component? The URI "host" is required by the RFC when any "authority" is given. So this OpenLDAP ldapi URI scheme does not comply with RFC 3986, and neither is it registered with the IANA, as is. But, same as above, "@ host" could be appended to that socketpath, to make it RFC compliant. Still, then URI "userinfo" will have been lost. Still, the socketpath could be added as a "password" for the "user", which is being deprecated anyway. But, strictly speaking, the RFC says:
So, the user could not visually discover or verify the actual UDS socketpath in a RFC compliant application using "socketpath as password".
"Plenty of room to innovate" here seems like "code words" for "stuffing a square peg into a round hole." And, please explain how "room to innovate" is something the does not also mean "change the URL parser"? The classic method used to make an easy problem seem difficult and complicated is to have more parameters than abstract variables. This then becomes a game of "musical chairs", and something always gets left-out, no matter how the parts are rearranged. That's a "fool's errand". |
@thx1111 - mea cupla, "host schemes" was a poor choice of words. Perhaps "kinds of authority" is a better phrase: the current URI syntax seems to be biased towards TCP, and your proposal tries to at least extend it to more naturally support UDS. I do have another concrete example in mind: for my use case I also want to be able to do HTTP over VSOCKs. The address for a AF_VSOCK socket is a 32-bit port number together with a 32-bit context ID. I'm not sure how I'd encode a VSOCK address into an authority with your port proposal: something like My point being that making the port either a number or a path is also not as future-proof as it could be: IF we want to make the authority syntax less of a round role for other kinds of authority, I'd be inclined to make the authority more opaque at the generix URI syntax level, so specific URI schemes can encode the kinds of authority they support more naturally and with less need for percent-encoding. |
Casual inspection of this description from the man page reveals the terms "port number" and "host". Then, referring to the current version of RFC 3968, I see:
Rhetorically, what would happen if the VSOCK "host" were a URI "host" and a VSOCK "port number" were a URI "port" number? Clearly, I'm not appreciating the actual problem there, since "host:port" is already defined in the RFC. What is the actual problem? |
You're correct that it's also possible to encode the VSOCK information in the URI as By the same argument, a UDS authority at I'm trying to say that your proposal makes progress by allowing a path as a port as well as a number, but I think IF we're going to propose a URI syntax change, it should have deeper impact and provide enough flexibility to support other address families nicely as well. (And to be clear I'm personally not enthusiastic at this point about a URI syntax change, given the |
No, it could not be misinterpreted as an INET authority. And, "1234567" is not a valid top-level domain name. From RFC 3986:
This reference to a "first-match-wins" algorithm is just another tacit assumption of RFC 3986 which presumes the crafting of an appropriate heuristic. The application author is on-their-own for crafting that algorithm. We could pretty much apply this same reasoning in the RFC - just presuming some heuristic provided by an application to recognize the URI authority "host" - to the interpretation of the URI authority "port", to distinguish a "*DIGIT" from a socketpath, as in ' path ":" '. There is nothing "novel" in that. It would simply be polite to articulate such a presumption in the RFC itself, if it were to be adopted as a de facto standard in applications. |
@randomstuff's "soxidier" tool maps domain names to unix domain sockets, such that if the user went to http://name.username.users.uds.localhost/foo/bar then it would be effectively connecting to e.g. In this way, the syntax of the URL is not changed, as it looks like a normal URL with a domain name. Only the interpretation of the URL in terms of the network stack is changed. Mapping domain names or IP addresses to unix domain sockets may be a bit hacky, but the mappings can be automatically generated and managed with scripts, SQL tables, and/or similar. This basically shows that it is not necessary to modify the URL syntax to merely connect to a Unix domain socket, because as I've said, there are many URL parsers which would have to be updated, and it's not really practical to modify every one of them. But modifying the URL syntax still has a few advantages. For example, you might have an app that may print out something like this on the console:
With both my and @randomstuff's solutions, one would have to register the Unix domain socket in the respective tools. But embedding the unix path in this manner does not have this restriction. |
Well, fine. But, as I said, that is not an RFC 3986 compliant URI. A list of IANA recognized top-level Domain Names can be found at: https://data.iana.org/TLD/tlds-alpha-by-domain.txt Of course, you could configure your own local dns server to resolve "localhost" to some server on your local network, but that is not a general solution. It would be useless for accessing a host on the global Internet. To be clear, with respect to the solution which I have proposed, the pros and cons come down to a question of a single ":" character. Seriously - the placement of a single ":" character in the context of the entire RFC 3986. |
Section 3.2.2:
Unix domain socket URIs are not intended to have global scope, due to the local nature of Unix domain sockets. Sure, you might not be able to connect to sockets whose name contains a colon, but one could place a symlink at a file path which does not contain a colon, and point it to a file path that does contain a colon. |
Allowing use of DNS to resolve a URI to a Unix Domain Socket path sounds like a wonderful gift to hand to malevolent actors. Convincing a user to click a link that resolves to a well-known UDS-based service would become commonplace. Regardless of what any RFC says, a URI referring to local resources should look significantly different from one referring to external resources, so that no person and no legacy or naive code could be confused about what transport mechanism is involved in accessing the resource in question. |
And this is why my solution involves the use of IPv6 link locals without scope id, as opposed to domain names (but domain names may still point to the IPv6 link local addresses), because at least on Linux, they fail to resolve in the network stack by default, however they can still be intercepted and made useful/usable by LD_PRELOAD libraries or eBPF programs. And just to be clear with my last example with "/var/run/whatever.sock", that is not at all suggestive of the intended URI syntax. What I was suggesting was that if there were a URI syntax then an app would have some means of statelessly encoding the Unix domain socket path, whether it's directly embedding the path string, or substituting, escaping, or percent encoding special characters. |
Could the e.g.:
|
Sure, you could have a different TLD for the unix domain socket namespace, but it's still not intrinsically part of the url standard. If you go to https://jsdom.github.io/whatwg-url/#url=aHR0cHM6Ly9uYW1lLnVzZXJuYW1lLnVzZXJzLnVkcy5sb2NhbGhvc3QvZm9vL2Jhcg==&base=YWJvdXQ6Ymxhbms= you see that the hostname portion of the URL http://name.username.users.uds.localhost/foo/bar is The ability to connect to a unix socket in this manner is realized because As I've said before, the interpretation of DNS names at the DNS level is operating system dependent and is modifiable by various subsystems. And, it is still necessary to parse the hostname as a domain name to determine that it has a tld of .alt in the first place. So it's not really the issue of which TLD the unix domain socket namespace is "mounted" on. |
To review, the topic of this discussion is "Addressing HTTP servers over Unix domain sockets". And now, addressing that goal, I have proposed interpreting an RFC 3986 "URI" - which is what is implied by the phrase "Addressing HTTP servers" in the title - more generally, so as to successfully resolve an AF_UNIX socket "filesystem pathname", as defined in It is important to note that an AF_UNIX "address" has a format different from other socket address families, where RFC 3986 is especially biased in support of AF_INET and AF_INET6 sockets. Also,
where "efficiently" generally implies side-stepping use otherwise of the local machine's IP stack. And, while use of an AF_UNIX socket is "to communicate between processes on the same machine", at the same time, the meaning of "Addressing HTTP servers", again, implies use of an RFC 3986 "URI", which itself is subject to additional standards recommendations. For instance, RFC 4007 extends the Textual Representation of an IPv6 address to include a "zone_id", using the "%" character as a delimiter, and RFC 6874 extends the definition of the RFC 3986 "URI" to address the use of this "%" delimiter, otherwise always treated as an escape character in a URI, such that the "%" character is replaced by its equivalent "escaped" representation, "%25":
Of course any delimiter, for whatever purpose, might be represented using an escaped character sequence, but RFC 6874 only apples to an IPv6 "ZoneID", and is of no use in IPv4 addresses, to otherwise describe an AF_UNIX "Pathname" address as a kind of "ZoneID". More recently, 2020 June, RFC 8820, "URI Design and Ownership" contains some rather pointed comments warning about overly rigid, difficult, or presumptuous interpretations of the RFC 3986 "URI". For example, RFC 8615, "Well-Known Uniform Resource Identifiers (URIs)", extends the interpretation of the URI "path":
But, then it goes on to say:
So, a UDS pathname could not, itself, be used as a "/.well-known/" registered name in a URI. Still, a UDS pathname could be modified in other ways, such as using the escaped "/" character, "%2F". But "/.well-known/" registered names seem an unnecessarily complicated approach to accommodating every different address style in every different address family to be found in More generally, the goal in addressing the question raised in this discussion is to provide a simple URI mechanism to accommodate the varied "address" formats used in the many Now, to your point, you seem to be suggesting that some security conscious System Administrator, for example, such as yourself, would configure a publicly accessible server to provide open access to a UDS, and thereby, leave the system vulnerable to some kind of security breach. Is that correct? Please explain - Why would you do that? Can you provide an example? There is no standards specification which requires any computer system, networked or not, to provide open access to a UDS. In fact, there are quite a number of mechanisms, already in use, to allow a System Administrator to restrict access to system resources, especially including any networked computer system. What prevents a System Administrator from making use of those security measures with respect to a UDS, as compared to any other socket family, and especially, as compared to AF_INET and AF_INET6 sockets? |
Another example where changing the interpretation of domain names allows connections to Unix domain sockets is the Nginx "upstream" module: https://nginx.org/en/docs/http/ngx_http_upstream_module.html Here, we use the URL |
It is often desirable to run various HTTP servers that are only locally connectable. These could be local daemons that expose an HTTP API and/or web GUI, a local dev instance of a web server, et cetera.
For these use cases, using Unix domain sockets provides two major advantages over TCP on localhost:
Indeed, due to these advantages, many servers/services already provide options for listening via a Unix domain socket rather a local TCP port. Unfortunately, there is not currently an agreed-upon way to address such a service in a URL. As a result, clients who choose to support it end up creating there own bespoke approach (e.g., a special command-line flag, or a custom URL format), while others choose not to support it so as not to bring their URL parsing out-of-spec (among other potential concerns).
Here are some of the various URL formats I've seen used or suggested:
unix:/path/to/socket.sock
. This lacks both the protocol and resource path, so it can only be used for clients that already know they'll be speaking to a specific HTTP API, and is not generally usable.http://localhost:[/path/to/socket.sock]/resource
. Only allowed when host islocalhost
. Paths containing]
could either be disallowed or URL encoded.http+unix://%2Fpath%2Fto%2Fsocket.sock/resource
. Distinct scheme allows existinghttp
URL parsing to stay the same. URL encoding reduces read- and type-ability.http+unix://[/path/to/socket.sock]/resource
or justhttp://[/path/to/socket.sock]/resource
. (The latter would require using the leading/
of the socket path to disambiguate from an IPv6 address.)References:
Archived Google+ post suggesting the socket-as-port approach:
https://web.archive.org/web/20190321081447/https://plus.google.com/110699958808389605834/posts/DyoJ6W6ufET
My request for this functionality if Firefox, which sent me here:
https://bugzilla.mozilla.org/show_bug.cgi?id=1688774
Some previous discussion that was linked in the Firefox bug:
https://daniel.haxx.se/blog/2008/04/14/http-over-unix-domain-sockets/
https://bugs.chromium.org/p/chromium/issues/detail?id=451721
The text was updated successfully, but these errors were encountered: