-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proxy Protocol Support #1270
Comments
|
Depending on Elixir code is definitely a big no, it's just not practical for non-Elixir developers. I'm not sure this feature should be in Cowboy, I think the ranch_proxy_protocol approach works well to be honest. Perhaps you could give more details as to what issues you ran into when using it for Websocket? I don't really see what problem you could have encountered considering Websocket is on top of HTTP (you never connect to Websocket directly). |
|
I'm going to work on a minimal app that demos the issue that I'm running into but the gist of it is when I swap out the default |
|
Disable the proxy protocol and try against a plain Cowboy, chances are your problem is unrelated to ranch_proxy_protocol. |
|
I have and it works in that case. My original thought was that the issue was not it ranch_proxy_protocol but everything in my testing points to that being the issue. |
|
Oh ranch_proxy_protocol is more complex than I remembered. Considering the PROXY stuff is just a connection header I don't think there should be any need to override transports at all. A better solution would be to have a protocol module that will read and parse the PROXY header and then call the real protocol module (similar to what cowboy_clear/cowboy_tls do). Of course it would need to fill in some options that Cowboy can then read from. Yeah I can see the case for having this built-in I guess. I just won't have free time for that for some time. |
|
Yeah that was my original thought as well (why I wrote the elixir PoC). If you could point me in the right direction I believe I should be able to put together a PR for this fairly soon. Translating the code I have previously written shouldn't be that hard. I just would need to know where/how to integrate it into Cowboy. |
|
I'll need time to think about it and I got no time right now. Feel free to experiment a little and show what you get, that would help. |
|
Would a transport wrapper work better? It would "lie" to ranch about the peer. Whitelisting upstream proxy esp, by cidr notation would be a pain. I had to do something similar recently (whitelisting |
|
I don't think that's a good idea, better not lose information. |
|
It's not really lost, just default to the client's address. |
|
I don't see much point in having a special transport for something that occurs only at the beginning of the connection. Best have a special protocol module that gets the info then switches to h1/h2 (or any other protocol that implements the same interface) passing along the information from the PROXY header. And then in Cowboy you could have something like |
|
It's mostly a wrapper, you pass |
|
I understand how ranch_proxy_protocol works and I'm mostly fine with it, but it's not the kind of solution I would go for if it was to be in Cowboy or Ranch themselves. Note that the solution I (vaguely) propose would also work for non-HTTP traffic too. |
|
Sry, I misread what you posted. You are saying providing something like |
|
No I'm saying a full fledged protocol module that receives the socket/options/etc. normally and eventually passes on to another protocol. Something like cowboy_clear/cowboy_tls except it'd have a receive loop to parse the PROXY header and then on completion call another protocol module. |
|
I translated my original elixir version of the proxy protocol into erlang.[0] This is my first time just writing Erlang directly so any pointers on idiomatic Erlang would be appreciated. The library is fully tested and everything should work as you would expect. My next goal would be to integrate this code into Cowboy. The easiest way is to likely publish this as a package and then write a small module that extends cowboy in order to parse the protocol where necessary. I'm not as familiar with the internals of Cowboy so I will defer to your expertise of the best way to handle this. Let me know what you think. |
|
Basically what you want is something that will run between cowboy_clear/cowboy_tls and cowboy_http/cowboy_http2. It'll start reading from the socket to get the PROXY header and when it's done will run cowboy_http/cowboy_http2 and pass the extracted info to them. |
|
If anyone is curious, the patch to cowboy was not that intrusive to get this to work master...jimdigriz:proxy ...didn't say it was not offensive though! :) |
|
Note that I am starting work on adding PROXY support to Ranch directly, and after that Cowboy. The plan is to have a special protocol module that then gives control to the normal protocol module along with the proxy information in the options. Cowboy will pass that information forward via the Req. |
|
Noted and much oblig'd, thanks! I guess we can follow this work in ninenines/ranch#204 |
|
Yes, thanks for linking. |
|
@essen Feel free to use the following code as a guide to parse the proxy_protocol as it should already be spec compliant and agnostic for v1 and v2. https://github.com/tomciopp/proxy_protocol |
|
Thanks. I'm probably not going to make it this far today, I'm mostly figuring out how to integrate this into Ranch, Cowboy and RabbitMQ. That said, I probably wouldn't be able to reuse it as-is, because your code expects the full proxy header to be given when trying to parse it, which might not necessarily be the case. I'll definitely steal some test cases though. |
|
Actually my assumptions might be wrong there because of: That however says nothing of the Erlang implementation on top of that. I wonder if we can safely expect the first read to actually contain everything. |
|
Based on my understanding of the protocol it should contain everything since it prepends the info before the rest of the request. You should be good to use it as-is, but I'm not as familiar with all of the edge cases, so I will defer to your knowledge. |
|
My lack of knowledge rather. A quick experiment shows that it's likely true for Erlang as well. I'll therefore retract my initial observation. |
|
From my experience, it has not been the receiver that causes this per-say, it is because of the SSL framing by the sender. For example, old Opera sends one byte (error codes on socket writes are easier to reason about), followed by the rest of the payload behind it. OpenSSL decodes on frames and so when you read, you see the first frame, which is one byte, and the second frame is the rest. I would be surprised if there was a proxy implementing this protocol out there that pushed a single byte rather than the whole header though. |
|
The SSL example was misguided because the PROXY header precedes the TLS part (and I edited it out of my comment because it was not a good example). The protocol seems to require sending the whole header in one packet, for example: And: So there's no problem doing a single |
|
Yeah the risky bit though is that there's no guarantee that something on the network did not bundle part of the TLS headers/hello along with the proxy header in the same packet. If you do a recv, either be sure to properly deal with the re-buffering of TLS data, or make sure to use line packet parsing for the headers and switch back to raw after that. |
|
Right. We don't have peek[1] but we do have unrecv[2] so all that's really necessary is that unrecv be documented and/or peek gets added (the NIF rewrite might be a good opportunity to request this). Then you just unrecv what remains from the buffer before initiating the TLS handshake (or read the consumed data if you peeked). [1] erlang/otp#335 |
|
There's at least one custom TLV: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/load-balancer-target-groups.html#proxy-protocol Don't want to support parsing it, but definitely should make the data available. |
|
I got most of the parser done now, about to add tests. I reused some of your code, though most of it is probably unrecognizable by now. I've carefully reimplemented with a close eye on the spec and the haproxy implementation. I'll probably push that part on Monday. |
|
@tomciopp Your v2 implementation seems to be missing one byte: Noticed while running your tests against my implementation. EDIT: do the tests even pass? Some lengths in the tests are incorrect. Doesn't matter, correcting and reusing. |
|
Yep, I must have glossed over that section while implementing. Luckily the fix for this should be quite simple by inserting the correct byte for each case. I will make a note to fix it while I have some time later today. |
|
Just pushed my initial parser implementation: ninenines/ranch@1d6d695 It supports everything from the sections 2.1 and 2.2 of the specification, although I need to write many more tests especially for the TLVs. Please review and provide feedback! |
|
I forgot to say in the commit: @tomciopp I reused your find_ip/port functions and the tests. I looked into ranch_proxy_protocol to reuse some tests too but they're complicated. |
|
Afraid I'll drop OTP-18 since it doesn't have map type syntax. It's old anyway. |
|
@essen I will take a look later today and give my thoughts |
|
The ipv6 address parsing is incorrect, it expects the full size but that's not what the spec says (it allows double colon for example) and that's not what the haproxy code says either. So getting until |
|
@essen Are you referring to the human readable (version 1) protocol? If so you should be correct I am only handling the naive case. Unless I am mistaken, I believe that the ipv6 parser for v2 should handle all cases since you can't represent an address more efficiently without losing information. |
|
Yes. Afraid I imported that flaw into my first implementation. But what I will push later today fixes it. |
|
I've just pushed ninenines/ranch@1cc7de1 which contains functions for building the PROXY header and tests for the TLVs. I'll add the ranch_tcp:recv_proxy_header function tomorrow. |
|
I've just pushed ninenines/ranch@4b9970b which adds the function |
|
Reviews and feedback welcome! I'll do the Cowboy side of things in a week or two. |
|
Ok, the only question I have is a hypothetical. Say I'm using another cloud provider that has defined their own custom TLV (other than AWS) and I need the information defined within? How should I proceed? My general thoughts are that you should copy the current |
|
See Just find the right TLV data by using its number, and parse it yourself. Note that Ranch does not implement the Amazon TLV, you have to parse it yourself (it has a fake one in one of the tests but it doesn't parse its contents). That's all we can do to support custom TLVs, different providers could reuse the same codes for different things, it's not workable otherwise. |
|
For TLS connections I'm considering the possibility of grabbing the TCP socket from the The I'm open to suggestions or warnings that this would be a huge mistake, and otherwise will probably go down that path as this gives us a lot of advantages and simplifies the code that needs to read the PROXY protocol header immensely. If Ranch 2.0 ends up removing the socket from the protocol start_link call, a separate function will be necessary to receive the socket before doing the handshake, but I suppose that's OK. But we can figure that out when the time comes and this extra function may be useful regardless of the PROXY protocol concerns. |
|
The other bonus is that the implementation ends up tiny, outside of course of the parsing code (which is big but only because it supports the full spec). |
|
How about having something like this in the protocol code for when you want to get proxy headers (forward compatible with planned Ranch 2.0 changes): {ok, Socket} = ranch:wait_for_socket(Ref),
{ok, ProxyInfo} = Transport:recv_proxy_header(Socket, 1000),
{ok, Socket} = ranch:handshake(Ref),In this example the sockets returned are the same, but they don't have to be when using custom transports. The first line just waits for the socket to be received (implying controlling_process was updated) and returns it. It then queues the message again. The second line reads the proxy header and returns the proxy info (using the TCP socket, unrecv if necessary) and the third line does the TLS handshake. The code would be compatible with both TCP and TLS so no need to do anything special to have a protocol support both TCP and TLS. Ultimately this uses two undocumented features: the unrecv (there will be a peek feature in OTP 22 to replace it) and retrieving the TCP socket from the |
|
Only the second line can return error tuples so even if you need to handle errors it's not a terrifying prospect. |
I think this is risky for the reasons that you've already listed. I don't necessarily understand why we're optimizing for the failure condition when we shouldn't expect it to happen that often. I get that the implementation would be smaller, but shouldn't we be optimizing for robustness? What is the worst case for the alternative? |
|
The failure condition in question is due to misconfiguration, which for projects like RabbitMQ is happening often enough to warrant guarding the input as much as possible as early as possible. I don't think the solution I propose is less robust, it should be equivalent. It might have a slightly higher maintenance cost if the ssl application changes drastically. It's also more consistent since you can just configure your TLS listener as normal and not have to worry about this at all, just call the transport module's |
|
I have opened https://bugs.erlang.org/browse/ERL-757 with regard to the hack necessary for doing the recv on the ssl socket. |
|
OK so I changed things a bit and went for this: {ok, ProxyInfo} = ranch:recv_proxy_header(Ref, 1000),
{ok, Socket} = ranch:handshake(Ref),This way there's no confusion about having two Socket variables, since there's only one. And there's no confusing function name. That'll become 1.7.0 once I finish porting RabbitMQ and Cowboy to it. |
|
As far as RabbitMQ goes, the switch from ranch_proxy_protocol to this is trivial:
Cowboy's implementation is also expected to be short. Though for some projects switching will probably be a little more complex than this. |
|
I've got the start of a Cowboy implementation done. I will push it tomorrow. As far as configuration goes, you enable it via the |
|
I've pushed PROXY support in Cowboy in 122faed See previous comment for instructions. Please try it and provide feedback, if possible within one week as then I could release Ranch 1.7 next week at the very least. |
|
Hello, has anyone done any experiments with this? Any negative comments? I'll wait a few more days and if no feedback assume it's all good and start releasing new versions with this feature. |
|
I've added some short documentation locally for Cowboy, and will release Ranch 1.7.0 later today. Considering this fixed. Thanks! |
Motivation
I'm running a phoenix app that needs to support HAProxy's proxy protocol which I'm currently doing with help from
https://github.com/heroku/ranch_proxy_protocolHowever I'm currently running into issues surrounding websocket support for that library and thought it would be better if this was directly built into cowboy with support via a configuration or as some sort of plugin. Based on some quick googling it looks like this has been requested before but no one has really taken the initiative to properly build out the feature. I would like to remedy that.I wrote a parser for the protocol in elixir which can be found here[0]. The api is pretty basic you feed it a packet and it returns a buffer and a struct containing the relevant information. My idea was that you could place this code at the beginning of your parser, pull out any relevant info and then continue on with the rest of the buffer as though it weren't there. As of right now the project supports both versions of the protocol out of the box and is fully tested.
Caveats
I don't know the architecture of cowboy very well, and while I've read a few erlang programs I would not call myself an erlang programmer. I would likely need some help translating this code unless you wanted to call elixir from erlang[1] What are your thoughts on how to best add support for this feature?
[0] https://github.com/tomciopp/proxy-protocol
[1] https://joearms.github.io/published/2017-12-18-Calling-Elixir-From-Erlang.html
The text was updated successfully, but these errors were encountered: