-
Notifications
You must be signed in to change notification settings - Fork 75
Proxy protocol support for tcp input #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This would resolve elastic/logstash#4418 for the tcp input at least. |
|
@jordansissel ping |
|
Will review. I am going to read up on http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt first |
|
Ok I've studied the spec a bit. Before I give some recommendations, I'll explain whatever goal I hope the recommendation achieves. I will also leave some inline comments in the patch. For starters, I am +1 supporting the haproxy proxy protocol. Thought 1: I wonder if we can have the
Thought 2: If you reject
|
lib/logstash/inputs/tcp.rb
Outdated
|
|
||
| # Proxy protocol version, only 1 is supported at this time | ||
| # http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt | ||
| config :proxy_protocol, :validate => :number, :default => -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments on the main PR about possibly removing the need for this setting entirely or an alternative of making it a boolean.
lib/logstash/inputs/tcp.rb
Outdated
| # PROXY proto clientip proxyip clientport proxyport | ||
| if pp_info[0] != "PROXY" | ||
| @logger.error("invalid proxy protocol header label", :hdr => pp_hdr) | ||
| tbuf = orig_buf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I read the spec, we should terminate this connection if we have an invalid header.
The receiver MUST NOT start processing the connection before it receives a
complete and valid PROXY protocol header
and
The receiver MUST be configured to only receive the protocol described in this
specification and MUST not try to guess whether the protocol header is present
or not.
and
Any sequence which does not exactly match the protocol must be discarded and
cause the receiver to abort the connection. It is recommended to abort the
connection as soon as possible so that the sender gets a chance to notice the
anomaly and log it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm completely open to that - figured y'all would rather it be tolerant of user config error. Will adjust.
| first_read = true | ||
| while !stop? | ||
| codec.decode(read(socket)) do |event| | ||
| tbuf = read(socket) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible that read() returns a payload that has too-few bytes in it to form a complete header.
However, given the following part of the specification, I am willing to accept that this is unlikely my experience with modern networks:
Both formats are designed to fit in the smallest TCP segment that any TCP/IP
host is required to support (576 - 40 = 536 bytes). This ensures that the whole
header will always be delivered at once when the socket buffers are still empty
at the beginning of a connection. The sender must always ensure that the header
is sent at once, so that the transport layer maintains atomicity along the path
to the receiver. The receiver may be tolerant to partial headers or may simply
drop the connection when receiving a partial header. Recommendation is to be
tolerant, but implementation constraints may not always easily permit this
lib/logstash/inputs/tcp.rb
Outdated
| while !stop? | ||
| codec.decode(read(socket)) do |event| | ||
| tbuf = read(socket) | ||
| if @proxy_protocol == 1 && first_read |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing this check every time we read may have negative performance impacts. I am thinking that this header parsing should be done before entering the while loop in this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured the overhead of a single variable comparison would be dwarfed by the time taken for a read.
|
I think I implied that proxy protocol v2 support is needed. I don't think it's needed in this PR, and I don't want to add additional work to the scope of this PR. V1 is just fine for now, and I would still prefer to have users (as mentioned above) not need to configure which protocol version is used. |
|
This change adds a significant feature, and so I ask that you add test coverage for the proxy protocol to make sure it works correctly. |
|
Boolean makes sense to me. Will apply that (and your other comments) here and to the other two PR's (log4j and syslog). |
|
Can you give me an example in the test case code equivalent to something like: insist { connection was dropped } Otherwise I think the updated PR covers your request to switch to boolean and the request for test case. |
|
Made the same boolean and test case addition on the PR for syslog input plugin. |
|
On the suggestion to pull the check out of the loop - I think that will require doing single byte reads from the socket in advance of the loop. That will probably be required anyway if we add v2 support later. |
|
This is on my todo list to review. Sorry for the delays! |
|
Thank you (for this one and the other two - syslog and log4j as well). |
|
LGTM |
|
Jordan Sissel merged this into the following branches!
|
This adds capability for tcp input to support the PROXY protocol - optionally generated by Amazon ELB and HAProxy. Intent is to do the same for log4j and syslog inputs if the implementation is acceptable. (I'm not a ruby programmer, so if there is better syntax/approach for any of this, please indicate.)
The purpose is to allow having logstash sitting behind a proxy without losing details of the connecting client host+port.