New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed TLS handshake #91
Comments
Never mind, think this was a data issue, we had some date mismatch issues. thanks |
FWIW, I'm getting this for random log shippers as well. If I restart Logstash (with a Lumberjack input), they'll be able to re-connect again. |
I'm also seeing lumberjack throw these error intermittently. When this happens, sometimes (but not always) I see these errors in the logstash log once every couple seconds:
But these errors don't always occur -- and when they do, the lumberjack errors continue long after these errors stop. After restarting logstash, the lumberjack agents will reconnect... for a while. I'm running lumberjack 0.3.1 and logstash 1.1.12 |
I'm also running into this issue. lumberjack 0.3.1 and logstash 1.2.2. |
Hi, This is normally a time /date related issue. ntpdate normally fixes it for me. It seems that the TLS handshake is very sensitive to time changes. 2013/12/09 17:47:54.260621 Read error looking for ack: read tcp 192.168.51.61:5043: i/o timeout service ntp stop 2013/12/09 17:48:37.053861 Read error looking for ack: read tcp 192.168.51.61:5043: i/o timeout |
Has anyone managed to find a workaround for this issue? It seems to strike randomly, and fiddling about with ntp is not resolving the problem for me when it happens.
|
Unfortunately, for me the "fix" is to restart Logstash with a cron job every so often. The logs get queued with Lumberjack anyway, so no logs get lost. I haven't had the time to do any further investigating on my end. |
I am getting loads of these from lots of servers. Lots of logs not getting through. I don't think is time related as the error is coming from the lumberjack code not the ssl connections as far as I can tell. Best "fix" I have so far is changing the end of the server.rb file in the library that logstash is using so that it sends an ack for every line. ( I think that is what is doing ) but even then still getting loads of timeouts.
|
Hi @choffee Is your pipeline in logstash very busy? Might be the cause. I implemented partial ACK and separated communication from pipeline in #180 to fix this. You could try that? Requires the server.rb copying into logstash and also the forwarder patching. I have that and other major statefile/prospector issues fixed in my repo too at: Jason |
Hi @driskell Thanks. I am trying that out now and it seems to be working great. I don't think I am sending a massive amount of logs through ( /me notes, should graph that ) but when the main logstash process has been down a while I think there is a bit of flood which seemed to break things. Is there any chance of this making it into master soon? It would be great to have this working. While we wait for zeromq and have an option for those who still want encryption in the future. john |
Good to hear @choffee Unfortunately I'm not project maintainer so wouldn't have a say. I hope it gets in. They might not like the new data frame version I added though - but that's the only way to fix what was a broken implementation. |
I've this issue too, I still can't figure out how to fix it.
|
@driskell partial acks don't solve stalls. |
I'm not sure what this ticket is about. The original filing was about TLS handshake timeouts. It was later reopened by @YummyLogs reporting a different problem (read ack timeout). @mrteera Your problem reports Can we close this and open new tickets focused on each kind of error? |
Partial acks solve read ack timeouts and solve event duplication (forwarder resends ALL logs on timeout). If logstash does not push all 1024 spool events (if it's a full spoil) through its pipeline within network timeout of 15 seconds... It will always timeout and resend all 1024. I've seen a logstash continually duplicate logs for hours just because a 100mb log file was added and sent all at once. Brokers alleviate this but if broker gets full too... I admit I've confused the read ack timeout and TLS handshake timeout in other places though. TLS handshake timeout is unrelated to ack. I did just recall a stall issue same as @myatu in log courier. I'll raise a ticket for you to look into. It causes failed handshake after a time but restarting logstash fixes it. |
I have a strange issue where in my logstash forwarder (as a docker) connects to the logstash server( as a docker too) only when I reboot my instance running the forwarder. Both are running on rackspace cloud. When I run the docker first time, i just get the following, Then when I issue reboot and again run the same docker its connected and can ship logs smoothly. Is it anything related to firewall or iptables? |
Please, please, please fix it. The problem about 2 years. It's disgusting! |
@juise Such comments are unhelpful and unwelcome |
Just for note, the follow problem: ... Read error looking for ack: read tcp x.x.x.x:xxxx: i/o timeout may appear when you are using a Ruby module, by example without try catch blocks and exception occure. In log doesn't appear any message about it. It's very - very bad. |
This occurs when downstream server doesn't acknowledge within the defined timeout period.
This is not an error.
This is not an error
This is not an error
This is not an error.
I'm not sure what this means. |
If not an error, why the only one solution to make it work after "not at error" is restart. I can reproduce it without any problems. It can be problem not on the logstash-forwarder side, but in logstash die. |
A restart of what? logstash or logstash-forwarder? |
The both, restart logstash and logstash-forwarder. The order doesn't matter. |
The problem usually occurs when logstash is overwhelmed by the load you are giving it with logstash-forwarder, causing the forwarder to give up (i/o timeout) waiting for a response and having it try to connect to another logstash if that can help. In almost all circumstances, the problem is a due to flow problems in logstash or something downstream of logstash itself (for example, if an output's dependency is down or offline or unavailable). This ticket is about "Failed TLS handshake" doesn't seem related to the 'Read error looking for ack: read tcp x.x.x.x:xxxx: i/o timeout' message you report. |
Got this issue without load... (1 req / 10 mins) |
@sledorze could it possibly be issues with logstash itself? I have had the issue without load myself and it is usually a result of logstash's state. |
maybe yes, and what was the fix? (it could be useful for logstash-forwarder On Wed, Apr 1, 2015 at 2:19 PM, cameronevans notifications@github.com
Stéphane Le Dorze http://lambdabrella.blogspot.com/ Tel: +33 (0) 6 08 76 70 15 |
It was a number of reasons, usually plugin errors (in my case tcp or elasticsearch) that would cause logstash to halt but have the service just spin it up again. Also, if the forwarder was inhaling a file with a great number of lines it would sometimes fill up the queue and throw this error even though there were not many events at the time. The |
Thanks @choffee modifying the server.rb fixed my issue :) |
Faced with this two times in last month. Only restart of
|
Try to modify server.rb as @choffee suggested. This works in all versions, include the latest 1.5. Looks like the elastic folks this problem doesn't interesting |
@juise still facing sometimes with same problem and have to restart |
We are working on solutions that include better messaging on downstream On Wednesday, June 3, 2015, Alexander Petrovsky notifications@github.com
|
There's so much discussion on unrelated problems in this ticket, I'm going to close this because I cannot follow it easily. If you are still affected by somethingin this ticket, please open a new ticket and provide whatever details you can. Thanks! |
Hello,
I suspect this is something dumb but I cant for the life of me figure out why lumberjack has started to throw these errors.
2013/09/19 12:36:37.093618 Failed to tls handshake with 192.168.51.61:5043 read tcp 192.168.51.61:5043: i/o timeout
2013/09/19 12:36:38.093839 Connecting to 192.168.51.61:5043
2013/09/19 12:36:53.094597 Failed to tls handshake with 192.168.51.61:5043 read tcp 192.168.51.61:5043: i/o timeout
2013/09/19 12:36:54.094838 Connecting to 192.168.51.61:5043
We have been using it for a while with no issues, then elastic search ran out of disk space. Cleaned up the disk and now we are seeing these errors.
Restarted logstash, elastic search and the indexer and lumberjack with no update.
thanks
Luke
The text was updated successfully, but these errors were encountered: