Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upRead error looking for ack: EOF #293
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
driskell
Oct 24, 2014
Contributor
How are you connecting to Logstash - it says elk-cluster - is it via AWS ELB or something else? Or just direct?
|
How are you connecting to Logstash - it says elk-cluster - is it via AWS ELB or something else? Or just direct? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
robin13
Oct 24, 2014
Member
Connecting direct to logstash (1.4.2) lumberjack input.
Naming for the certificate is perhaps not optimal... :)
|
Connecting direct to logstash (1.4.2) lumberjack input. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
robin13
Oct 24, 2014
Member
In the same physical network too. I had problems with read error before when load was too high and the logstash agents weren't able to process the events fast enough, but at the moment the cluster is healthy and this time the oddity is that the slowest machines are experiencing the error, not the ones pumping hundreds/thousands of events per second...
|
In the same physical network too. I had problems with read error before when load was too high and the logstash agents weren't able to process the events fast enough, but at the moment the cluster is healthy and this time the oddity is that the slowest machines are experiencing the error, not the ones pumping hundreds/thousands of events per second... |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
driskell
Oct 24, 2014
Contributor
Have you got many servers connecting? It might be that issue where if a client fails to connect due to certificate problem - it can cause the disconnection of other clients randomly. And because it retries connection every second it randomly throws loads of other clients off.
You can patch for the above using the current version:
https://github.com/elasticsearch/logstash-forwarder/blob/master/lib/lumberjack/server.rb
It sits in logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.20/lib if I remember correctly. The gem hasn't been updated so Logstash still using old one.
If you can, try Logstash in debug mode it might tell you what's going on, but I don't think the current gem does any logging - maybe look at #180 in that case as it will log who's connecting and who's failing.
|
Have you got many servers connecting? It might be that issue where if a client fails to connect due to certificate problem - it can cause the disconnection of other clients randomly. And because it retries connection every second it randomly throws loads of other clients off. You can patch for the above using the current version: It sits in logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.20/lib if I remember correctly. The gem hasn't been updated so Logstash still using old one. If you can, try Logstash in debug mode it might tell you what's going on, but I don't think the current gem does any logging - maybe look at #180 in that case as it will log who's connecting and who's failing. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
robin13
Oct 24, 2014
Member
Hi Jason,
That patch worked! :) You're the bomb! Beers are on me if we ever meet IRL!
I still have to identify the clients which have invalid certificates - there are ~110 logstash-forwarder agents connecting to 10 logstash agents.
How could we get that patch pushed so it is included in the next logstash package?
Cheers,
-Robin-
|
Hi Jason, That patch worked! :) You're the bomb! Beers are on me if we ever meet IRL! How could we get that patch pushed so it is included in the next logstash package? Cheers, |
robin13
closed this
Oct 24, 2014
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
caseydunham
Dec 19, 2014
I am getting this on CentOS 6.4 with recent update of logstash and I built logstash-forwarder from master.
Is this related to an issue with the SSL cert on my logstash server?
caseydunham
commented
Dec 19, 2014
|
I am getting this on CentOS 6.4 with recent update of logstash and I built logstash-forwarder from master. Is this related to an issue with the SSL cert on my logstash server? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jordansissel
Dec 19, 2014
Contributor
"Read error looking for ack: EOF" means likely that the remote (logstash,
or whatever is receiving the conncetion from lsf) closed the connection. By
the time lsf gets to sending data and waiting for acks, I'm reasonably
certain it's already successfully done a tls handshake and is not showing
you a cert error.
On Fri, Dec 19, 2014 at 11:37 AM, Casey Dunham notifications@github.com
wrote:
I am getting this on CentOS 6.4 with recent update of logstash and I built
logstash-forwarder from master.Is this related to an issue with the SSL cert on my logstash server?
—
Reply to this email directly or view it on GitHub
#293 (comment)
.
|
"Read error looking for ack: EOF" means likely that the remote (logstash, On Fri, Dec 19, 2014 at 11:37 AM, Casey Dunham notifications@github.com
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jmreicha
Dec 29, 2014
Seeing this issue as well on CentOS 5.9 using LSF 3.1.1 and Logstash 1.4.2. I think I read there will be a fix for this in the newest release of Logstash? I see that there is a beta release out, would it be worth switching over to this release to correct the issue?
jmreicha
commented
Dec 29, 2014
|
Seeing this issue as well on CentOS 5.9 using LSF 3.1.1 and Logstash 1.4.2. I think I read there will be a fix for this in the newest release of Logstash? I see that there is a beta release out, would it be worth switching over to this release to correct the issue? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rabidscorpio
Feb 12, 2015
I'm seeing this as well with Ubuntu 14.04, LSF 3.1.1 and Logstash 1.4.2. I upgraded to 1.5.0.beta1 but no luck with that either so @jmreicha I don't know if that's going to change.
rabidscorpio
commented
Feb 12, 2015
|
I'm seeing this as well with Ubuntu 14.04, LSF 3.1.1 and Logstash 1.4.2. I upgraded to 1.5.0.beta1 but no luck with that either so @jmreicha I don't know if that's going to change. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rabidscorpio
Feb 12, 2015
I also just tried logstash-forwarder 0.4.0 with logstash 1.4.2 to no avail (I made a mistake above, should have been LSF 0.3.1 for the version.
rabidscorpio
commented
Feb 12, 2015
|
I also just tried logstash-forwarder 0.4.0 with logstash 1.4.2 to no avail (I made a mistake above, should have been LSF 0.3.1 for the version. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
@rabidscorpio Can you provide the full LSF log output? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
richardgoetz
Feb 12, 2015
I also have this problem -
2015/02/12 15:24:04.401619 Read error looking for ack: read tcp x.x.x.x:5000: i/o timeout
I have thought this is issue with config on
elasticsearch.yml
cluster.name: SOMENAME
cluster name not matching
not matching
30-lumberjack-output.conf
output {
elasticsearch {
host => localhost
}
cluster => SOMENAME
stdout { codec => rubydebug }
}
this is keeping data from loading
richardgoetz
commented
Feb 12, 2015
|
I also have this problem - I have thought this is issue with config on output { this is keeping data from loading |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
nornagon
Mar 3, 2015
It seems like LSF 0.4.0 (built from master a week or so ago) is not compatible with Logstash 1.4.2?
nornagon
commented
Mar 3, 2015
|
It seems like LSF 0.4.0 (built from master a week or so ago) is not compatible with Logstash 1.4.2? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jmreicha
Mar 5, 2015
I have been troubleshooting this again recently and can see packets from the client (logstash-forwarder) going through the firewall, sending Syn packets endlessly.
Is there anything that I can look at on the logstash server side to see why it is closing the connection (or isn't accepting the syn)?
It is strange, for at least a small period of time, logs are being forwarded correctl.
jmreicha
commented
Mar 5, 2015
|
I have been troubleshooting this again recently and can see packets from the client (logstash-forwarder) going through the firewall, sending Syn packets endlessly. Is there anything that I can look at on the logstash server side to see why it is closing the connection (or isn't accepting the syn)? It is strange, for at least a small period of time, logs are being forwarded correctl. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
tuukkamustonen
Mar 6, 2015
I also upgraded to LSF 0.4.0 (with Logstash 1.4.2) and am facing this issue now. Could someone re-open this ticket or is there a new one somewhere?
tuukkamustonen
commented
Mar 6, 2015
|
I also upgraded to LSF 0.4.0 (with Logstash 1.4.2) and am facing this issue now. Could someone re-open this ticket or is there a new one somewhere? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
brendangibat
Mar 6, 2015
I see this issue with Logstash 1.5.0beta1 (also saw it with 1.4.2) and a docker container for LSF from https://registry.hub.docker.com/u/digitalwonderland/logstash-forwarder/
One thing to note is that the connections tend to eventually become stable - however I have not been able to tell by any pattern that one is connecting correctly, since it can fail against connecting to one logstash server, then attempt to reconnect and join successfully another time.
brendangibat
commented
Mar 6, 2015
|
I see this issue with Logstash 1.5.0beta1 (also saw it with 1.4.2) and a docker container for LSF from https://registry.hub.docker.com/u/digitalwonderland/logstash-forwarder/ One thing to note is that the connections tend to eventually become stable - however I have not been able to tell by any pattern that one is connecting correctly, since it can fail against connecting to one logstash server, then attempt to reconnect and join successfully another time. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jordansissel
Mar 6, 2015
Contributor
"read error looking for ack: EOF"
Let me describe what I believe this message to mean:
- "Looking for ack" - This is a phase of the lumberjack protocol when logstash-forwarder is waiting for a response from Logstash saying, "I got your messages! Send more!"
- "EOF" - This means that, while waiting for, and instead of receiving, an ackknowledgement, the socket connection was terminated. EOF == end of file, meaning the network read was terminated due to a connection closure, probably on the Logstash side.
EXPERIENCING THIS ISSUE?
Please put the following on https://gist.github.com/ and link it here:
- The last 500 lines of the logstash-forwarder log during the time this problem occurs
- The last 500 lines of the receiving Logstash agent's log during the time this problem occurs
- Your logstash config
- Your logstash-forwarder config
- What version of logstash
- What version of logstash-forwarder
- A brief description of the networking scenario between your two servers, for example: Are you using a proxy? Can you also attach the proxy's configuration?
If you aren't using Logstash 1.4.2 and logstash-forwarder 0.4.0, please upgrade and let me know if you are still having problems.
Let me describe what I believe this message to mean:
EXPERIENCING THIS ISSUE?Please put the following on https://gist.github.com/ and link it here:
If you aren't using Logstash 1.4.2 and logstash-forwarder 0.4.0, please upgrade and let me know if you are still having problems. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jordansissel
Mar 6, 2015
Contributor
Reopening -
I need more information to make a determination. Please read my above comment and provide as much information as you can. Thank you!
|
Reopening - I need more information to make a determination. Please read my above comment and provide as much information as you can. Thank you! |
jordansissel
reopened this
Mar 6, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
brendangibat
Mar 6, 2015
Thanks for getting back to me so quickly! I'm testing updating to rc2 currently, I'll bring back all the info you need once I complete some changes in the configuration for rc2 deployment.
brendangibat
commented
Mar 6, 2015
|
Thanks for getting back to me so quickly! I'm testing updating to rc2 currently, I'll bring back all the info you need once I complete some changes in the configuration for rc2 deployment. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jmreicha
Mar 6, 2015
Hey @jordansissel thanks for reopening. I am going to put what I have so far in to the following gist. I will add on to this as much as I can.
https://gist.github.com/jmreicha/dc25c3790793ae4a163f
Logstash version is 1.5.0.beta1
logstash-forwarder is/was 0.4.0 built from master a week or so ago.
Everything is running in Docker containers. The logstash stack (ELK) is running on its own server, the containers running the clients are running in a few different ways. The stuff I am having trouble is running in Kubernetes, so I don't know if that is maybe partially causing this issue? As far a I know there isn't any proxying. But logs that are being shipped by logstash-forwarder that are outside of Kubernetes seem to work fine for the most part so far but I haven't been able to narrow anything down yet.
Here's a few things I've tested so far:
- I am able to hit the public logstash server ports from inside the kubernetes containers with netcat.
- If I run just one container in Kubernetes with the logstash-forwarder shipping logs, things seem to work for the most part. But after I add in another client then that is when I start seeing the connection errors.
- I have tried running log-courier with the same results inside of Kubernetes. One container running the client works but after adding in another, it starts reporting connection errors.
Definitely let me know what other details you need. I have been playing around with tcpdump but don't know if any of the debugs from it would be useful or not.
jmreicha
commented
Mar 6, 2015
|
Hey @jordansissel thanks for reopening. I am going to put what I have so far in to the following gist. I will add on to this as much as I can. https://gist.github.com/jmreicha/dc25c3790793ae4a163f Logstash version is 1.5.0.beta1 Everything is running in Docker containers. The logstash stack (ELK) is running on its own server, the containers running the clients are running in a few different ways. The stuff I am having trouble is running in Kubernetes, so I don't know if that is maybe partially causing this issue? As far a I know there isn't any proxying. But logs that are being shipped by logstash-forwarder that are outside of Kubernetes seem to work fine for the most part so far but I haven't been able to narrow anything down yet. Here's a few things I've tested so far:
Definitely let me know what other details you need. I have been playing around with tcpdump but don't know if any of the debugs from it would be useful or not. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jordansissel
Mar 6, 2015
Contributor
@jmreicha we fixed a bug on the lumberjack input in 1.5.0 RC2, btw. Not sure if it's your bug, but may be.
|
@jmreicha we fixed a bug on the lumberjack input in 1.5.0 RC2, btw. Not sure if it's your bug, but may be. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jordansissel
Mar 6, 2015
Contributor
running in Kubernetes
I have two hypotheses at this time:
- A bug in the lumberjack input causing connections to be terminated silently in Logstash. This woul manifest as an EOF on the logstash-forwarder side.
- A network configuraiton problem causing connections to be interrupted or terminated (stateful firewall with a low idle timeout, or something)
I have two hypotheses at this time:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
brendangibat
Mar 6, 2015
@jordansissel For me bumping up to 1.5.0rc2 seems to have corrected the issue; I've not seen the issue repeat today since updating and using the new logstash plugins.
brendangibat
commented
Mar 6, 2015
|
@jordansissel For me bumping up to 1.5.0rc2 seems to have corrected the issue; I've not seen the issue repeat today since updating and using the new logstash plugins. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jmreicha
Mar 6, 2015
Oh sheesh, I didn't realize there was a new release out :) I will get that put in place and see if it helps.
The extent of my Kubernetes networking knowledge is limited. I know it uses a network overlay for distributed containers to communicate using flannel, and I also know it does some things in the background with iptables, I'm just not sure what exactly what it is.
Interestingly enough, with tcpdump running I can see logstash-forwarder traffic coming in to the logstash server.
Maybe my issue is a combination of the two?
jmreicha
commented
Mar 6, 2015
|
Oh sheesh, I didn't realize there was a new release out :) I will get that put in place and see if it helps. The extent of my Kubernetes networking knowledge is limited. I know it uses a network overlay for distributed containers to communicate using flannel, and I also know it does some things in the background with iptables, I'm just not sure what exactly what it is. Interestingly enough, with tcpdump running I can see logstash-forwarder traffic coming in to the logstash server. Maybe my issue is a combination of the two? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jordansissel
Mar 6, 2015
Contributor
I can see logstash-forwarder traffic coming in
When logstash-forwarder says things like this: Registrar: precessing 6 events, it means that it has successfully sent 6 events to Logstash and Logstash has acknowledge all 6 successfully. It means your network connectivity worked for one round-trip of a log payload.
If you see these messages, and then later see "read error waiting for ack" it means something interrupted the previously-healthy connection.
When logstash-forwarder says things like this: If you see these messages, and then later see "read error waiting for ack" it means something interrupted the previously-healthy connection. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jordansissel
Mar 6, 2015
Contributor
1.5.0rc2 seems to have corrected the issue
@brendangibat This is very exciting. Thank you for the report!
@brendangibat This is very exciting. Thank you for the report! |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jmreicha
commented
Mar 6, 2015
|
Ah that makes sense, thanks for the clarification. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
driskell
Jun 25, 2015
Contributor
Ah interesting. The json codec decode actually succeeds - it takes the float at the beginning of the line and returns that, rather than actually failing. This creates an event that is not a set of key-value pairs, it just is a float.
This should be raised in https://github.com/logstash-plugins/logstash-codec-json I think - it really should check the decoded result is a key-value pairs I think, otherwise it does crash the input and really it should just mark it as decode failed.
|
Ah interesting. The json codec decode actually succeeds - it takes the float at the beginning of the line and returns that, rather than actually failing. This creates an event that is not a set of key-value pairs, it just is a float. This should be raised in https://github.com/logstash-plugins/logstash-codec-json I think - it really should check the decoded result is a key-value pairs I think, otherwise it does crash the input and really it should just mark it as decode failed. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
vqng
Jun 25, 2015
correct me if I am wrong but I thought based on the configs, only grok pattern decoder should be invoked, not json decoder?
anyway, I raised an issue at the other repo. Thank you.
vqng
commented
Jun 25, 2015
|
correct me if I am wrong but I thought based on the configs, only grok pattern decoder should be invoked, not json decoder? |
vqng
referenced this issue
Jun 25, 2015
Closed
json decoder should raise error when input is invalid #11
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
TristanMatthews
Jun 29, 2015
I seem to be running into this same problem, followed the same DO tutorial as naviehuynh, I have almost exactly the same config file as him from the 22nd.
Are other people able to reproduce this error consistently? I'm seeing it as an intermittent problem where everything works fine and then it breaks. Non-json log statements seem to cause the EOF, but most of the time it picks up and starts processing again, only some times does it go into a repeating error state that strzelecki-maciek reported.
TristanMatthews
commented
Jun 29, 2015
|
I seem to be running into this same problem, followed the same DO tutorial as naviehuynh, I have almost exactly the same config file as him from the 22nd. Are other people able to reproduce this error consistently? I'm seeing it as an intermittent problem where everything works fine and then it breaks. Non-json log statements seem to cause the EOF, but most of the time it picks up and starts processing again, only some times does it go into a repeating error state that strzelecki-maciek reported. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
driskell
Jun 29, 2015
Contributor
I would say don't use the codec if some lines are not json. Use the filter and put a conditional on it to ensure you only decide valid json logs. This is my own method with Log Courier
Though I have heard ppl say filter has issues. But with the plugin separations hopefully fixing the issues is faster and it's so easy to update individual filters in 1.5.
|
I would say don't use the codec if some lines are not json. Use the filter and put a conditional on it to ensure you only decide valid json logs. This is my own method with Log Courier Though I have heard ppl say filter has issues. But with the plugin separations hopefully fixing the issues is faster and it's so easy to update individual filters in 1.5. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
TristanMatthews
Jun 29, 2015
@driskell Thanks, I'll try that. Is your filter actually validating the JSON? Or just filtering for files that should have valid json?
I found the issue because I had a log file with corrupted text (not sure how, but there was weird corrupted text in the middle of a log for a python Traceback of an error.), so I can't assume that I will always have valid json in my logs.
On Friday I was seeing the issue intermittently, over the weekend with the corrupted log file I could put the system in the EOF loop state every time I moved a copy of the file into the path, but my system seems to be in a different state today and only some times goes into the EOF loop. Just so that I understand logs like:
Jun 29 06:20:06 ip-10-0-0-152 logstash-forwarder[16258]: 2015/06/29 06:20:06.542950 Read error looking for ack: EOF
Jun 29 06:20:06 ip-10-0-0-152 logstash-forwarder[16258]: 2015/06/29 06:20:06.543015 Setting trusted CA from file: /etc/pki/tls/certs/logstash-forwarder.crt
Jun 29 06:20:06 ip-10-0-0-152 logstash-forwarder[16258]: 2015/06/29 06:20:06.586836 Connecting to [X.X.X.X]:5000 (logs.example.com)
Jun 29 06:20:06 ip-10-0-0-152 logstash-forwarder[16258]: 2015/06/29 06:20:06.644742 Connected to X.X.X.X
Indicate that something went wrong on the logstash server and the connection was closed, a properly working system should never see that even if you try to ingest a corrupted file?
TristanMatthews
commented
Jun 29, 2015
|
@driskell Thanks, I'll try that. Is your filter actually validating the JSON? Or just filtering for files that should have valid json? I found the issue because I had a log file with corrupted text (not sure how, but there was weird corrupted text in the middle of a log for a python Traceback of an error.), so I can't assume that I will always have valid json in my logs. On Friday I was seeing the issue intermittently, over the weekend with the corrupted log file I could put the system in the EOF loop state every time I moved a copy of the file into the path, but my system seems to be in a different state today and only some times goes into the EOF loop. Just so that I understand logs like:
Indicate that something went wrong on the logstash server and the connection was closed, a properly working system should never see that even if you try to ingest a corrupted file? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
driskell
Jun 29, 2015
Contributor
EOF usually means network problem or a crashed input. Checking logstash error log will say what it was as it will normally log why it failed.
|
EOF usually means network problem or a crashed input. Checking logstash error log will say what it was as it will normally log why it failed. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
frutik
Jul 14, 2015
I had the same "Read error looking for ack: EOF" issue. I've started logstash in debug mode
/opt/logstash/bin/logstash agent -f /etc/logstash/conf.d --debug
And noticed that logstash could not connect to ES using default protocol.
Changing it to http resolved my issue
elasticsearch {
host => localhost
index => 'logstash-%{+YYYY.MM}'
+ protocol => http
}
So, probably there is one of possible reasons of this famous issue
frutik
commented
Jul 14, 2015
|
I had the same "Read error looking for ack: EOF" issue. I've started logstash in debug mode
And noticed that logstash could not connect to ES using default protocol. Changing it to http resolved my issue
So, probably there is one of possible reasons of this famous issue |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
pmoust
Jul 29, 2015
Member
I can confirm that explicitly declaring a protocol for elasticsearch output resolved this for us.
|
I can confirm that explicitly declaring a protocol for elasticsearch output resolved this for us. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
strzelecki-maciek
Aug 5, 2015
[endless loop of]
2015/08/05 13:58:42.609958 Connected to 10.20.0.249
2015/08/05 13:58:42.611478 Read error looking for ack: EOF
2015/08/05 13:58:42.611615 Setting trusted CA from file: /etc/pki/tls/certs/logstash-forwarder.crt
2015/08/05 13:58:42.612001 Connecting to [10.20.0.249]:5517 (10.20.0.249)
[...]
after updating the plugin (bin/plugin update logstash-input-lumberjack)
in logstash logs (with constantly disconnecting logstash_forwarder <--> logstash, the) i can see
{:timestamp=>"2015-08-05T13:58:42.173000+0200", :message=>"Exception in lumberjack input thread", :exception=>#<ArgumentError: comparison of String with 1 failed>, :level=>:error}
this only happens for redis and redis sentinel logs. Every other logfile works fine
strzelecki-maciek
commented
Aug 5, 2015
|
[endless loop of] after updating the plugin (bin/plugin update logstash-input-lumberjack) in logstash logs (with constantly disconnecting logstash_forwarder <--> logstash, the) i can see {:timestamp=>"2015-08-05T13:58:42.173000+0200", :message=>"Exception in lumberjack input thread", :exception=>#<ArgumentError: comparison of String with 1 failed>, :level=>:error} this only happens for redis and redis sentinel logs. Every other logfile works fine |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
strzelecki-maciek
commented
Aug 5, 2015
|
on logstash side - removal of codec: json fixed the issue. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kuduk
Aug 6, 2015
Same problem with nxlog in windows. I resolved adding a new line into the certificate.
kuduk
commented
Aug 6, 2015
|
Same problem with nxlog in windows. I resolved adding a new line into the certificate. |
suyograo
added
the
bug
label
Aug 10, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Some of comments here (recent one) make me believe its related to #293 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rejia
Aug 14, 2015
Logstash-forwarder: 0.4.0 (client on diff server, collecting logs)
Logstash: 1.5.3 (dedicated server)
Elasticsearch: 1.7.1 (on same server as Logstash)
OS: RHEL 6.6
Java: 1.8
2015/08/14 12:28:29.137179 Socket error, will reconnect: write tcp 10.200.106.101:5043: broken pipe
2015/08/14 12:28:30.137494 Setting trusted CA from file: /etc/pki/tls/certs/logstash-forwarder.crt
2015/08/14 12:28:30.137864 Connecting to [10.200.106.101]:5043 (server107)
2015/08/14 12:28:45.138399 Failed to tls handshake with 10.200.106.101 read tcp 10.200.106.101:5043: i/o timeout
2015/08/14 12:28:46.138739 Connecting to [10.200.106.101]:5043 (server107)
2015/08/14 12:29:00.253962 Connected to 10.200.106.101
2015/08/14 12:29:00.269056 Registrar: processing 1024 events
....
2015/08/14 12:29:06.067480 Registrar: processing 1024 events
2015/08/14 12:29:14.601041 Read error looking for ack: EOF
2015/08/14 12:29:14.601172 Setting trusted CA from file: /etc/pki/tls/certs/logstash-forwarder.crt
2015/08/14 12:29:14.601588 Connecting to [10.200.106.101]:5043 (server107)
2015/08/14 12:29:29.602059 Failed to tls handshake with 10.200.106.101 read tcp 10.200.106.101:5043: i/o timeout
I had the same issue with all the same error messages on the forwarder and indexer, was able verify that there were no firewall issues, could see that the tcp connections were working... checked access .. tried stopping Logstash, but wouldn't go shutdown.. finally killed it :-( .. tried brining back up.. came up but same issue with all the "broken pipe" and other connection errors...
After reading all the github issues 415, 416, and 293. .. and hearing that some folks had success .. thought I would give it a try.. although not the best option, I stopped all services/connections to Logstash.. stopped all the logstash-forwarder instances.. stopped Elasticsearch .. THEN :-) brought back Logstash first, then all the other services ... NO further errors in either of the logs, and Logstash + Logstash-forwarder is working fine.. so is Elasticsearch and I can see the messages in Kibana
rejia
commented
Aug 14, 2015
|
Logstash-forwarder: 0.4.0 (client on diff server, collecting logs) 2015/08/14 12:28:29.137179 Socket error, will reconnect: write tcp 10.200.106.101:5043: broken pipe I had the same issue with all the same error messages on the forwarder and indexer, was able verify that there were no firewall issues, could see that the tcp connections were working... checked access .. tried stopping Logstash, but wouldn't go shutdown.. finally killed it :-( .. tried brining back up.. came up but same issue with all the "broken pipe" and other connection errors... After reading all the github issues 415, 416, and 293. .. and hearing that some folks had success .. thought I would give it a try.. although not the best option, I stopped all services/connections to Logstash.. stopped all the logstash-forwarder instances.. stopped Elasticsearch .. THEN :-) brought back Logstash first, then all the other services ... NO further errors in either of the logs, and Logstash + Logstash-forwarder is working fine.. so is Elasticsearch and I can see the messages in Kibana |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
rejia
Aug 20, 2015
Back to square one.. the moment I touch anything in Logstash .. the "broken pipe" happens and cannot shutdown Logstash gracefully..
[root@sever107 software]# service logstash stop
Killing logstash (pid 22234) with SIGTERM
Waiting logstash (pid 22234) to die...
Waiting logstash (pid 22234) to die...
Waiting logstash (pid 22234) to die...
Waiting logstash (pid 22234) to die...
Waiting logstash (pid 22234) to die...
logstash stop failed; still running.
rejia
commented
Aug 20, 2015
|
Back to square one.. the moment I touch anything in Logstash .. the "broken pipe" happens and cannot shutdown Logstash gracefully.. [root@sever107 software]# service logstash stop |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
izenk
Sep 8, 2015
Hi.
Get this issue with
logstash 1.5.4
logstash-forwarder 0.4.0
Env is:
logstash and logstash-forwarder are on the same VM
OS ubuntu 14.04.2
input {
stdin {}
lumberjack {
port => 5043
ssl_certificate => "/etc/ssl/certs/server.crt"
ssl_key => "/etc/ssl/private/server.key"
codec => rubydebug {
metadata => true
}
}
}
output {
elasticsearch {
host => localhost
cluster => "log"
index => "logstash-test-%{+YYYY.MM.dd}"
workers => 1
}
}
logstash-forwarder config:
{
"network": {
"servers": [ "my_server:5043" ],
"ssl ca": "/etc/ssl/certs/server.ca.crt",
"timeout": 15
},
"files": [
{
"paths": [ "-" ]
}
]
}
logstash-forwarder starts normally:
2015/09/08 18:00:59.490129 --- options -------
2015/09/08 18:00:59.490280 config-arg: ./logstash-forwarder.conf
2015/09/08 18:00:59.490337 idle-timeout: 5s
2015/09/08 18:00:59.490397 spool-size: 1024
2015/09/08 18:00:59.490425 harvester-buff-size: 16384
2015/09/08 18:00:59.490458 --- flags ---------
2015/09/08 18:00:59.490540 tail (on-rotation): false
2015/09/08 18:00:59.490588 log-to-syslog: false
2015/09/08 18:00:59.490608 quiet: false
2015/09/08 18:00:59.490673 {
"network": {
"servers": [ "my_server:5043" ],
"ssl ca": "/etc/ssl/certs/server.ca.crt",
"timeout": 15
},
"files": [
{
"paths": [ "-" ]
}
]
}
2015/09/08 18:00:59.491422 Waiting for 1 prospectors to initialise
2015/09/08 18:00:59.491510 harvest: "-" (offset snapshot:0)
2015/09/08 18:00:59.491768 All prospectors initialised with 0 states to persist
2015/09/08 18:00:59.492290 Setting trusted CA from file: /etc/ssl/certs/server.ca.crt
2015/09/08 18:00:59.493863 Connecting to [10.1.199.116]:5043 (<my_server>)
2015/09/08 18:00:59.651542 Connected to 10.1.199.116
After (due to STDIN input) I try enter some text and error appears
test
2015/09/08 18:02:59.513948 Read error looking for ack: EOF
2015/09/08 18:02:59.514135 Setting trusted CA from file: /etc/ssl/certs/server.ca.crt
2015/09/08 18:02:59.514895 Connecting to [10.1.199.116]:5043 (my_server)
2015/09/08 18:02:59.587461 Connected to 10.1.199.116
2015/09/08 18:02:59.592947 Read error looking for ack: EOF
2015/09/08 18:02:59.593078 Setting trusted CA from file: /etc/ssl/certs/server.ca.crt
2015/09/08 18:02:59.593807 Connecting to [10.1.199.116]:5043 (my_server)
2015/09/08 18:02:59.663463 Connected to 10.1.199.116
2015/09/08 18:02:59.670076 Read error looking for ack: EOF
In logstash in the same time I see errors too:
config LogStash::Codecs::RubyDebug/@metadata = true {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"111", :method=>"config_init"}
Lumberjack input: unhandled exception {:exception=>#<RuntimeError: Not implemented>, :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-codec-rubydebug-1.0.0/lib/logstash/codecs/rubydebug.rb:24:in decode'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-lumberjack-1.0.5/lib/logstash/inputs/lumberjack.rb:77:inrun'", "org/jruby/RubyProc.java:271:in call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-lumberjack-1.0.5/lib/logstash/inputs/lumberjack.rb:105:ininvoke'", "org/jruby/RubyProc.java:271:in call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.24/lib/lumberjack/server.rb:264:indata'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.24/lib/lumberjack/server.rb:246:in read_socket'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.24/lib/lumberjack/server.rb:190:indata_field_value'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.24/lib/lumberjack/server.rb:101:in feed'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.24/lib/lumberjack/server.rb:206:incompressed_payload'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.24/lib/lumberjack/server.rb:101:in feed'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.24/lib/lumberjack/server.rb:239:inread_socket'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/jls-lumberjack-0.0.24/lib/lumberjack/server.rb:224:in run'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-input-lumberjack-1.0.5/lib/logstash/inputs/lumberjack.rb:104:ininvoke'", "org/jruby/RubyProc.java:271:in call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/concurrent-ruby-0.9.1-java/lib/concurrent/executor/executor_service.rb:515:inrun'", "Concurrent$$JavaExecutorService$$Job_1733551258.gen:13:in `run'"], :level=>:error, :file=>"logstash/inputs/lumberjack.rb", :line=>"117", :method=>"invoke"}
About my network:
This error I get, when run both LS and LSF on the same vm
my_server is pointed to its ip through /etc/hosts file
Also I get this error, when run LS and LSF on different VMs
izenk
commented
Sep 8, 2015
|
Hi. Env is:
logstash-forwarder config:
logstash-forwarder starts normally: 2015/09/08 18:00:59.490129 --- options -------
}, "files": [ 2015/09/08 18:00:59.491422 Waiting for 1 prospectors to initialise After (due to STDIN input) I try enter some text and error appears 2015/09/08 18:02:59.513948 Read error looking for ack: EOF In logstash in the same time I see errors too: About my network: |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ph
Sep 8, 2015
Member
Can you try using the rubydebug codec in a stdout output instead of the lumberjack input?
This config might help:
input {
stdin {}
lumberjack {
port => 5043
ssl_certificate => "/etc/ssl/certs/server.crt"
ssl_key => "/etc/ssl/private/server.key"
}
}
output {
stdout {
codec => rubydebug { metadata => true }
}
elasticsearch {
host => "localhost"
cluster => "log"
index => "logstash-test-%{+YYYY.MM.dd}"
workers => 1
}
}
|
Can you try using the rubydebug codec in a stdout output instead of the lumberjack input?
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
izenk
Sep 8, 2015
Ye, it works. A problem is, that for rubydebug codec method decode is not implemented, and this method is called when add codec in input section
Thank you
izenk
commented
Sep 8, 2015
|
Ye, it works. A problem is, that for rubydebug codec method decode is not implemented, and this method is called when add codec in input section |
ruflin
added
the
filebeat
label
Sep 16, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
Salmani
Sep 29, 2015
Hi All,
FYI for anyone potentially googling to this bug (like me):
In my case with the error "Read error looking for ack: EOF" in logstash-forwarder log there was no bug in the code and I was getting this error because logstash server was not able to connect to the ouput elasticsearch servers due to configuration error in logstash/ES/firewall on my part. Judging from the logs this prevented Logstash from flushing events to the output server and kept disconnecting the logstash-forwarder client with errors logged in logstash.log and logstash-forwarder log.
This error in the logstash logs pointed to the real problem:
":timestamp=>"2015-09-29T04:04:35.592000+0000", :message=>"Got error to send bulk of actions: connect timed out", :level=>:error}
{:timestamp=>"2015-09-29T04:04:35.592000+0000", :message=>"Failed to flush outgoing items"
Once I fixed the elasticsearch servers settings in my logstash config(+ES+firewall) it started to work fine.
LogStash : 1.5.4
Logstash Forwarder : 0.4.0
ES Version : 1.7.2
Forwarder Log ****
2015/09/29 14:03:16.495311 Connected to x.x.x.x
2015/09/29 14:03:21.499879 Read error looking for ack: EOF
2015/09/29 14:03:21.499981 Setting trusted CA from file: x.crt
2015/09/29 14:03:21.513262 Connecting to x.x.x.x
2015/09/29 14:03:21.571949 Connected to x.x.x.x
2015/09/29 14:03:21.574897 Read error looking for ack: EOF
Forwarder Log ****
_LogStash Log_**
{:timestamp=>"2015-09-29T04:04:13.661000+0000", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2015-09-29T04:04:14.162000+0000", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
{:timestamp=>"2015-09-29T04:04:14.662000+0000", :message=>"Lumberjack input: the pipeline is blocked, temporary refusing new connection.", :level=>:warn}
_LogStash Log_**
Thanks,
Salman
Salmani
commented
Sep 29, 2015
|
Hi All, FYI for anyone potentially googling to this bug (like me): In my case with the error "Read error looking for ack: EOF" in logstash-forwarder log there was no bug in the code and I was getting this error because logstash server was not able to connect to the ouput elasticsearch servers due to configuration error in logstash/ES/firewall on my part. Judging from the logs this prevented Logstash from flushing events to the output server and kept disconnecting the logstash-forwarder client with errors logged in logstash.log and logstash-forwarder log. This error in the logstash logs pointed to the real problem: ":timestamp=>"2015-09-29T04:04:35.592000+0000", :message=>"Got error to send bulk of actions: connect timed out", :level=>:error} Once I fixed the elasticsearch servers settings in my logstash config(+ES+firewall) it started to work fine. LogStash : 1.5.4 Forwarder Log **** _LogStash Log_** Thanks, Salman |
jstangroome
referenced this issue
Oct 5, 2015
Closed
High CPU on forwarder when the server disconnects quickly #535
jamesblackburn
referenced this issue
Dec 10, 2015
Open
Logstash 2.1.1 immediate OOM when running simple lumberjack input (regression) #4333
added a commit
to jamesblackburn/logstash-issue
that referenced
this issue
Dec 11, 2015
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jamesblackburn
Dec 11, 2015
I've managed to reproduce this reliably with logstash-forwarder 0.4.0 and logstash 2.1.1:
https://github.com/jamesblackburn/logstash-issue/blob/master/logstash-issue.zip
What I've seen is that copy-truncate can produce a very large Java GC log
This log is starts with a bunch of NULL bytes. Running logstash forwarder on this fails to make any progress:
Logstash
$ LOGSTASH_VERSION=1.5.5 logstash --config=./logstash-receiver.conf
Logstash startup completed
Logstash-forwarder
$ /apps/research/tools/logstash-forwarder/0.4.0/logstash-forwarder -config=logstash-forwarder.conf
2015/12/11 09:46:20.716506 --- options -------
2015/12/11 09:46:20.716600 config-arg: logstash-forwarder.conf
2015/12/11 09:46:20.716617 idle-timeout: 5s
2015/12/11 09:46:20.716623 spool-size: 1024
2015/12/11 09:46:20.716629 harvester-buff-size: 16384
2015/12/11 09:46:20.716633 --- flags ---------
2015/12/11 09:46:20.716638 tail (on-rotation): false
2015/12/11 09:46:20.716643 log-to-syslog: false
2015/12/11 09:46:20.716648 quiet: false
2015/12/11 09:46:20.716976 {
"network": {
"servers": [ "logs.dev.man.com:8999"
],
"timeout": 30,
"ssl key": "/ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.key",
"ssl ca": "/ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt"
},
"files": [
{
"paths": [ "/users/is/jblackburn/logstash-issue/logs/*.log"
],
"fields": { "type": "lumberjack", "service": "%%SVC_HOST%%" }
}
]
}
2015/12/11 09:46:20.717926 Waiting for 1 prospectors to initialise
2015/12/11 09:46:20.718315 Launching harvester on new file: /users/is/jblackburn/logstash-issue/logs/elasticsearch-gc.log
2015/12/11 09:46:20.718714 harvest: "/users/is/jblackburn/logstash-issue/logs/elasticsearch-gc.log" (offset snapshot:0)
2015/12/11 09:46:20.729481 All prospectors initialised with 0 states to persist
2015/12/11 09:46:20.729652 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:46:20.731115 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:46:21.105701 Connected to 10.220.53.55
2015/12/11 09:46:36.549103 Read error looking for ack: EOF
2015/12/11 09:46:36.549171 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:46:36.549920 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:46:36.634797 Connected to 10.220.53.55
2015/12/11 09:46:40.746815 Read error looking for ack: EOF
2015/12/11 09:46:40.746883 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:46:40.747473 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:46:40.888935 Connected to 10.220.53.55
2015/12/11 09:46:44.560995 Read error looking for ack: EOF
2015/12/11 09:46:44.561062 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:46:44.563795 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:46:44.697986 Connected to 10.220.53.55
2015/12/11 09:46:48.477937 Read error looking for ack: EOF
2015/12/11 09:46:48.478011 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:46:48.478623 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:46:48.616089 Connected to 10.220.53.55
2015/12/11 09:46:52.778987 Read error looking for ack: EOF
2015/12/11 09:46:52.779054 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:46:52.779699 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:46:52.910973 Connected to 10.220.53.55
2015/12/11 09:46:57.065908 Read error looking for ack: EOF
2015/12/11 09:46:57.065978 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:46:57.066617 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:46:57.198152 Connected to 10.220.53.55
2015/12/11 09:47:01.254964 Read error looking for ack: EOF
2015/12/11 09:47:01.255034 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:47:01.255556 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:47:01.391207 Connected to 10.220.53.55
2015/12/11 09:47:05.499504 Read error looking for ack: EOF
2015/12/11 09:47:05.499574 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:47:05.500155 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:47:05.589293 Connected to 10.220.53.55
2015/12/11 09:47:09.039561 Read error looking for ack: EOF
2015/12/11 09:47:09.039629 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:47:09.040237 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:47:09.169342 Connected to 10.220.53.55
2015/12/11 09:47:13.254280 Read error looking for ack: EOF
2015/12/11 09:47:13.254347 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:47:13.254948 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:47:13.384358 Connected to 10.220.53.55
2015/12/11 09:47:17.729627 Read error looking for ack: EOF
2015/12/11 09:47:17.729698 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:47:17.730253 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:47:17.860438 Connected to 10.220.53.55
2015/12/11 09:47:22.081924 Read error looking for ack: EOF
2015/12/11 09:47:22.081992 Setting trusted CA from file: /ahlinfra/monit/bin/../conf/logs.dev.man.com/../common/logs.dev.man.com.crt
2015/12/11 09:47:22.082687 Connecting to [10.220.53.55]:8999 (logs.dev.man.com)
2015/12/11 09:47:22.211463 Connected to 10.220.53.55
jamesblackburn
commented
Dec 11, 2015
|
I've managed to reproduce this reliably with logstash-forwarder 0.4.0 and logstash 2.1.1: What I've seen is that copy-truncate can produce a very large Java GC log This log is starts with a bunch of Logstash
Logstash-forwarder
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ruflin
Dec 27, 2015
@jamesblackburn Interesting issue. I didn't test it yet but I assume this issue also exists in filebeat https://github.com/elastic/beats/tree/master/filebeat On the one hand it seems like a log rotation issue but on the other hand the log crawler should be able to "overcome" this issue. Did you find a solution to the problem or is it still an issue? If yes, would you mind testing it with filebeat and in case it is an issue, open an issue there?
ruflin
commented
Dec 27, 2015
|
@jamesblackburn Interesting issue. I didn't test it yet but I assume this issue also exists in filebeat https://github.com/elastic/beats/tree/master/filebeat On the one hand it seems like a log rotation issue but on the other hand the log crawler should be able to "overcome" this issue. Did you find a solution to the problem or is it still an issue? If yes, would you mind testing it with filebeat and in case it is an issue, open an issue there? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jamesblackburn
Jan 12, 2016
Thanks @ruflin
It's still an issue with these files. The workaround is to disable logrotate on the Java GC out files.
The biggest issue we have is actually on the jruby logstash end. I'm using exactly the config above - the receiver just pipes data directly to files, no filtering or parsing. Unfortunately the receivers are very heavyweight, and can't keep up with the write load. I've got 25 hosts with forwarders, and I'm running 8 receivers. The problems I've seen:
- if there are too few receivers then you end up exhausting number of threads as forwarders connect, timeout and reconnect. In the mean time the java process leaks threads for a long period of time
- receiving input and output are single threaded
- the receivers need to be restarted periodically due to some leak or otherwise they just grind to a halt eventually
At this point I'm not sure that logstash scales, I'm considering Heka which claims to handle 10Gb/s/node(!): https://blog.mozilla.org/services/2013/04/30/introducing-heka/
Looking at the history on my receiver box it's only doing 3MB/s (~20Mb/s) which is orders of magnitude less
jamesblackburn
commented
Jan 12, 2016
|
Thanks @ruflin It's still an issue with these files. The workaround is to disable logrotate on the Java GC out files. The biggest issue we have is actually on the jruby logstash end. I'm using exactly the config above - the receiver just pipes data directly to files, no filtering or parsing. Unfortunately the receivers are very heavyweight, and can't keep up with the write load. I've got 25 hosts with forwarders, and I'm running 8 receivers. The problems I've seen:
At this point I'm not sure that logstash scales, I'm considering Heka which claims to handle 10Gb/s/node(!): https://blog.mozilla.org/services/2013/04/30/introducing-heka/ |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ruflin
Jan 12, 2016
@jamesblackburn As you mentioned yourself, it looks more like a Logstash issue. Perhaps it's better to start a discussion here: https://discuss.elastic.co/c/logstash
ruflin
commented
Jan 12, 2016
|
@jamesblackburn As you mentioned yourself, it looks more like a Logstash issue. Perhaps it's better to start a discussion here: https://discuss.elastic.co/c/logstash |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ph
Jan 12, 2016
Member
He created this issue elastic/logstash#4333
And since the test case is provided, I will check what I can do to improve the situation. (The log is pretty big!)
In the mean time the java process leaks threads for a long period of time
receiving input and output are single threaded
The receiving input is multi threaded, only the TCP accept loop is not, each connections has his own thread. But you are correct the output to file is single threaded.
|
He created this issue elastic/logstash#4333
The receiving input is multi threaded, only the TCP accept loop is not, each connections has his own thread. But you are correct the output to file is single threaded. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jamesboorn
Feb 8, 2016
I have hit the dreaded 'Read error looking for ack: EOF' and 'Connecting to' loop of death. We tracked it down to what looks like a parse error on the logstash side cause by a log message that begins with a number followed by white space.
I reproduced the issue with a simple logstash set up (lumber jack input and file output, no filtering and no codec specified). When the line in the file logstash-forwarder is watching begins with a number followed by white space the non-recoverable error condition occurs.
For example this will cause the error to occur:
echo '3 you are going down' >> file_forwarder_is_watching
This will not cause the error to occur:
echo ‘3you are still up' >> file_forwarder_is_watching
using logstash-1.5.4 and logstash-forwarder-0.4.0
jamesboorn
commented
Feb 8, 2016
|
I have hit the dreaded 'Read error looking for ack: EOF' and 'Connecting to' loop of death. We tracked it down to what looks like a parse error on the logstash side cause by a log message that begins with a number followed by white space. I reproduced the issue with a simple logstash set up (lumber jack input and file output, no filtering and no codec specified). When the line in the file logstash-forwarder is watching begins with a number followed by white space the non-recoverable error condition occurs. For example this will cause the error to occur: using logstash-1.5.4 and logstash-forwarder-0.4.0 |
robin13 commentedOct 24, 2014
Since upgrading to (0.3.1 / 0632ce)[https://github.com/elasticsearch/logstash-forwarder/commit/0632ce3952fb4e941ec520d9fad05a3e10955dc4] I've been getting this error a lot, but strangely only on the boxes where there is relatively little activity. The boxes which are sending hundreds of events per second never have this error, but where there are fewer events being sent it looks like this:
As you can see, always on the dot 10 seconds after the last log event, even though I've got the timeout set to 30s in the logstash-forwarder.conf
Any ideas what could be going on here?