Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logstash stop reading new lines #205

Closed
ThomasLau opened this issue Sep 4, 2018 · 7 comments
Closed

logstash stop reading new lines #205

ThomasLau opened this issue Sep 4, 2018 · 7 comments
Assignees

Comments

@ThomasLau
Copy link

ThomasLau commented Sep 4, 2018

hi guys, this is the system info.

  • Version: logstash:5.6.0, logstash-input-file: 4.0.3 or 4.1.5
  • Operating System: centos 7
  • Config File (if you have sensitive info, please remove it):
input {	
    file {
        path => "/opt/logs/data_null.*"
		start_position => "beginning"
		type => "dc-zx"
		sincedb_write_interval => "1"
		sincedb_path => "/data/logstash/.sincedb"
        discover_interval => "1"
    }
}
filter {
   	metrics {
   	   meter => "events@%{[type]}@%{[path]}"
   	   meter => "events@%{[type]}@sum"
   	   add_tag => "metric"
   	   add_field => {"group" => "xxx"}
   	   add_field => {"collect_time" => "%{+YYYY.MM.dd HH:mm:ss}"}
   	   add_field => {"host" => "%{message}"}
   	   remove_field => ["message"]
   	   flush_interval => 60
   	   clear_interval => 86401
   	}
}
output {
    if "metric" in [tags] {
      stdout { codec => rubydebug }
    } else {
      kafka {
    		...
    		compression_type => "snappy"
    		codec => plain {
        	    format => "%{message}"
        	}
    	}
    }
}
  • Sample Data:
    each line is a json string, size of each line may be 2kB-2MB
  • Steps to Reproduce:
    occasionally,about 3-6 days

i change the log level to DEBUG, only found that:

`
.....

251605:[2018-09-04T12:59:16,989][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
251606:[2018-09-04T12:59:16,990][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
251607:[2018-09-04T12:59:16,991][DEBUG][filewatch.sincedbcollection] writing sincedb (delta since last write = 1)
251608:[2018-09-04T12:59:17,006][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
251609:[2018-09-04T12:59:17,007][DEBUG][filewatch.sincedbcollection] writing sincedb (delta since last write = 1)
251610:[2018-09-04T12:59:18,021][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
251611:[2018-09-04T12:59:18,023][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
251612:[2018-09-04T12:59:18,023][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
251613:[2018-09-04T12:59:18,023][DEBUG][filewatch.sincedbcollection] writing sincedb (delta since last write = 1)
251614:[2018-09-04T12:59:18,038][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk
251615:[2018-09-04T12:59:18,940][DEBUG][logstash.agent ] Reading config file {:config_file=>"/data/logstash/logstash-5.6.0_fix/config/dc-z
x.conf"}
251616:[2018-09-04T12:59:18,940][DEBUG][logstash.agent ] no configuration change for pipeline {:pipeline=>"main"}
251617:[2018-09-04T12:59:21,254][DEBUG][logstash.pipeline ] Pushing flush onto pipeline
251618:[2018-09-04T12:59:21,262][DEBUG][logstash.pipeline ] Flushing {:plugin=>#<LogStash::FilterDelegator:0x75f79e76 @metric_events_out

....
`

the sincedb was last updated at: 2018-09-04T12:59:18, but new tailed lines of this file was iggnored after that.

from the log above, i think it seems that logstash filewatch.tailmode.handlers found the new line, but didn't write it to the output.

however everything works fine after i restart logstash until it got the same log as above again.

@ThomasLau
Copy link
Author

ThomasLau commented Sep 5, 2018

hi, i found that maybe the problem is inode number reuse, but not sure about that, i'll look at it later...

@ThomasLau
Copy link
Author

ThomasLau commented Sep 8, 2018

It's not the problem of inode number reuse, logstash still stop reading new lines when i set "sincedb_clean_after=>1.5" nearly 46 hours later.

@urizaiden
Copy link

Hey guys, I'm having the same problem with logstash 6.4.0 on AMAZON LINUX 2

-bash-4.2# cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
-bash-4.2# /usr/share/logstash/bin/logstash --version
logstash 6.4.0

-bash-4.2# cat /etc/logstash/conf.d/servicename.conf
input {
  file {
    path => "/path/to/log/servicename/_logs/servicename.log"
  }
}

filter{
  json{
    source => "message"
  }
}

output {
  elasticsearch {
    hosts => ["elasticsearch.endpoint.dns:9200"]
    index => "logstash-servicename-%{+YYYY.MM.dd}"
    document_type => "logs"
  }
}

-bash-4.2# cat /etc/logstash/logstash.yml
node.name: service-name
path.data: /var/lib/logstash
pipeline.workers: 18
pipeline.output.workers: 4
pipeline.batch.size: 450
path.config: /etc/logstash/conf.d
path.logs: /var/log/

I changed the log level to trace, and all I get are these 2 lines that repeat them self:

[2018-10-29T08:16:42,838][TRACE][filewatch.tailmode.handlers.grow] reading... {"iterations"=>323627, "amount"=>32768, "filename"=>"service-name.log"}
[2018-10-29T08:16:42,838][DEBUG][filewatch.tailmode.handlers.grow] read_to_eof: get chunk

@guyboertje
Copy link
Contributor

@ThomasLau

I know what this bug is. The fix to exit the stuck loop is not hard but

Are you using logrotate? The copy/truncate strategy can cause this when the file is unexpectedly truncated while the file input is still looping through the remaining unread content and gets stuck in this loop.
The fix to exit the stuck loop is not hard but note, because the copied content has a new filename and a new inode, the file input can't know that it is the old content from before and complete the reading - those unread bytes are lost.

guyboertje pushed a commit to guyboertje/logstash-input-file that referenced this issue Nov 26, 2018
The processing is lagging behind, i.e. bytes read is less that file size.

1. The grow handler begins to a read of the remaining bytes from an offset - as a loop B inside loop A.
2. While the code is in loop B, the file is truncated and sysread raises an EOF error that breaks out of loop B but the watched_file thinks it still has more to read (the truncation has not been detected at this time), loop A retries loop B again without success.

Fixes logstash-plugins#205
guyboertje pushed a commit that referenced this issue Nov 27, 2018
…or. (#221)

* Fix endless outer loop when inner loop gets an error.

The processing is lagging behind, i.e. bytes read is less that file size.

1. The grow handler begins to a read of the remaining bytes from an offset - as a loop B inside loop A.
2. While the code is in loop B, the file is truncated and sysread raises an EOF error that breaks out of loop B but the watched_file thinks it still has more to read (the truncation has not been detected at this time), loop A retries loop B again without success.

Fixes #205

* change to flag_read_error
@guyboertje
Copy link
Contributor

@ThomasLau
v4.1.8 has been released. Please use bin/logstash-plugin install --version 4.1.8 to verify that the fix works for you.

@brydoncheyney-slc
Copy link

brydoncheyney-slc commented Nov 30, 2018

@guyboertje we're observing exactly the same behaviour, occurring daily.

  • Version: logstash:6.5.1, logstash-input-file: 4.1.6
  • Operating System: RHEL6

Since switching to logstash-file-input plugin 4.1.8 two days ago, we haven't seen this happen again.

edit: hadn't noticed that this issued had been closed - apologies!

@guyboertje
Copy link
Contributor

@brydoncheyney-slc

It is not only OK to confirm the fix in a closed issue, it is good because future readers get to see the confirmation. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants