Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logstash lost data during log rotate #214

Closed
Tsukiand opened this issue Sep 17, 2018 · 4 comments

Comments

@Tsukiand
Copy link

commented Sep 17, 2018

I have use logstash-input-file(4.1.4) to ingest from file. I found data loss during log rotation.
I have set my 3 files to rotate. And when the file over 1k the file rotate happen.

My configuration of log rotate:
{
missingok
size 1k
notifempty
sharedscripts
rotate 3
}

My script to generate log and rotate:
for (( i=1 ; i <= 100000; i++ ))
do
echo "$i this is a bunch of test data blah blah" >> /tmp/log/test

if ! ((i % 1000)); then
sleep 1
fi

if ! ((i % 30000 || i == 100000)); then
/usr/sbin/logrotate -f /etc/logrotate.d/test &
fi
done

My configuration of logstash:
input {
file {
path => "/tmp/log/test*"
}
}

output {
file {
path => "/tmp/output.txt"
codec => line { format => "custom format: %{message}" }
}
}

Data loss happened as below:
I found that creating new "log" file caused data loss. I have checked the source code and found that new "log" file lost some logs in the beginning. (create_initial.rb seek operation cause this issue).It means that logs that written to the "log" files during the file rotation will lost.

Please give me some advice on this issue.

Thanks,
Tsukiand

@Tsukiand Tsukiand closed this Sep 17, 2018

@Tsukiand Tsukiand reopened this Sep 17, 2018

@lrbsunday

This comment has been minimized.

Copy link

commented Sep 26, 2018

I got the same issue, any ideas?

@Tsukiand

This comment has been minimized.

Copy link
Author

commented Sep 26, 2018

@lrbsunday I have test with input path as "/tmp/log/test" and "/tmp/log/test*". And data loss is the same.
When i use "/tmp/log/test", filewatch only monitor the "test" file. And logstash will lost data during file rotation.
When i use "/tmp/log/test*", filewatch monitor "test" "test.1" "test.2". And logstash will not lost data during file rotation. But we also lost data. I will explain the data loss:

  1. We have test test.1 test.2 and test.3
  2. File rotation happened.
    2.1 File rotation1: test.2 change to test.3 (new test.3 will rotate from old test.2, no data loss)
    2.2 File rotation2: test.1 change to test.2 (new test.2 will rotate from old test.1, no data loss)
    2.3 File rotation3 : test change to test.1 (new test.1 will rotate from old test, no data loss)
    2.4 File rotation4: new test generated (As old test change to test.1, the watched_file changed, and it caused new test to rotate as initial file, and seek to the current size. The seek operation result in data loss)

I have add a flag(:rotate_flag) in "watced_file.rb" to avoid data loss. But i am not sure whether my change will bring other issues. Maybe you can give me some advice.

attr_reader :bytes_read, :state, :file, :buffer, :recent_states, :bytes_unread, :rotate_flag
attr_reader :path, :accessed_at, :modified_at, :pathname, :filename
attr_reader :listener, :read_loop_count, :read_chunk_size, :stat
attr_reader :loop_count_type, :loop_count_mode
attr_accessor :last_open_warning_at

def initialize(pathname, stat, settings)
@settings = settings
@pathName = Pathname.new(pathname)
@path = @pathname.to_path
@filename = @pathname.basename.to_s
full_state_reset(stat)
watch
set_standard_read_loop
set_accessed_at
@rotate_flag = false
end

def flag?
@rotate_flag
end

def set_flag
@rotate_flag = true
end

def rotate_from(other)
# move all state from other to this one
set_standard_read_loop
file_close
@bytes_read = other.bytes_read
@bytes_unread = other.bytes_unread
@Listener = nil
@initial = false
@recent_states = other.recent_states
@accessed_at = other.accessed_at
if !other.delayed_delete?
# we don't know if a file exists at the other.path yet
# so no reset
other.full_state_reset
other.set_flag
end
set_stat PathStatClass.new(pathname)
ignore
end

def rotate_as_initial_file
# rotation, when no sincedb record exists for new inode - we have never seen this inode before.
rotate_as_file
if !flag?
@initial = true
end
#@initial = true
end

@guyboertje

This comment has been minimized.

Copy link
Contributor

commented Oct 1, 2018

The temporary work around is to use start_position => "beginning" as this forces processing to start at the beginning of the latest file.
However, it is a bug that this should be necessary. The docs say... If you have old data you want to import, set this to 'beginning'. but clearly this is not old data.

@Tsukiand

This comment has been minimized.

Copy link
Author

commented Oct 8, 2018

@guyboertje

Thanks for your reply. I have test with start_position => "beginning" and it works. But as you said, it is a temporary work around. Maybe we need a fix.

guyboertje added a commit to guyboertje/logstash-input-file that referenced this issue Oct 25, 2018
guyboertje added a commit that referenced this issue Oct 29, 2018
Force all files under rotation to start at 0 or at the sincedb record. (
#217)

* Force all files under rotation to start at 0 or at the sincedb record.
* Update travis.yml to update versions.

Fixes #214
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.