Skip to content
This repository has been archived by the owner on Apr 2, 2024. It is now read-only.

Logstreamer doesn't notice new file in a stream since previous empty one was deleted #1199

Closed
validname opened this issue Nov 20, 2014 · 9 comments

Comments

@validname
Copy link
Contributor

Hello!

I've found a bug. I've checked it on the last commit 8941404 without our patches.

How to reproduce:

  1. Test configuration
[in]
type="LogstreamerInput"
log_directory="/data/logs/"
file_match='z\.in'
rescan_interval="1000ms"

[PayloadEncoder]

[out]
type="FileOutput"
message_matcher="TRUE"
encoder="PayloadEncoder"
path="/data/logs/z.out"
  1. Actions
# start heka, check the files:
root@frontend6:/etc/heka# ls -la  /data/logs/z.*
-rw-r--r-- 1 root root 0 2014-11-20 14:55 /data/logs/z.out
# so, there is no input file and empty output one

# check which files heka is holding (sorry, it's a Linux and I'm reading procfs)
root@frontend6:/etc/heka# ls -la /proc/`pgrep hekad`/fd| grep '/data/logs/z\.'
l-wx------ 1 root root 64 2014-11-20 14:55 7 -> /data/logs/z.out
# still only output file, OK

# create new input file
root@frontend6:/etc/heka# touch /data/logs/z.in
root@frontend6:/etc/heka# ls -la /proc/`pgrep hekad`/fd| grep '/data/logs/z\.'
lr-x------ 1 root root 64 2014-11-20 14:56 3 -> /data/logs/z.in
l-wx------ 1 root root 64 2014-11-20 14:55 7 -> /data/logs/z.out
# heka found and opened it

# delete it (imitate log rotation)
root@frontend6:/etc/heka# rm /data/logs/z.in
root@frontend6:/etc/heka# ls -la /proc/`pgrep hekad`/fd| grep '/data/logs/z\.'
lr-x------ 1 root root 64 2014-11-20 14:56 3 -> /data/logs/z.in (deleted)
l-wx------ 1 root root 64 2014-11-20 14:55 7 -> /data/logs/z.out

# create new file
root@frontend6:/etc/heka# touch /data/logs/z.in
root@frontend6:/etc/heka# ls -la /proc/`pgrep hekad`/fd| grep '/data/logs/z\.'
lr-x------ 1 root root 64 2014-11-20 14:56 3 -> /data/logs/z.in (deleted)
l-wx------ 1 root root 64 2014-11-20 14:55 7 -> /data/logs/z.out
# but heka didn't notice it

# add content to the new file
root@frontend6:/etc/heka# date >> /data/logs/z.in
root@frontend6:/etc/heka# ls -la /proc/`pgrep hekad`/fd| grep '/data/logs/z\.'
lr-x------ 1 root root 64 2014-11-20 14:56 3 -> /data/logs/z.in (deleted)
l-wx------ 1 root root 64 2014-11-20 14:55 7 -> /data/logs/z.out
# but heka still didn't notice it. It's a bug.

root@frontend6:/etc/heka# ls -la  /data/logs/z.*
-rw-r--r-- 1 root root 30 2014-11-20 14:56 /data/logs/z.in
-rw-r--r-- 1 root root  0 2014-11-20 14:55 /data/logs/z.out
# and the output file still empty

# would heka notice the new file after restart?
root@frontend6:/etc/heka# /etc/init.d/ngs.heka restart
Stopping hekad: OK.
Starting hekad: OK.
root@frontend6:/etc/heka# ls -la  /data/logs/z.*
-rw-r--r-- 1 root root 30 2014-11-20 15:23 /data/logs/z.in
-rw-r--r-- 1 root root 31 2014-11-20 15:23 /data/logs/z.out
# yes, heka noticed it. It's a normal file and I have workable configuration.
@validname validname changed the title Logstreamer doesn't notice new file since previous empty one was deleted Logstreamer doesn't notice new file in a stream since previous empty one was deleted Nov 20, 2014
@validname
Copy link
Contributor Author

It's seems that we have erroneous loop in this case:

  • on new file function SeekInFile() returns file descriptor (aka "hash match") because of stored seekPosition=0,
  • hence, FileHashMismatch() returns false ("hash match") further,
  • and NewerFileAvailable() returns false, too.
    And this lasts forever.

@validname
Copy link
Contributor Author

I've tried to look deeper at code, but it's too complicated for me. Too scary to touch hashing. So I just wrote a 'special-case-fix'. It works for me.

--- dir1/logstreamer/reader.go  2014-10-14 10:28:30.000000000 +0700
+++ dir2/logstreamer/reader.go  2014-10-14 10:37:44.000000000 +0700
@@ -178,10 +178,14 @@
        return "", false
    }

+   prior_was_empty := false
    // 1. If our size is greater than the file at this filename, we're not the
    // same file
    if currentInfo.Size() > fInfo.Size() {
        ok = true
+   } else if (currentInfo.Size() < fInfo.Size() && currentInfo.Size()==0 )  {
+       prior_was_empty = true
+       ok = true
    } else if l.FileHashMismatch() {
        // Our file-hash didn't verify, not the same file
        ok = true
@@ -234,6 +238,11 @@
        return l.logfiles[fileIndex+1].FileName, true
    }

+   // So filename did't change, but currently opened file is empty and existing file with our filename is bigger. Maybe is it a newer one?
+   if prior_was_empty == true {
+       return l.position.Filename, true
+   }
+
    return
 }

@markabey
Copy link

This also affects me when using a program that runs mv on the tracked log files to a different directory and then deletes them after being copied. New ones are then made at the original path.
Heka keeps hold of the old deleted file handle (in a different directory) if the new log is not in place when the old one is deleted, so when viewed in lsof, it shows a bunch of logs marked as "(deleted)". This means logs get silently dropped from tracking.
The only workaround seems to be having a cron job that looks for heka holding deleted file handles and restarting it. I've not tried recompiling it with the patch above.
Using heka 0.8

@markabey
Copy link

And the old log file doesn't have to be empty for this to occur. At least if I mv the tracked log file to a new directory, and then delete it, heka loses track of that log stream until it is restarted.

@validname
Copy link
Contributor Author

@markabey Did you try to view positions in journal files? I'll say banality, but maybe heka just couldn't read your files as fast as you expect.

@jestan
Copy link

jestan commented Dec 11, 2015

I can reproduce this issue (in heka 0.10.b1) with rolling empty files with same name.
I am just wondering why @validname 's patch is not merged.

@jestan
Copy link

jestan commented Dec 30, 2015

This bug is a show stopper for me. @validname fix works properly.
For time being, I have to use a forked version of heka https://github.com/jestan/heka/commit/c9366b1dd7b18d865dbdb5479445972d251a68d5

@rafrombrc
Copy link
Contributor

@markabey In the case you're describing, Heka will hold on to the deleted file when no replacement file has been created, but it should notice that there's a new file as soon as one is created in the original location.

@jestan
Copy link

jestan commented Dec 31, 2015

@rafrombrc Thanks for fix the issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants