Add follow_inodes flag to tail files by inodes, not by names #1660

OleksiiDuzhyi · 2017-08-08T14:10:45Z

Use case

We have a lot of servers with up to 300 application deployed on each server (mostly Java/Scala). Each application writes log files using log4j library. Each log file is rotated daily in the same directory, with up to 10 back up files possible. Events are fed to realtime processing system(critical to time delays up to 1 second) and to HBase. We want to replace current heavy log collector with lightweight solution.

Problem

We tried to use default tail plugin. Most basic configuration will look like this:

<source>
  @type tail
  read_from_head true
  path /var/log/*/app*.log  
  tag app.log.*
  format none
  refresh_interval 10
  pos_file /var/log/pos/app.log.pos
  rotate_wait 3600
  path_key path
  limit_recently_modified 86400
</source>

According to fluentd in_tail plugin documentation - every time file is rotated for such configuration, it will be treated by in_tail plugin as new one, and will be read from it's head, even though file inode has not changed, and in position file we have latest read position (but just for last line before rotation; during rotate_wait time position stored in memory only - https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin/in_tail.rb#L554)

This doesn't suit our needs, because:

we will generate a lot of unneeded traffic, which will be irrelevant to realtime part
disk I/O increase
longer ingestion into HBase

One of the possible solution proposed by @repeatedly in #1613 is to avoid '*' in file pattern discovery. This also doesn't suit us, because we'd like to reliably store events with no or minimum loss because they are very crucial for business analysis. Possible data loss scenario:

Fluentd goes down at 23:45 UTC
Log file is rotated at 00:00:00 UTC, app.log is renamed to app.2017-08-02.log (just for example)
Fluentd goes up at 00:15 UTC
Tail plugin reads all data from in new app.log file
Data from app.2017-08-02.log file from 23:45 - 00:00 UTC is lost. No way for us to find out at least latest offset in app.2017-08-02.log because it has been overwritten in position file.

Proposed solution

Add ability to track inodes inside fluentd in_tail plugin to enable combination of '*' in path configuration with rotation inside same directory and read_from_head set to true. In proposed solution we add follow_inodes flag (set to false by default to preserve current behavior). When follow_inodes is set to true and file rotation is detected - we do not reread file, instead we are checking if such inode is present inside pos file - if yes, we don't recreate TailWatcher. TailWatcher is closed after rotate_wait period and file marked as unwatched inside position file after limit_recently_modified time passed. Pos file is cleaned on startup as it was before, but also delete entries for non-existing files.

Please review, and let me know what do you think of such approach?

repeatedly · 2017-08-15T12:14:26Z

Thanks for the patch!
I will check it later.

repeatedly · 2017-08-29T11:18:20Z

lib/fluent/plugin/in_tail.rb

@@ -213,8 +219,30 @@ def expand_paths
          paths << path
        end
      }
-      excluded = @exclude_path.map { |path| path = date.strftime(path); path.include?('*') ? Dir.glob(path) : path }.flatten.uniq
-      paths - excluded
+      excluded = @exclude_path.map {|path| path = date.strftime(path); path.include?('*') ? Dir.glob(path) : path}.flatten.uniq


Don't change syntax/indent.

repeatedly · 2017-08-29T11:18:51Z

lib/fluent/plugin/in_tail.rb

+      excluded = @exclude_path.map {|path| path = date.strftime(path); path.include?('*') ? Dir.glob(path) : path}.flatten.uniq
+      to_return = paths - excluded
+      # filter out non existing files, so in case pattern is without '*' we don't do unnecessary work
+      to_return = to_return.select {|path|


(paths - excluded).to_return.select {|path| seems enough

repeatedly · 2017-08-29T11:20:56Z

lib/fluent/plugin/in_tail.rb

+      to_return = paths - excluded
+      # filter out non existing files, so in case pattern is without '*' we don't do unnecessary work
+      to_return = to_return.select {|path|
+        Pathname.new(path).exist?


FileTest#exist? is more lightweight because no object instantiation.

Cool, thx, didn't know about that option

repeatedly · 2017-08-29T11:23:07Z

lib/fluent/plugin/in_tail.rb

+      to_return = to_return.select {|path|
+        Pathname.new(path).exist?
+      }
+      Hash[to_return.map {|path|


Long lines in Hash[] seems hard to read.
This map can be moved to above line. {}.select {}.map {}.

repeatedly · 2017-08-29T11:25:10Z

lib/fluent/plugin/in_tail.rb

+    end
+
+    def existence_path
+      Hash[@tails.keys.map {|path_ino|


With lots of files, e.g. 1000+ files, this map creates lots of object.
So avoiding map is better for the performance.

Of course, expand_paths's map is also same.

repeatedly · 2017-08-29T11:30:48Z

lib/fluent/plugin/in_tail.rb

@@ -910,5 +996,26 @@ def read_inode
        @inode
      end
    end
+
+    class PathInodeTuple


I didn't check eql? and other method behaviour of struct object but can't use struct instead of class for this case?

I've checked eql? of struct - it work as expected, will change to Struct

repeatedly · 2017-08-29T11:31:47Z

BTW, all most tests are failed.
So the something makes tests unstable.

OleksiiDuzhyi · 2017-09-06T07:42:33Z

I can't figure why tests are failing, maybe you could give me a hand in this?

In my test I do the following:

File.open("#{TMP_DIR}/tail.txt", "wb") {|f|
    f.puts "test1"
    f.puts "test2"
}
............
cleanup_file("#{TMP_DIR}/tail.txt")
File.open("#{TMP_DIR}/tail.txt", "wb") {|f|

cleanup_file code:

  def cleanup_file(path)
    FileUtils.rm_f(path, secure: true)
    if File.exist?(path)
      # ensure files are closed for Windows, on which deleted files
      # are still visible from filesystem
      GC.start(full_mark: true, immediate_mark: true, immediate_sweep: true)
      FileUtils.remove_entry_secure(path, true)
    end
  end

But for some reason I can't open a file for writing after I've deleted it. Even though I was able to write this file few lines above. I got this exception:

Errno::EACCES: Permission denied @ rb_sysopen - C:/projects/fluentd/test/plugin/../tmp/tail/tail.txt

I have no clue why I can't reopen file when I've deleted it. Should I remove file in a different way?

esselius · 2017-12-15T14:39:43Z

Will this be merged anytime soon?

repeatedly · 2017-12-18T13:42:54Z

@esselius in_tail plugin is widely used so we want to test this patch on several environment before merging.

OleksiiDuzhyi · 2017-12-18T14:16:34Z

We decided to move away from fluentd for log collection, also relying on inodes - is not 100% reliable, inodes could be reused which could cause issues

repeatedly · 2017-12-18T16:31:59Z

@OleksiiDuzhyi

We decided to move away from fluentd for log collection

What the tool do you use?
And how to solve rotation issue with your tools?

OleksiiDuzhyi · 2017-12-19T08:53:30Z

@repeatedly
We decided to go with Apache Flume. From our tests - it shows bigger memory footprint but 2-3 times less CPU usage, which was critical for us.

As input to Flume we wrote a program in C which does what I tried to implement in this PR - tracks inodes, keep state of inode, etc.

repeatedly · 2017-12-20T11:38:54Z

@OleksiiDuzhyi Thanks! So inode issue is still remained but flume resolves CPU concern.
Hmm... we hope fluent-bit resovles such case with lower CPU and memory resource.

OleksiiDuzhyi · 2017-12-20T13:02:21Z

@repeatedly
Memory usage of fluentd is small as in documentation - in our case for around 1 mln events/min - it was around 80 mb.

metayd · 2018-04-04T06:52:45Z

@repeatedly Hello, I have this problem too, is there any plan to merge this patch into master?

sanjuicgithub · 2020-04-09T12:37:18Z

@repeatedly , is this patch still in consideration or does fluentd already has any such feature to address this issue ?

repeatedly · 2020-04-17T17:44:25Z

is this patch still in consideration

Current patch is incomplete because using only inode has a problem for inode reuse case.
So inode and path or something is needed for more reliable file trace.
If we can ignore such situation, we can accept this patch approach.

Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.

Based on closed PR #1660 which is posted by OleksiiDuzhyi because of incomplete test cases. Co-authored-by: alexeyd <alexeyd@wix.com> Signed-off-by: Hiroshi Hatake <hatake@clear-code.com>

Add follow_inodes flag to tail files by inodes, not by names

44d14b2

repeatedly added feature request v0.14 labels Aug 15, 2017

Make file deletion based on existing path in watching directory

38aa868

repeatedly mentioned this pull request Aug 29, 2017

how to avoid push duplicate log to es when rename file? #1673

Closed

repeatedly reviewed Aug 29, 2017

View reviewed changes

OleksiiDuzhyi added 3 commits August 29, 2017 15:35

Code review changes

4db222b

Fix test failing on Windows systems

f7dd8b7

Fix test failing on Windows systems

168af6e

Merge branch 'master' into master

ec3014c

OleksiiDuzhyi closed this Dec 18, 2017

kenhys added a commit to kenhys/fluentd that referenced this pull request Sep 29, 2020

in_tail: allow follow_inodes flag to tail files by inodes

adef9c5

Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.

cosmo0920 pushed a commit to cosmo0920/fluentd that referenced this pull request Nov 11, 2020

in_tail: allow follow_inodes flag to tail files by inodes

d34e39c

Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.

cosmo0920 pushed a commit to cosmo0920/fluentd that referenced this pull request Nov 13, 2020

in_tail: allow follow_inodes flag to tail files by inodes

df3d2ab

Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.

cosmo0920 pushed a commit to cosmo0920/fluentd that referenced this pull request Nov 17, 2020

in_tail: allow follow_inodes flag to tail files by inodes

b00f1de

Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.

cosmo0920 pushed a commit to cosmo0920/fluentd that referenced this pull request Nov 20, 2020

in_tail: allow follow_inodes flag to tail files by inodes

6cb1e45

Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.

cosmo0920 pushed a commit to cosmo0920/fluentd that referenced this pull request Nov 24, 2020

in_tail: allow follow_inodes flag to tail files by inodes

d672e85

Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.

cosmo0920 pushed a commit to cosmo0920/fluentd that referenced this pull request Nov 24, 2020

in_tail: allow follow_inodes flag to tail files by inodes

12fe793

Based on closed PR fluent#1660 which is posted by OleksiiDuzhyi because of incomplete test cases.

cosmo0920 mentioned this pull request Nov 25, 2020

Add follow_inodes parameter on in_tail #3182

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add follow_inodes flag to tail files by inodes, not by names #1660

Add follow_inodes flag to tail files by inodes, not by names #1660

OleksiiDuzhyi commented Aug 8, 2017

repeatedly commented Aug 15, 2017

repeatedly Aug 29, 2017

repeatedly Aug 29, 2017

repeatedly Aug 29, 2017

OleksiiDuzhyi Aug 29, 2017

repeatedly Aug 29, 2017

repeatedly Aug 29, 2017

repeatedly Aug 29, 2017

OleksiiDuzhyi Aug 29, 2017

repeatedly commented Aug 29, 2017

OleksiiDuzhyi commented Sep 6, 2017

esselius commented Dec 15, 2017

repeatedly commented Dec 18, 2017

OleksiiDuzhyi commented Dec 18, 2017

repeatedly commented Dec 18, 2017

OleksiiDuzhyi commented Dec 19, 2017

repeatedly commented Dec 20, 2017

OleksiiDuzhyi commented Dec 20, 2017

metayd commented Apr 4, 2018 •

edited

sanjuicgithub commented Apr 9, 2020

repeatedly commented Apr 17, 2020 •

edited

Add follow_inodes flag to tail files by inodes, not by names #1660

Add follow_inodes flag to tail files by inodes, not by names #1660

Conversation

OleksiiDuzhyi commented Aug 8, 2017

Use case

Problem

Proposed solution

repeatedly commented Aug 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

repeatedly commented Aug 29, 2017

OleksiiDuzhyi commented Sep 6, 2017

esselius commented Dec 15, 2017

repeatedly commented Dec 18, 2017

OleksiiDuzhyi commented Dec 18, 2017

repeatedly commented Dec 18, 2017

OleksiiDuzhyi commented Dec 19, 2017

repeatedly commented Dec 20, 2017

OleksiiDuzhyi commented Dec 20, 2017

metayd commented Apr 4, 2018 • edited

sanjuicgithub commented Apr 9, 2020

repeatedly commented Apr 17, 2020 • edited

metayd commented Apr 4, 2018 •

edited

repeatedly commented Apr 17, 2020 •

edited