Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent 'flows' currently only support following a single log file at any given time... #45

Open
diranged opened this issue Jul 18, 2016 · 2 comments

Comments

@diranged
Copy link
Contributor

The problem

We found this out the fun way ... Lets imagine you have a fairly normal syslog-ng config like this:

destination d_mnt_log { file(
      "/mnt/log/$FACILITY.log"
      perm(0644) owner(root) group(root) dir_perm(0755) create_dirs(yes));
     };
log { source(s_src);  destination(d_mnt_log); };

This creates a log directory like this:

[root@us1-scheduler-...:~:2]# ls -la /mnt/log
total 281788
drwxr-xr-x 2 root root       121 Jul 15 20:47 .
drwxr-xr-x 8 root root        74 Jul 15 20:38 ..
-rw-r--r-- 1 root root    284829 Jul 18 15:17 authpriv.log
-rw-r--r-- 1 root root    158226 Jul 18 15:17 cron.log
-rw-r--r-- 1 root root   1549178 Jul 18 15:22 daemon.log
-rw-r--r-- 1 root root       176 Jul 15 20:38 kern.log
-rw-r--r-- 1 root root 198713506 Jul 18 15:23 local0.log
-rw-r--r-- 1 root root  14755528 Jul 18 15:23 syslog.log
-rw-r--r-- 1 root root    327945 Jul 18 15:21 user.log

Now imagine you want to tail all of these files and send all the data into Kinesis. This makes sense, right?

{
... 
  "flows": [
    {
      "filePattern": "/mnt/log/*.log*",
      "kinesisStream": "log-pipeline-<%= @environment %>-syslog",
      "partitionKeyOption": "RANDOM"
    }
  ]
}

Wrong.

This is a dangerous configuration because the Agent will watch each of the files that match the regex (all of them, in this case). After each file is modified, the agent will become confused and jump to that file to start taking in log events. It will never follow all of the files at once, instead it follows only one file at a time.

The solution

I don't have a real solution here.. but ideally what we want is to be able to tell the agent that there is a whole directory of files ... follow them all, and keep track of all of their inodes. If any file is rotated, thats fine .. jsut keep following it by inode. If a new file is created, then start reading that file as well.

I know that this can lead to resource problems if you are writing out a lot of log files, but I believe that complexity and concern is up to the end-user to decide upon implementation.

@chaochenq
Copy link
Contributor

chaochenq commented Jul 19, 2016

We can probably have a higher level configuration that could match all flows in a directory. For example "/mnt/log/${flow}.log." could create flows tailing /mnt/log/kern.log. and /mnt/log/syslog.log.*

To save the resource (for not tracking too many files) we could have another option of "ignoreOlder" which ignores to track a file flow older than a particular period in case too many files get added to the directory without rotation and deletion.

@diranged
Copy link
Contributor Author

@chaochenq any further thoughts on this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants