-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] in_tail plugin isn't continue watch log file after logrotate was ran on k8s logs file. #375
Comments
note that when a third-party tool rotate a file Fluent Bit catch this event (which is a file rename), and what it does is to keep monitoring the rotated file for the next 5 seconds (Rotate_Wait option), after that is not longer monitored. Now when a file is rotated, likely the original application that create the logs will re-create the file (same name), but in order to let Fluent Bit catch that file creation it needs to re-scan the path, this operation is handled by the Refresh_Interval option, by default it re-scan every 60 seconds, I suggest to keep this value low as 5 seconds. If the issue mentioned do not address the problem explained above, please provide detailed steps to try to reproduce the problem. |
@duythinht is there any pending question/issue on your side ? |
I'm still troubleshoot this issue. Sometime tail keep working, sometime it's not working (after logrotate running) |
I'm also with same issue. |
Changed the refresh-interval didn't helped.. when file rotated fluent-bit didn't monitored it anymore, needed to restart the fluent container. |
Hello @edsiper, i upgraded fluent-bit but even though same issue, when file rotates its read anymore by fluent-bit and stays in loop trying to read the file. i've turned on the debug log level to post here the behaviour, if it helps. [2017/11/06 22:03:07] [debug] [task] destroy task=0x7fca0023c0e0 (task_id=0) |
@hdiass what kind of rotation mode are you using, copytruncate ?
why the rotated file have the same name ?
that means that a file was promoted for inotify but then it failed, mostly because it was deleted. Looks like your file are being rotated faster than the refresh_interval, please set a refresh_interval of 5 seconds |
@edsiper, the application that i want to monitor handles the log file itself, not using logrotate from the system. What the app does for what i can see is create a "backup" file with the old log file and recreates a new log file with the same name. Ok i'll set the refresh interval for that value and test again, Thanks a lot, |
@edsiper I was checking and i already had refresh interval option set on 5, so that will not help you can find the the config file i'm using below.
also maybe good for you to know, the timestamp between old file last log is really like miliseconds difference from the first timestamp on the new log file. old log file last line time stamp : "@timestamp":"2017-11-06T22:03:06.198+00:00" If you can somehow tell me what is the best config here to fluent-bit correcty follow the log after the rotation |
I challenge the similar behaviour. We discovered it's related to logrotate "copytruncate" option. copytruncate
Fluent bit should recognize number of lines in file, and if that is < then the previous value, it should re-read the file from scratch + reset it's position (whatever to get un-blocked). |
Can confirm the issue using Fluent-Bit v0.12.13. sqlite3 db keeps the counter even when the log file itself was logrotated ans reset to 0 bytes. Only workaround I was able to come up with is not to use the DB option. 😒 |
Copytruncate mode is dangerous and should be avoided in this scenario, in general it leads to data loss. Counting the number of lines is not a solution since that will mean: for every read(2) go to the beginning of the file and count the number of line breaks (\n). The other solution would be to check for the file size on every read using stat(2), again ..it will be performance killer and a constant pain. It's times better to use a different log rotation mode than copytruncate. Note that also copytruncate is done by a third party tool, so there is high chances that truncation is done when the application is writing data to the file, there is no "sync". A workaround would be to let Docker handle rotation. There is relevant discussion on this topic on Kubernetes repo: |
We're using fluent-bit outside of kubernetes/docker. Usually "logrotate" is responsible for logrotation (Debian/Ubuntu). In some cases we're still using "remote_syslog2" which claims to handle this scenario https://github.com/papertrail/remote_syslog2#log-rotation-and-the-behavior-of-remote_syslog - maybe an inspiration? |
@rmoriz thanks for the hints. Actually the papertrail client does specifically the workaround mentioned above: "stat(2) the file when some 'write' operation was done": Note that the workaround will only work if the tool that generated the original log file did not open the file using O_APPEND mode. I will try an enhancement |
Signed-off-by: Eduardo Silva <eduardo@treasure-data.com>
Signed-off-by: Eduardo Silva <eduardo@treasure-data.com>
Confirm 0.13 Dev, tested for a while and seems it really works with logrotate and the above options. No freezes yet. |
Deployed + tested one week. looks good so far. |
thanks everyone for helping on this issue. Closing as Fixed. |
Awesome @edsiper 🏆 Will this be released in the 0.12.x line? It would be very helpful! |
Released :) |
@edsiper Awesome, thanks! 💯 |
Signed-off-by: Eduardo Silva <eduardo@treasure-data.com>
ping @edsiper The official documentation here
Is the documentation outdated or is there still an issue with logrotate and |
Documentation needs to be updated, in the other side the note the following requirement:
|
@edsiper FYI the documentation (even for 1.0: https://docs.fluentbit.io/manual/input/tail) still mentions "Rotation with truncation (e.g. logrotate's copytruncate mode) is not supported." Also, regarding your remark that it "will only work if the tool that generated the original log file did not open the file using O_APPEND mode": does that mean we can expect logs rotated through logrotate's copytruncate to work or not? |
Updating the docs now, thanks for catching that. |
…fluent#375) * out_stackdriver: fix incorrect format of local_resource_id, add ending dot Signed-off-by: Jeff Luo <jeffluoo@google.com> * out_stackdriver: add missing option severity_key Signed-off-by: Jeff Luo <jeffluoo@google.com> Co-authored-by: Eduardo Silva <eduardo@treasure-data.com>
What about the copied file, would it be consume from start? |
I have run fluent-bit for k8s, but after run logrotate, in_tail is not watch log file, which has been rotated.
My fluentbit config:
My logrotate config
I thinks something was wrong after logs file has changed outside container, how I reproduce: I run a fluent-bit containers in docker, mount volume
[current_folder]:/log
Trying start, restart sometime.
The text was updated successfully, but these errors were encountered: