logrotate does not ever recover from a corrupted statefile #45

Closed
khays-biamp opened this Issue Jul 22, 2016 · 7 comments

Comments

Projects
None yet
3 participants

Discovered in 3.8.7, confirmed by inspection in 3.9.1.

If the statefile becomes corrupted, for any reason, logrotate immediately exits, and log rotation stops. This is particularly bad in embedded devices that depend on limiting the sizes of logfiles.

To recreate, edit the statefile with vi, and either set a date to a bad value, or corrupt the first line.

On the next execution of logrotate, it error exits with error=1, and makes no attempt to recover, which means it will continue to error out, forever.

We patched our local version to recreate the statefile with only the version header when this occurs, which seems to workaround the problem in a simple fashion.

It might be better to simply remove the statefile if it is corrupt.

Since the patch is neither elegant, nor against the latest version, I have not attached it.

Owner

kdudka commented Aug 2, 2016

Thanks for the report! There has already been an attempt to fix it: r3-7-8~

Unfortunately, it did not work correctly in all cases. So it was reverted a few releases later: r3-8-5~

So we need to write a better patch to fix this properly...

Do you have any idea why the file gets corrupted in your case?

khays-biamp commented Aug 2, 2016

It appears the metadata for the file was written before the file contents were written, so the file has an appropriate length, but no data sectors were written, so the file file has no extents, and the contents are returned as nulls, which is correct behavior for sparse files.

As for why, we believe it to be a bug in the underlying file system, which is UBIFS.

We’re still chasing it.

Please consider recovering automatically from a corrupted statefile. The manual recovery process invariably is to delete the statefile and re-run logrotate; but we find out about the failure by being paged at the middle of the night because a disk has filled.

If you're not completely comfortable with this solution, would you consider making it an opt-in via config or cmdline arg?

thanks!

Since others are having the same problem, and in hopes of a better fix, I've attached our kludgy patch.
biamp_robust_statefile.txt

kdudka added a commit that referenced this issue Feb 8, 2017

do not treat failure of readState() as fatal
This also prevents an empty state file from being written when running
with the --debug option.

Closes #45
Owner

kdudka commented Feb 8, 2017

Please have a look at my proposal in the recover-state branch:
https://github.com/logrotate/logrotate/compare/recover-state

Any feedback is appreciated!

Owner

kdudka commented Mar 7, 2017

@khays-biamp @mattghali is there anything I can do to help you with testing the proposal?

khays-biamp commented Mar 7, 2017

@kdudka kdudka closed this in b9d8200 Apr 7, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment