Skip to content
This repository has been archived by the owner on Jan 30, 2024. It is now read-only.

Make logrotate aware of disk space #11671

Merged
merged 1 commit into from May 9, 2022
Merged

Make logrotate aware of disk space #11671

merged 1 commit into from May 9, 2022

Conversation

rtrinque
Copy link
Contributor

@rtrinque rtrinque commented May 9, 2022

Delete logs files above a threshold in size in chronological order
while disk space usage is above a given threshold.

@rtrinque rtrinque force-pushed the logrotate branch 2 times, most recently from 1c8401e to 10ebb2f Compare May 9, 2022 09:24
Copy link
Contributor

@ChrisBAshton ChrisBAshton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 but would like a bit more commit message detail before merge.

for file in $(find <%= @matches %>* -type f -size +$SIZE -exec ls -tr {} +); do
USE=$(df /var/log --output=pcent | sed -n '2p' | tr -d %)
if [ $USE -gt $MAX_USE ]; then
rm $file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this correctly, we loop over every log file (that is more than 300M), and check if disk space is at >85%. If it is, remove the file. Once we reach <=85%, we won't be removing any more, but we'll still be looping over the rest of the files.

I think it would be logically easier to follow by inverting it like this:

while (not enough disk space)
  remove the oldest file that is >300mb

But this is quite nitpicky 😅 also not sure how slow the find query is, in which case my suggestion would be less efficient. So happy to leave as-is 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's the idea, and I agree the logic of the loop is a bit backward.
The reason for that is mainly to keep it simple as find gives us a nice list to go over in a simple way, using the disk space left to control the loop would make more sense but would introduce more control work as we would need some bit of logic to break the loop if we run out of files to delete.
As it is this we just get a list of files we think we can do something about, work and them, and then we've done what we could to alleviate the space issue.

Delete logs files above a threshold in size in chronological order
while disk space usage is above a given threshold.
We've had several issues with machines running out of disk space
because of oversized log files. This is an attempt to prevent
this from happening, as such it is a blanket change applying to
all rotated logs on all classes.
The thresholds were picked arbitrarily initially, SIZE is set at
300M to avoid getting rid of small files that would not help much
to save space, and MAX_USE is at 85% of use as our nagios
thresholds for disk space are 80% warning, 90% critical.
Copy link
Contributor

@ChrisBAshton ChrisBAshton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 👍

@rtrinque rtrinque merged commit 5b5e249 into main May 9, 2022
@rtrinque rtrinque deleted the logrotate branch May 9, 2022 10:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants