Ticket26937: check for free disk space#241
Conversation
pastly
left a comment
There was a problem hiding this comment.
I kind of like the idea but I don't understand the size assumptions made in sbws_required_disk_space().
I also don't know why two specific log messages were picked as places to catch an exception when we think that maybe we ran out of disk space. Couldn't this happen every time we log? Shouldn't we therefore catch this exception at every log message? (Please do not write code to do that)
If I understood the logic in calculating the necessary disk space, agreed that it is reasonable logic, and sbws only checked at the top of main for having enough space, then I would like this PR better.
| """ | ||
| # Number of relays per line average size in Bytes | ||
| size_v3bw_file = 7500 * 220 | ||
| num_v3bw_files = int(conf['general']['data_period']) |
There was a problem hiding this comment.
data_period is the number of days into the past that we consider measurements valid. Why does this line assume that the number of v3bw files will be the same as this value?
There was a problem hiding this comment.
I think that i intended to calculate size of the minimum number of results files to keep, not v3bw files.
For v3bw files, it would be enough to keep only the last 2 ones that should have been generated in the last 2 hours.
This is going to depend a lot on the crontab that will generate/clean files and would be provided separately, but at least this would give an estimation on the minimum space required
| # not counted rotated files and assuming that when it is not rotated the | ||
| # size will be aproximately 10MiB | ||
| size_log_file = (int(conf['logging']['to_file_max_bytes']) or 10485760) \ | ||
| if conf['logging']['to_stdout'] == 'yes' else 0 |
There was a problem hiding this comment.
Shouldn't this be to_file, not to_stdout? Also, you should do if conf.getboolean('logging', 'to_file') since there are more was to indicate a true value than the word "yes" https://docs.python.org/3/library/configparser.html#configparser.ConfigParser.getboolean
| space_v3bw_files = size_v3bw_file * num_v3bw_files | ||
| # not counted rotated files and assuming that when it is not rotated the | ||
| # size will be aproximately 10MiB | ||
| size_log_file = (int(conf['logging']['to_file_max_bytes']) or 10485760) \ |
There was a problem hiding this comment.
conf.getint('logging', 'to_file_max_bytes') instead of wrapping with int()
Since 10 MiB is specified in the config.default.ini and that file should not be edited by anyone, I do not think we should hardcode a fallback of 10 MiB here
|
|
||
|
|
||
| def df(path): | ||
| """Return space left on device where path is.""" |
There was a problem hiding this comment.
Please note that the units are MiB
| size_log_file = (int(conf['logging']['to_file_max_bytes']) or 10485760) \ | ||
| if conf['logging']['to_stdout'] == 'yes' else 0 | ||
| # roughly... | ||
| space_result_files = space_v3bw_files |
There was a problem hiding this comment.
How do we know this will be close to true?
There was a problem hiding this comment.
not sure now why i did that :), probably i should calculate the space they take separately, in the way commented in #241 (comment)
| disk_avail_mb = df(conf['paths']['sbws_home']) | ||
| if disk_avail_mb < disk_required_mb: | ||
| log.warn("The space left on the device (%s MiB) is less than " | ||
| "the minimum recommented to run sbws (%s MiB)." |
also replace conf[''] for conf.getint/conf.getboolean
and add comment about MiB
In concrete cases where i run out of space, they happened there.
Yes
Maybe, but we would need to change all logs, so probably we do not want that. What about just leaving the logs in this PR? |
|
Fixed the issues you pointed out when calculating the space |
pastly
left a comment
There was a problem hiding this comment.
One bug remaining, see inline comment.
I don't like the 2 logging changes where we catch OSError+print and where we make a guess in the message that we might have run out of disk space during a completely unrelated exception.
I'd merge this with the fixed bug and the 2 logging changes removed.
| # not counted rotated files and assuming that when it is not rotated the | ||
| # size will be aproximately 10MiB | ||
| size_log_file = conf.getint('logging', 'to_file_max_bytes') or 10485760 \ | ||
| if conf.getboolean('logging', 'to_stdout') else 0 |
In 72b45d4#diff-814bf53a84b89545a05dd2761603efc4R68, we are trying to close a circuit that we can not even get, i don't think it's totally unrelated exception. But removing it. |
|
If now it's fine for you, i can rebase to master and fix the conflicts. I'd do it in a different branch. |
getpath is necessary so that ~ is expanded GH: closes #241 trac: implements #26937
No description provided.