torproject / sbws Public archive
Ticket26937: check for free disk space #241
Conversation
I kind of like the idea but I don't understand the size assumptions made in sbws_required_disk_space().
I also don't know why two specific log messages were picked as places to catch an exception when we think that maybe we ran out of disk space. Couldn't this happen every time we log? Shouldn't we therefore catch this exception at every log message? (Please do not write code to do that)
If I understood the logic in calculating the necessary disk space, agreed that it is reasonable logic, and sbws only checked at the top of main for having enough space, then I would like this PR better.
sbws/util/fs.py
Outdated
| """ | ||
| # Number of relays per line average size in Bytes | ||
| size_v3bw_file = 7500 * 220 | ||
| num_v3bw_files = int(conf['general']['data_period']) |
data_period is the number of days into the past that we consider measurements valid. Why does this line assume that the number of v3bw files will be the same as this value?
I think that i intended to calculate size of the minimum number of results files to keep, not v3bw files.
For v3bw files, it would be enough to keep only the last 2 ones that should have been generated in the last 2 hours.
This is going to depend a lot on the crontab that will generate/clean files and would be provided separately, but at least this would give an estimation on the minimum space required
sbws/util/fs.py
Outdated
| # not counted rotated files and assuming that when it is not rotated the | ||
| # size will be aproximately 10MiB | ||
| size_log_file = (int(conf['logging']['to_file_max_bytes']) or 10485760) \ | ||
| if conf['logging']['to_stdout'] == 'yes' else 0 |
Shouldn't this be to_file, not to_stdout? Also, you should do if conf.getboolean('logging', 'to_file') since there are more was to indicate a true value than the word "yes" https://docs.python.org/3/library/configparser.html#configparser.ConfigParser.getboolean
sbws/util/fs.py
Outdated
| space_v3bw_files = size_v3bw_file * num_v3bw_files | ||
| # not counted rotated files and assuming that when it is not rotated the | ||
| # size will be aproximately 10MiB | ||
| size_log_file = (int(conf['logging']['to_file_max_bytes']) or 10485760) \ |
conf.getint('logging', 'to_file_max_bytes') instead of wrapping with int()
Since 10 MiB is specified in the config.default.ini and that file should not be edited by anyone, I do not think we should hardcode a fallback of 10 MiB here
sbws/util/fs.py
Outdated
|
|
||
|
|
||
| def df(path): | ||
| """Return space left on device where path is.""" |
sbws/util/fs.py
Outdated
| size_log_file = (int(conf['logging']['to_file_max_bytes']) or 10485760) \ | ||
| if conf['logging']['to_stdout'] == 'yes' else 0 | ||
| # roughly... | ||
| space_result_files = space_v3bw_files |
not sure now why i did that :), probably i should calculate the space they take separately, in the way commented in #241 (comment)
sbws/util/fs.py
Outdated
| disk_avail_mb = df(conf['paths']['sbws_home']) | ||
| if disk_avail_mb < disk_required_mb: | ||
| log.warn("The space left on the device (%s MiB) is less than " | ||
| "the minimum recommented to run sbws (%s MiB)." |
also replace conf[''] for conf.getint/conf.getboolean
and add comment about MiB
In concrete cases where i run out of space, they happened there.
Yes
Maybe, but we would need to change all logs, so probably we do not want that. What about just leaving the logs in this PR? |
|
Fixed the issues you pointed out when calculating the space |
One bug remaining, see inline comment.
I don't like the 2 logging changes where we catch OSError+print and where we make a guess in the message that we might have run out of disk space during a completely unrelated exception.
I'd merge this with the fixed bug and the 2 logging changes removed.
sbws/util/fs.py
Outdated
| # not counted rotated files and assuming that when it is not rotated the | ||
| # size will be aproximately 10MiB | ||
| size_log_file = conf.getint('logging', 'to_file_max_bytes') or 10485760 \ | ||
| if conf.getboolean('logging', 'to_stdout') else 0 |
In 72b45d4#diff-814bf53a84b89545a05dd2761603efc4R68, we are trying to close a circuit that we can not even get, i don't think it's totally unrelated exception. But removing it. |
|
If now it's fine for you, i can rebase to master and fix the conflicts. I'd do it in a different branch. |
getpath is necessary so that ~ is expanded GH: closes #241 trac: implements #26937
No description provided.
The text was updated successfully, but these errors were encountered: