Skip to content
This repository has been archived by the owner. It is now read-only.

Ticket26937: check for free disk space #241

Closed
wants to merge 15 commits into from
Closed

Conversation

Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
@juga0
Copy link
Contributor

@juga0 juga0 commented Jul 25, 2018

No description provided.

Copy link
Member

@pastly pastly left a comment

I kind of like the idea but I don't understand the size assumptions made in sbws_required_disk_space().

I also don't know why two specific log messages were picked as places to catch an exception when we think that maybe we ran out of disk space. Couldn't this happen every time we log? Shouldn't we therefore catch this exception at every log message? (Please do not write code to do that)

If I understood the logic in calculating the necessary disk space, agreed that it is reasonable logic, and sbws only checked at the top of main for having enough space, then I would like this PR better.

sbws/util/fs.py Outdated
"""
# Number of relays per line average size in Bytes
size_v3bw_file = 7500 * 220
num_v3bw_files = int(conf['general']['data_period'])
Copy link
Member

@pastly pastly Aug 1, 2018

data_period is the number of days into the past that we consider measurements valid. Why does this line assume that the number of v3bw files will be the same as this value?

Copy link
Contributor Author

@juga0 juga0 Aug 2, 2018

I think that i intended to calculate size of the minimum number of results files to keep, not v3bw files.
For v3bw files, it would be enough to keep only the last 2 ones that should have been generated in the last 2 hours.
This is going to depend a lot on the crontab that will generate/clean files and would be provided separately, but at least this would give an estimation on the minimum space required

sbws/util/fs.py Outdated
# not counted rotated files and assuming that when it is not rotated the
# size will be aproximately 10MiB
size_log_file = (int(conf['logging']['to_file_max_bytes']) or 10485760) \
if conf['logging']['to_stdout'] == 'yes' else 0
Copy link
Member

@pastly pastly Aug 1, 2018

Shouldn't this be to_file, not to_stdout? Also, you should do if conf.getboolean('logging', 'to_file') since there are more was to indicate a true value than the word "yes" https://docs.python.org/3/library/configparser.html#configparser.ConfigParser.getboolean

Copy link
Contributor Author

@juga0 juga0 Aug 2, 2018

you're right, mistake

sbws/util/fs.py Outdated
space_v3bw_files = size_v3bw_file * num_v3bw_files
# not counted rotated files and assuming that when it is not rotated the
# size will be aproximately 10MiB
size_log_file = (int(conf['logging']['to_file_max_bytes']) or 10485760) \
Copy link
Member

@pastly pastly Aug 1, 2018

conf.getint('logging', 'to_file_max_bytes') instead of wrapping with int()

Since 10 MiB is specified in the config.default.ini and that file should not be edited by anyone, I do not think we should hardcode a fallback of 10 MiB here

Copy link
Contributor Author

@juga0 juga0 Aug 2, 2018

makes sense

sbws/util/fs.py Outdated


def df(path):
"""Return space left on device where path is."""
Copy link
Member

@pastly pastly Aug 1, 2018

Please note that the units are MiB

Copy link
Contributor Author

@juga0 juga0 Aug 2, 2018

good point

sbws/util/fs.py Outdated
size_log_file = (int(conf['logging']['to_file_max_bytes']) or 10485760) \
if conf['logging']['to_stdout'] == 'yes' else 0
# roughly...
space_result_files = space_v3bw_files
Copy link
Member

@pastly pastly Aug 1, 2018

How do we know this will be close to true?

Copy link
Contributor Author

@juga0 juga0 Aug 2, 2018

not sure now why i did that :), probably i should calculate the space they take separately, in the way commented in #241 (comment)

sbws/util/fs.py Outdated
disk_avail_mb = df(conf['paths']['sbws_home'])
if disk_avail_mb < disk_required_mb:
log.warn("The space left on the device (%s MiB) is less than "
"the minimum recommented to run sbws (%s MiB)."
Copy link
Member

@pastly pastly Aug 1, 2018

"recommended"

@pastly pastly changed the title Ticket26937 Ticket26937: check for free disk space Aug 1, 2018
@juga0
Copy link
Contributor Author

@juga0 juga0 commented Aug 3, 2018

I also don't know why two specific log messages were picked as places to catch an exception when we think that maybe we ran out of disk space.

In concrete cases where i run out of space, they happened there.

Couldn't this happen every time we log?

Yes

Shouldn't we therefore catch this exception at every log message? (Please do not write code to do that)

Maybe, but we would need to change all logs, so probably we do not want that.

What about just leaving the logs in this PR?

@juga0
Copy link
Contributor Author

@juga0 juga0 commented Aug 3, 2018

Fixed the issues you pointed out when calculating the space

Copy link
Member

@pastly pastly left a comment

One bug remaining, see inline comment.

I don't like the 2 logging changes where we catch OSError+print and where we make a guess in the message that we might have run out of disk space during a completely unrelated exception.

I'd merge this with the fixed bug and the 2 logging changes removed.

sbws/util/fs.py Outdated
# not counted rotated files and assuming that when it is not rotated the
# size will be aproximately 10MiB
size_log_file = conf.getint('logging', 'to_file_max_bytes') or 10485760 \
if conf.getboolean('logging', 'to_stdout') else 0
Copy link
Member

@pastly pastly Aug 3, 2018

This is supposed to be to_file.

Copy link
Contributor Author

@juga0 juga0 Aug 4, 2018

sorry, forgot, fixed

juga0 added 3 commits Aug 4, 2018
This reverts commit ede9fed.
An exception trying to write to log file when there is not disk
space can happen anywhere in the code
@juga0
Copy link
Contributor Author

@juga0 juga0 commented Aug 4, 2018

I don't like the 2 logging changes where we catch OSError+print and where we make a guess in the message that we might have run out of disk space during a completely unrelated exception.

In 72b45d4#diff-814bf53a84b89545a05dd2761603efc4R68, we are trying to close a circuit that we can not even get, i don't think it's totally unrelated exception. But removing it.

@juga0
Copy link
Contributor Author

@juga0 juga0 commented Aug 4, 2018

If now it's fine for you, i can rebase to master and fix the conflicts. I'd do it in a different branch.

@pastly pastly closed this in 5e363d8 Sep 5, 2018
pastly added a commit that referenced this issue Sep 5, 2018
getpath is necessary so that ~ is expanded

GH: closes #241
trac: implements #26937
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.