-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[percona-xtrabackup] prevent High memory usage for no reason (IO_CLOSE) #1724
[percona-xtrabackup] prevent High memory usage for no reason (IO_CLOSE) #1724
Conversation
Hi, why did you close this PR? |
Hi @sduehr, |
Sure, I've set the on-hold label. We have a systemtest for the plugin, see
The systemtest is probably too specific to be used outside of the build environment, but Thanks again, waiting for your notification start review. Regards, |
3382006
to
8171df0
Compare
Hi @sduehr, is there any way to rerun tests? |
8171df0
to
2e34195
Compare
Thanks, build and tests are automatically triggered when new commits are pushed. |
2e34195
to
051ac71
Compare
I rewrote the commit message and added details and explanations about the importance of the changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider trying self.stream.terminate()
first, and only if that fails use self.stream.kill()
?
core/src/plugins/filed/python/percona-xtrabackup/bareos-fd-percona-xtrabackup.py
Outdated
Show resolved
Hide resolved
core/src/plugins/filed/python/percona-xtrabackup/bareos-fd-percona-xtrabackup.py
Outdated
Show resolved
Hide resolved
Good idea. Honestly, I wrote a separate function for the dump process termination with timeouts and TERM signal instead of killing it straightaway. |
6a2b699
to
b9d2b5f
Compare
Thanks, normally an additional commit is ok. But in this case it's only one file, so force push is ok, too. |
Thanks, looks good so far. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check my suggestions. Are you able to reproduce the case where end_dumper()
would go in the second timeout, where the process didn't terminate even after sending SIGTERM?
core/src/plugins/filed/python/percona-xtrabackup/bareos-fd-percona-xtrabackup.py
Outdated
Show resolved
Hide resolved
core/src/plugins/filed/python/percona-xtrabackup/bareos-fd-percona-xtrabackup.py
Show resolved
Hide resolved
core/src/plugins/filed/python/percona-xtrabackup/bareos-fd-percona-xtrabackup.py
Outdated
Show resolved
Hide resolved
core/src/plugins/filed/python/percona-xtrabackup/bareos-fd-percona-xtrabackup.py
Outdated
Show resolved
Hide resolved
I'm not sure that is 100% reproducible on my side. I saw an empty return code straight after sending SIGTERM which is why I put an additional 30 seconds waiting for the correct termination. I agree that that is possible not to wait 2nd 30 sec and jump to the else block with setting 9 to the stream's return code, but prefer to wait a bit more seconds. |
Fix non-terminating subprocess behaviour in case of bareos-storage unavailable during a Backup job. This commit terminates the subprocess in case bareos-storage is unavailable for some reason during a Backup job. This is a bug fix because the previous behaviour causes high memory usage that may cause OOM in some cases. Move the logic of terminating the xtrabackup(dumper) subprocess in case of IO_CLOSE to separate function end_dumper(). Add timeouts and change the kill signal to terminate. Add convenient debug and job messages.
Code formatted using black and corrected copyright yeyr to comply with our coding guidelines.
Co-authored-by: sduehr <stephan.duehr@bareos.com>
Agree, however, I'm not sure that the 9 exit code is the right one here. I've never been there or seen such a scenario, but potentially it may happen. Co-authored-by: sduehr <stephan.duehr@bareos.com>
Co-authored-by: sduehr <stephan.duehr@bareos.com>
3c72f85
to
c712412
Compare
OK, thanks. Yes, setting 9 to the stream's return isn't perfect, but ok for now. |
In the case of IO_CLOSE inside the plugin, we don't need to wait for the subprocess to finish, because it trying to get the data from xtrabackup, but the destination for such data to backup is unavailable.
It causes high memory usage and OOM in some cases.
Please check
If you have any questions or problems, please give a comment in the PR.
Helpful documentation and best practices
Checklist for the reviewer of the PR (will be processed by the Bareos team)
Make sure you check/merge the PR using
devtools/pr-tool
to have some simple automated checks run and a proper changelog record added.General
Source code quality