New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Innobackupex stops streaming after 3.2GB #617
Comments
Backtrace when I Interrupt:
|
@barttenbrinke The pipeline error comes from Backup, that's triggered from the interrupt you're triggering. It looks like it was either still running or hanging. It was at least still busy in the Backup pipeline process. In order to debug that you would really have to open the pipeline up. https://github.com/meskyanichi/backup/blob/f2228011c33006d2802002c8c27ea3bfe14acf3c/lib/backup/pipeline.rb#L54-L66 I'm busy creating a DB at least 3.2GB in size, see if I can reproduce the same issue. Anything else I should know about your configuration? What compression, encryption, storage, etc are you using? |
Storage: SSD raid 10, XFS, dedicated hardware. Mysql Percona, so nothing really special. What am I looking for when i open up the pipeline? |
It works when I run the pipeline command manually from the command line, btw. Running under ruby 2.1.2p95 |
Sorry for the late response @barttenbrinke, I attempted to create a similar setup, but after spending a couple hours on it I've given up. What you can do to figure out what is wrong is: Try disabling all other actions in your backup model, disable the compression, encryption, etc. and backup to a local directory. So only keep the # my_example_file.rb
puts `#{pipeline_command_here}` If it also happens in this setup it's probably taking too long or pushing too much data through it that the ruby process detaches from the command it's executing. If this is the case there's not a lot there is too do at all unfortunately. If it does work it might be in the Open4 implementation that hangs, https://github.com/meskyanichi/backup/blob/f2228011c33006d2802002c8c27ea3bfe14acf3c/lib/backup/pipeline.rb#L54-L66. Copy the basic setup from there and try running it. # my_example_file_open4.rb
require "open4"
pipeline = "YOUR COMMAND"
Open4.popen4(pipeline) do |pid, stdin, stdout, stderr|
puts pid
puts stdout.read
puts stderr.read
end I would suggest trying the above options, and post the results. If it's in the actual Backup pipeline rather than the ruby implementation I'll take another look. innobackupex is not a "core" supported feature, it was added rather recently by a community PR. I have no in depth knowledge how it should work. Maybe @chesterbr knows more since he implemented it. |
I will try that. Thanks! |
This is really odd. I use it with a >250GB database with no hanging or process detachment (it is set up to compress, split and upload to S3). Ruby is stil 1.9.3, but there should be no reason to work differently on 2.x. Everything points to innobackupex - or, to be exact, to xtrabackup (which is the one that actually does the work), but it being able to run on the shell makes it pretty confusing (I'm assuming the same user that runs the backup gem was used, so we rule out the need for OS permissions - although to be fair again, whenever I had such problems an error would happen at the beginning). Out of options, I'd check whether file ownership/permissions on the database files are all the same (they typically belong to mysql:mysql), and other than that, really don't have a clue 😕 Good luck! |
Backup script:
Command from ps aux | grep backup
This also hangs. |
Via back ticks it works (calling ruby test.rb). Now trying the Popen4 one. |
Popen4 stops after 4GB, without any notification whatsoever.
|
https://github.com/ahoward/open4 1.3.4. doesn't help either :( |
@barttenbrinke alright, we at least know where it is. It's strange however that @chesterb has no issues with it pushing a 250GB database through it and for you it hangs at ~4GB. Since I can't run innobackupex, I'll ask you: does your config innobackupex output a lot of data in stdin or stderr when you run it directly? If so that might cause problems for open4. Try to disable as much output as possible. There might be some other differences in the config as well that might cause problems. Please share any additional non-default config you might have changed. Other than that I can't think of a reason it would work on one machine and not the other (excluding hardware). If we can't find the cause we might need to swap out the use of open4 just to be sure, but that's a time consuming overhaul I'd like to avoid. |
Maybe put this workaround somewhere in the documentation and leave it with that for now. |
I was having this same issue. turning verbose = false seems to have solved the problem for me. |
The command executed by backup is this:
If I do this in a bash console, it works as expected. If I trigger this via a backup action, the backup process grinds to a halt while performing the snapshot:
It hangs when about 3.2 GB is transferred and I see no progress in file size/top/whatsoever. Is there some way for me to debug this further?
The text was updated successfully, but these errors were encountered: