Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Innobackupex stops streaming after 3.2GB #617

Closed
barttenbrinke opened this issue Dec 9, 2014 · 14 comments
Closed

Innobackupex stops streaming after 3.2GB #617

barttenbrinke opened this issue Dec 9, 2014 · 14 comments

Comments

@barttenbrinke
Copy link
Contributor

The command executed by backup is this:

  sh -c "{ { /usr/bin/innobackupex --user=root --password='REPLACED' --host='localhost' --port='3306'  --no-timestamp /var/rails/backup/shared/tmp/core_mysql_snapshot/databases/MySQL-core2.bkpdir  && /usr/bin/innobackupex --apply-log /var/rails/backup/shared/tmp/core_mysql_snapshot/databases/MySQL-core2.bkpdir --use-memory=4G   && /bin/tar --remove-files -cf -  -C /var/rails/backup/shared/tmp/core_mysql_snapshot/databases MySQL-core2.bkpdir 2>&4 ; echo \"0|$?:\" >&3 ; } | { /bin/gzip 2>&4 ; echo \"1|$?:\" >&3 ; } | { /bin/cat > '/var/rails/backup/shared/tmp/core_mysql_snapshot/databases/MySQL-core2.tar.gz' 2>&4 ; echo \"2|$?:\" >&3 ; } } 3>&1 1>&2 4>&2"

If I do this in a bash console, it works as expected. If I trigger this via a backup action, the backup process grinds to a halt while performing the snapshot:

  /usr/bin/innobackupex --user=root --password='REPLACED' --host='localhost' --port='3306'  --no-timestamp /var/rails/backup/shared/tmp/core_mysql_snapshot/databases/MySQL-core2.bkpdir

It hangs when about 3.2 GB is transferred and I see no progress in file size/top/whatsoever. Is there some way for me to debug this further?

@barttenbrinke
Copy link
Contributor Author

Backtrace when I Interrupt:

  2014/12/09 15:43:14][info] Database::MySQL (core2) Started...
  ^C[2014/12/09 15:50:07][error] Model::Error: Backup for Snapshot and backup the entire mysql databases for core (core_mysql_snapshot) Failed!
  [2014/12/09 15:50:07][error] --- Wrapped Exception ---
  [2014/12/09 15:50:07][error] Pipeline::Error: Pipeline failed to execute
  [2014/12/09 15:50:07][error] --- Wrapped Exception ---
  [2014/12/09 15:50:07][error] Interrupt:
  [2014/12/09 15:50:07][error]
  [2014/12/09 15:50:07][error] Backtrace:
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/lib/backup/pipeline.rb:55:in `read'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/lib/backup/pipeline.rb:55:in `block in run'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/open4-1.3.0/lib/open4.rb:107:in `call'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/open4-1.3.0/lib/open4.rb:107:in `do_popen'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/open4-1.3.0/lib/open4.rb:30:in `popen4'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/lib/backup/pipeline.rb:54:in `run'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/lib/backup/database/mysql.rb:93:in `perform!'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/lib/backup/model.rb:269:in `each'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/lib/backup/model.rb:269:in `block in perform!'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/lib/backup/model.rb:268:in `each'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/lib/backup/model.rb:268:in `perform!'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/lib/backup/cli.rb:156:in `perform'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/thor-0.18.1/lib/thor/command.rb:27:in `run'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/thor-0.18.1/lib/thor/invocation.rb:120:in `invoke_command'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/thor-0.18.1/lib/thor.rb:363:in `dispatch'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/thor-0.18.1/lib/thor/base.rb:439:in `start'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/gems/backup-4.1.4/bin/backup:5:in `<top (required)>'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/bin/backup:23:in `load'
  [2014/12/09 15:50:07][error]   /var/rails/backup/vendor/bundle/ruby/2.1.0/bin/backup:23:in `<main>'

@tombruijn
Copy link
Member

@barttenbrinke The pipeline error comes from Backup, that's triggered from the interrupt you're triggering. It looks like it was either still running or hanging. It was at least still busy in the Backup pipeline process.

In order to debug that you would really have to open the pipeline up. https://github.com/meskyanichi/backup/blob/f2228011c33006d2802002c8c27ea3bfe14acf3c/lib/backup/pipeline.rb#L54-L66

I'm busy creating a DB at least 3.2GB in size, see if I can reproduce the same issue. Anything else I should know about your configuration? What compression, encryption, storage, etc are you using?

@barttenbrinke
Copy link
Contributor Author

Storage: SSD raid 10, XFS, dedicated hardware. Mysql Percona, so nothing really special. What am I looking for when i open up the pipeline?

@barttenbrinke
Copy link
Contributor Author

It works when I run the pipeline command manually from the command line, btw. Running under ruby 2.1.2p95

@tombruijn
Copy link
Member

Sorry for the late response @barttenbrinke, I attempted to create a similar setup, but after spending a couple hours on it I've given up.

What you can do to figure out what is wrong is:

Try disabling all other actions in your backup model, disable the compression, encryption, etc. and backup to a local directory. So only keep the database MySQL and storage_with Local configuration options. If it still hangs, copy the command that the pipeline receives an execute it from a simple ruby file.

# my_example_file.rb

puts `#{pipeline_command_here}`

If it also happens in this setup it's probably taking too long or pushing too much data through it that the ruby process detaches from the command it's executing. If this is the case there's not a lot there is too do at all unfortunately. If it does work it might be in the Open4 implementation that hangs, https://github.com/meskyanichi/backup/blob/f2228011c33006d2802002c8c27ea3bfe14acf3c/lib/backup/pipeline.rb#L54-L66. Copy the basic setup from there and try running it.

# my_example_file_open4.rb

require "open4"
pipeline = "YOUR COMMAND"
Open4.popen4(pipeline) do |pid, stdin, stdout, stderr|
  puts pid
  puts stdout.read
  puts stderr.read
end

I would suggest trying the above options, and post the results. If it's in the actual Backup pipeline rather than the ruby implementation I'll take another look.

innobackupex is not a "core" supported feature, it was added rather recently by a community PR. I have no in depth knowledge how it should work. Maybe @chesterbr knows more since he implemented it.

@barttenbrinke
Copy link
Contributor Author

I will try that. Thanks!

@chesterbr
Copy link
Contributor

This is really odd. I use it with a >250GB database with no hanging or process detachment (it is set up to compress, split and upload to S3). Ruby is stil 1.9.3, but there should be no reason to work differently on 2.x.

Everything points to innobackupex - or, to be exact, to xtrabackup (which is the one that actually does the work), but it being able to run on the shell makes it pretty confusing (I'm assuming the same user that runs the backup gem was used, so we rule out the need for OS permissions - although to be fair again, whenever I had such problems an error would happen at the beginning).

Out of options, I'd check whether file ownership/permissions on the database files are all the same (they typically belong to mysql:mysql), and other than that, really don't have a clue 😕

Good luck!

@barttenbrinke
Copy link
Contributor Author

Backup script:

Model.new(:core_mysql_snapshot_test, 'Snapshot and backup the entire mysql databases for core') do
   database MySQL, :core2 do |db|
     db.backup_engine      = :innobackupex
     db.name               = "production"
     db.username           = "root"
     db.password           = "OBFUSCATED"
     db.host               = "localhost"
     db.port               = 3306
     db.verbose            = true
   end

  store_with Local do |local|
    local.path = '/tmp/test/'
    local.keep = 5
  end

Command from ps aux | grep backup

sh -c { { /usr/bin/innobackupex --user=root --password='OBFUSCATED' --host='localhost' --port='3306'  --no-timestamp   /tmp/core_mysql_snapshot_test/databases/MySQL-core2.bkpdir  && /usr/bin/innobackupex --apply-log   /tmp/core_mysql_snapshot_test/databases/MySQL-core2.bkpdir    && /bin/tar --remove-files -cf -  -C   /tmp/core_mysql_snapshot_test/databases MySQL-core2.bkpdir 2>&4 ; echo "0|$?:" >&3 ; } | { /bin/cat > '  /tmp/core_mysql_snapshot_test/databases/MySQL-core2.tar' 2>&4 ; echo "1|$?:" >&3 ; } } 3>&1 1>&2 4>&2

This also hangs.

@barttenbrinke
Copy link
Contributor Author

Via back ticks it works (calling ruby test.rb). Now trying the Popen4 one.

@barttenbrinke
Copy link
Contributor Author

Popen4 stops after 4GB, without any notification whatsoever.
I've made the following workaround.

Model.new(:core_mysql_snapshot, 'Snapshot and backup the entire mysql databases for core') do
  innodb_dump_path = "#{Config.tmp_path}/innodbdump"

  before do
    system("rm -rf #{innodb_dump_path}")
    result = system("/usr/bin/innobackupex --user=root --host='localhost' --port='3306' --password='PASSWORD' --no-timestamp #{innodb_dump_path}")
    raise 'terror' if (result != true)
  end

  archive :core2 do |archive|
    archive.add innodb_dump_path
  end

  after do |_exit_status|
    system("rm -rf #{innodb_dump_path}")
  end

  compress_with Custom
  store_with S3
  notify_by Slack
end

@barttenbrinke
Copy link
Contributor Author

https://github.com/ahoward/open4 1.3.4. doesn't help either :(

@tombruijn
Copy link
Member

@barttenbrinke alright, we at least know where it is. It's strange however that @chesterb has no issues with it pushing a 250GB database through it and for you it hangs at ~4GB.

Since I can't run innobackupex, I'll ask you: does your config innobackupex output a lot of data in stdin or stderr when you run it directly? If so that might cause problems for open4. Try to disable as much output as possible.

There might be some other differences in the config as well that might cause problems. Please share any additional non-default config you might have changed. Other than that I can't think of a reason it would work on one machine and not the other (excluding hardware).

If we can't find the cause we might need to swap out the use of open4 just to be sure, but that's a time consuming overhaul I'd like to avoid.

@barttenbrinke
Copy link
Contributor Author

Maybe put this workaround somewhere in the documentation and leave it with that for now.

@taf2
Copy link

taf2 commented Jun 1, 2015

I was having this same issue. turning verbose = false seems to have solved the problem for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants