New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow exporting/importing containers (for backups) #3730

Closed
stgraber opened this Issue Aug 28, 2017 · 33 comments

Comments

@stgraber
Member

stgraber commented Aug 28, 2017

This is a placeholder issue for us to think about and implement a better, simpler backup mechanism for those who want offline backups of their containers.

@stgraber stgraber added this to the later milestone Aug 28, 2017

@Kramerican

This comment has been minimized.

Kramerican commented Aug 28, 2017

I have been doing this with an adapted script from https://github.com/vistoyn/backup-tools where you essentially dump the container config and then tar gzip it along with the rootfs, excluding proc sys and tmp.

I've also written a number of CLI tools in order to restore existing containers from such backups, or provision new ones (and fixing up the network config etc.). However this is not very generalize-able.

Call me old-fashioned, but I simply do not trust the snapshot functionality as-is and backup chains in general. I've lived one too many horror stories on that front. So in my mind, a wholesale portable snapshot is the way to go.

It would be neat if this was baked-in, as it's a strong "selling point" that you can do live snapshots of containers in this way, and it would make newcomers more at ease doing this stuff as it takes a bit of research and trial and error to get this working as-is.

So yeah, huzzah!

One thing I see would be improved is that as-is I have to import my tarball as an image, then do lxd init and then I nuke the temporary image again. Which seems a bit ... indirect.

@huegelc

This comment has been minimized.

huegelc commented Aug 28, 2017

I follow this topic with high interest. I've tried different methods backing up lxd containers described in the lxd backup strategy and for now I'm sticking with doing this on the filesystem level (Btrfs). I know it's not 100% reliable and maybe not the "right way" but in lack of other options my best choice.

@davidfavor

This comment has been minimized.

davidfavor commented Aug 29, 2017

@Kramerican, be great if you can post your cli restore tools, in a zip file.

I'm in the same situation, where I have 100s of LXD containers running across many machines + require a mechanism to restore a container on any machine, any time + have the container work.

This was very easy with LXC + ridiculously difficult with LXD.

I'll take a look at the above link.

Thanks for posting it.

@Kramerican

This comment has been minimized.

Kramerican commented Aug 30, 2017

@davidfavor our tools are very much baked into our management tools, so it would be a bit difficult and time consuming to split out the relevant parts for you. However, feel free to shoot me any specific questions you might have and I'll try to help the best I can.

@Kramerican

This comment has been minimized.

Kramerican commented Sep 15, 2017

I have encountered an issue in connection with this recently, which is a bit of a catch-22 for me.

As mentioned, I want portable off-site backups of containers - that's why I don't use snapshotting.

This has worked fine, untill we had a container with hundreds of thousands of files on an HDD setup. This container would take about ~10 hours to "snapshot" instead of the usual couple of minutes.

This is clearly not acceptable in our setup - so now I am not investigating alternatives. I have looked at something like https://zpool.org/zfs-snapshots-and-remote-replication/ - but we are unfortunately limited by our remote storage which is a dumb FTP server.

The ultimate scenario would be the ability to create portable snapshots of containers quickly and copy them off-site for storage, like you can with so many other traditional vm solutions.

@tomposmiko

This comment has been minimized.

tomposmiko commented Sep 15, 2017

AFAIK you can dump ZFS snapshots and upload them to the ftp server.

@Kramerican

This comment has been minimized.

Kramerican commented Sep 15, 2017

@tomposmiko got anything you can link to on this?

@stgraber

This comment has been minimized.

Member

stgraber commented Sep 17, 2017

zfs send blah@snapshot > file.img

Then when you want to restore: zfs receive blah < file.img

@Kramerican

This comment has been minimized.

Kramerican commented Sep 17, 2017

@stgraber I have stumbled on this myself these last few days, and have had wonderfully mixed, but always far superior results with zfs send / receive. Mixed in the sense that I have a problem with zfs send fluctuating wildly in speed, but I've reached out to the zfs mailing list to diagnose this.

The remainder here is probably a bit off topic, but I'll leave it here for your comments and if others might find this useful ... :

It is now clear to me that it is crazy to consider descending into the filesystem with tar and making tarballs if your objective is snapshots of running containers. That is simply not the correct tool for the job when you have zfs at your disposal.

I have now updated all my snapshotting and restoration workflows with regards to my portable backups to use zfs send / receive. This is not on production yet however, as I am doing things a bit gung-ho with regards to restoration, let me explain:

Rolling back a running container, I do something like: stop container, destroy original pool (and snaps) and then rename pool to minimize downtime

pigz -c -d /backup/testsnap.gz | zfs recv -F lxd/containers/temppool
lxc stop mycontainer
zfs destroy -r lxd/containers/mycontainer
zfs rename lxd/containers/temppool lxd/containers/mycontainer
lxc start mycontainer

Provisioning a new container based on a snapshotted dataset: Init "empty" ubuntu container, Destroy old storage dataset, Import new dataset, start the container, update network config and whatever else needs done

lxc init ubuntu:xenial testnewcontainer --profile webdock
zfs destroy -r lxd/containers/testnewcontainer
pigz -c -d /backup/testsnap.gz | zfs recv -F lxd/containers/testnewcontainer
lxc start testnewcontainer
... Update network config and whatever else needs doing

This approach is something I came up with as a test, just to see if it would work, and for now this seems to work just nicely. I have not seen anybody doing things this way however (not that I could find), which leads me to suspect there is possibly more to this kettle of fish. Is this a crazy way of doing it? i.e. just replacing the dataset wholesale like I do here? I can't think of anything theoretical which would be wrong with this, but then again I'm not well versed in the internals of LXD.

@stgraber

This comment has been minimized.

Member

stgraber commented Sep 17, 2017

What you're doing should be fine and I in fact intend for us to support something like this natively with #3730. Effectively letting you export a container (and its snapshots) as a tarball which either contains the raw fs tree (good old tarball) or contains the storage backend optimized format (result of zfs send, btrfs send, ...).

What format you'd export as would therefore depend on what you expect to restore your container on.

@Kramerican

This comment has been minimized.

Kramerican commented Sep 17, 2017

@stgraber Thank you for setting my mind at ease, and awesome to hear this will be supported natively so I can clean up my scripts a bit 😃

@stgraber stgraber modified the milestones: later, lxd-2.19 Sep 20, 2017

@stgraber stgraber modified the milestones: lxd-2.19, lxd-2.20 Oct 17, 2017

@davidfavor

This comment has been minimized.

davidfavor commented Oct 24, 2017

Regards ZFS... I've had to fall back to using ext4, because ZFS speed fluctuates wildly + the client sites I host in LXD containers are all high traffic sites... so speed fluctuation make ZFS a no-go.

@davidfavor

This comment has been minimized.

davidfavor commented Oct 24, 2017

@Kramerican, when you do the LXD restore, pass along how you handle regenerating /dev + /sys + /proc as this seems like the only sticking point in the restore process.

@stgraber, maybe you can comment on this also.

Looking at https://github.com/vistoyn/backup-tools looks like this person's approach is to lxc init $container which is a bit simplistic. Likely taking this approach requires init'ing the container using an exact match for the image type used for the running container, which was backed up.

This approach ensures correct creation of /dev + /proc + /sys + seems like huge overkill.

If there's a better/faster/lighter-weight approach, I'd love any suggestions.

Thanks.

@Kramerican

This comment has been minimized.

Kramerican commented Oct 24, 2017

@davidfavor I am no longer doing the tar'ing of the container. I had no problems with the method you refer to, except for the obvious issues of speed and impracticality vs. zfs

The speed issues you have experienced with zfs: Is that when doing zfs send/receive, or just disk i/o in general?

I had tremendously varied results with zfs send/receive up untill I sparred with the nerds on the zfs mailing list. From that I gathered a bunch of very excellent suggestions for optimizations suitable for LXD containers. I have been meaning to do a writeup of this and post...somewhere, when I have time.

In essence: Containers suffer from fragmentation of data. This slows down disk/io. Especially webservers with heavy db access or great deletion/creation of files. Here you can essentially "defrag" the datapool with a couple of simple commands, and mitigate this fragmentation with some zfs settings.

If you are seeing bad disk i/o in general, I do not have any solutions for you - except for a few optimization settings for zfs: make sure compression is on, do zfs set xattr=sa lxd possibly zfs set sync=disabled lxd
and possibly zfs set atime=off lxd as well as tweaking the amount of RAM zfs can use. PS do not use these settings without reading up on what they do as they can result in data loss in certain scenarios.

I'd suspect something other than zfs is giving you trouble maybe ...

@stgraber

This comment has been minimized.

Member

stgraber commented Mar 30, 2018

Just a few design notes on this:

API

  • New API endpoint at /1.0/containers/NAME/backups
    • POST lets you create backups using a struct with the following option (background operation):
      • name (Unique identifier for the backup)
      • expiry (when to delete the backup automatically)
      • container_only (if True, snapshots aren't included)
      • optimized_storage (if True, btrfs send or zfs send is used for container and snapshots)
    • GET lets you list existing backups for the container (supports recursive=1 to list the structs)
  • New API endpoint at /1.0/containers/NAME/backups/NAME
    • GET lets you retrieve the struct
    • DELETE lets you remove the backup
    • POST lets you rename the backup
  • New API endpoint at /1.0/containers/NAME/backups/NAME/export
    • GET lets you fetch the backup tarball
  • Extend POST /1.0/containers to allow reading a tarball sent during POST and create the container based on its content (including snapshots)

Database

  • New table containers_backups with at least those fields:
    • id
    • name
    • creation_date
    • expiry_date
    • container_only
    • optimized_storage

Client support

We'll keep client support for this pretty limited at the beginning with just two extra top-level commands:

  • lxc export [<remote>:]<container> [target] [--container-only] [--optimized-storage]
  • lxc import [<remote>:] <backup file>

So export would effectively ask LXD to create a backup, setting a short expiry for it, then retrieve it once made and delete the backup. Import would read a backup file and POST it to LXD to have it restored.

@brauner @monstermunchkin assigning this to the two of you.
@monstermunchkin you can try to do all the API addition bits, structs, API extension, ... and then sync with @brauner for the actual generation and consumption of the tarball.

@l00mi

This comment has been minimized.

l00mi commented Apr 23, 2018

We are excited to see this feature coming.

The last two years we are using our own scripts running something like this:
zfs send $zfssrc/$container@snapshot-$id | lz4 > $zfslcl/$container-$id.zfs.lz4

Also do we use the the zfs copy to another lxd daemon as a "stand-by" backup.

For both we would love to see an incremental (#3326) option, as the backup load on the network is slowly hitting a critical size for us.

@huegelc

This comment has been minimized.

huegelc commented Apr 23, 2018

Same here: we are using (almost) since the beginning https://github.com/digint/btrbk. Works really good with incremental backups.

@davidfavor

This comment has been minimized.

davidfavor commented Apr 23, 2018

@l00mi

This comment has been minimized.

l00mi commented Apr 23, 2018

@davidfavor
Ha, good to know: Backup is easy, Restore is hard (-:

Unfortunately we cant stop our machines for the backup. Is there anyway to instruct the Database to save to disk before a backup?

@davidfavor

This comment has been minimized.

davidfavor commented Apr 23, 2018

@Kramerican

This comment has been minimized.

Kramerican commented Apr 23, 2018

@davidfavor We've been doing, on our ZFS backed systems

/sbin/zfs snapshot lxd/containers/$LXD_NAME@$BACKUP_TYPE

and then

/sbin/zfs send lxd/containers/$LXD_NAME@$BACKUP_TYPE | /usr/bin/mbuffer -q -m 500M | /usr/bin/pigz -c -p 6 | /usr/bin/mbuffer -q -m 500M > $LXD_BACKUP/$LXD_SNAPSHOT.gz

For 7-8 months now, on 300+ containers running MariaDB without noticing a single instance of corrupted database data.

Granted, the vast majority of these sites are not busy websites, so maybe we didn't notice any potential drop of data? At the very least, we have seen no corruption of data or problems arising on the database side.

Maybe we have just consistently been getting lucky? :)

Edit: We have restored many a webserver from snapshots like these. I don't have a precise count, but including migrations internally on our network it's in the hundreds. Again, we have never had a server not come up or had problems due to corrupt MariaDB data

Second edit: If you are wondering what mbuffer is for, it's due to us streaming from one disk to another on the system, typically from an SSD to a HDD - so mbuffer is there to optimize the transfer by "borrowing" some RAM, in case pigz lags behind or the HDD for some reason sees high i/o and can't receive the stream quick enough.

@davidfavor

This comment has been minimized.

davidfavor commented Apr 25, 2018

@Kramerican

This comment has been minimized.

Kramerican commented Apr 25, 2018

@davidfavor Maybe this doesn't need to be that complicated? Looking at this:

https://serverfault.com/questions/805257/backing-up-a-mysql-database-via-zfs-snapshots

There may be some configuration / commands which can be used to make sure memory buffers are flushed when doing the snapshot. I'm going to monkey around with this a bit and see what I find. If you've already been down this path and failed, it would be much appreciated to know :)

Edit: From further reading, as I am using InnoDB (the default for MariaDB) and since the innodb_flush_log_at_trx_commit setting is set to 1 per default More info here then as far as I see it there should be no chance of any major badness happening, and no need for any FLUSH TABLES ... stuff before snapshotting. I guess possibly MariaDB would emit some messages about recovering the last transaction commit, but as far as I understand it, it's all good :)

If I have misunderstood something here, or you have more practical experience with this @davidfavor I would love to know.

Second edit: (I should really complete my research before writing replies..) From further reading there may be some chance of the flush not completing and hence will be lost in the snapshot. However, Just because a snapshot didn't grab the very last commit, I don't really see how that could corrupt the database. We're not talking about a power outage here, where the contents of memory is lost. There'd just be some data that didn't make it to that snapshot, but neither will any of the data that comes after More info on this InnoDB auto-recovery magic From my current understanding of things - and this synchs with my practical experience so far - there should be no issue here. I can't speak for MySQL specifically, but if you're using InnoDB you should be good.

@michacassola

This comment has been minimized.

michacassola commented May 6, 2018

So from reading this @stgraber:
Will the new feature of more native backups adress the mentioned database troubles?

Also more imprtantly:
For larger Containers, will there be an incremental option?

@stgraber

This comment has been minimized.

Member

stgraber commented May 8, 2018

@michacassola no, this feature lets you extract the filesystem as it currently is into a tarball. If that filesystem is inconsistent because your database server hasn't been told to write everything, then it will be inconsistent. Database servers should usually be special cased in backups, either by turning them off, by doing a seperate dump of the databases or by syncing remotely with a remote DB server.

For incremental type backups, #3326 is what we'll be looking at.

@michacassola

This comment has been minimized.

michacassola commented May 9, 2018

Thank you @stgraber !

I don't have any large sites right now but its always better to know in case I will.
@davidfavor

The best way to do this is just to do a...

  1. lxc stop container

  2. lxc copy container in background

  3. restart container

At this point the container copy will run till it finishes.

Then stop mysqld on both servers + do a 2nd rsync /var/lib/mysql from
original container to copied container.

Then restart mysqld on both servers.

I dont get it.
Why not stop mysql then stop the container then copy it and restart it again. Would the copy then not be corruption free? And you would save a step? Please enlighten me.

Also why is a mysql dump by mysql itself not possible and one could copy the container and do a mysql dump and for restoring putting the mysql dump back into the restored container?

Thanks in advance! :)

@Kramerican

This comment has been minimized.

Kramerican commented May 9, 2018

@michacassola Did you see my comments? If you are using MariaDB/InnoDB you have no need to worry. Just snapshot and off you go ...

@michacassola

This comment has been minimized.

michacassola commented May 10, 2018

@Kramerican Yes of course I did. But with the many edits I was not sure what was the end result. But thanks for making that clear!
But I will have to set innodb_flush_log_at_trx_commit to 1 or 2.
And I guess @stgraber if the log buffer is continually flushed to disk using the future LXD native backup function will also work as a ZFS snapshot would?

@brauner brauner closed this May 15, 2018

@dplandigi

This comment has been minimized.

dplandigi commented May 15, 2018

@davidfavor

This comment has been minimized.

davidfavor commented May 15, 2018

@Kramerican - Running ZFS + BTRFS backing stores are incredibly slow, compared to raw EXT4, so EXT4 is really the only option for high IOPs databases.

@davidfavor

This comment has been minimized.

davidfavor commented May 15, 2018

This entire problem returns to many existing tickets which have requested.

  1. Pre + Post hooks for lxc copy, so database start/stop can be integrated into copy operations.

  2. Allow passing rsync exclude lists, to optimize copy operations.

If these two features existed working with large data containers would simplify.

@brauner

This comment has been minimized.

Member

brauner commented May 16, 2018

Here's an illustration of the new feature:
asciicast

@uocxp

This comment has been minimized.

uocxp commented Oct 2, 2018

+1 for incremetal backups

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment