-
-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flag to avoid backing up long-stopped containers #94
Comments
This is an interesting proposal, however I am not sure if it's possible to do this in a configuration friendly way because of the following: right now, a So if we were to implement such a feature, we'd need to come up with an API that allows linking a volume and a service definition. Right now, I have a hard time coming up with something that would work here, but I'll play with it in the back of my head for a little while. Do you have a good idea how such an API could look like? Looking at your underlying use case, another solution could be: introduce an option like |
You're right, as I am using a single specific An option could be to make this a global config instead of a label in a container and apply the rule to all matching containers. The condition to avoid making a backup would then be: In my case
|
I think I prefer the second approach in the variation you describe. It's just very easy to understand and would cover a lot of cases. The option could be called something like
Yeah that's a very good point. It would be a lot more intuitive if it's working as you described, also more useful. One problem to solve still would be: how do we know about the "last" backup? Pruning has the
This is correct, but the backup mechanism already has to walk the entire file tree so it can create the tar archive here: docker-volume-backup/cmd/backup/archive.go Lines 66 to 71 in 1b1fc48
|
That's great! Ok, so if we don't want to reuse the I have a few ideas in mind, I'll list them here in no particular order:
Personally I'd go with option 1 as that is 100% backwards compatible and keeps the logic and filesystem pretty clean. What do you think? Note: if we go though with this, it could also be used for the pruning mechanism. |
Option 1 sounds enticing, but to be honest I'm not entirely sure how such a regexp that is derived from Maybe using the prefix is dumb, but predictable and lets people sleep well (which should be an explicit design tool for a backup tool). Not sure right now what's the best way to go. Right now I would lean towards soft-deprecating In any case I think having a I think option 2 and 3 would work well but have too many drawbacks. |
The regex is pretty simple to generate, but the problem is actually with the env expand now that I think about it.
Notice how the second regex will match any backup from the first instance. Probably your |
This is a brilliant idea considering it's a new option anyways. So what we'd need to do to implement this feature:
|
That looks reasonable. I'll start working on a PR as soon as a get a bit of spare time. I'll probably need some help when dealing with S3 storages, but I'll let you know when I get there. |
@MMauro94 Just to coordinate with you: I am planning to move things around in here a bit in the next few days so a. we can have an easier way to exclude files from the backup (this is being requested again and again) and also provide some saner structure so that we could maybe do #95 at some point. This would probably collide with what you are planning to do here (also provide synergies, e.g. making the list of files available upfront for filtering), so I wanted to check whether you have started working on this already? It would be a bit disappointing if you'd be close to opening that PR just to see a big merge conflict against my recent changes. |
Luckily I've been quite busy lately, so I haven't started working on a PR yet. I'd say go ahead with your changes, and I'll start working on this once you've finished. |
@MMauro94 I merged #100 so you can now a. decode Regular Expressions from configuration and b. access the list of files to be backed up outside of the archive mechanism which might be helpful for the task at hand here. All other refactoring I might be doing won't collide with your feature, so feel free to start whenever it works for you (no obligations, but I guess you know that). Thanks again! |
@m90 Great! Quick question: there is quite a bit of similar code for the different types of storages. Would you be open to generalize the code a bit? In a OOP language I would create a common abstract class with with abstract functions like I know that it's a bit out of scope for this issue, but I'd like to hear your thoughts on this. Also, there to consider the possibility that a single storage jack-of-all-trades like rclone will be the way to go in the future (as per #65); in this case abstracting the current storages like this will probably be a waste of time. |
Definitely. I attempted to do this this branch already, but never really finished it. The idea was to make each storage implement the same interface, something like this type storage interface {
id() storageID
list(prefix string) ([]backupInfo, error)
symlink(file string) error
copy(file string) error
delete(file string) error
} and then the script just iterates over a list of configured storages, doing the same thing over and over again. IIRC the code on that branch works (all tests pass), but I never got around to testing it thoroughly enough. If you want to work on this, maybe this is a good starting point? This branch existing does not mean it cannot be improved btw, so if you think something can be done better, feel invited to do so. About rclone I am not sure. It sounds tempting but right now it doesn't check all boxes it would need to check for me:
If there was an rclone like tool that is designed to be used as a library I would be open to look into it, but right now I don't know of any. |
This refactoring is now on |
It would be nice to be able to add a flag (either in the config or in the container labels) to avoid making a backup if the container is stopped AND it was last stopped before the latest backup.
My use case is this: I have a game server container that I start on demand. There may be periods on which it'll be used almost every day and periods of months when it will be permanently stopped. I would like to run backups only when there are actual changes to the state of the container. When used in conjunction with backup pruning this also avoids having N backups of the same exact thing and allows to keep the last N states.
By running
docker inspect
I found that there are theStartedAt
andFinishedAt
properties, so it should be relatively easy to pull this info up and compare it with the last backup date.The text was updated successfully, but these errors were encountered: