New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Stop after current file" has various problems with partial backups #3982
Comments
Previously, a yellow warning icon would be displayed but the logs did not contain any details regarding the cause of the warning. This concerns issue duplicati#3982.
Pull request #3984 should address the issue with the empty warning that occurs when a backup is cancelled. |
When I have some time, I'll try and look into addressing some of these issues. @ts678, if I submit a pull request with the work in progress, would you be able to help test? |
Sorry I didn't see this post earlier. So your preference is to try to fix the current code?
Not readily, as build procedure and hardware/software/environment challenges exist. There are various options, including Fix pausing and stopping after upload #3712 PR. As you may know, I really want to get this fixed and a Beta out with or without Stop... |
I think my preference would be for a beta to ship sooner than later. If we are able to include fixes for these issues, then great. If shipping a beta requires that we hide all stop buttons so that we have more time to test, that's fine with me as well. I have a branch where I think the only code that really heeds the partial/full status is the delete handler and database recreate handler. However, much more testing is needed. |
@warwickmm I can help test a PR. This issue has a good recipe documented to reproduce the issue, so hopefully it won't be too difficult. |
This concerns issue duplicati#3982.
This concerns issue duplicati#3982.
This concerns issue duplicati#3982.
This concerns issue duplicati#3982.
If surrounded by full backups, partial backups are assumed to be no longer necessary. This concerns issue duplicati#3982.
If surrounded by full backups, partial backups are assumed to be no longer necessary. This concerns issue duplicati#3982.
Thanks @drwtsn32x and @ts678. Pull request #3992 contains the current work in progress. |
Ok great, first step for me is to reproduce the problem using @ts678's recipe without the PR changes applied... |
I agree on "sooner than later", but just hiding the button just means that the issue will be rarer. You can still provoke the same errors if you kill the process while a backup is active. |
I was able to reproduce all of the problems shown in @ts678's instructions when using the the latest source code without @warwickmm's PR applied. (Only difference is that with the latest source code the warning when stopping a backup correctly shows the reason - since PR 3984 is present.) Applied @warwickmm's PR and restarted all tests.
This is looking really good! Seeing 'partial' in test 6 is a different issue (if it is not intended behavior). Test 7 where regenerated dlists don't contain 'fileset' is also an unrelated issue. |
Previously, all filesets would be displayed as partial. Now, we obtain the partial/full type from the fileset file (not the fileset.json) file in the dlist file. If this file does not exist, the backup type defaults fo full. Note that this requires downloading and uncompressing the dlist file, so performance of populating the restore list may be impacted significantly. This concerns issue duplicati#3982.
I've pushed a commit that addresses the "partial" label in case 6 for my simple test case (1 full backup followed by 1 partial backup). Note that determining the partial/full status of the filesets from the remote files requires downloading and uncompressing the @drwtsn32x, can you check the partial/full labels in case 6 with the new changes? |
@drwtsn32x, can you clarify what "same result as before" means? I would expect that with the changes in the pull request, there now is a visible "1" version and that it would be deleted, leaving only the full backup. Thanks again for the help testing! EDIT: Sorry, I didn't set retention to keep 1 backup. Without the retention setting and a test case with an initial partial backup followed by a full backup, I was able to delete version 1 (the partial one). I believe this is the correct behavior. |
Yep I'll give it a test and let you know...
Agreed! |
This concerns issue duplicati#3982.
This concerns issue duplicati#3982.
I pushed a fix for case 8 (delete dlist test). Let me know if it works! |
Ok i merged your PR with my branch, going to test both 6 and 8 now... |
Tested 6 - looks good - I also tried it with a larger set of backup data I have (195 versions) and the time required increased from 1m0s to 1m59s. Backup data is local on a NAS. I like how it distinguishes partial from full but what do you all feel about the tradeoff in performance? It will be more pronounced with slower, remote backends. |
Tested 8 - looks good! Ran through multiple scenarios of full and partials, and the 'filelist' in the dlist was properly included with the regenerated dlist files every time now, with the correct boolean value. Nice! |
I think the partial/full distinction is key, especially if a user's goal is to restore all files. They would want to do so from a full backup. It's possible that my solution unnecessarily retrieves the dlist files twice. I'll try and take a look later when I have some time. |
Yes, good point. I'm ok with the performance tradeoff myself taking into account using 'direct restore' is probably a rare event for most users. |
This has now been addressed. Previously the The synthetic filelist was also missing the |
I think it should be merged as-is, with a note to check on this in the future. It's better to have this and other fixes, even if not yet fully optimized, than not have them at all.
I do not know..... I haven't looked into when synth filelists are used at all. Thanks for the continued improvements! |
If I understand this correctly, it is used with the UI and a direct restore. In this case, we need to build a local database, but since we only restore from the most recent version, the recreated database is marked as "incomplete" (to avoid using it for anything else by mistake) and contains only the minimal information required to run the restore.
Yes. Synthetic is intended to indicate that the |
One big question is how far we want to go to protect the users from themselves. Currently, is it the PR case that --keep-versions looks only at fulls, but --keep-time and --retention-policy treats them all the same, allowing the possibility to delete or thin partials and maybe prevent a full from ever happening? If a user picks "Smart backup retention" I think that's the same as custom What will possibly save them is that the file enumeration of the next partial might rerun previous path order, and additional version references lowers risk of compact thinking blocks are now wasted space. From a uniformity point of view, I'd sort of like all three retention ways to do the same, so maybe the --keep-time should just never delete partials (which does make me a bit nervous in some other ways). --retention-policy makes me nervous because the algorithm is a little complex, so needs some care... Simply hiding partials could hurt keeping of most recent backup, subject to several different settings. Order also matters. If a full backup finishes and is marked before ApplyRetentionPolicy runs, then the most recent backup would be that full, otherwise the full just run might show up marked as a partial. If partial runs then ApplyRetentionPolicy runs, then I'd say exempt partials from retention processing, which could probably be done by pulling these out of the list early if it can be reconciled with current special handling of most recent file. You need to look before and after the main loop to see the ends. If --keep-time gets changed, test should be pretty easy. --retention-policy has more cases to go into. |
Why would deleting a partial affect a full backup? I thought all filesets (backup versions) are independent from each other. Partial just means it didn't finish backing up all the identified data, right?
Aren't partials a side effect of stopping/interrupting a backup job? So shouldn't they be rare? Or is there some other scenario where they would happen often? Also even if a partial is deleted, I don't see it affecting the deletion of blocks any differently. Blocks should only be deleted if they are unreferenced by ANY job (partial OR full). At least this is how I understand it. |
I don't know the design super-well, and I don't have a build with the changes. Here goes anyway: As a simple case, say you have --keep-time set to more than a file backup time but less than time between backups. Stop the backup when you see a dblock begin uploading, as in usual test steps. First partial stops after file A and contains file A. If it's possible for backed up file blocks to get deleted this way, the full might finish later or never. Getting this to fail in test might be harder than shown due to concurrency. A design overview here suggests third partial could process B in parallel with A, so more files might be needed to see this:
Pause reboot resume? is a recent use case where initial backup needs to be broken into pieces. Currently we tell people to slowly add folders, so that backup finishes, but an alternative might be to backup, stop when needed, continue the next chance. Say initial backup takes a month and you carry a laptop around. Would a user know to use "Keep all backups" awhile, in order to get to a full backup in the quickest way? Note that I'm not sure this is worth extra code risk to "improve" now. Above just explains risk of inaction. To any who know the design better than I do, does above make sense in at least a hypothetical manner? |
Sorry, I haven't been able to follow all the various scenarios, but I can try and explain how the approach in pull request #3992 attempts to deal with "keep a specific number of backups". Suppose that "keep a specific number of backups" is set to 1 and we do the following:
Let me know if there are any flaws with the above approach. I'm not sure that we can treat a "retention rule" ( I think to be conservative, we should not delete partial backups if its not clear that they are superseded by a full backup. I think the logic above for "keep a specific number of backups" is reasonable. The current treatment of the "retention rule" case is to not delete partials. I think this isn't a bad starting point. As @drwtsn32x noted, partial backups should be a rarity. I wasn't aware of EDIT: It looks like currently all backups that are older than that specified by |
Yes, that's my understanding. Relevant code: |
Looking at PR 3992 (isn't that the relevant one?), I hadn't actually noticed that the delete-partials-when-superseded-by-a-full was only in --keep-versions, because I think I was advocating for that everywhere. Partials without later full are more useful. As described before, they're accumulating data towards a full. Giving such partials a blanket exemption could run the risk of someone building up partials forever, but they'd also have to do "Stop after current file" forever, so I don't think that's a scenario that's very likely. When I wrote earlier about --retention-policy, I was thinking partials-at-the-recent-end-if-at-all in mind. |
which sounds like an argument favoring specific configuration (even if inadvertent, through choosing "Smart backup retention") over consistency between the different retention methods. In original work involving hiding partials, my main worry was the breakage it caused, not disrespect for user intention, therefore I saw deleting rather than hiding superseded partials as a reasonable universal replacement. I'm still kind of leaning that way because it does respect user intention with respect to full backups, so maintains the previous behavior when there were only full backups. Partials seem rather unpredictable sources for restores, because it's not certain how far they got. I suppose one could just look and hope. The problem with letting partials get fully involved in --retention-policy without any distinction is that partial could have a full too close after it, so full gets deleted. If I look at what I think is the PR merged into context (using I wonder if there's a way to get GitHub web UI to show a diff between before-this-all-began to where things are proposed at the moment? I have most confidence in the Beta-tested code. If we'd reverted back to that for a restart, I might favor a different path, but maybe the current should just be shipped. There does need to be more testing at some point, then after that, maybe cut a Canary and test more. |
Correct. If can summarize the behaviors currently in pull request #3992 (as of cd08ca9):
Partial backups should be rare so I think the behavior in the "retention rule" case is reasonable. We should focus on fixing the bugs caused by hiding partials, so I'd be hesitant to change the (complicated) retention rule logic too much right now. This can always be improved after the beta release, especially if we make an effort to release more often. I might make a small change to the "keep time" case so that at least one full backup is always kept. For example, if the last full backup is older than the specified window, it should probably be kept just in case.
|
this concerns issue duplicati#3982.
Revision 861f0ff ensures that we keep at least one full backup when removing versions according to the Are there any other remaining issues here that need to be addressed? If not, I will remove the do-not-merge label from the pull request for further review, testing, and potential merge. |
Nice work, @warwickmm! |
This worries me. The original design was explicitly made to avoid this scenario. To repeat the above it would be:
The downside to this approach, is that we will be adding data forever, and only figuring out about deleted files if the backup is ever running to completion. Based on @ts678 scenario, the last step would somehow be "- Run backup, change A, |
We had to remove the keep-versions options since multiple retention options are not allowed. This concerns issue duplicati#3982.
I'm not sure why this would happen. @ts678, can you elaborate? Also, the logic for the |
If the most recent backups were partials (and not followed by a full backup), they should not be removed. A partial backup should only be removed if it is followed by a full backup. This concerns issue duplicati#3982.
By the way, I fixed a bug in my earlier attempt at the |
This concerns issue duplicati#3982.
Sorry, missed the earlier comment. I'm ignoring synthetic filelist because I think it's broken currently. Diagnosis: Warning: Expected there to be a temporary fileset for synthetic filelist #2329 Expected there to be a temporary fileset for synthetic filelist #2506 I talked synthetic filelist in a staff topic (and original PR), as a way to avoid inventing new mechanisms, however partials does have its appeal (and headaches), and I'm not sure if synthetic filelist on the way down would work, but if it could then that solves the partial-is-partial. Synthetic always has prior base. |
Fixed in pull request #3992. |
This test would have failed prior to the changes in revision 51b4ecf ("Include fileset file in dlist files after purge operation"). This concerns issue duplicati#3982.
This test would have failed prior to the changes in revision 0cfce6b ("Include fileset file when repairing missing dlist file"). This concerns issue duplicati#3982.
This test would have failed prior to the changes in revision 8ab6407 ("Display correct backup type when using direct restore"). Prior to revision 8ab6407, all filesets would have been marked as partial when performing a direct restore without the local database. This concerns issue duplicati#3982.
Environment info
Description
Various issues around partial backup. Rather than file separate issues now, this is a collection of issues.
"Stop after current file" with --usn-policy may never back up remaining files #3971 was separately filed, and possibly has a different cause than the ones here which are primarily on how backups are now split into full and partial, with record-keeping in the database and backup, and attempts to make things like retention work right by hiding partial backups after a full exists. This breaks things because some code is going by the view-after-hiding, some is going with the raw view, and translating between may break.
Fix 'stop after current file' #3836 has technical info. This issue is more of a hopefully-reproducible tour.
fileset
file to show "partial" status.One is probably in the number translator for "Unexpected difference in fileset" errors (not tested).
Steps to reproduce
Icon is a yellow exclamation point (Warning I think), and log has one
"ParsedResult": "Warning",
Not sure which I prefer, but if it's going to do a Warning, there should be some way to know why.
purge
with arguments set to some file -- it doesn't seem to make a difference.Not certain, but the id 1 would be the first partial backup which is now hidden and not user-visible.
Keeping an eye on the DB (especially the Fileset table) using an SQLite DB browser could be useful.
affected
with arguments set to file name of oldest dblock file, i.e. from partial.Proper report.
delete
with arguments --version=1Maybe actual was expected. --version=1 would be initial partial which exists but you can't get to it.
find
with arguments * and --all-versions to see if "all versions" does just that.Per the current manual, "Searches in all backup sets, instead of just searching the latest." No longer.
It should have released the space from the first partial, but that still remains and is hanging onto it.
The Duplicati user can't get to these to delete them, and it appears Duplicati retention can't either.
As additional backups happen and replace old blocks, these hidden partials become a storage leak.
Version 0 is the current full with 1 byte file, but is mis-called partial. Version 1 is the original partial.
Correct label of partial. Direct restore says that everything is partial, whether it actually is, or is not.
fileset
file is there, delete dlists, then run a Database Repair.dlist files are regenerated without
fileset
file.dlist files are regenerated with correct
fileset
file with correctIsFullBackup
variable in its JSON.Recreate (delete and Repair)
, then look to see whatRestore files
will offer you.Version 0 is the 1 byte file full backup. Version 1 is the original partial. Neither one is marked partial.
This test used the bad regenerated dlist files. Good ones work OK. Unhiding partials may be useful, should one desperately need to get to one, with no other way (but it's sure a strange workaround).
Screenshots
Debug log
The text was updated successfully, but these errors were encountered: