Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix 'stop after current file' #3836

Merged
merged 72 commits into from Sep 17, 2019

Conversation

@BlueBlock
Copy link
Contributor

commented Jul 23, 2019

The current 'stop after upload' fails to actually stop the backup process. This PR changes the operation name to 'stop after the current file' and correctly stops the backup operation after the current file backup is complete.

A complimentary PR 'Fix pausing and stopping after upload' (#3712) will fix the 'stop now' as it stops the backup operation in the middle of the current file.

Together, these two PR's will resolve and offer the ability to stop and pause a backup cleanly.

BlueBlock added 5 commits Jul 23, 2019
fix stop after the current file
'stop after upload' is now
stop after the current file' and the operation will cleanly allow the current file to complete.
@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Jul 23, 2019

The Travis broken tests should be rerun.

BlueBlock added 2 commits Jul 24, 2019
Also stop the file enum process when the user has cancelled
The file enumeration process should also stop when the user has cancelled the operation.
@seantempleton

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2019

One thing I noticed while testing both ways of stopping was that if I had a large amount of files in my source folder, hit stop after current file, then when it stopped I would get a warning for every single file that it failed to process the metadata for. I can see the possibility of that being useful as you could know what files were not backed up, though it's not very obvious from the warning that the user could know that. I would assume most people would be scared by it.

The Output channel in the FileEnumerationProcess doesn't get closed so then MetadataPreProcess continues in it's infinite loop and fails on every Output.WriteAsync, as it's channel is retired, and logs it.

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Jul 26, 2019

One thing I noticed while testing both ways of stopping was that if I had a large amount of files in my source folder, hit stop after current file, then when it stopped I would get a warning for every single file that it failed to process the metadata for. I can see the possibility of that being useful as you could know what files were not backed up, though it's not very obvious from the warning that the user could know that. I would assume most people would be scared by it.

The Output channel in the FileEnumerationProcess doesn't get closed so then MetadataPreProcess continues in it's infinite loop and fails on every Output.WriteAsync, as it's channel is retired, and logs it.

This was corrected in the last commit regarding FileEnum.

When you test, search the for "cancel" which marks the point the users cancels. All files listed in the log following that point are the FileEnum process still running until it hits a point when it checks for the cancel token.

@seantempleton

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2019

This was corrected in the last commit regarding FileEnum.

Ah, excellent. Sorry I didn't check that there was an update first.

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Jul 27, 2019

This was corrected in the last commit regarding FileEnum.

Ah, excellent. Sorry I didn't check that there was an update first.

I need to remember to place a comment after committing to a PR so everyone knows there was a change.

@warwickmm

This comment has been minimized.

Copy link
Contributor

commented Jul 27, 2019

The changes to the translations will need to be fixed in Transifex (https://www.transifex.com/duplicati/duplicati/translate/). The .po files are auto-generated, so the modifications to them here will get overwritten the next time the translations are generated. Can you revert the changes to the .po files here?

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Jul 27, 2019

Ah.. ok. Thanks.

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Jul 28, 2019

The .po files have been reverted.

@warwickmm

This comment has been minimized.

Copy link
Contributor

commented Aug 2, 2019

I started running some tests and noticed one thing. If I "stop after current file" during the initial backup, the operation does stop after the current uploads finish. The backup job lists "0 bytes / 0 versions"
image

but the restore menu indicates that there are some files (but not all the source files) that can be restored:

image

If I then run another backup and allow it to finish, the backup job lists 2 versions:
image

and the restore menu indicates that there are 2 versions that can be restored from. One version is incomplete (from the interrupted initial backup like above), and the second version is complete:
image

This leads me to a few questions:

  1. Is this behavior desired?
  2. Will the existence of the incomplete backup version mess with retention settings? In other words, is it possible that retention settings will discard a complete backup in favor of an incomplete one?
@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2019

The first backup is incomplete as expected but remains because while it is incomplete, it is still a valid backup since those files were successful. Removing the backup completely when cancelled doesn't seem desirable since the backup can essentially now be resumed just be running it again.

For retention, those files were backed up so I would expect ant retention rules to apply just as they would for any other backup.

The 0 bytes doesn't seem correct. I've been able to re-create this. I'll look at it further.

BlueBlock added 3 commits Aug 2, 2019
update the backups stats also when a bcakup is cancelled
PostBackupVerification() was being skipped when a bcakup was cancelled. PostBackupVerification() will now be called.
@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2019

UpdateStorageStatsFromDatabase() or PostBackupVerification() is called when a backup ends completely. They were being skipped when a backup was cancelled. The stats will then display correctly.

This wasn't an issue previously since the "stop" wasn't working and just let the backup completely finish.

@warwickmm

This comment has been minimized.

Copy link
Contributor

commented Aug 4, 2019

I'm now quite confused, and a little concerned that people use "stop after upload" with different intentions. For example, if one interrupts the backup during the first of thousands of files, most likely they wouldn't want this version to be considered for retention. But do they want this first file to be immediately restorable? Or do they think this is a long pause, where the next backup simply resumes the interrupted backup?

I've asked this in the forums with the hope of getting some more clarity on this.

https://forum.duplicati.com/t/stop-after-upload/3753/25

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Aug 4, 2019

I'm now quite confused, and a little concerned that people use "stop after upload" with different intentions. For example, if one interrupts the backup during the first of thousands of files, most likely they wouldn't want this version to be considered for retention. But do they want this first file to be immediately restorable? Or do they think this is a long pause, where the next backup simply resumes the interrupted backup?

I've asked this in the forums with the hope of getting some more clarity on this.

https://forum.duplicati.com/t/stop-after-upload/3753/25

The phrase is changed to "Stop after the current file". The backup would be a valid partial backup after the current file has completed with no progress lost..

The "Stop now" will is being handled by another devs PR.

@warwickmm

This comment has been minimized.

Copy link
Contributor

commented Aug 4, 2019

My original thought was that both "Stop after the current file" and "Stop now" should behave like a "Pause", except that it allows for one to perform other tasks before resuming. For example one can stop a huge backup in favor of running a smaller faster backup, and later resume the huge backup while avoiding re-uploading blocks that were already uploaded. None of these would record a new backup version for a partial backup.

My fear with recording a backup version for an incomplete backup is the following:

  • A user's retention settings indicates that one backup version should be kept each month
  • A backup operation runs every day of the month
  • On the last day of the month, the user stops the backup for some reason
  • When the retention settings take effect, the last (partial) backup is kept, and no complete backups remain for that month.

Perhaps my interpretation is wrong, but I don't think partial backups should be recorded as a valid version. I don't see a scenario where a user is ok with a backup version containing only a subset of the source files. Is there a different use-case that I am not seeing?

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Aug 4, 2019

I might be missing the strategy.. is the idea to be able to pause a backup, close Duplicati, and then resume a paused backup after restarting Duplicati?

If not..
Pausing is good to have but there must be a way for a user to stop/interrupt the backup cleanly. In fact we should probably be listening for a system shutdown event and then begin stopping the job.

Partial backups could be recorded as partial and the retention policies could take then that into account. Partial backups would essentially belong to the next most-recent non-partial backup. When it is removed by retention then the partials are as well.

And on the Restore list we could also place "(partial)" to the right of the datetime.

1

@ts678

This comment has been minimized.

Copy link
Contributor

commented Aug 5, 2019

FWIW Stop after Upload is my (partly code-level) comment on the revived forum discussion on stop.

@duplicatibot

This comment has been minimized.

Copy link

commented Aug 5, 2019

This pull request has been mentioned on Duplicati. There might be relevant details there:

https://forum.duplicati.com/t/suggestion-improve-upload-resilience-by-breaking-backups-into-sub-jobs/7711/2

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Aug 5, 2019

FWIW Stop after Upload is my (partly code-level) comment on the revived forum discussion on stop.

Does this functionality currently work? Is the stop-now supposed to allow a backup to be interrupted and later continued? When I test this I keep getting additional backups added to the restore dropdown and they don't seem to resume a previous backup.

@ts678

This comment has been minimized.

Copy link
Contributor

commented Aug 5, 2019

From the code section there (read further down), I think the intent was for an interrupted backup to be later continued (not automatically, but on the next run), starting with verification and repair if required -- user might be asked), doing the synthetic filelist (except a bug is blocking getting into that), then on to usual backup. I might misread something. Can you read up from here instead of down from the other:

// Run the backup operation
if (await m_result.TaskReader.ProgressAsync)
await RunMainOperation(sources, snapshot, journalService, db, stats, m_options, m_sourceFilter, m_filter, m_result, m_result.TaskReader, lastfilesetid).ConfigureAwait(false);

How does Duplicati Resume After Interruption? talks about interruptions, and I assume it includes stop.

I forget which variety (if any) of stop works. I generally avoid them because I know it's a trouble point...

You could maybe turn on some modest level of logging (live or file) to see what path you're taking, e.g.

Logging.Log.WriteInformationMessage(LOGTAG, "PreviousBackupFilelistUpload", "Uploading filelist from previous interrupted backup");

Above will be one way to tell if your dropdown item was a synthetic filelist (despite the earlier bug -- or maybe bug means it WILL be bad -- I'm not sure how well it works even if it somehow avoids that bug).

Later on maybe I'll try some testing (does it need this code or will regular canary do?) and study forum.

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Sep 9, 2019

Do you think saving the backup type in the remote files is a blocker, or it's something that can be addressed later?

Maybe to be addressed first.

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Sep 9, 2019

It looks like adding to the manifest file will not work easily and would be unnecessarily complex since it is created at the start. We want to record the fileset state at the end of the backup.

Like file "manifest", I'm adding a file to dlist called "fileset". This file for now will just contain the backup state IsFullBackup. This file can be also used in the future for other data not accessible at the start of a backup.

Let me know if there is a better name than "fileset".

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Sep 10, 2019

(whatever I figure out to fix my adding of my file needed for this PR, I'll likely be fixing the problem below as well.)

I'm running into a problem adding a file to the dlist.zip file and it might be an existing problem. There is the user option ability to add "extra" files to the dlist file and might not be used much if at all since it seems to me that trying to do so causes the filelist in the zip to be invalid.

As seen here when using the "control-files" to add a file:

Annotation 2019-09-10 101825

Annotation 2019-09-10 101705

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Sep 10, 2019

(FYI @ts678 ) Code in these commits also fixes 'control-files" which was corrupting 'filelist'.

The stream for filelist was kept open for writing, then 'control-files' were added to the zip, then the filelist was still being written and eventually closed. This made the filelist file in the zip split within the zip.

The solution is to not stream the filelist to the zip but to write to a memorystream and write to the zip once the filelist file is complete.

BlueBlock added 10 commits Sep 10, 2019
modify to pass the actual IsFullBackup state
changed here because the method was refactored
add the fileset file to dlist
we also jsut set PartialBackup=false just to be clear and not assume the value
add method to add the filelist file
this also fixes "control-files" being added to dlist
add method to add filelist file
This is part of correcting the way the dlist zip file is created. We now have a method to take the memorystream and put the file into dlist.
@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Sep 10, 2019

The commits are in for handling storing the IsFullBackup state on the backend.

For backward compatibility, if the dlist zip is missing "fileset" and the IsFullBackup value then the default is IsFullBackup=true

For example, taking 2 partial backups, then 1 full backup results in backend files in the image. Deleting the database and recreating will then show 3 filesets correctly set as 2 partial and 1 full.

Annotation 2019-09-10 124556

Annotation 2019-09-10 131539

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Sep 10, 2019

@warwickmm Can Travis be kicked off again? Thanks.

@warwickmm

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2019

I restarted the Travis build.

@BlueBlock

This comment has been minimized.

Copy link
Contributor Author

commented Sep 11, 2019

I restarted the Travis build.

Thanks. It looks good now.

@warwickmm

This comment has been minimized.

Copy link
Contributor

commented Sep 14, 2019

@BlueBlock, this looks good to me. Testing has gone well. Awesome job. Can you think of anything else that we should do before I merge?

@kenkendk kenkendk merged commit c5b9150 into duplicati:master Sep 17, 2019

2 of 3 checks passed

Codacy/PR Quality Review Not up to standards. This pull request quality could be better.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.