New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement --backup(-dir) #98

Closed
mkiesel opened this Issue Aug 14, 2015 · 53 comments

Comments

@mkiesel

mkiesel commented Aug 14, 2015

rsync's --backup(-dir) moves all files on the destination that would be deleted or overwritten with newer data to a backup directory. This feature is very helpful for creating incremental backups.

@ncw ncw added the enhancement label Aug 16, 2015

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Aug 16, 2015

Owner

A nice idea and I didn't realise rsync had that feature

Owner

ncw commented Aug 16, 2015

A nice idea and I didn't realise rsync had that feature

@roms2000

This comment has been minimized.

Show comment
Hide comment
@roms2000

roms2000 Nov 20, 2015

+1 and rclone can be used in every day backup for servers.
If you try to implement it, --backup-dir is very useful in rsync because it keep directory structure intact and move modified or deleted files in this backup dir.

roms2000 commented Nov 20, 2015

+1 and rclone can be used in every day backup for servers.
If you try to implement it, --backup-dir is very useful in rsync because it keep directory structure intact and move modified or deleted files in this backup dir.

@dbcm

This comment has been minimized.

Show comment
Hide comment
@dbcm

dbcm commented Dec 11, 2015

+1

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 10, 2016

Owner

This has some similar ideas to #18

Owner

ncw commented Feb 10, 2016

This has some similar ideas to #18

@ncw ncw added this to the Unplanned milestone Feb 10, 2016

@balazer

This comment has been minimized.

Show comment
Hide comment
@balazer

balazer Jul 19, 2016

A --backup-dir option would be a great addition. Server-side copying goes part of the way to providing versioned backups, but it takes a very long time when there are a lot of files, and it can make wasteful use of the remote storage, depending on how the remote storage deals with duplicate files.

I think all you need is a --backup-dir option, and not a --backup option with suffixes. Suffixes would be more complicated, because then you have to worry about what suffix to add, and what filters to add to exclude the suffixed files.

The logic for --backup-dir would simple:

  • If a copy, sync, or move operation would replace or remove an existing destination file, first move that file to the backup-dir path.

Like with a normal move, the destination path and backup-dir path should not overlap. Some consideration must be made for what to do when a file to be moved already exists in the backup-dir path. For versioned backups, I'd argue that it makes sense to preserve the file that already exists in the backup-dir path. But in other circumstances, it might make more sense to replace the file in the backup-dir path. Maybe that should be an option.

Sample command:

rclone sync "c:\My Documents" "remote:backup/current/My Documents" --backup-dir "remote:backup/old versions/My Documents"

Relative paths should be preserved. So if remote:backup/current/My Documents/Folder/file.txt existed in the destination and would be replaced or removed, it would be moved to remote:backup/old versions/My Documents/Folder/file.txt.

Under this backup scheme, the destination path would be a proper sync of the source, and old versions & deleted files could be found in some backup-dir path. In typical usage, the backup-dir path would contain a date, and it would be up to the user to remove old folders after sufficient time has passed. This seems simpler than some of the ideas in issue #18 .

You might add a restriction that the backup-dir path be in the same remote as the destination path, if that makes things easier.

balazer commented Jul 19, 2016

A --backup-dir option would be a great addition. Server-side copying goes part of the way to providing versioned backups, but it takes a very long time when there are a lot of files, and it can make wasteful use of the remote storage, depending on how the remote storage deals with duplicate files.

I think all you need is a --backup-dir option, and not a --backup option with suffixes. Suffixes would be more complicated, because then you have to worry about what suffix to add, and what filters to add to exclude the suffixed files.

The logic for --backup-dir would simple:

  • If a copy, sync, or move operation would replace or remove an existing destination file, first move that file to the backup-dir path.

Like with a normal move, the destination path and backup-dir path should not overlap. Some consideration must be made for what to do when a file to be moved already exists in the backup-dir path. For versioned backups, I'd argue that it makes sense to preserve the file that already exists in the backup-dir path. But in other circumstances, it might make more sense to replace the file in the backup-dir path. Maybe that should be an option.

Sample command:

rclone sync "c:\My Documents" "remote:backup/current/My Documents" --backup-dir "remote:backup/old versions/My Documents"

Relative paths should be preserved. So if remote:backup/current/My Documents/Folder/file.txt existed in the destination and would be replaced or removed, it would be moved to remote:backup/old versions/My Documents/Folder/file.txt.

Under this backup scheme, the destination path would be a proper sync of the source, and old versions & deleted files could be found in some backup-dir path. In typical usage, the backup-dir path would contain a date, and it would be up to the user to remove old folders after sufficient time has passed. This seems simpler than some of the ideas in issue #18 .

You might add a restriction that the backup-dir path be in the same remote as the destination path, if that makes things easier.

@robjlg

This comment has been minimized.

Show comment
Hide comment
@robjlg

robjlg Aug 23, 2016

Hello, let me show how I use the rsync's backup-dir, backup and suffix options. This would be a very good feature to add to rclone.

I use 3 directory:

RSYNC - the destination dir
RSYNC_BAK - where the changed files are stored
RSYNC_DEL - where the deleted files are stored

I use 2 commands, the first do not delete files on destination (RSYNC) that was removed from source, it just move changed files to the backup-dir (RSYNC_BAK) and add a date and time label to the end of each file name before store the newer versions. For easy find files from a specific date, we add a directory for each month in the backup-dir path.

The second command move the deleted files to the backup-dir (RSYNC_DEL). This command runs just after the first one.

As an example, lets say a local file /drawing/ele/E1500025.dwg was changed a lot before being removed and the file /drawing/hid/H1500012.DWG also was changed but not removed. It would end up with this files on remote:

RSYNC/drawing/hid/H1500012.dwg

RSYNC_BAK/2016_08/drawing/hid/H1500012.dwg__2016_08_17_224039
RSYNC_BAK/2016_08/drawing/hid/H1500012.dwg__2016_08_18_125240
RSYNC_BAK/2016_08/drawing/hid/H1500012.dwg__2016_08_22_224135

RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_03_224007
RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_04_125246
RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_08_224054
RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_11_223952
RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_17_224039

RSYNC_DEL/drawing/ele/E1500025.dwg

The 2 commands I use is like bellow

rsync -av --acls --xattrs --stats --backup --backup-dir=/RSYNC_BAK/2016_08 --suffix=__2016_08_17_224039 /drawing /RSYNC

rsync -av --acls --xattrs --stats --backup --backup-dir=/RSYNC_DEL --delete /drawing /RSYNC

Or

BASE="/RSYNC"
BASE_DEL="/RSYNC_DEL"
BASE_BAK="/RSYNC_BAK"

dir="/drawing"

rsync -av --acls --xattrs --stats --backup --backup-dir=${BASE_BAK} --suffix=$(date +"__%Y_%m_%d_%H%M%S") ${dir} ${BASE}

rsync -av --acls --xattrs --stats --backup --backup-dir=${BASE_DEL} --delete ${dir} ${BASE}

Thank you

robjlg commented Aug 23, 2016

Hello, let me show how I use the rsync's backup-dir, backup and suffix options. This would be a very good feature to add to rclone.

I use 3 directory:

RSYNC - the destination dir
RSYNC_BAK - where the changed files are stored
RSYNC_DEL - where the deleted files are stored

I use 2 commands, the first do not delete files on destination (RSYNC) that was removed from source, it just move changed files to the backup-dir (RSYNC_BAK) and add a date and time label to the end of each file name before store the newer versions. For easy find files from a specific date, we add a directory for each month in the backup-dir path.

The second command move the deleted files to the backup-dir (RSYNC_DEL). This command runs just after the first one.

As an example, lets say a local file /drawing/ele/E1500025.dwg was changed a lot before being removed and the file /drawing/hid/H1500012.DWG also was changed but not removed. It would end up with this files on remote:

RSYNC/drawing/hid/H1500012.dwg

RSYNC_BAK/2016_08/drawing/hid/H1500012.dwg__2016_08_17_224039
RSYNC_BAK/2016_08/drawing/hid/H1500012.dwg__2016_08_18_125240
RSYNC_BAK/2016_08/drawing/hid/H1500012.dwg__2016_08_22_224135

RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_03_224007
RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_04_125246
RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_08_224054
RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_11_223952
RSYNC_BAK/2016_08/drawing/ele/E1500025.dwg__2016_08_17_224039

RSYNC_DEL/drawing/ele/E1500025.dwg

The 2 commands I use is like bellow

rsync -av --acls --xattrs --stats --backup --backup-dir=/RSYNC_BAK/2016_08 --suffix=__2016_08_17_224039 /drawing /RSYNC

rsync -av --acls --xattrs --stats --backup --backup-dir=/RSYNC_DEL --delete /drawing /RSYNC

Or

BASE="/RSYNC"
BASE_DEL="/RSYNC_DEL"
BASE_BAK="/RSYNC_BAK"

dir="/drawing"

rsync -av --acls --xattrs --stats --backup --backup-dir=${BASE_BAK} --suffix=$(date +"__%Y_%m_%d_%H%M%S") ${dir} ${BASE}

rsync -av --acls --xattrs --stats --backup --backup-dir=${BASE_DEL} --delete ${dir} ${BASE}

Thank you

@stephenjamieson

This comment has been minimized.

Show comment
Hide comment
@stephenjamieson

stephenjamieson Sep 8, 2016

This would be awesome!

stephenjamieson commented Sep 8, 2016

This would be awesome!

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Sep 12, 2016

Owner

Now most remotes can do Copy or Move/Delete this is now a practical feature to implement.

It would also require #197 and #721 in an ideal world.

Owner

ncw commented Sep 12, 2016

Now most remotes can do Copy or Move/Delete this is now a practical feature to implement.

It would also require #197 and #721 in an ideal world.

@ncw ncw modified the milestones: Soon, Unplanned / Help Wanted Sep 12, 2016

@dsrbecky

This comment has been minimized.

Show comment
Hide comment
@dsrbecky

dsrbecky commented Sep 13, 2016

+1

@jkaberg

This comment has been minimized.

Show comment
Hide comment
@jkaberg

jkaberg Sep 19, 2016

Might I suggest to at least consider/think about deduplication when this is implemented. Deduplication is a huge win, space wise :-)

jkaberg commented Sep 19, 2016

Might I suggest to at least consider/think about deduplication when this is implemented. Deduplication is a huge win, space wise :-)

@Cadish

This comment has been minimized.

Show comment
Hide comment
@Cadish

Cadish Dec 27, 2016

Really looking forward to a backup feature like this! For me, this is the only thing missing in rclone.

Cadish commented Dec 27, 2016

Really looking forward to a backup feature like this! For me, this is the only thing missing in rclone.

@MONKiCODE

This comment has been minimized.

Show comment
Hide comment
@MONKiCODE

MONKiCODE Jan 4, 2017

How's the "backup" feature coming along?

I assume this would also be working with encryption & ideally have a deduplication feature as well.

MONKiCODE commented Jan 4, 2017

How's the "backup" feature coming along?

I assume this would also be working with encryption & ideally have a deduplication feature as well.

@ncw ncw modified the milestones: v1.36, Soon Jan 4, 2017

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 16, 2017

Owner

I've implemented this now - please find it in this beta. Any feedback much appreciated!

http://beta.rclone.org/v1.35-33-g47ebd07/ (uploaded in 15-30 mins)

Owner

ncw commented Jan 16, 2017

I've implemented this now - please find it in this beta. Any feedback much appreciated!

http://beta.rclone.org/v1.35-33-g47ebd07/ (uploaded in 15-30 mins)

@ncw ncw closed this in 3745c52 Jan 16, 2017

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 16, 2017

Owner

Here are the docs for --backup-dir

--backup-dir=DIR

When using sync, copy or move any files which would have been
overwritten or deleted are moved in their original hierarchy into this
directory.

The remote in use must support server side move or copy and you must
use the same remote as the destination of the sync. The backup
directory must not overlap the destination directory.

For example

rclone sync /path/to/local remote:current --backup-dir remote:old

will sync /path/to/local to remote:current, but for any files
which would have been updated or deleted will be stored in
remote:old.

Owner

ncw commented Jan 16, 2017

Here are the docs for --backup-dir

--backup-dir=DIR

When using sync, copy or move any files which would have been
overwritten or deleted are moved in their original hierarchy into this
directory.

The remote in use must support server side move or copy and you must
use the same remote as the destination of the sync. The backup
directory must not overlap the destination directory.

For example

rclone sync /path/to/local remote:current --backup-dir remote:old

will sync /path/to/local to remote:current, but for any files
which would have been updated or deleted will be stored in
remote:old.

@simnether

This comment has been minimized.

Show comment
Hide comment
@simnether

simnether Jan 16, 2017

Thanks a lot for this one! I'll download it as soon as I have a minute.
I am guessing files previously present in "backup-dir" will be overwritten?

So let's say I have Folder A with File1.txt and File2.txt. I copy it to Destination\A\File1.txt and File2.txt. Then I modify File1.txt and run the job with backup-dir. This will move Destination\A\File1.txt in the backup-dir.
Then I modify File1.txt again, this i guess will overwrite Destination\A\File1.txt?

simnether commented Jan 16, 2017

Thanks a lot for this one! I'll download it as soon as I have a minute.
I am guessing files previously present in "backup-dir" will be overwritten?

So let's say I have Folder A with File1.txt and File2.txt. I copy it to Destination\A\File1.txt and File2.txt. Then I modify File1.txt and run the job with backup-dir. This will move Destination\A\File1.txt in the backup-dir.
Then I modify File1.txt again, this i guess will overwrite Destination\A\File1.txt?

@Cadish

This comment has been minimized.

Show comment
Hide comment
@Cadish

Cadish Jan 16, 2017

Very nice, will test it when it's available for my platform (QNAP).

Thanks a lot!

Cadish commented Jan 16, 2017

Very nice, will test it when it's available for my platform (QNAP).

Thanks a lot!

@dsrbecky

This comment has been minimized.

Show comment
Hide comment
@dsrbecky

dsrbecky Jan 16, 2017

@simnether Good question. My intention is to always set a different backup-dir based on date.

dsrbecky commented Jan 16, 2017

@simnether Good question. My intention is to always set a different backup-dir based on date.

@simnether

This comment has been minimized.

Show comment
Hide comment
@simnether

simnether Jan 16, 2017

@dsrbecky Actual your is a good point, I think I will do that too 👍
For the function itself, perhaps adding an _# would help. So for instance if File1.txt is already in the backup dir, it could create File1_1.txt, File1_2.txt and so on.. just a thought

simnether commented Jan 16, 2017

@dsrbecky Actual your is a good point, I think I will do that too 👍
For the function itself, perhaps adding an _# would help. So for instance if File1.txt is already in the backup dir, it could create File1_1.txt, File1_2.txt and so on.. just a thought

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 16, 2017

Owner

@simnether wrote:

I am guessing files previously present in "backup-dir" will be overwritten?

Yes you are correct. I'll add that to the documentation.

The intention is that you'd make a new backup dir for each day, or each backup, so it is up to you how granular you want the old backups to be. I don't really want to rename the files - that would complicate the implementation.

Owner

ncw commented Jan 16, 2017

@simnether wrote:

I am guessing files previously present in "backup-dir" will be overwritten?

Yes you are correct. I'll add that to the documentation.

The intention is that you'd make a new backup dir for each day, or each backup, so it is up to you how granular you want the old backups to be. I don't really want to rename the files - that would complicate the implementation.

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 16, 2017

Owner

@dsrbecky wrote

My intention is to always set a different backup-dir based on date.

My plan is that rclone will grow a backup command eventually which will automate the use of --backup-dir which will do exactly that.

Owner

ncw commented Jan 16, 2017

@dsrbecky wrote

My intention is to always set a different backup-dir based on date.

My plan is that rclone will grow a backup command eventually which will automate the use of --backup-dir which will do exactly that.

@simnether

This comment has been minimized.

Show comment
Hide comment
@simnether

simnether Jan 17, 2017

@ncw - Seems to be working fine so far :)

simnether commented Jan 17, 2017

@ncw - Seems to be working fine so far :)

@unnfav

This comment has been minimized.

Show comment
Hide comment
@unnfav

unnfav Jan 17, 2017

ACD would complain if the filenames are the same in "backup-dir" about naming conflicts. But then we should be using new backup-dir names every time anyway.

unnfav commented Jan 17, 2017

ACD would complain if the filenames are the same in "backup-dir" about naming conflicts. But then we should be using new backup-dir names every time anyway.

@robjlg

This comment has been minimized.

Show comment
Hide comment
@robjlg

robjlg Jan 17, 2017

Excellent, it is working like a charm. Perfect !

In DRIVE, every version in the backup-dir stays with the same name. This is very good.

Thank you, very very much

robjlg commented Jan 17, 2017

Excellent, it is working like a charm. Perfect !

In DRIVE, every version in the backup-dir stays with the same name. This is very good.

Thank you, very very much

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 17, 2017

Owner

@balazer wrote

One note for your documentation, on Google Drive at least, I found that if the file in the backup-dir already exists, it will remain there alongside the moved file. This is totally fine, and I actually prefer having duplicates instead of overwriting in this case.

Hmm, that is unexpected! That probably means I haven't thought through enough what happens if there is an existing file in the backup-dir. @unnfav - you are right ACD complains about naming conflicts here in my tests.

I don't really like rclone creating duplicate file names. Even though drive allows it, it causes trouble with practically everything else! I could allow this behavior for fses which allow duplicate files I suppose, but I forsee it causing problems!

So my preferred course of action would be to overwrite the files in the backup-dir.

As for the technical mechanism... The solution is quite simple - to pass in a dst Object to Move if one exists and it will delete it first.

I'll re-open the ticket to remind me to fix this.

Owner

ncw commented Jan 17, 2017

@balazer wrote

One note for your documentation, on Google Drive at least, I found that if the file in the backup-dir already exists, it will remain there alongside the moved file. This is totally fine, and I actually prefer having duplicates instead of overwriting in this case.

Hmm, that is unexpected! That probably means I haven't thought through enough what happens if there is an existing file in the backup-dir. @unnfav - you are right ACD complains about naming conflicts here in my tests.

I don't really like rclone creating duplicate file names. Even though drive allows it, it causes trouble with practically everything else! I could allow this behavior for fses which allow duplicate files I suppose, but I forsee it causing problems!

So my preferred course of action would be to overwrite the files in the backup-dir.

As for the technical mechanism... The solution is quite simple - to pass in a dst Object to Move if one exists and it will delete it first.

I'll re-open the ticket to remind me to fix this.

@ncw ncw reopened this Jan 17, 2017

@robjlg

This comment has been minimized.

Show comment
Hide comment
@robjlg

robjlg Jan 17, 2017

@ncw wrote

I could allow this behavior for fses which allow duplicate files I suppose, but I forsee it causing problems! So my preferred course of action would be to overwrite the files in the backup-dir.

Is it very difficult to allow duplicate files on the backup-dir for DRIVE and others that allow it? As an option?

Could you consider this option?

Please note that it could be a very useful feature. In my case, an particular folder I sync it every 30 minutes and there are a lot of changes. Been able to have all those changes in the backup-dir is a huge advantage.

I was thinking about to create one backup-dir by month and have all file versions for a month there. It is not a good solution to create a new backcup-dir for every sync.

The program RSYNC has the --suffix option that allow to change file names by attaching the provided suffix for each file that goes to the backup-dir. Implementing this could be another solution.

Maybe an option to allow duplicate files on the backup-dir could be simple but will not work for all storages supported by rclone.

robjlg commented Jan 17, 2017

@ncw wrote

I could allow this behavior for fses which allow duplicate files I suppose, but I forsee it causing problems! So my preferred course of action would be to overwrite the files in the backup-dir.

Is it very difficult to allow duplicate files on the backup-dir for DRIVE and others that allow it? As an option?

Could you consider this option?

Please note that it could be a very useful feature. In my case, an particular folder I sync it every 30 minutes and there are a lot of changes. Been able to have all those changes in the backup-dir is a huge advantage.

I was thinking about to create one backup-dir by month and have all file versions for a month there. It is not a good solution to create a new backcup-dir for every sync.

The program RSYNC has the --suffix option that allow to change file names by attaching the provided suffix for each file that goes to the backup-dir. Implementing this could be another solution.

Maybe an option to allow duplicate files on the backup-dir could be simple but will not work for all storages supported by rclone.

@unnfav

This comment has been minimized.

Show comment
Hide comment
@unnfav

unnfav Jan 18, 2017

@robjlg wrote

The program RSYNC has the --suffix option that allow to change file names by attaching the provided suffix for each file that goes to the backup-dir. Implementing this could be another solution.

That'd be great.

As for allowing duplicate naming, wouldn't it be a nightmare to do restores if various versions with the same filename exist in a directory?

unnfav commented Jan 18, 2017

@robjlg wrote

The program RSYNC has the --suffix option that allow to change file names by attaching the provided suffix for each file that goes to the backup-dir. Implementing this could be another solution.

That'd be great.

As for allowing duplicate naming, wouldn't it be a nightmare to do restores if various versions with the same filename exist in a directory?

@robjlg

This comment has been minimized.

Show comment
Hide comment
@robjlg

robjlg Jan 18, 2017

@unnfav wrote

As for allowing duplicate naming, wouldn't it be a nightmare to do restores if various versions with the same filename exist in a directory?

For me it is OK! We can select the correct file based on its time-stamp.

On my RSYNC (bash shell) scripts, the --sufix option is configured as:

--suffix=$(date +"__%Y_%m_%d_%H%M%S")

The file FILE_TESTE.txt will be stored on the backup-dir as

FILE_TESTE.txt__2017_01_17_125337

robjlg commented Jan 18, 2017

@unnfav wrote

As for allowing duplicate naming, wouldn't it be a nightmare to do restores if various versions with the same filename exist in a directory?

For me it is OK! We can select the correct file based on its time-stamp.

On my RSYNC (bash shell) scripts, the --sufix option is configured as:

--suffix=$(date +"__%Y_%m_%d_%H%M%S")

The file FILE_TESTE.txt will be stored on the backup-dir as

FILE_TESTE.txt__2017_01_17_125337

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 18, 2017

Owner

@unnfav wrote

As for allowing duplicate naming, wouldn't it be a nightmare to do restores if various versions with the same filename exist in a directory?

Yes it would.

So what does everyone think about this plan?

  • --backup-dir will overwrite existing files when storing new files in the DIR
  • you can set --suffix to give those files a new name - by default it will be empty so files will be stored with their original name

I don't intend to implement --suffix without --backup-dir for rclone - rsync has to jump through hoops of fire to make that work with automatic filters etc.

To fix the original overwrite issue, the most efficient thing to be will be to load the metadata for the objects in backup-dir into memory. This should be a small fraction of the objects in the destination dir which also are loaded into memory.

Owner

ncw commented Jan 18, 2017

@unnfav wrote

As for allowing duplicate naming, wouldn't it be a nightmare to do restores if various versions with the same filename exist in a directory?

Yes it would.

So what does everyone think about this plan?

  • --backup-dir will overwrite existing files when storing new files in the DIR
  • you can set --suffix to give those files a new name - by default it will be empty so files will be stored with their original name

I don't intend to implement --suffix without --backup-dir for rclone - rsync has to jump through hoops of fire to make that work with automatic filters etc.

To fix the original overwrite issue, the most efficient thing to be will be to load the metadata for the objects in backup-dir into memory. This should be a small fraction of the objects in the destination dir which also are loaded into memory.

@robjlg

This comment has been minimized.

Show comment
Hide comment
@robjlg

robjlg Jan 18, 2017

@ncw wrote

So what does everyone think about this plan?

  • --backup-dir will overwrite existing files when storing new files in the DIR
  • you can set --suffix to give those files a new name - by default it will be empty so files will be stored with their original name

For me the best solution. I will use --suffix with --backup-dir,

robjlg commented Jan 18, 2017

@ncw wrote

So what does everyone think about this plan?

  • --backup-dir will overwrite existing files when storing new files in the DIR
  • you can set --suffix to give those files a new name - by default it will be empty so files will be stored with their original name

For me the best solution. I will use --suffix with --backup-dir,

@simnether

This comment has been minimized.

Show comment
Hide comment
@simnether

simnether Jan 18, 2017

@ncw - I would implement that as well to keep it simpler. As of right now the backups are stored in a separated folders and then a subfolder is created with the current date. Using a suffix will probably make it easier when restoring:
It'll all be in the same folder, meaning I will know what file version I have by looking at the same location (I will also use the date on the file).

I guess I can set --backup-dir to target the same (backupto) folder to keep everything in one place?

As I'm using it right now, I have a SYNC job towards ACL, then I do a COPY with --backup-dir of ACL's to another location in ACL that'll keep all the stuff I delete. [For who thinks it might be a pain as i need to download/reupload, it's not as a) I am backing a NAS up, so I would still need download traffic and b) my bandwidth is higher than what amazon provides me with, so i won't see any improvement]

simnether commented Jan 18, 2017

@ncw - I would implement that as well to keep it simpler. As of right now the backups are stored in a separated folders and then a subfolder is created with the current date. Using a suffix will probably make it easier when restoring:
It'll all be in the same folder, meaning I will know what file version I have by looking at the same location (I will also use the date on the file).

I guess I can set --backup-dir to target the same (backupto) folder to keep everything in one place?

As I'm using it right now, I have a SYNC job towards ACL, then I do a COPY with --backup-dir of ACL's to another location in ACL that'll keep all the stuff I delete. [For who thinks it might be a pain as i need to download/reupload, it's not as a) I am backing a NAS up, so I would still need download traffic and b) my bandwidth is higher than what amazon provides me with, so i won't see any improvement]

ncw added a commit that referenced this issue Jan 19, 2017

Implement --suffix for use with --backup-dir only #98
This also makes sure we remove files we are about to override in the
--backup-dir properly.

ncw added a commit that referenced this issue Jan 19, 2017

Implement --suffix for use with --backup-dir only #98
This also makes sure we remove files we are about to override in the
--backup-dir properly.
@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 19, 2017

Owner

OK Here is the next revision. It supports --suffix and won't duplicate files in drive or cause 409 errors with ACD

http://beta.rclone.org/v1.35-40-gb6848a3/ (uploaded in 15-30 mins)

Please test and let me know how you get on - thanks!

New docs

--backup-dir=DIR

When using sync, copy or move any files which would have been
overwritten or deleted are moved in their original hierarchy into this
directory.

If --suffix is set, then the moved files will have the suffix added
to them. If there is a file with the same path (after the suffix has
been added) in DIR, then it will be overwritten.

The remote in use must support server side move or copy and you must
use the same remote as the destination of the sync. The backup
directory must not overlap the destination directory.

For example

rclone sync /path/to/local remote:current --backup-dir remote:old

will sync /path/to/local to remote:current, but for any files
which would have been updated or deleted will be stored in
remote:old.

If running rclone from a script you might want to use today's date as
the directory name passed to --backup-dir to store the old files, or
you might want to pass --suffix with today's date.

--suffix=SUFFIX

This is for use with --backup-dir only. If this isn't set then
--backup-dir will move files with their original name. If it is set
then the files will have SUFFIX added on to them.

See --backup-dir for more info.

Owner

ncw commented Jan 19, 2017

OK Here is the next revision. It supports --suffix and won't duplicate files in drive or cause 409 errors with ACD

http://beta.rclone.org/v1.35-40-gb6848a3/ (uploaded in 15-30 mins)

Please test and let me know how you get on - thanks!

New docs

--backup-dir=DIR

When using sync, copy or move any files which would have been
overwritten or deleted are moved in their original hierarchy into this
directory.

If --suffix is set, then the moved files will have the suffix added
to them. If there is a file with the same path (after the suffix has
been added) in DIR, then it will be overwritten.

The remote in use must support server side move or copy and you must
use the same remote as the destination of the sync. The backup
directory must not overlap the destination directory.

For example

rclone sync /path/to/local remote:current --backup-dir remote:old

will sync /path/to/local to remote:current, but for any files
which would have been updated or deleted will be stored in
remote:old.

If running rclone from a script you might want to use today's date as
the directory name passed to --backup-dir to store the old files, or
you might want to pass --suffix with today's date.

--suffix=SUFFIX

This is for use with --backup-dir only. If this isn't set then
--backup-dir will move files with their original name. If it is set
then the files will have SUFFIX added on to them.

See --backup-dir for more info.

@balazer

This comment has been minimized.

Show comment
Hide comment
@balazer

balazer Jan 19, 2017

@ncw, I just tested v1.35-40-gb6848a3 here with Google Drive. It's working fine. Existing files in the backup-dir get replaced. It doesn't keep the old revision, like a move would. Thanks again for your hard work. I already converted my backup scheme to use --backup-dir, and Google is thrilled about all of the data I have stored in their servers.

balazer commented Jan 19, 2017

@ncw, I just tested v1.35-40-gb6848a3 here with Google Drive. It's working fine. Existing files in the backup-dir get replaced. It doesn't keep the old revision, like a move would. Thanks again for your hard work. I already converted my backup scheme to use --backup-dir, and Google is thrilled about all of the data I have stored in their servers.

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 20, 2017

Owner

@balazer thanks for testing :-)

Owner

ncw commented Jan 20, 2017

@balazer thanks for testing :-)

@robjlg

This comment has been minimized.

Show comment
Hide comment
@robjlg

robjlg Jan 20, 2017

@ncw wrote

OK Here is the next revision. It supports --suffix and won't duplicate files in drive or cause 409 errors with ACD

http://beta.rclone.org/v1.35-40-gb6848a3/ (uploaded in 15-30 mins)

Please test and let me know how you get on - thanks!

Hello, I tested this new version with DRIVE, worked as the stated. With --suffix all versions remain in the backup-dir, each one with its different suffix.

With no --suffix provided, only one version is kept.

Thank you very much for this marvelous version

robjlg commented Jan 20, 2017

@ncw wrote

OK Here is the next revision. It supports --suffix and won't duplicate files in drive or cause 409 errors with ACD

http://beta.rclone.org/v1.35-40-gb6848a3/ (uploaded in 15-30 mins)

Please test and let me know how you get on - thanks!

Hello, I tested this new version with DRIVE, worked as the stated. With --suffix all versions remain in the backup-dir, each one with its different suffix.

With no --suffix provided, only one version is kept.

Thank you very much for this marvelous version

@simnether

This comment has been minimized.

Show comment
Hide comment
@simnether

simnether Jan 20, 2017

Tested with ACL and seems to be working as expected :)

simnether commented Jan 20, 2017

Tested with ACL and seems to be working as expected :)

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 20, 2017

Owner

Whoohoo!

Thanks for testing @robjlg and @simnether .

I think I'll close this ticket now which is another one ticked off for the 1.36 release :-)

Owner

ncw commented Jan 20, 2017

Whoohoo!

Thanks for testing @robjlg and @simnether .

I think I'll close this ticket now which is another one ticked off for the 1.36 release :-)

@ncw ncw closed this Jan 20, 2017

@robjlg

This comment has been minimized.

Show comment
Hide comment
@robjlg

robjlg Jan 23, 2017

Hi,

When do you intend to release the 1.36 version?

robjlg commented Jan 23, 2017

Hi,

When do you intend to release the 1.36 version?

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Jan 24, 2017

Owner

@robjlg wrote:

When do you intend to release the 1.36 version?

The plan is by 19th Feb

Owner

ncw commented Jan 24, 2017

@robjlg wrote:

When do you intend to release the 1.36 version?

The plan is by 19th Feb

@robjlg

This comment has been minimized.

Show comment
Hide comment
@robjlg

robjlg Jan 25, 2017

Thanks,

anxious waiting

robjlg commented Jan 25, 2017

Thanks,

anxious waiting

@fboyd

This comment has been minimized.

Show comment
Hide comment
@fboyd

fboyd Feb 2, 2017

Sorry if this is in the wrong spot, new to posting in community resources.
I read through this ticket and went and downloaded the following Betas
rclone-v1.35-33-g47ebd07β
rclone-v1.35-40-gb6848a3β
rclone-v1.33-100-gcb40511β

rclone-v1.35-33-g47ebd07β had an issue where it moved changed files over, as the original for ACD...
rclone-v1.35-40-gb6848a3β seems to work perfectly
rclone-v1.33-100-gcb40511β Is missing the --backup-dir option completely.

But good work, from my testing on rclone-v1.35-40-gb6848a3β, everything is working nicely with ACD.

fboyd commented Feb 2, 2017

Sorry if this is in the wrong spot, new to posting in community resources.
I read through this ticket and went and downloaded the following Betas
rclone-v1.35-33-g47ebd07β
rclone-v1.35-40-gb6848a3β
rclone-v1.33-100-gcb40511β

rclone-v1.35-33-g47ebd07β had an issue where it moved changed files over, as the original for ACD...
rclone-v1.35-40-gb6848a3β seems to work perfectly
rclone-v1.33-100-gcb40511β Is missing the --backup-dir option completely.

But good work, from my testing on rclone-v1.35-40-gb6848a3β, everything is working nicely with ACD.

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 2, 2017

Owner

@fboyd78 thanks for testing.

Owner

ncw commented Feb 2, 2017

@fboyd78 thanks for testing.

@vb0

This comment has been minimized.

Show comment
Hide comment
@vb0

vb0 Feb 4, 2017

I always missed this option from rclone, it is something EXTREMELY useful and not easy (and not completely) possible to work around. It is fantastic for making simple (=reliable and hard to screw up) incremental backups, very easy to access, very easy to test everything is there (as the destination folder will always mirror the original while the --backup-dir directories will contain all the removed/changed files), very easy to prune (just delete all --backup-dir from last year), etc, etc, etc.

Now that we have it rclone is a one-line powerful almost no-setup-needed (encrypted too if desired!) all-included incremental backup system. No database needed, no big multi-GB archive files holding who-knows-what-where, everything just filesystem-based. You can set it on multiple computers with minimal effort, you can access your files from any machine with minimal setup (just get your config and rclone binary).

For reference (maybe it helps somebody less familiar with Windows's peculiarities) I'm using in Windows %datetime% generated from cygwin's date binary and of course in linux the same date directly:

for /f %%i in ('c:\cygwin64\bin\date.exe +"%%Y%%m%%d%%H%%M%%S"') do set datetime=%%i

Two inconsequential details I've noticed testing this:

  • there's no "--backup" switch needed (actually accepted at all) like for rsync, just use --backup-dir
    - it doesn't seem to work for "copy", only "sync". Perfectly fine, anyway we wanted it for sync but if you're coming from copy (like I was, because I wanted sync actually but was too afraid it'll sync some accidentally removed folders) you might be surprised it isn't working as expected.

In any case absolutely fantastic job. Thanks a lot.

vb0 commented Feb 4, 2017

I always missed this option from rclone, it is something EXTREMELY useful and not easy (and not completely) possible to work around. It is fantastic for making simple (=reliable and hard to screw up) incremental backups, very easy to access, very easy to test everything is there (as the destination folder will always mirror the original while the --backup-dir directories will contain all the removed/changed files), very easy to prune (just delete all --backup-dir from last year), etc, etc, etc.

Now that we have it rclone is a one-line powerful almost no-setup-needed (encrypted too if desired!) all-included incremental backup system. No database needed, no big multi-GB archive files holding who-knows-what-where, everything just filesystem-based. You can set it on multiple computers with minimal effort, you can access your files from any machine with minimal setup (just get your config and rclone binary).

For reference (maybe it helps somebody less familiar with Windows's peculiarities) I'm using in Windows %datetime% generated from cygwin's date binary and of course in linux the same date directly:

for /f %%i in ('c:\cygwin64\bin\date.exe +"%%Y%%m%%d%%H%%M%%S"') do set datetime=%%i

Two inconsequential details I've noticed testing this:

  • there's no "--backup" switch needed (actually accepted at all) like for rsync, just use --backup-dir
    - it doesn't seem to work for "copy", only "sync". Perfectly fine, anyway we wanted it for sync but if you're coming from copy (like I was, because I wanted sync actually but was too afraid it'll sync some accidentally removed folders) you might be surprised it isn't working as expected.

In any case absolutely fantastic job. Thanks a lot.

@balazer

This comment has been minimized.

Show comment
Hide comment
@balazer

balazer Feb 4, 2017

--backup-dir is supposed to work with sync, copy, and move. I've tested it with sync and copy to Google Drive, and it is working fine. @vb0, if it's not working for you, maybe you can provide details so it can be debugged.

balazer commented Feb 4, 2017

--backup-dir is supposed to work with sync, copy, and move. I've tested it with sync and copy to Google Drive, and it is working fine. @vb0, if it's not working for you, maybe you can provide details so it can be debugged.

@vb0

This comment has been minimized.

Show comment
Hide comment
@vb0

vb0 Feb 4, 2017

My mistake, it does work as expected with "copy" - I was probably to quick and the web GUI wasn't showing the folders yet.

vb0 commented Feb 4, 2017

My mistake, it does work as expected with "copy" - I was probably to quick and the web GUI wasn't showing the folders yet.

@traynier

This comment has been minimized.

Show comment
Hide comment
@traynier

traynier Mar 8, 2017

I'm a very recent convert to rclone, and love it. Thanks for all your hard work on it! :-)

I have been testing this new functionality out, and just updated to the latest beta rclone v1.35-163-gc45c604β, and the following error(s) still happens:

Note my remote is onedrive, and using crypt with encrypted filenames (although using the onedrive remote NOT using crypt doesn't seem to make any difference?)

I am copying from local to my remote, and specifying a backup-dir:

rclone sync test oneenc:Z/test --backup-dir oneenc:Zold

Note that if the backup-dir does not exist, it gets stuck "Waiting for checks to finish", and in digging down into --dump-bodies gets in a retry loop calling GET /v1.0/monitor/REDACTED over and over after the call to POST /v1.0/drive/items/REDACTED/action.copy

The monitor returns HTTP 500, which seems to make rclone think it's rate limited and gets the pacer to continually retry, but the HTTP500 monitor response shows the server side copy actually failed in the body returned:

{"operation":"ItemCopy","percentageComplete":0.0,"status":"failed","statusDescription":"Completed 0/0 files; 0/85 bytes"}

If the backup-dir exists, but the file with the same name already exists in the backup-dir, it fails to copy:

2017/03/08 18:35:42 ERROR : test.txt: Failed to copy: can't copy "002i3q2u20jlcd0ietvoa93ri8" -> "002i3q2u20jlcd0ietvoa93ri8" as are same name when lowercase

If I use --suffix ".old", but a file with the new filename (i.e. including the new suffix) already exists in backup-dir, it gets stuck in a loop calling monitor the same as if the directory doesn't exist.

[Edit] I will confirm if the backup-dir exists, and the file doesn't exist in the backup-dir it does work as expected. :)

traynier commented Mar 8, 2017

I'm a very recent convert to rclone, and love it. Thanks for all your hard work on it! :-)

I have been testing this new functionality out, and just updated to the latest beta rclone v1.35-163-gc45c604β, and the following error(s) still happens:

Note my remote is onedrive, and using crypt with encrypted filenames (although using the onedrive remote NOT using crypt doesn't seem to make any difference?)

I am copying from local to my remote, and specifying a backup-dir:

rclone sync test oneenc:Z/test --backup-dir oneenc:Zold

Note that if the backup-dir does not exist, it gets stuck "Waiting for checks to finish", and in digging down into --dump-bodies gets in a retry loop calling GET /v1.0/monitor/REDACTED over and over after the call to POST /v1.0/drive/items/REDACTED/action.copy

The monitor returns HTTP 500, which seems to make rclone think it's rate limited and gets the pacer to continually retry, but the HTTP500 monitor response shows the server side copy actually failed in the body returned:

{"operation":"ItemCopy","percentageComplete":0.0,"status":"failed","statusDescription":"Completed 0/0 files; 0/85 bytes"}

If the backup-dir exists, but the file with the same name already exists in the backup-dir, it fails to copy:

2017/03/08 18:35:42 ERROR : test.txt: Failed to copy: can't copy "002i3q2u20jlcd0ietvoa93ri8" -> "002i3q2u20jlcd0ietvoa93ri8" as are same name when lowercase

If I use --suffix ".old", but a file with the new filename (i.e. including the new suffix) already exists in backup-dir, it gets stuck in a loop calling monitor the same as if the directory doesn't exist.

[Edit] I will confirm if the backup-dir exists, and the file doesn't exist in the backup-dir it does work as expected. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment