Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to backup/restore files/dirs with same name #549

Closed
rwlodarc opened this issue Jul 26, 2016 · 28 comments · Fixed by #1494
Closed

Unable to backup/restore files/dirs with same name #549

rwlodarc opened this issue Jul 26, 2016 · 28 comments · Fixed by #1494

Comments

@rwlodarc
Copy link

Output of restic version

restic 0.1.0
compiled at 2016-07-20 12:42:43 with go1.6.3

Expected behavior

After restore all directories should be restored.

Actual behavior

Only one directory is restored.

Steps to reproduce the behavior

  1. Create files

/tmp/restic/FILESNEW01/Dir01/Test01.txt
/tmp/restic/FILESNEW01/Dir01/Test02.txt
/tmp/restic/FILESNEW01/Dir01/Test03.txt

/tmp/restic/FILESNEW02/Dir01/Test01.txt
/tmp/restic/FILESNEW02/Dir01/Test02.txt
/tmp/restic/FILESNEW02/Dir01/Test03.txt

content of files:
cat /tmp/restic/FILESNEW01/Dir01/Test0*
Content file. /tmp/restic/FILESNEW01/Dir01/Test01.txt
Content file. /tmp/restic/FILESNEW01/Dir01/Test02.txt
Content file. /tmp/restic/FILESNEW01/Dir01/Test03.txt

cat /tmp/restic/FILESNEW02/Dir01/Test0*
Content file. /tmp/restic/FILESNEW02/Dir01/Test01.txt
Content file. /tmp/restic/FILESNEW02/Dir01/Test01.txt
Content file. /tmp/restic/FILESNEW02/Dir01/Test03.txt

I want backup

  • /tmp/restic/FILESNEW01/Dir01/
  • /tmp/restic/FILESNEW02/Dir01/

Commands:
Initiate repository in /tmp/restic/BACKUP directory

  • restic -r /tmp/restic/BACKUP/ init

Make backup

  • restic backup /tmp/restic/FILESNEW01/Dir01 /tmp/restic/FILESNEW02/Dir01 -r /tmp/restic/BACKUP/

scan [/tmp/restic/FILESNEW01/Dir01 /tmp/restic/FILESNEW02/Dir01]
scanned 2 directories, 6 files in 0:00
[0:00] 16.67% 0B/s 51B / 306B 0 / 8 items 0 errors ETA 0:00 duration: 0:00, 0.01MiB/s
snapshot 4d197b90 saved

Checking if backup exists in repository

  • restic -r /tmp/restic/BACKUP/ snapshots

ID Date Host Directory

4d197b90 2016-07-26 14:14:43 nebss /tmp/restic/FILESNEW01/Dir01 /tmp/restic/FILESNEW02/Dir01

Restore backup

  • restic -r /tmp/restic/BACKUP/ restore 4d197b90 -t /tmp/restic/RESTORE/

restoring <Snapshot 4d197b90 of [/tmp/restic/FILESNEW01/Dir01 /tmp/restic/FILESNEW02/Dir01] at 2016-07-26 14:14:43.208840145 +0300 EEST> to /tmp/restic/RESTORE/

Checking if directories/files exists

  • ls /tmp/restic/RESTORE/
    Dir01
  • cat /tmp/restic/RESTORE/Dir01/Test0*
    Content file. /tmp/restic/FILES01/Dir01/Test01.txt
    Content file. /tmp/restic/FILES01/Dir01/Test02.txt
    Content file. /tmp/restic/FILES01/Dir01/Test03.txt
@fd0
Copy link
Member

fd0 commented Jul 26, 2016

Thanks for reporting this issue, I think this is a bug.

@mappu
Copy link
Contributor

mappu commented Jul 27, 2016

Probably will happen whenever top-level directories have the same name. Because the full path is not restored, only the top-level directory.

Solution is to reconstruct the full path upon restore, and restore each tree into the full path. So resulting path would be like /tmp/restic/tmp/restic/FILESNEW0{1,2}/Dir01/. I think it's acceptable.

Does the patch need to be implemented as part of the restore?
Or, maybe it has to be done during backup by building a different top-level tree that includes the full path components?

@fd0
Copy link
Member

fd0 commented Jul 28, 2016

I also suspect this is the case. At the moment, restic works like this:

When called as restic backup A/foo B/foo it creates a tree structure in the repository that looks like this:

├── foo
└── foo

So only the last path component from the arguments to the backup command is taken, this leads to a problem when restoring such a snapshot.

In order to correct this, I propose implementing the same behavior as tar, which would in this case create the following tree:

.
├── A
│   └── foo
└── B
    └── foo

This will require some work in the archiver part of restic. I don't think we'll need to touch restore at all.

@wscott
Copy link

wscott commented Aug 26, 2016

#588 I reported the same thing. But that one has a test case you can use.

@trbs
Copy link
Contributor

trbs commented Aug 26, 2016

@fd0 I propose to also include an option (--store-full-path) to backup where it explicitly stores the full 'real' path of the backup target.

The reasoning is that in the tar case and with several other backup tools you can get a little convoluted restore tree. While this is a good sane default I personally would like it if my restores resemble the entire layout of the original filesystem for host backups. (Even better if restore could also prefix the hostname to the restore location)

@pdf
Copy link

pdf commented Aug 27, 2016

@trbs I think the default needs to be to store full paths, with a switch for the special case of using relative paths. Reason being that rel paths can produce unexpected or undefined behaviour, but abs can't. If you want to request prefixes or some other form of path mangling, I'd suggest that's an entirely separate issue.

@fd0
Copy link
Member

fd0 commented Aug 27, 2016

I've thought about this and I think we need to change the backup behavior so that always the full path (as given on the command-line) is saved. That's what tar does, and it works very well. This is unfortunately a relict of a bad design decision early in the development for restic.

@fd0 fd0 changed the title Restore doesn't restore few directories Unable to backup/restore files/dirs with same name Aug 27, 2016
@yhafri
Copy link

yhafri commented Mar 21, 2017

+1 for --store-full-path

@zcalusic
Copy link
Member

Hate to just +1, but I'm also very interested in solution for this bug. Have several pending installations of restic, where this bug is unfortunately a showstopper.

Thanks @fd0 for your work on this, I understand it's not easy to unwind now.

@middelink
Copy link
Member

middelink commented Mar 21, 2017

-1 for --store-full-path. I would much rather see the full path always going in the backup and then having a --strip-components <N> to take parts away if you don't need them at restore time. This means the full data is always available in the backup and if the user strips too many components from the path at restore time and therefore combines subdirs, it becomes a recoverable user error.

As to prefixing the hostname to the backup location, this seems it can be easily done from the cmdline, as most people know from which host they are going to restore beforehand :)

@mholt
Copy link
Contributor

mholt commented Jun 10, 2017

Given that you're not 1.0 yet, I vote that, if a breaking change has to be made in order for the ideal fix, go ahead and do it sooner rather than later.

@fd0
Copy link
Member

fd0 commented Jun 10, 2017

@mholt I agree, I'm already working on this. As I said, this is caused by a bad design decision early on and needs to be corrected.

@zx2c4
Copy link

zx2c4 commented Jul 5, 2017

Hey @fd0 -- just saw that 0.7 was released. Is this (and #910 and #909) on the map for 0.7.1?

@fd0
Copy link
Member

fd0 commented Jul 5, 2017

Maybe not for 0.7.1, but for 0.8.0 or so. I've already started working on it though. Maybe a bit of background: This is caused by the archiver code, which is the oldest code present in restic. Unfortunately (as I was just beginning to learn Go back in 2013/2014) the archiver code is very complex and I made a lot of beginner mistakes (too much concurrency, too many channels). I also worried about things that turned out to be not a problem at all, and overlooked things that became a problem (e.g. the index handling).

So, I've already started on reimplementing the archiver code completely, using concurrency only when it makes sense (i.e. processing individual chunks) and not reading 20 files from the disk in parallel. This code also includes proper directory walking and will insert the complete paths into the repo.

Fortunately, this is really just the archiver that needs to be touched, the rest of the code will (thanks to the design of restic and the repo) just continue to work fine.

@mbiebl
Copy link

mbiebl commented Aug 17, 2017

will this change affect existing repositories and if so, how?

@fd0
Copy link
Member

fd0 commented Aug 18, 2017

"affecting" in terms of "new backups will have a slightly different structure", yes, but that's about it. No migrate or anything needed.

@rvdh
Copy link

rvdh commented Sep 11, 2017

@fd0 Any idea when we might expect snapshots that contain the full original path? We are currently working on automating backups and restores using restic.

When automating the restore, having the source path intact is essential.

If I have a server with two 'data' directories being backed up (and this is not theoretical, we have a number of servers with Confluence and JIRA 'data' directories that need to be backed up) - the restore process needs to know which data directory belongs to Confluence and which data directory belongs to JIRA. A name like 'data' and 'data-1' obviously doesn't cut it here.

I think the best workaround for now is backing up the data directories in seperate snapshots and tagging them with 'JIRA' or 'Confluence'?

@fd0
Copy link
Member

fd0 commented Sep 11, 2017

There's no timeline, sorry.

@fd0 fd0 added this to the 0.8.0 milestone Sep 11, 2017
@bherila
Copy link

bherila commented Sep 12, 2017

I think the best workaround for now is backing up the data directories in seperate snapshots and tagging them with 'JIRA' or 'Confluence'?

Yes, but per #1225 you won't be able to easily merge them into one repo later.

@willemw12
Copy link

Regarding option --store-full-path: rsync has this option: -R, --relative.
Maybe use the same option name for restic?

@fd0
Copy link
Member

fd0 commented Feb 20, 2018

For full-system backups I've described a workaround here: https://forum.restic.net/t/full-system-restore/126/8 It's not pretty but will do the job until #1494 is done.

@tbluemel
Copy link

This bug worried me a bit, but I can't reproduce it in 0.8.3 with the steps provided. Is this still an open issue?

@fd0
Copy link
Member

fd0 commented Mar 20, 2018

Yes, unfortunately this is still an issue.

@tbluemel
Copy link

Hm, I somehow can't replicate the issue, so not sure what I'm doing different. I attached my test script.

test_restic_549.zip

@fd0
Copy link
Member

fd0 commented Mar 21, 2018

You can reproduce it like this:

$ mkdir dir1/subdir
$ echo foo > dir1/subdir/foo

$ mkdir dir2/subdir
$ echo bar > dir2/subdir/bar

$ restic backup dir1/subdir dir2/subdir
password is correct
scan [/home/user/dir1/subdir /home/user/dir2/subdir]
scanned 2 directories, 2 files in 0:00
/home/user/dir2: name collision for "subdir", renaming to "subdir-1"
[...]
snapshot f6138d06 saved

For the two subdirs, restic uses the basename of the subdir as the top-level dir in the repo, so for either dir1/subdir and dir2/subdir it's both subdir, that's what causes the collision.

Listing the latest snapshot shows it:

$ restic ls latest
password is correct
snapshot f6138d06 of [/home/user/dir1/subdir /home/user/dir2/subdir] at 2018-03-21 20:38:33.58232292 +0100 CET):
/subdir
/subdir/foo
/subdir-1
/subdir-1/bar

In your test case, the basenames of $TESTDIR/dir1 and $TESTDIR/dir2 are different (dir1 vs. dir2) so the bug does not occur.

@flo82
Copy link

flo82 commented May 25, 2018

From the release notes of version 0.9:

The first backup with this release of restic will likely result in all files being re-read locally, so it will take a lot longer. The next backup after that will be fast again.

I just want to give you some statistics:

first backup:

-------------------------------------------------------------
Start: Do 24. Mai 05:15:01 CEST 2018
437 snapshots

Files:           0 new,     0 changed, 40524 unmodified
Dirs:            0 new,     0 changed,     2 unmodified
Added:      0 B

processed 40524 files, 14.805 GiB in 1:38
snapshot f724ff21 saved

Files:         556 new,     0 changed,     0 unmodified
Dirs:            2 new,     0 changed,     0 unmodified
Added:      719 B

processed 556 files, 914.493 GiB in 2:15:29
snapshot 3c0e0f1b saved

Files:       11570 new,     0 changed,     0 unmodified
Dirs:            2 new,     0 changed,     0 unmodified
Added:      719 B

processed 11570 files, 66.044 GiB in 16:21
snapshot 312fd29c saved

Files:        2309 new,     0 changed,     0 unmodified
Dirs:            2 new,     0 changed,     0 unmodified
Added:      719 B

processed 2309 files, 163.332 GiB in 24:13
snapshot 2baab573 saved

Files:         312 new,     0 changed,     0 unmodified
Dirs:            2 new,     0 changed,     0 unmodified
Added:      719 B

processed 312 files, 1.503 TiB in 4:48:23
snapshot 02dfe40c saved

Files:       743172 new,     0 changed,     0 unmodified
Dirs:            2 new,     0 changed,     0 unmodified
Added:      84.927 MiB

processed 743172 files, 89.131 GiB in 2:48:59
snapshot dcee3e70 saved

Files:         441 new,     0 changed,     0 unmodified
Dirs:            2 new,     0 changed,     0 unmodified
Added:      719 B

processed 441 files, 727.575 GiB in 1:56:36
snapshot 676adc45 saved
End:   Do 24. Mai 17:46:46 CEST 2018
Duration: 12h:31m:45s
-------------------------------------------------------------

second one:

-------------------------------------------------------------
Start: Fr 25. Mai 05:15:01 CEST 2018
444 snapshots

Files:           0 new,     0 changed, 40524 unmodified
Dirs:            0 new,     0 changed,     2 unmodified
Added:      0 B

processed 40524 files, 14.805 GiB in 1:42
snapshot 9c7cf320 saved

Files:           0 new,     0 changed,   556 unmodified
Dirs:            0 new,     0 changed,     2 unmodified
Added:      0 B

processed 556 files, 914.493 GiB in 0:15
snapshot 533e2155 saved

Files:           0 new,     0 changed, 11570 unmodified
Dirs:            0 new,     0 changed,     2 unmodified
Added:      0 B

processed 11570 files, 66.044 GiB in 0:17
snapshot 1c1235c3 saved

Files:           0 new,     0 changed,  2309 unmodified
Dirs:            0 new,     0 changed,     2 unmodified
Added:      0 B

processed 2309 files, 163.332 GiB in 0:13
snapshot d5ef168d saved

Files:           0 new,     0 changed,   312 unmodified
Dirs:            0 new,     0 changed,     2 unmodified
Added:      0 B

processed 312 files, 1.503 TiB in 0:16
snapshot 76e94946 saved

Files:         292 new,     0 changed, 743172 unmodified
Dirs:            0 new,     2 changed,     0 unmodified
Added:      32.790 MiB

processed 743464 files, 89.163 GiB in 1:06
snapshot 12fa66e8 saved

Files:           0 new,     0 changed,   441 unmodified
Dirs:            0 new,     0 changed,     2 unmodified
Added:      0 B

processed 441 files, 727.575 GiB in 0:15
snapshot ab2d29bb saved
End:   Fr 25. Mai 05:19:12 CEST 2018
Duration: 0h:4m:11s
-------------------------------------------------------------

so a lot longer, means a lot longer :-)
Keep up the great work! 👍

@tbluemel
Copy link

@fd0, awesome work! Thanks so much! Your backup tool has become my favorite for all my off-site backups (using b2) :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.