Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live backup preview (mount) #636

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

davidrios
Copy link

@davidrios davidrios commented Oct 20, 2022

This work is inspired by !628, but it's a more complete implementation.

I'm using cgofuse so it works on all platforms, including Windows.

There are two commands:

  • mount <MOUNT_POINT>: must be in a repository directory, uses preferences information to load snapshot info.
  • mount-storage <STORAGE_URL> <MOUNT_POINT>: mounts a storage directly without the need for a repository. note that it may need a repository under some circumstances anyway, so there's a -repository option for that.

Snapshot/revision information is loaded only when you try to browse the containing folder. By default it'll create a base tree organizing snapshot revisions by date using the format /%YYYY/%MM/%DD/%HH%mm.%REV. For instance, let's say you have this output from duplicacy list:

$ duplicacy list
Snapshot Users revision 1 created at 2022-10-19 20:09 -hash
Snapshot Users revision 2 created at 2022-10-19 21:38
Snapshot Users revision 3 created at 2022-10-20 10:38
Snapshot Users revision 4 created at 2022-11-14 13:44

The following base tree will be created:

2022/
    10/
        19/
            2009.1/
            2138.2/
        20/
            1038.3/
    11/
        14/
            1344.4/

If you want all revisions to be on the same directory, you can use the -flat parameter. Revision and snapshot directories have an extra empty directory to prevent Windows explorer from prematurely triggering the downloading of data from the storage.

Folders and files will display their saved attributes, with the caveat that everything that's not a folder is shown as a regular file.

File reading is efficiently implemented by only downloading the chunks needed for the specific OS read request and caching them. The cache is a 2Q LRU cache by golang-lru with size 20 and keyed by the chunk hash.

I've tested using a repository with these characteristics:

  • 10.2 GB
  • 70151 Files
  • 10467 Folders
  • 4 revisions
  • sftp storage on a remote server

It takes a couple of seconds to create the base tree and a few seconds for every revision dir that is loaded for the first time, but otherwise everything works as expected. Memory use hovered around 700MB, even tarring a whole revision dir and piping it to /dev/null.

@CLAassistant
Copy link

CLAassistant commented Oct 20, 2022

CLA assistant check
All committers have signed the CLA.

@sevimo123
Copy link
Contributor

I haven't tried it yet, but it sounds great! Couple of questions here:

  • when/how cached revision list is refreshed (e.g. you did mount, then run another backup, now you want to see new revision in the mount)?
  • Does the cache use memory/disk? Is there any way to limit cache usage (e.g. ala rclone vfs parameters), as otherwise it would be possible to run out of memory/disk during long-running mounts.

We'll need to stress test it with some storages with thousands of revisions to see how it scales up. Cheers!

@davidrios
Copy link
Author

davidrios commented Oct 20, 2022

  • Revision list is loaded only once, so if you have new revisions you must stop and start again to see them
  • Chunk cache is on memory but only stores 20 items, so for a max chunk size of 16MB it would only consume up to ~320MB of memory
  • Revision file list is saved on memory permanently, but memory use is low, so it should be fine to browse a few dozen/hundred at a time. Only revisions that were browsed / read from are loaded and saved to memory.

The design choices were made on the assumption that the command will be used when needed and only to browse a few revisions. I guess it could be tweaked to support running permanently as a service, if that's a desired use case.

@sevimo123
Copy link
Contributor

The design choices were made on the assumption that the command will be used when needed and only to browse a few revisions. I guess it could be tweaked to support running permanently as a service, if that's a desired use case.

I tend to run my mounts more or less permanently, so that I can have access from anything (scripts etc) without running anything special first. But for this case I don't think it is really necessary as primary use would be to restore, which may never (or very infrequently) happen. I think for the most cases it is perfectly fine to re-mount for refresh.

Though usecase might be just the ability to check that new revisions are getting created by simply looking at filesystem, without running specific duplicacy commands. Probably not too important ;)

@davidrios
Copy link
Author

To me that sounds strange, but I can definitely understand the appeal. I guess if the feature is there, there's really no reason to not use it 😂

@gilbertchen
Copy link
Owner

This pull request has been mentioned on Duplicacy Forum. There might be relevant details there:

https://forum.duplicacy.com/t/mount-duplicacy-snapshot-as-a-virtual-filesystem/1718/18

@gilbertchen
Copy link
Owner

This pull request has been mentioned on Duplicacy Forum. There might be relevant details there:

https://forum.duplicacy.com/t/restore-multiple-files-web-ui/5320/21

@gilbertchen
Copy link
Owner

This pull request has been mentioned on Duplicacy Forum. There might be relevant details there:

https://forum.duplicacy.com/t/mount-duplicacy-snapshot-as-a-virtual-filesystem/1718/47

@cyberguy42
Copy link

@davidrios I can't believe this pull request hasn't received more attention. The absence of this functionality was a deal breaker for me when I first tested Duplicacy several years ago. When I saw this pull request about a week ago, I decided to give Duplicacy another go and have not regretted it. Thank you for sharing your efforts with the community!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants