Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mount backups as real folders #3081

Open
piegamesde opened this issue Mar 5, 2018 · 11 comments
Open

Mount backups as real folders #3081

piegamesde opened this issue Mar 5, 2018 · 11 comments

Comments

@piegamesde
Copy link
Contributor

At the current point, there is no good way to have a look at the backups made. My idea is to change this by writing a FUSE module for the backup format. Here's how using it could look like:

  • Mount the backup drive as a local file if remote. (For now; this is to avoid the additional complexity of remove access.)
  • Mount the folder where the backup is to a new folder using that FUSE module. The module will decrypt the needed data as it is requested.
  • This folder will contain one subfolder per snapshot of the backup
  • Each folder contains the full backup to browse

Obviously, this will be read-only and very (very) slow. But if you want to do a manual partial restore or just run a script or make a diff to the original, this will come in very handy.

@kenkendk
Copy link
Member

kenkendk commented Mar 5, 2018

Obviously, this will be read-only and very (very) slow

If you use the local database, each directory listing is just a database lookup, so it should be reasonably fast.

Reading file contents is another problem, because you will only get one file request at a time, and Duplicati's format expects you to do a batch

@piegamesde
Copy link
Contributor Author

If its as slow as the folder listing when restoring a backup, it is going to be slow (But I still don't care). Your idea of using the local database looks promising, at least for browsing the files.

In the worst case, is it possible to do a batch and then discard the information not needed? We could then add caching afterwards.

@kenkendk
Copy link
Member

kenkendk commented Mar 6, 2018

If its as slow as the folder listing when restoring a backup, it is going to be slow

There was a big improvement after the beta, so if you have not upgraded to the latest canary build, you can try that.

Otherwise, it should be possible to make a faster folder listing, and I hope to get working on a restructure of the paths in the database, which would make it really fast.

@piegamesde
Copy link
Contributor Author

piegamesde commented Mar 6, 2018

At the moment, I see the following possibilities:

  • Write it entirely in Python. For listing directories it will query the database directly with select * from File where Path like "/path/to/parent/%" and possibly for the file size too (is the file size stored in the local database?). For real read operations the script will need to find the correct blocks from the database and somehow process them. In this case, it will be useful to combine all the scripts to create a standalone Duplicati python API.
  • Write it in Python, but pass the queries on to C#. This will require a standalone Duplicati API or a greatly enhanced command line.
  • Write it in C#. There is one C# fuse library, but it doesn't look that good to me. This one has the advantage that the module can directly use the original Duplicati code for queries.
  • Write it in C. I'm not gonna do it.

I like Python for the simplicity of the fusepy API, but I kind of dislike the code duplication between C# and Python. I didn't have a look at the code yet, but here are a few questions:

  • Is there something like a "Duplicati API" I could use like a library?
  • How tightly is it tied to the actual application (the server)?
  • Where can I find it?
  • How to use it?
  • What is the easiest/fastest way to get the size of a given file?

@kenkendk
Copy link
Member

kenkendk commented Mar 7, 2018

Fuse on Windows is not working that great, but I tried this a few years ago:
https://github.com/dokan-dev/dokany

I am not sure what can be done in terms of cross-os FUSE if that is your target.

I have used the Python-fuse library (beware there are two incompatible version of libfuse), and it works great and is really simple to use.

I try to write as much as possible in C# as it find that it is slightly slower to write (compared to Python) but performs better and is much easier to maintain later on (static types FTW).

One nice thing about Python is that it is simple to integrate with external stuff, so maybe the solution is to write a fuse-layer in Python and then have a C# Duplicati.FuseSupport.exe program?

Mono is generally a bit slow to start, and on Windows process spawning is slow, so you would want something that you can start, and then feed commands (.e.g. use stdin/stdout).

Is there something like a "Duplicati API" I could use like a library?

There is a server, but that requires a process that is running somewhere. It could be done this way, but I prefer a new client that exposes the functionality needed for fuse.

What is the easiest/fastest way to get the size of a given file?

Either download the dlist file, decrypt+decompress it, parse the json, find the path you want, and the size is there. Or, make an SQL query into the local database for the path (provided you have a database already).

@piegamesde
Copy link
Contributor Author

piegamesde commented Mar 7, 2018

I'm targetting Linux only for the moment, porting will come later (eventually). I really like the Duplicati.FuseSupport.exe idea how you described it. Can the command line (#2693) do such things too? We only need a "continuous mode" that scans the input in a loop to avoid starting mono for each call. Here's the minimal functionality I would need:

  • Make local database queries
  • Get some deencrypted data from the backup

Here's all the functionality the module must implement:

  • Get and list information about different snapshots of a backup (local database)
  • List files from a folder including size and metatdata (like permissions, timestamp etc.) (local database?)
    • Note to self: File size: SELECT Length from Blockset WHERE ID == (SELECT BlocksetID FROM File where Path == '/full/path/to/file.ext' LIMIT 1) to get the size.
    • List files in folder as before
    • The metadata only contains the block ID of the specific file and thus needs database access.
  • Read a single file (remote database)
  • Follow symbolic links (local?)

This might be done by the command-line application too, but is optional.

I'd like to know which of those can actually be made through the local database and which SQL queries do this.

@kenkendk
Copy link
Member

I think this helps in understanding the database layout:
https://github.com/duplicati/duplicati/wiki/Local-database-format

The "fileset" table contents are essentially a snapshots.

There is no file-data in the database, only metadata, so you cannot read files directly.

@git70
Copy link

git70 commented Jan 25, 2022

Is there any progress on this topic?
BORG seems to have the feature of mounting the backup as a drive or folder. Maybe it's worth seeing how he does it?
It will be great to preview individual remote files!

@ts678
Copy link
Collaborator

ts678 commented Jan 25, 2022

BORG seems to have the feature of mounting the backup as a drive or folder. Maybe it's worth seeing how he does it?

Borg Documentation provides some high-level description. Below that, internals are probably too different from Duplicati.

borg fs

The command borgfs provides a wrapper for borg mount.

borg mount

This command mounts an archive as a FUSE filesystem.

Filesystem in Userspace (Wikipedia) gives general concept.

Filesystem in USErspace (FUSE) is a software interface for Unix and Unix-like computer operating systems

https://github.com/libfuse/libfuse

The reference implementation of the Linux FUSE (Filesystem in Userspace) interface

So if you start at the original post then read down, the challenge has been supporting all the systems Duplicati supports.
Development resources are always a challenge too, and tend to prevent feature additions unless a new volunteer arrives.

It will be great to preview individual remote files!

Open File from History/See File Content is a feature request similar to this that sounds more portable (uses the browser).
It might be easier, but is still quite a bit of doing, and probably needs a web developer who can write the AngularJS code.

@ltguillaume
Copy link
Contributor

@ts678 Well, RClone mount is cross-platform. Uses FUSE, too and for Windows it uses WinFsp, works great!

@ts678
Copy link
Collaborator

ts678 commented Jan 26, 2022

and for Windows it uses WinFsp, works great!

Thanks for the revised suggestion and the personal testimonial about its reliability. There is still a portability issue on the Linux side which a developer would need to look into. Duplicati is almost entirely portable because C# code is interpreted by .NET Framework (Windows) or mono (Linux, FreeBSD, Mac, maybe more). The rclone Downloads come in seven processor architectural build flavors.

Duplicati does not currently build processor specific flavors. It does package that code or a subset of it in different package formats. There was one recent portability exception that is already producing complaints: Tardigrade: Support ARM architectures [$5] #4342.

The future of Duplicati (subject, as always, to developer availability) is with new flavors of .NET which replace existing tools. This may cause Duplicati to similarly ship in multiple architectures, because the idea of self-contained executables has appeal, aside from that.

Possibly when/if Duplicati is a self-contained executable built for different architectures, one can either get to a preinstalled system FUSE shared library or link in a static one, but this is too hazy for me to comment on. Perhaps an appropriate expert will stop by...

See Net 5 migration #3124 which seems to be slowing. Developer time is scarce. Without volunteers, things tend not to happen...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants