Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for file encryption (e.g. non-trusted servers) #109

Closed
Natanji opened this issue Apr 4, 2014 · 143 comments
Closed

Support for file encryption (e.g. non-trusted servers) #109

Natanji opened this issue Apr 4, 2014 · 143 comments
Labels
enhancement New features or improvements of some kind, as opposed to a problem (bug)
Milestone

Comments

@Natanji
Copy link

Natanji commented Apr 4, 2014

So I have had a look at BitTorrent sync, syncthing and alternatives and what I always wondered about was the possibility to not only sync between resources I own and trust, but also external resources/servers which I do NOT trust with my data, up to a certain extent.

One way to do this is using ecryptfs or encfs, but this has many obvious downsides: it is not an interoperable solution (only works on Linux), the files are actually stored in encrypted form on the disk (even if the resource is trusted and this is not necessary, for instance because of the file system being encrypted already), etc.

What I propose is somehow configuring nodes which are only sent the files in an encrypted format, with all file contents (and potentially file/directory names as well; or even permissions) being encrypted. This way, if I want to store my private files on a fast server in a datacenter to access them from anywhere, I could do this with syncthing without essentially giving up ownership of those files. I could also prevent that particular sync node from being allowed/able to make any changes to the files without me noticing.

I realize that this requires a LOT of additional effort, but it would be a killer feature that seems to not be available in any other "private cloud" solution so far. What are your thoughts on this feature?

EDIT: BitTorrent sync mentions a feature like this in their API docs: "Encryption secret
API users can generate folder secrets with encrypted peer support. Encryption secrets are read-only. They make Sync data encrypted on the receiver’s side. Recipients can sync files, but they can’t see file content, and they can’t modify the files. Encryption secrets come in handy if you need to sync to an untrusted location." (from http://www.bittorrent.com/intl/de/sync/developers/api)

@jewel
Copy link

jewel commented Apr 4, 2014

This would be amazing. I tried to spec out what this might look like in this clearskies extension, but it adds so much complexity that I've tabled plans for it for now.

Like you say, if only the file contents are synchronized to the "untrusted" peers, that would be a lot simpler to implement (i.e. the metadata never hits the untrusted peer in any form). I hadn't thought of that.

@Natanji
Copy link
Author

Natanji commented Apr 4, 2014

It seems like you even thought of a zero-knowledge-proof to show that the server is legitimate/actually stores the files (did I understand that correctly?). Not bad.

CTR mode sounds like an extremely bad choice for me, just like other stream ciphers like GCM. Yes, it is seekable and that is useful, but XORing two snapshots of encryption on top of each other will result in an adversary knowing what changed between the plaintext of those two files. CBC is a much better choice: when seeking, you may need two blocks of ciphertext to decrypt the first block of plaintext, but that is negligible usually because you will read more than one block anyway, and the more you decrypt the less overhead you get.

I don't really understand why encrypting everything - including metadata - should somehow be "easier" or simpler to implement. Maybe I'm misunderstanding you? What do you mean?

@jewel
Copy link

jewel commented Apr 4, 2014

I think you might have misunderstood, I was trying to say that it'd be simpler to implement if the metadata isn't synced.

Thanks for the feedback on CTR mode, I wasn't aware that seeking was possible with CBC mode.

@calmh
Copy link
Member

calmh commented Apr 5, 2014

I could see how this would be useful. As you say, it would require some work because it's currently not a part of the design at all - an additional layer of encryption would be needed. There would obviously be some trade offs between privacy and efficiency, ie if a blocks changes in the middle of a large file, do we resynchronize just that block and leak that fact or re-encrypt the entire file etc.

@calmh
Copy link
Member

calmh commented Apr 5, 2014

Also slightly related to #62 which is similar functionality, minus the encryption (ie for when we trust the node with the data, just not with modifying it).

@NickPyz
Copy link

NickPyz commented Apr 7, 2014

This idea is particularly useful for people who would use syncthing to setup a network of devices, and require 1 of them to be available 24 hours a day. Let's say all the devices are trusted except for the always on device, which is a 3rd party VPS server.

In this case, it would desirable for some additional properties built into syncthing so that the VPS node has the following characteristics:

READ ONLY (can't change any data)
ENCRYPTED (so the VPS personnel can't see the data).

No doubt this adds complexity and performance hits to support the encryption, especially if this project eventually extends to devices that don't support hardware-based encryption, such as most current smartphones,

@kylemanna
Copy link

Tahoe-LAFS has this feature and it would be awesome if a more usable implementation (I find tahoe-lafs WebAPI very painful and difficult to use).

https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/architecture.rst#security

It has the notion of "storage nodes" that hold chunks of distributed encrypted data. The default configuration is that any 3 chunks can restore the file out of a goal of 10 chunks on different storage nodes using erasure coding.

It would be nice if syncthing could support the distributed fractional concept as well, but that sounds like a topic for another issue. It may be out of scope too, hopefully not :)

@Natanji
Copy link
Author

Natanji commented Apr 16, 2014

Tahoe-LAFS sounds pretty much exactly like what we want - what and incredible find, I hadn't heard of it. Thanks, @kylemanna :)

The way I see it, syncthing already has the functionality to keep directories in sync and perform upload/download operations between the nodes whenever something changes. So the feature we want might not be that far out of reach: whenever a file changes, then we have to call the API of tahoe-lafs and upload/download the file.

I agree that we should start with a configuration where files are simply replicated completely on all foreign nodes. Fractional distribution can be added later if this setup turns out to work well.

The solution would also work on both Windows and Linux, which is a huge plus! And we don't have to do any crypto storage of our own, which would most probably turn out to be a failure anyway, I presume. :)

Sooo... anyone see a problem with this approach yet, from a design perspective? @calmh, do you think syncthing's design is compatible with tahoe-lafs?

@calmh
Copy link
Member

calmh commented Apr 16, 2014

Sure. However if Tahoe-lafs is a "cloud storage system", then perhaps that is all you need and syncthing doesn't add much to the equation if you already have that up and running?

@kylemanna
Copy link

I played with Tahoe-lafs for a while and it doesn't really do what I want. The major deal breaker for me was that the storage nodes don't work behind NAT. Everything I could find suggested that I needed to do port forwarding and tunneling of some sort. I'd imagine that a significant portion of the user base for syncthing is behind a NAT.

@mcg
Copy link

mcg commented May 12, 2014

These days, without some form of encrypted/untrusted node support, Syncthing is probably going to be unusable for some portion of users. One of the reasons I choose BT Sync over other solutions was it's support for this.

@elimisteve
Copy link

This feature would be great because it'd allow me to replace the rather bulky and Mono dependency-laden SparkleShare with syncthing, which is much easier to set up :-).

SparkleShare has been working well for me, though.

@EvdH0
Copy link

EvdH0 commented May 18, 2014

This would indeed be a great feature!

@menelic
Copy link

menelic commented May 19, 2014

This would indeed be a great feature, especially if it could be defined on a folder and/or file level as in BTSync. I'd argue adding this feature is part of the "BTSync-replacement" goal. This would add complexity for sure, but it would be great to have one Syncthing interface from which I can manage my synchronised shares with people who are supposed to have access as well as with locations which are not supposed to either have access or be able to see the files. As a VPS user, this would be great for me - and surely for a lot of others as well.

@bigbear2nd
Copy link
Contributor

For me, this feature is the only thing keeping me from switching from BTSync to Syncthing.

How the encryprtion works with BTSync ist discribed here in detail:
http://forum.bittorrent.com/topic/25823-generate-encrypted-read-only-secret-without-api-key/

The use would be for me: I can store data at a friends home, at my family members PCs and i dont have to worry about that they can access my data. Additionally i can store data for them and i cannot access it.

The more people which have my data, the faster my download / upload and spreading of data is.
Additionally, the safer my data is.

Syncing data is for me not only having it available, it has become data safekeeping as well.

@jedie
Copy link
Contributor

jedie commented May 28, 2014

Can closed source projects ever offer security? Keyword: verifiability...

And should i really sync important files to an untrusted location?

Just my 2¢...

@Natanji
Copy link
Author

Natanji commented May 28, 2014

Syncthing is open source. That's the point. That's why I don't want the
closed-source BTsync but a functionality in an open source project such
as syncthing.

Syncing important files to untrusted locations is usually not a problem
when they are encrypted+signed. Or where do you see the problem?

On Mittwoch, 28. Mai 2014 11:44:09, Jens Diemer wrote:

Can closed source projects ever offer security? Keyword: verifiability...

And should i really sync important files to an untrusted location?

Just my 2¢...


Reply to this email directly or view it on GitHub
https://github.com/calmh/syncthing/issues/109#issuecomment-44384827.

@bigbear2nd
Copy link
Contributor

Quote: Can closed source projects ever offer security? Keyword: verifiability...

Thats why i want to change.

Quote: Syncing important files to untrusted locations is usually not a problem when they are encrypted+signed.

I totally agree on that.
But for me, I would say that my family members and friends computers are kind of "half trusted" locations.

@nadalle
Copy link

nadalle commented Jul 20, 2014

One issue with the clearskies proposal is that it only addresses encryption of file data, not metadata about the file like name length, file size, etc.

If you really don't trust the remote storage, this is not sufficient -- it's often possible to tell what's in a directory tree just by looking at the file sizes, for example. Encrypting file systems try to mitigate this somewhat by padding files up and so forth, but dealing with the remote security issues may be rather hard.

At minimum, you probably want to think about randomizing the file locations in the tree and padding the files. Better would be to break them up into blocks and scatter them around in fixed size storage chunks that the remote end doesn't know anything about.

@nadalle
Copy link

nadalle commented Jul 21, 2014

To elaborate a little bit, you can't entirely eliminate the data leakage if you're storing in a completely untrusted location. For example, at an absolute minimum, someone who can watch your traffic can tell how much data you change (and thus need to sync) every hour/minute/etc.

But systems that just encrypt the file data (and hopefully the names) leak a lot more. For example, say I just stored the new Weird Al album in my share. Even encrypted, rounded up to 16 byte boundaries, the directory contains files of these sizes:

    By track     By size
 1.  7151712     5497664
 2.  9123472     5822608
 3.  5822608     7032608
 4.  5497664     7151712
 5.  9032544     7159040
 6.  8931184     7856016
 7.  9947920     8931184
 8. 10858000     9032544
 9.  7159040     9123472
10.  7856016     9947920
11.  7032608    10858000
12. 21923472    21923472

Probably no other set of files will show this pattern. So it's pretty easy for an adversary with a database of these things (they exist) to tell that I have a Weird Al album there.

You might assume that the sort order of the files will be scrambled, but of course the tool probably uploaded them in order (so they can get it from the CTIME). Even if it didn't, the file sizes are nearly as good in sorted order. You might try to store the files in random locations in the directory tree (better), but that has the same CTIME problem.

If you really want to have much hope of a secure system here, you really want to avoid storing the data in files entirely. One simple way to think of this is to break all the data you want to sync into 4k blocks, and then have the untrusted side store a database of SHA256 hash -> encrypted 4k block. You do updates by sending new blocks, and then giving the remote store a manifest of which blocks are still needed (the data about file names and the map of blocks to files is itself stored in encrypted 4k blocks hidden in the data). The layout of the database is now mostly irrelevant, since the protocol just talks in terms of hashes and manifests.

You'll note that this is starting to look a lot like a filesystem in its own right. I think something like this is probably needed to have a reasonable level of security.

@Natanji
Copy link
Author

Natanji commented Jul 24, 2014

Well, the question certainly is what counts as "reasonable". There are file systems like EncFS and ecryptfs which expose the same problems that you mention here, but are still widely used - especially for cloud storage. If syncthing can do it just as good as these state-of-the-art systems, then that is a big leap forward!

Security is never absolute, but relative to a use case. Leaking the alphabetical orders can be easily circumvented by shuffling the order in which files are uploaded - that is a good idea. Leaking the file sizes can lead to some exposure, but for most use cases leaking your musical preferences will not be the end of the world. Files with private data in them, however, would still benefit a hundred percent from having just their name and contents encrypted, like in EncFS or ecryptfs.

Don't get me wrong: it is important to think about these issues. But we don't have to come up with a perfect solution that exposes absolutely nothing under no circumstances ever. If a perfect solution fulfilling 100% of the use cases means so much work, then it should be fine to opt for a much less complicated option that just fits 95% of use cases - at least for now, until a better option is available.

Perfectionism is the greatest enemy of open source progress. ;) As long as you inform your users of the security implications, e.g. what does not get protected and what does, it's completely legitimate.

@Phyks
Copy link

Phyks commented Aug 3, 2014

Hi,

I'm really interested in Syncthing, but client-side encryption is a major feature for me, as I want to sync my files against my dedicated server (which hosts several other services) and thus, I don't want to risk to have any sensitive files unencrypted on such as server.

I read this issue and saw that this feature is ongoing. But do you know about some working setups usable as of today ? For example using an encfs or ecryptfs container which could be automatically mounted and unmounted before / after each synchronization or something similar ? (just for basic file content encryption, waiting for a better solution directly implemented in syncthing)

Thanks !

@Finkregh
Copy link

i'd offer 40 EUR when i could replace tahoe-lafs with syncthing... 👍
https://www.bountysource.com/issues/1474343-support-for-file-encryption-e-g-non-trusted-servers

@elimisteve
Copy link

I'd put money in, too.

What don't you like about Tahoe-LAFS specifically?
On Aug 10, 2014 11:49 AM, "Oluf Lorenzen" notifications@github.com wrote:

i'd offer 40 EUR when i could replace tahoe-lafs with syncthing... [image:
👍]


Reply to this email directly or view it on GitHub
#109 (comment).

@Finkregh
Copy link

well, i like tahoe-lafs very much concept-wise, but putting in files and the whole setup is a pita :/

i need something that i can just install on my mothers' PC ;)

@djtm
Copy link

djtm commented May 18, 2015

@bademux I'm using ecryptfs shipping with Ubuntu for my home directoy. I never had any issues of any kind. The only thing is that it's a bit difficult to mount an ecryptfs directory from a bootable linux distribution. I'm not worried as much about speed as about security. The whole point of encryption is that it's reliable. encfs currently allows various attacks which are especially problematic within the cloud. However I believe it might be the better option to fix encfs, which had undergone security reviews by cryptography experts than to invent a new encryption here which will most likely end up being either a ton of work or mostly snake oil. As much as I'd love to see this feature implemented...

ext4 encryption will certainly not help, as the encrypted files will be invisible to syncthing.

@MyPod-zz
Copy link

As per the previous comments on this bug/enhancement request, @djtm, ecryptfs/encfs isn't good enough for something that is sent passing trough wires you don't own and can't control, as the changes within the structure of your database/filesystem can reveal informations. There is also to keep in mind that the solution has to be OS-independent and, in particular, work with windows (OSX could be easier to work with from what I know and understand), as well as being something (relatively) simple and easy to do that doesn't go beyond the scope of Syncthing.

@calmh
Copy link
Member

calmh commented May 19, 2015

The requirement is clear; further discussion about the possible workarounds and this limitations fit better in the forum.

@syncthing syncthing locked and limited conversation to collaborators May 19, 2015
@calmh calmh removed this from the v1.0-maybe milestone Nov 17, 2015
@calmh calmh modified the milestone: Unplanned Jan 1, 2016
@lkwg82
Copy link
Contributor

lkwg82 commented Feb 4, 2018

see PR #4331

@calmh calmh removed this from the Unplanned (Contributions Welcome) milestone Feb 11, 2018
@syncthing syncthing unlocked this conversation Oct 1, 2020
@Max-Schmeling
Copy link

Max-Schmeling commented Dec 27, 2020

Is end-to-end-encryption still being considered by the devs? I found librevault, but it has not been updated since 2016.

@nunesgh
Copy link

nunesgh commented Dec 27, 2020

Actually E2EE is being currently tested by the devs. You can find more information here:
https://github.com/syncthing/syncthing/wiki/Untrusted-devices-(encryption)-status

@onlyjob
Copy link

onlyjob commented Dec 28, 2020

For untrusted backups you probably want Duplicati, Borg or Rclone.

Or restic.

@imsodin
Copy link
Member

imsodin commented Dec 28, 2020

Please only comment if you have something to write that is pertinent to the issue at hand. That Syncthing isn't a backup solution on its own and what actual backup programs exist is not relevant here.

As @nunesgh wrote, a solution for untrusted/encrypted devices is being tested and improved at the moment. A wider call for testing will likely happen in on of the next releases (and will also be announced in this issue). Please do not ask questions about the feature here or start any related discussions - join https://forum.syncthing.net/ for that.

@Absolucy
Copy link

That Syncthing isn't a backup solution on its own and what actual backup programs exist is not relevant here

It is useful information for anyone who searches "syncthing encryption" and finds this issue.

@syncthing syncthing locked and limited conversation to collaborators Dec 28, 2020
@imsodin
Copy link
Member

imsodin commented Apr 6, 2021

This feature is currently being tested. You can try it out and help ensure it is ready to be used on our precious data. Check this forum thread for infos on how to try it out: https://forum.syncthing.net/t/testing-untrusted-encrypted-devices/16470

imsodin added a commit to imsodin/syncthing that referenced this issue Apr 12, 2021
No longer hide the web UI controls for the new untrusted/encrypted
device feature. Testing hasn't been very widespread, but there has been
some and quite a few bugs have been caught and fixed. I believe its time
to not hide it anymore, and cautiously recommend usage. E.g. mention
that the feature hasn't been widely used yet and anyone using it is an
early adopter, but drop the bit about not using it with production data.
We can maybe stress the need for backups in general and especially
using this.

To remove the feature flag I added a new type of config migration like on dbs:
Changes something, buti not the schema.
imsodin added a commit to imsodin/syncthing that referenced this issue Apr 12, 2021
No longer hide the web UI controls for the new untrusted/encrypted
device feature. Testing hasn't been very widespread, but there has been
some and quite a few bugs have been caught and fixed. I believe its time
to not hide it anymore, and cautiously recommend usage. E.g. mention
that the feature hasn't been widely used yet and anyone using it is an
early adopter, but drop the bit about not using it with production data.
We can maybe stress the need for backups in general and especially
using this.

To remove the feature flag I added a new type of config migration like on dbs:
Changes something, buti not the schema.
@calmh calmh closed this as completed in f71fcd4 Apr 13, 2021
@calmh calmh added this to the v1.16.0 milestone Apr 13, 2021
greatroar pushed a commit to greatroar/syncthing that referenced this issue May 8, 2021
)

No longer hide the web UI controls for the new untrusted/encrypted
device feature. Testing hasn't been very widespread, but there has been
some and quite a few bugs have been caught and fixed. I believe its time
to not hide it anymore, and cautiously recommend usage. E.g. mention
that the feature hasn't been widely used yet and anyone using it is an
early adopter, but drop the bit about not using it with production data.
We can maybe stress the need for backups in general and especially
using this.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New features or improvements of some kind, as opposed to a problem (bug)
Projects
None yet
Development

No branches or pull requests