-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for file encryption (e.g. non-trusted servers) #109
Comments
This would be amazing. I tried to spec out what this might look like in this clearskies extension, but it adds so much complexity that I've tabled plans for it for now. Like you say, if only the file contents are synchronized to the "untrusted" peers, that would be a lot simpler to implement (i.e. the metadata never hits the untrusted peer in any form). I hadn't thought of that. |
It seems like you even thought of a zero-knowledge-proof to show that the server is legitimate/actually stores the files (did I understand that correctly?). Not bad. CTR mode sounds like an extremely bad choice for me, just like other stream ciphers like GCM. Yes, it is seekable and that is useful, but XORing two snapshots of encryption on top of each other will result in an adversary knowing what changed between the plaintext of those two files. CBC is a much better choice: when seeking, you may need two blocks of ciphertext to decrypt the first block of plaintext, but that is negligible usually because you will read more than one block anyway, and the more you decrypt the less overhead you get. I don't really understand why encrypting everything - including metadata - should somehow be "easier" or simpler to implement. Maybe I'm misunderstanding you? What do you mean? |
I think you might have misunderstood, I was trying to say that it'd be simpler to implement if the metadata isn't synced. Thanks for the feedback on CTR mode, I wasn't aware that seeking was possible with CBC mode. |
I could see how this would be useful. As you say, it would require some work because it's currently not a part of the design at all - an additional layer of encryption would be needed. There would obviously be some trade offs between privacy and efficiency, ie if a blocks changes in the middle of a large file, do we resynchronize just that block and leak that fact or re-encrypt the entire file etc. |
Also slightly related to #62 which is similar functionality, minus the encryption (ie for when we trust the node with the data, just not with modifying it). |
This idea is particularly useful for people who would use syncthing to setup a network of devices, and require 1 of them to be available 24 hours a day. Let's say all the devices are trusted except for the always on device, which is a 3rd party VPS server. In this case, it would desirable for some additional properties built into syncthing so that the VPS node has the following characteristics: READ ONLY (can't change any data) No doubt this adds complexity and performance hits to support the encryption, especially if this project eventually extends to devices that don't support hardware-based encryption, such as most current smartphones, |
Tahoe-LAFS has this feature and it would be awesome if a more usable implementation (I find tahoe-lafs WebAPI very painful and difficult to use). https://tahoe-lafs.org/trac/tahoe-lafs/browser/trunk/docs/architecture.rst#security It has the notion of "storage nodes" that hold chunks of distributed encrypted data. The default configuration is that any 3 chunks can restore the file out of a goal of 10 chunks on different storage nodes using erasure coding. It would be nice if syncthing could support the distributed fractional concept as well, but that sounds like a topic for another issue. It may be out of scope too, hopefully not :) |
Tahoe-LAFS sounds pretty much exactly like what we want - what and incredible find, I hadn't heard of it. Thanks, @kylemanna :) The way I see it, syncthing already has the functionality to keep directories in sync and perform upload/download operations between the nodes whenever something changes. So the feature we want might not be that far out of reach: whenever a file changes, then we have to call the API of tahoe-lafs and upload/download the file. I agree that we should start with a configuration where files are simply replicated completely on all foreign nodes. Fractional distribution can be added later if this setup turns out to work well. The solution would also work on both Windows and Linux, which is a huge plus! And we don't have to do any crypto storage of our own, which would most probably turn out to be a failure anyway, I presume. :) Sooo... anyone see a problem with this approach yet, from a design perspective? @calmh, do you think syncthing's design is compatible with tahoe-lafs? |
Sure. However if Tahoe-lafs is a "cloud storage system", then perhaps that is all you need and syncthing doesn't add much to the equation if you already have that up and running? |
I played with Tahoe-lafs for a while and it doesn't really do what I want. The major deal breaker for me was that the storage nodes don't work behind NAT. Everything I could find suggested that I needed to do port forwarding and tunneling of some sort. I'd imagine that a significant portion of the user base for syncthing is behind a NAT. |
These days, without some form of encrypted/untrusted node support, Syncthing is probably going to be unusable for some portion of users. One of the reasons I choose BT Sync over other solutions was it's support for this. |
This feature would be great because it'd allow me to replace the rather bulky and Mono dependency-laden SparkleShare with syncthing, which is much easier to set up :-). SparkleShare has been working well for me, though. |
This would indeed be a great feature! |
This would indeed be a great feature, especially if it could be defined on a folder and/or file level as in BTSync. I'd argue adding this feature is part of the "BTSync-replacement" goal. This would add complexity for sure, but it would be great to have one Syncthing interface from which I can manage my synchronised shares with people who are supposed to have access as well as with locations which are not supposed to either have access or be able to see the files. As a VPS user, this would be great for me - and surely for a lot of others as well. |
For me, this feature is the only thing keeping me from switching from BTSync to Syncthing. How the encryprtion works with BTSync ist discribed here in detail: The use would be for me: I can store data at a friends home, at my family members PCs and i dont have to worry about that they can access my data. Additionally i can store data for them and i cannot access it. The more people which have my data, the faster my download / upload and spreading of data is. Syncing data is for me not only having it available, it has become data safekeeping as well. |
Can closed source projects ever offer security? Keyword: verifiability... And should i really sync important files to an untrusted location? Just my 2¢... |
Syncthing is open source. That's the point. That's why I don't want the Syncing important files to untrusted locations is usually not a problem On Mittwoch, 28. Mai 2014 11:44:09, Jens Diemer wrote:
|
Quote: Can closed source projects ever offer security? Keyword: verifiability... Thats why i want to change. Quote: Syncing important files to untrusted locations is usually not a problem when they are encrypted+signed. I totally agree on that. |
One issue with the clearskies proposal is that it only addresses encryption of file data, not metadata about the file like name length, file size, etc. If you really don't trust the remote storage, this is not sufficient -- it's often possible to tell what's in a directory tree just by looking at the file sizes, for example. Encrypting file systems try to mitigate this somewhat by padding files up and so forth, but dealing with the remote security issues may be rather hard. At minimum, you probably want to think about randomizing the file locations in the tree and padding the files. Better would be to break them up into blocks and scatter them around in fixed size storage chunks that the remote end doesn't know anything about. |
To elaborate a little bit, you can't entirely eliminate the data leakage if you're storing in a completely untrusted location. For example, at an absolute minimum, someone who can watch your traffic can tell how much data you change (and thus need to sync) every hour/minute/etc. But systems that just encrypt the file data (and hopefully the names) leak a lot more. For example, say I just stored the new Weird Al album in my share. Even encrypted, rounded up to 16 byte boundaries, the directory contains files of these sizes: By track By size 1. 7151712 5497664 2. 9123472 5822608 3. 5822608 7032608 4. 5497664 7151712 5. 9032544 7159040 6. 8931184 7856016 7. 9947920 8931184 8. 10858000 9032544 9. 7159040 9123472 10. 7856016 9947920 11. 7032608 10858000 12. 21923472 21923472 Probably no other set of files will show this pattern. So it's pretty easy for an adversary with a database of these things (they exist) to tell that I have a Weird Al album there. You might assume that the sort order of the files will be scrambled, but of course the tool probably uploaded them in order (so they can get it from the CTIME). Even if it didn't, the file sizes are nearly as good in sorted order. You might try to store the files in random locations in the directory tree (better), but that has the same CTIME problem. If you really want to have much hope of a secure system here, you really want to avoid storing the data in files entirely. One simple way to think of this is to break all the data you want to sync into 4k blocks, and then have the untrusted side store a database of SHA256 hash -> encrypted 4k block. You do updates by sending new blocks, and then giving the remote store a manifest of which blocks are still needed (the data about file names and the map of blocks to files is itself stored in encrypted 4k blocks hidden in the data). The layout of the database is now mostly irrelevant, since the protocol just talks in terms of hashes and manifests. You'll note that this is starting to look a lot like a filesystem in its own right. I think something like this is probably needed to have a reasonable level of security. |
Well, the question certainly is what counts as "reasonable". There are file systems like EncFS and ecryptfs which expose the same problems that you mention here, but are still widely used - especially for cloud storage. If syncthing can do it just as good as these state-of-the-art systems, then that is a big leap forward! Security is never absolute, but relative to a use case. Leaking the alphabetical orders can be easily circumvented by shuffling the order in which files are uploaded - that is a good idea. Leaking the file sizes can lead to some exposure, but for most use cases leaking your musical preferences will not be the end of the world. Files with private data in them, however, would still benefit a hundred percent from having just their name and contents encrypted, like in EncFS or ecryptfs. Don't get me wrong: it is important to think about these issues. But we don't have to come up with a perfect solution that exposes absolutely nothing under no circumstances ever. If a perfect solution fulfilling 100% of the use cases means so much work, then it should be fine to opt for a much less complicated option that just fits 95% of use cases - at least for now, until a better option is available. Perfectionism is the greatest enemy of open source progress. ;) As long as you inform your users of the security implications, e.g. what does not get protected and what does, it's completely legitimate. |
Hi, I'm really interested in Syncthing, but client-side encryption is a major feature for me, as I want to sync my files against my dedicated server (which hosts several other services) and thus, I don't want to risk to have any sensitive files unencrypted on such as server. I read this issue and saw that this feature is ongoing. But do you know about some working setups usable as of today ? For example using an encfs or ecryptfs container which could be automatically mounted and unmounted before / after each synchronization or something similar ? (just for basic file content encryption, waiting for a better solution directly implemented in syncthing) Thanks ! |
i'd offer 40 EUR when i could replace tahoe-lafs with syncthing... 👍 |
I'd put money in, too. What don't you like about Tahoe-LAFS specifically?
|
well, i like tahoe-lafs very much concept-wise, but putting in files and the whole setup is a pita :/ i need something that i can just install on my mothers' PC ;) |
@bademux I'm using ecryptfs shipping with Ubuntu for my home directoy. I never had any issues of any kind. The only thing is that it's a bit difficult to mount an ecryptfs directory from a bootable linux distribution. I'm not worried as much about speed as about security. The whole point of encryption is that it's reliable. encfs currently allows various attacks which are especially problematic within the cloud. However I believe it might be the better option to fix encfs, which had undergone security reviews by cryptography experts than to invent a new encryption here which will most likely end up being either a ton of work or mostly snake oil. As much as I'd love to see this feature implemented... ext4 encryption will certainly not help, as the encrypted files will be invisible to syncthing. |
As per the previous comments on this bug/enhancement request, @djtm, ecryptfs/encfs isn't good enough for something that is sent passing trough wires you don't own and can't control, as the changes within the structure of your database/filesystem can reveal informations. There is also to keep in mind that the solution has to be OS-independent and, in particular, work with windows (OSX could be easier to work with from what I know and understand), as well as being something (relatively) simple and easy to do that doesn't go beyond the scope of Syncthing. |
The requirement is clear; further discussion about the possible workarounds and this limitations fit better in the forum. |
see PR #4331 |
Is end-to-end-encryption still being considered by the devs? I found librevault, but it has not been updated since 2016. |
Actually E2EE is being currently tested by the devs. You can find more information here: |
Please only comment if you have something to write that is pertinent to the issue at hand. That Syncthing isn't a backup solution on its own and what actual backup programs exist is not relevant here. As @nunesgh wrote, a solution for untrusted/encrypted devices is being tested and improved at the moment. A wider call for testing will likely happen in on of the next releases (and will also be announced in this issue). Please do not ask questions about the feature here or start any related discussions - join https://forum.syncthing.net/ for that. |
It is useful information for anyone who searches "syncthing encryption" and finds this issue. |
This feature is currently being tested. You can try it out and help ensure it is ready to be used on our precious data. Check this forum thread for infos on how to try it out: https://forum.syncthing.net/t/testing-untrusted-encrypted-devices/16470 |
No longer hide the web UI controls for the new untrusted/encrypted device feature. Testing hasn't been very widespread, but there has been some and quite a few bugs have been caught and fixed. I believe its time to not hide it anymore, and cautiously recommend usage. E.g. mention that the feature hasn't been widely used yet and anyone using it is an early adopter, but drop the bit about not using it with production data. We can maybe stress the need for backups in general and especially using this. To remove the feature flag I added a new type of config migration like on dbs: Changes something, buti not the schema.
No longer hide the web UI controls for the new untrusted/encrypted device feature. Testing hasn't been very widespread, but there has been some and quite a few bugs have been caught and fixed. I believe its time to not hide it anymore, and cautiously recommend usage. E.g. mention that the feature hasn't been widely used yet and anyone using it is an early adopter, but drop the bit about not using it with production data. We can maybe stress the need for backups in general and especially using this. To remove the feature flag I added a new type of config migration like on dbs: Changes something, buti not the schema.
) No longer hide the web UI controls for the new untrusted/encrypted device feature. Testing hasn't been very widespread, but there has been some and quite a few bugs have been caught and fixed. I believe its time to not hide it anymore, and cautiously recommend usage. E.g. mention that the feature hasn't been widely used yet and anyone using it is an early adopter, but drop the bit about not using it with production data. We can maybe stress the need for backups in general and especially using this.
So I have had a look at BitTorrent sync, syncthing and alternatives and what I always wondered about was the possibility to not only sync between resources I own and trust, but also external resources/servers which I do NOT trust with my data, up to a certain extent.
One way to do this is using ecryptfs or encfs, but this has many obvious downsides: it is not an interoperable solution (only works on Linux), the files are actually stored in encrypted form on the disk (even if the resource is trusted and this is not necessary, for instance because of the file system being encrypted already), etc.
What I propose is somehow configuring nodes which are only sent the files in an encrypted format, with all file contents (and potentially file/directory names as well; or even permissions) being encrypted. This way, if I want to store my private files on a fast server in a datacenter to access them from anywhere, I could do this with syncthing without essentially giving up ownership of those files. I could also prevent that particular sync node from being allowed/able to make any changes to the files without me noticing.
I realize that this requires a LOT of additional effort, but it would be a killer feature that seems to not be available in any other "private cloud" solution so far. What are your thoughts on this feature?
EDIT: BitTorrent sync mentions a feature like this in their API docs: "Encryption secret
API users can generate folder secrets with encrypted peer support. Encryption secrets are read-only. They make Sync data encrypted on the receiver’s side. Recipients can sync files, but they can’t see file content, and they can’t modify the files. Encryption secrets come in handy if you need to sync to an untrusted location." (from http://www.bittorrent.com/intl/de/sync/developers/api)
The text was updated successfully, but these errors were encountered: