Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanity check on server #29

Open
tobiasKaminsky opened this issue Dec 18, 2017 · 28 comments
Open

Sanity check on server #29

tobiasKaminsky opened this issue Dec 18, 2017 · 28 comments
Labels
1. to develop Accepted and waiting to be taken care of enhancement New feature or request
Projects

Comments

@tobiasKaminsky
Copy link
Member

To get a more bulletproofed upload process the server should do some checks to ensure that we always have a valid and consistent folder with the correct metadata and encrypted files:

  • file size
    • on lock the clients tell server the file size
    • an unlock the server checks if the correct file size was transferred, otherwise revert and send error to client
  • file check
    • although the server cannot read the entire metadata file, it can check & ensure that the count of encrypted files matches the stored files in the folder
    • on unlock: check if correct file count and file names are stored both in metadata and in local folder

While this should always be correct (as the clients should handle it correctly), with these enhancements we can make (more) sure that every upload is consistent.
The only potential problem (I can imagine) then is that the clients uploads a "wrong" playload, e.g. you cannot decrypt it anymore.

--> with this we check for

@MorrisJobke (as we discussed it)

@tobiasKaminsky tobiasKaminsky added the enhancement New feature or request label Dec 18, 2017
@MorrisJobke
Copy link
Member

cc @schiessle @rullzer @LukasReschke

@marinofaggiana
Copy link
Member

marinofaggiana commented Dec 18, 2017

No, for me this is not possible, I lock and unLock the folder only for update the metadata not for upload the file(s).

If metadata has file cont different from real file in folder is not a issue ...

@MorrisJobke
Copy link
Member

No, for me this is not possible, I lock and unLock the folder only for update the metadata not for upload the file(s).

If metadata has file cont different from real file in folder is not a issue ...

Let me quote the API specification: https://github.com/nextcloud/end_to_end_encryption/blob/master/doc/api.md#unlock-file

Unlock the file again after the file and the meta data was updated correctly.

@marinofaggiana
Copy link
Member

The send file of iOS is in asynchronous mode ... When I send a file the low level of iOS manage de upload ... for example if you have low battery the file is suspended ... and the folder remain locked ...

case 2 : I send a file and the phone enter in fly mode the directory is locked ...

.....

inconceivable and I am sure that this is a big big issue for large scale ... too many issue.

@MorrisJobke
Copy link
Member

inconceivable and I am sure that this is a big big issue for large scale ... too many issue.

But it is also a problem, if the iOS app does not follow the specification. If there is a technical problem with it, then the specification should be changed.

cc @karlitschek @schiessle @rullzer @tcanabrava @tobiasKaminsky

@marinofaggiana
Copy link
Member

marinofaggiana commented Dec 18, 2017

For me this is a problem for all devices ... is not possible forecast when an upload ends, especially for the insecure "mobile" device (low battery, lost radio, fly mode ... enter in wi-fi mode and exit in wi-fi mode ... ) too many issue ... all the uploads have to be imagined as asincrony or even worse as send in broadcast or UDP something always insecure.

@MorrisJobke
Copy link
Member

For me this is a problem for all devices ... is not possible forecast when an upload ends, especially for the insecure "mobile" device (low battery, lost radio, fly mode ... enter in wi-fi mode and exit in wi-fi mode ... ) too many issue ... all the uploads have to be imagined as asincrony or even worse as send in broadcast or UDP something always insecure.

But then they should result in consistent data. We need to make sure that the server always serves a proper state. There is no problem in an upload to take some time, but then the metadata should also not be made available yet. This is then a problem of the protocol itself. But serving partly correct metadata is the worst thing ever. Either they are in sync or they are broken.

Then please raise your voice that the spec is wrong and don't implement your own interpretation of it that completely does not follow the specification, because this is just wrong and will explode hard and cause more issues than it solves. Everybody should be on the same page rather then implementing random stuff.

@marinofaggiana
Copy link
Member

marinofaggiana commented Dec 18, 2017

I have send a email to Tobi cc Bjoern for this issue ... my procedure is :

  1. Prepare encrypted file
  2. Prepare Metadata
  3. LOCK directory -> write token in directory DB
  4. Send Metadata (required a little time, if error abort UPLOAD)
  5. UNLOCK directory -> remove token in directory DB
  6. Send encrypted file
  7. Next file in queue ? GoTo 1.

Every upload required an update metadata ... 100 upload in queue 100 upload metadata

if 6. error, Upload error, the metadata contenent a record without a real file, for me this is not a issue ..... in my database this is not a problem of primary index ... the only problem is that with the time the dimension of metadata is not optimized ...

Example : metadata has 1000 files but in folder exists only 900 files ...

@MorrisJobke
Copy link
Member

for me this is not a issue

But for all the others. It is not according to the spec, that clearly says: "unlock after the upload of metadata and files". That means that 5 and 6 needs to be switched.

I have send a email to Tobi cc Bjoern for this issue ... my procedure is :

We want to discuss this in the open. Please open a ticket to request a change in the spec and this should also include how the state in between should be handled. Because it would then allow to overwrite the files and metadata while another client is uploading, because it is not locked anymore.

@schiessle
Copy link
Member

You need to keep the lock until both the meta data and the file is uploaded completely. The whole purpose of the lock is to keep to separate request together and block access to the file until both (payload and meta data) are in sync again.

@marinofaggiana
Copy link
Member

marinofaggiana commented Dec 18, 2017

You need to keep the lock until both the meta data and the file is uploaded completely. The whole purpose of the lock is to keep to separate request together and block access to the file until both (payload and meta data) are in sync again.

sorry, no @schiessle read up :-)

@tcanabrava
Copy link

tcanabrava commented Dec 18, 2017 via email

@tobiasKaminsky
Copy link
Member Author

tobiasKaminsky commented Dec 18, 2017

You need to keep the lock until both the meta data and the file is uploaded completely. The whole purpose of the lock is to keep to separate request together and block access to the file until both (payload and meta data) are in sync again.

I second that. We put a lot of brain into it to (try to) guarantee that metadata and payload are always in sync.
If you only lock the folder during upload of metadata file and then upload, e.g. 500mb of payload, very nasty things can happen

  • the file is deleted or renamed in between
  • given that the file already exists with key A, you change it in metadata to key B (as you have changed the file), the (old) file is then downloaded from another client and tried to decrypt with key B, which results in an error. ( then the user can get the impression that the file is corrupt (no go!) and delete it)
  • as upload can take some time, it can happen that the same file is overwritten by another client, which then can lead to metadata from client A and payload from client B -> another file is corrupted

If you can solve all of these (and probably more) issues then we can discuss how to change the specifications. But "just" implementing a custom logic on your own and without saying that you do not obey the specifications (it only came up because Björn and I asked you about storing the token), is a big no go for me.
The specifications is there for a very very good reason: to stay in sync on all clients (there are maybe other 3rd party clients that want to include this and those do not know anything about your "logic").

Regarding iOS: if the upload can be cancelled, stalled, deleted on any time, I do not understand what the problem is:

  • lock folder

  • upload metadata & payload

  • payload gets killed, cancelled, whatever

  • on next connection resume upload of payload

  • after complete upload unlock folder

  • maybe it can happen that an upload in this scenario lasts 2 or 5 hours or even days, but in this time the folder is still consistent

  • if the user is too impatient, he can use Option to unlock folder as user as fallback #20: manually unlock folder and revert to last known step

after all: how do you handle regular uploads if the upload is such a broken thing on iOS? There you have the same problem, or?

@marinofaggiana
Copy link
Member

marinofaggiana commented Dec 18, 2017

after all: how do you handle regular uploads if the upload is such a broken thing on iOS? There you have the same problem, or?

In iOS the upload is not broken (but possible if the user remove the app ... during uploading) but if my regular upload is in "pause" or it's very slow this is not a issue ... do not exists a lock ...

For me the unlock after upload file It will be a big problem, do not block an entire folder for an single upload especially for mobile devices, simply.

For your case, do not exists issue ... I verify ever the metadata before upload/delete .... the metadata commands not the physical file.

Add : on the hypothetical side the lock on the entire folder (metadata + file) is the most conservative but in mobile use it is full of problems ...

@tcanabrava
Copy link

tcanabrava commented Dec 18, 2017 via email

@marinofaggiana
Copy link
Member

This makes no sense for the other people that could be uploading the file
based on the metadata.

? please ? I don't understand :-)

@tobiasKaminsky
Copy link
Member Author

For me the unlock after upload file It will be a big problem, do not block an entire folder for an single upload especially for mobile devices, simply.

How can you then make sure that this scenario is working without a lock?

  • file 1 exists with key K_A
  • file 1 is changed and key K_B is used
  • changed metadata is uploaded
  • file 1 is uploaded on client A
  • during upload of file 1 the old file is downloaded by client B
  • client B decrypts old file 1 with new key K_B --> cannot work

Unless we find a reliable way to prevent such problems, the one and only solution we (@schiessle, @LukasReschke and me) came up with is the current specification.

@marinofaggiana
Copy link
Member

marinofaggiana commented Dec 18, 2017

@tobiasKaminsky , I agree, the lock metadata+file for entire folder is the best solution (logical) but remain the issue when the device enter in backgroud etc. the folder remain in lock ... Another user enter and forced unlock, then upload the files and ... this is your scenario :-)

@tobiasKaminsky
Copy link
Member Author

But with forced unlock the server(!) reverts to a known an consistent state and the user is presented a big giant warning that this might lead to data loss.
So it is a very very reliable and "easy" to understand / manage for other clients.

@marinofaggiana
Copy link
Member

if we want for me is not a problem move unlock 100 line after (after send file) ;-) but for me this will become a big issue with several users thath complain errors ...

@schiessle
Copy link
Member

if we want for me is not a problem move unlock 100 line after (after send file) ;-) but for me this will become a big issue with several users thath complain errors ...

Yes, please do this. This is the only solution to keep meta data and payload in sync. In 99% of all cases this will work, the client locks it. Perform the operation (update payload and metadata), all other clients wait until the client finished, download the updated metadata and payload and do their operation. In the 1%, e.g. the client locks the file and then your iPhone get hit by a hammer, is destroyed and will never ever be able to finish the operation and unlock the file, we will have a option in the web interface to restore metadata and payload to a previous state.

@marinofaggiana
Copy link
Member

Yes, no problem @schiessle , but this will generate many problems ...

@MorrisJobke
Copy link
Member

but this will generate many problems ...

Then those should be addressed as well. The goal is to have a working solution and not a partly working one.

@tobiasKaminsky
Copy link
Member Author

If you run into such problems, please discuss them with us so we stay in sync how to handle new things. Whatever we agree on must be put into RFC then.

@schiessle
Copy link
Member

About the original request by @tobiasKaminsky:

  • file size
    - on lock the clients tell server the file size
    - an unlock the server checks if the correct file size was transferred, otherwise revert and send error to client

While technically we can do it, I wonder why we should do it for encrypted files but not for regular files. If we want to have such a check I think we should have it independent from the file content (at the end e2e files are just files like any other file, only the content is useless until you have the right key).

I would assume that the chunked upload has enough logic to detect complete and failed upload. If we want to extend/improve it I would add it there, independent from e2e.

While this should always be correct (as the clients should handle it correctly), with these enhancements we can make (more) sure that every upload is consistent.

While I see the rationals behind the check of the meta-data. It adds a lot more logic to the server with all possible negative impacts (both for performance and possible sources of error). This also means that we always have to keep clients and server in sync if we changes/enhancements to the meta data file format. New clients which might add a additional values to the meta data file might no longer work with older servers, although technically there is no reason not to work with older servers. I think for e2e it is better to see the server as a dump data storage. It receives data and send data back, but all the logic is at the client.

@MorrisJobke
Copy link
Member

I would assume that the chunked upload has enough logic to detect complete and failed upload. If we want to extend/improve it I would add it there, independent from e2e.

Full ack.

While I see the rationals behind the check of the meta-data. It adds a lot more logic to the server with all possible negative impacts (both for performance and possible sources of error). This also means that we always have to keep clients and server in sync if we changes/enhancements to the meta data file format. New clients which might add a additional values to the meta data file might no longer work with older servers, although technically there is no reason not to work with older servers. I think for e2e it is better to see the server as a dump data storage. It receives data and send data back, but all the logic is at the client.

I see your point, but as it has shown during the development: once once client does not follow the rules/specification it will mess up a lot with everything in the server. As the server is also the central pool for all the metadata and keys we should add some sanity checks, just to avoid faulty clients (for whatever reason they are faulty). It's just another safety net, because the spec seems to be not too easy and in this way it can be enforced (especially in which order calls have to happen).

@tobiasKaminsky
Copy link
Member Author

You are right, file size checking should (if ever) be done for all files 👍

Checking metadata and preventing a client to mess up them is in my opinion needed and can only be done via server. Especially any 3rd party app (that we cannot all help/debug may lead to an overall corrupted e2e behaviour)

Regarding old server / new clients: this is what we have "version" in metadata.

@marinofaggiana
Copy link
Member

marinofaggiana commented Jan 8, 2018

Yes, true, if the metadata file is corrupt all file are unreadable and this is a disaster... required more control on the server ...
and for me (iOS) the "lock" system is seriously a issue ..

@georgehrke georgehrke added the 1. to develop Accepted and waiting to be taken care of label Mar 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1. to develop Accepted and waiting to be taken care of enhancement New feature or request
Projects
E2Ev2
Awaiting triage
Development

No branches or pull requests

6 participants