New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smashbox test_basicSync fails when encryption is on #17956
Comments
Disabling files_locking did not solve the issue. owncloud.log shows this:
|
I tried logging the cause too:
|
Tried it three times but couldn't reproduce it |
Looks like another weird race condition. For some very strange reason it is trying to update the cache entry of a file, but tries to set the "parent" column to null which is an invalid value. I'll try and debug it. |
Tried again with SQLite and it works fine. The stack trace shows some strange characters "\x9F\xF8\x8B\xD5\xCD\xC9\x9DD\x9A\xC5", possibly randomly generated. As per another discussion today it was said that Mariadb had trouble with some UTF-8 characters. Will try and dig deeper. |
Just realized the strange characters are actually the file key, not the file name. Still, there is something that breaks with Mysql that doesn't in SQLite. |
Can somebody fire up the same on postgres? |
It seems that I have messed up the test cases (I edited test_basicSync to have a single test case only and mistakenly reset my changes). Tried again with SQLite: FAIL Sorry for the confusion. |
Ok, got more info. First, that specific test case with 500k is using chunking. The others don't. The problem is that encryption is trying to store keys here: As you can see this is a ".part" folder. This means that the previous call to Now the mysterious part is why this works on @schiesbn's computer and why we didn't spot this earlier. |
v8.1.0 works fine. I'll bisect. |
THX - you rock! |
Wrong again. Seems I forgot to enable encryption when resetting to v8.1.0 😡 |
Hmmm, I just tried with the sync client and the "parent == null" situation happens, but doesn't cause any errors:
It is still a bogus situation that we need to avoid. |
This is the stack trace leading to
The difference with test_basicSync is that the target folder does not exist at all:
The "data1.dat.ocTransferId4040295251.part" doesn't exist on disk. However with test_basicSync, that folder already existed. Apparently doing an INSERT with a null parent is acceptable, but not an UPDATE. But in general the bigger issue is that the ".part" folder has no entry because part files are never indexed. So basically there's a hole in the tree in oc_filecache if we let it through. So the following points have been identified:
We can fix 4) short-term to make it work. But 3) is still unexplained. @schiesbn @icewind1991 @DeepDiver1975 thoughts ? |
Another alternative is to allow ".part" folders to be indexed, but not files (unless one day we really use .part folders for something else #13756). Might need checking the mime type at insertion time though... |
@schiesbn why aren't we stripping ocTransferId + .part any more ? I thought that when overwriting a file we'd just reuse the same keys ? |
@PVince81 the final keys get renamed with the final file because all this happens in the storage wrapper now which gets called for part files the same way it get called for real files. This makes it much easier with less exceptions and special cases. |
I'm trying to analyze 3). It is very strange that it tries to recreate the OC_DEFAULT_MODULE folder twice for the same transfer id but not when done with the regular sync client. By the way I have sync client owncloud-client-1.8.4-2.5.x86_64 Here is with the regular sync client:
Here we can see that file_assemble is done once, and after that the keys are written. But now look what happens with test_basicSync:
It looks like we get two separate PUT requests with the same transferId ?! |
So far I suspect that test_basicSync is a bit aggressive and tries uploading the same file at the same time from two separate clients. If that happens at exactly the same milliseconds (I have a fast laptop with SSD, and the smashdir is a ramdisk), and if the transferid is the current time, then there's the risk of having transferid collision as observed above. |
I found the matching access log entries:
You can see that the same transferid is reused. @DeepDiver1975 I suggest to defer this ticket to 8.1.2, this shouldn't be a showstopper as this situation is very unlikely to happen. |
TODOs:
|
Update: fixing the collision on the client also make the test pass, see owncloud/client#3522 (comment) |
I raised all issues as separate tickets: #17956 (comment) Actually I don't see anything to be done in 8.1.2 apart from maybe providing the client fix in the 1.8.x series. (CC @guruz) Closing. |
This is awesome - thanks a lot! |
Steps
-a
:Expected result
No errors
Actual result:
@schiesbn please have a look. Thanks.
The text was updated successfully, but these errors were encountered: