Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix delete in the owncloud storage driver #1833

Merged
merged 1 commit into from Jun 28, 2021

Conversation

C0rby
Copy link
Contributor

@C0rby C0rby commented Jun 28, 2021

In the owncloud storage driver when we delete a file we append the deletion time to the file name.
If two fast consecutive deletes happened, the deletion time would be the same and if the two files had the same name we ended up with only one file in the trashbin.

Because of this issue we had inconsistently failing tests in CI.

@phil-davis
Copy link
Contributor

So maybe this sequence is the problem:

    And user "Alice" has deleted file "/textfile0.txt"
    When user "Alice" restores the file with original path "/textfile0.txt" to "/PARENT/textfile0.txt" using the trashbin API

The 1st step puts a textfile0.txt into the trashbin, with some timestamp.

The 2nd step, in the back-end, first deletes /PARENT/textfile0.txt and that goes in the trashbin as textfile0.txt, with some timestamp (*). The timestamp might be the same as the 1st step.

Then the restore happens. textfile0.txt in the trashbin might be the content from PARENT that was just put into the trashbin. That gets restored, and so effective "nothing happened".

I have a question about (*): when a file is deleted from a folder, I would have guessed that the file would be stored in a matching folder PARENT in the trashbin, to avoid this kind of file name clash, and to easily "guess" where the deleted file originally came from.

@C0rby
Copy link
Contributor Author

C0rby commented Jun 28, 2021

I would have guessed that the file would be stored in a matching folder PARENT in the trashbin

AFAIK this only happens if you delete the folder but not if you just delete a file from that folder.

So maybe this sequence is the problem:

And user "Alice" has deleted file "/textfile0.txt"
When user "Alice" restores the file with original path "/textfile0.txt" to "/PARENT/textfile0.txt" using the trashbin API

The 1st step puts a textfile0.txt into the trashbin, with some timestamp.

Yes, it's what triggers the bug but the collision avoidance code in the owncloud driver was wrong anyways. The os.Rename will overwrite an existing file. It won't return the os.ErrExist error.

@C0rby C0rby force-pushed the fix-delete-owncloud-storage-driver branch from e65c2e6 to 4cab3b3 Compare June 28, 2021 11:38
@phil-davis
Copy link
Contributor

https://drone.cernbox.cern.ch/cs3org/reva/1946/17/7

runsh: Total unexpected passed scenarios throughout the test run:
apiTrashbin/trashbinRestore.feature:108
apiTrashbin/trashbinRestore.feature:109
apiTrashbin/trashbinRestore.feature:110
apiTrashbin/trashbinRestore.feature:111

That's great - those scenarios were in expected-failures - I suppose they were failing "a lot of the time" before this fix.

Copy link
Contributor

@phil-davis phil-davis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I made a comment about doing this "even better" in future.

Maybe the other storage drivers already have a better implementation?

if _, err := os.Stat(tgt); !errors.Is(err, os.ErrNotExist) {
// timestamp collision, try again with higher value:
dtime++
tgt = filepath.Join(rp, fmt.Sprintf("%s.d%d", filepath.Base(ip), dtime))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good improvement, so we should have it.
But IMO I can write a test where there are lots of textfile.txt in different folders. And I delete then all quickly. The first 2 will delete fine. Then, if the deletes happen more than once per second, there will quickly be a case where the file "exists" in the trashbin with the current Unix second, and one exists for the next Unix second, so the 3rd (or 4th and/or...) delete will fail.

We have this sort of stuff in oC10 where trashbin and versions entries have uses of Unix second in the back-end implementation. And that means there are timing issues when multiple deletes, versions,... happen in the same second. And these things can happen, specially when a sync-client "wakes up" and has a backlog of things to do, or if some automated data-producer is doing "lots of stuff fast".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree but I am kind of hesitant to spend too much time on the owncloud storage driver since we don't really use it anywhere except for the tests and I still don't know if we even intend to use it productively. Maybe we need to bring this question up again in the standup or some other discussion...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - decisions are needed about which storage drivers are to be "long-term" and enhanced to pass "all the tests".

@phil-davis
Copy link
Contributor

@ishank011 @labkode this fixes a timing issue that can cause tests results to be less-predictable. Please review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants