Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minio erroneously deletes symlinks to full directories in the path of a file being deleted #9419

Closed
ghost opened this issue Apr 22, 2020 · 4 comments · Fixed by #9457
Closed
Assignees

Comments

@ghost
Copy link

ghost commented Apr 22, 2020

Under particular circumstances, minio obliterates key components of the directory path - implemented as symlinks to directories - causing large swaths of the objectspace to become unavailable. The exact conditions under which this manifests are somewhat tricky but perfectly valid and are listed under "Context" or "Steps to Reproduce".

We believe the bug lurks somewhere around cmd/posix.go line 1430.

Expected Behavior

(This section should really come at the end of this particular bug report, since the bug is setup-sensitive and this description won't make sense without it, but oh-well.)
We expect minio to not really treat symlinks to directories the exact same way it treats directories because a key assumption does not hold across both of them. In particular, attempting to delete a full directory errors out, but attempting to delete a symlink to a full directory does NOT error out.

Instead, we expect minio to be aware of "symlinks" and treat them separately (or just give up if it detects a symlink and let "the user handle it").

Current Behavior

In true The AT&T Way of the elder days, minio utilizes error facilities of the OS to atomically determine if a "directory" is empty and delete it if it is so. However, this behaviour has different semantics on symlinks because a symlink is never "full" so as to raise an error.

Possible Solution

Ideally, minio would check if the "directory" being deleted is actually a symlink, if it is, it would dereference the symlink (make sure to dereference recursively, since a symlink could point to another symlink, loop checking might also be done but you could also just hardcode a limit to recursion and error if the limit gets hit), check if it is empty, and return if not. Notice that if the directory is empty, the behaviour we expect (and consider correct) is to delete the symlink in the original path of the file being deleted and not the directory the symlink ultimately points to (or any of the intermediate symlinks). Once the symlink to an empty directory is deleted, we expect minio to continue recursing up the original path of the file being deleted (up the URL path/object key) and not up the symlink target dirs .. path since those two could be miles apart (literally).

The alternate solution is to just give up if the delete up-recursion encounters a symlink.

A yet third option is to implement the functionality of not-deleting the directories containing files being deleted. The users should then implement their own garbage collection of empty directories (a simple example: a hourly cronjob doing find ${MINIO_VOLUME} -type d -delete -print ).

This following issues helpfully pointed out by github could be helpful:
#5102
#5664

Steps to Reproduce (for bugs)

  1. mkdir -p BUCKET/target
  2. touch BUCKET/target/dontwant BUCKET/target/want
  3. ln -s target BUCKET/linky
  4. now use minio to delete BUCKET/linky/dontwant
  5. minio is unable to retreve BUCKET/linky/want

Context

Assume there is a file somewhere deep in a directory hierarchy in some bucket on a POSIX backend. Further, assume that one (or more) of the "directories" that are in the parent path of this file is actually a symlink. Further, assume that all the directories under the symlink are empty save for the single file being deleted.

For example:
BUCKET/dir1/dir2/symlink/dir3/file_for_deletion
BUCKET/dir1/dir2/symlink/dir4/whatever
(dir3 in this example contains only "file_for_deletion")

Or, more ominously:
BUCKET/dir5/dir6/symlink/file_for_deletion
BUCKET/dir5/dir6/symlink/other
BUCKET/dir5/dir6/symlink/stuff
BUCKET/dir5/dir6/symlink/we
BUCKET/dir5/dir6/symlink/require

The specific reason there is a symlink there has to do with another wart of minio. When uploading files, minio first stores them in a temporary file in the "volume directory" (the path you give it on the command line) and then moves it to its final location when upload finishes. The way this is implemented, it does not work if the "volume directory" and bucket (or a part of the bucket) are on different filesystems on the underlying server (the server or container that actually runs minio as a process). This other wart is not of concern right now, it just explains why is there an "unusual" setup in the first place. :)

Regression

Not a regression that I can tell.

Your Environment

  • Version used (minio version): minio version RELEASE.2020-04-22T00-11-12Z
  • Environment name and version (e.g. nginx 1.9.1): ran as a system service by systemd
  • Server type and version: Ubuntu-18.04
  • Operating System and version (uname -a): Linux 4.15.0
  • Link to your project: N/A
@harshavardhana
Copy link
Member

We do not preserve symlinks, this is by design, symlinks cannot be preserved without significant complication to our code.

We should perhaps simply skip symlinks, for simplicity sake.

@ghost
Copy link
Author

ghost commented Apr 23, 2020

Skipping symlinks looks like a perfectly workable idea. I mentioned it in the bug report as "give up if you see a symlink and let the user handle it".

@kannappanr kannappanr added this to the Next Release milestone Apr 24, 2020
harshavardhana added a commit to harshavardhana/minio that referenced this issue Apr 26, 2020
harshavardhana added a commit to harshavardhana/minio that referenced this issue Apr 26, 2020
@ghost
Copy link
Author

ghost commented Apr 28, 2020

Actually... this makes matters far worse. Like, symlinks don't work at all this way. Better not to have fixed it and we would have patched together a workaround using bespoke POSIX directory permissions then to have... ruined it like this. Because this is a dealbreaker for us. Minio has to be removed from our cluster (or we have to fork and roll our own patchset) if this change persists.

When I wrote "give up if you see a symlink and let the user handle it" what I meant is to give up in the up-recursion when deleting the filesystem objects. That way minio would delete the file it got ordered to delete and all empty parents of the file up to the first symlink or bucket root. I categorically did not mean "give up if you see a symlink of any kind, in any circumstances". How would I be able to use minio to serve files that way?

Ignoring symlinks altogether may be okay in trivial installations, but for real-world applications, one has to remember that symlinks have many uses and are a legitimate component of the filesystem. Minio has to support being run by ops people who make non-trivial installations not because they have wrong ideas in their heads, but because they know what they are doing.

One of the use cases is when minio runs alongside other software that uses the same backend store. Over the years, you have received a number of requests for functionality that would make this easier to run, which means this is a real real-world use case that minio is actually being used for by real people in the world.

Please, revert this. I'll endeavor to make the patch myself and share it with you. You may need to port it to Windows since I have no experience interfacing to it.

@harshavardhana
Copy link
Member

Symlinks are not supported on object storage @akuktin I don't think it makes sense to revert it. Keeping symlinks without being able to navigate we cannot serve any requests to them.

It requires complicated handing on our end, that is why we used to follow the symlink.

You may fork MinIO and use it the way you feel like, I don't think we need to bring it back.

blaenk pushed a commit to blaenk/minio that referenced this issue Aug 26, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants