New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix `fs::remove_dir_all` #31944

Closed
wants to merge 5 commits into
base: master
from

Conversation

Projects
None yet
6 participants
@pitdicker
Copy link
Contributor

pitdicker commented Feb 27, 2016

This fixes the case where remove_dir_all is unable to remove large directories on Windows due to race conditions.
It is now also able to remove read-only files, and when the paths become longer that MAX_PATH.

Fixes #29497

This is based on #31892, I hope I didn't mess up git to much...

@rust-highfive

This comment has been minimized.

Copy link
Collaborator

rust-highfive commented Feb 27, 2016

r? @aturon

(rust_highfive has picked a reviewer for you, use r? to override)

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Feb 27, 2016

Your PR has three merges. You may want to consider rebasing and force pushing.


#[repr(C)]
pub struct FILE_DISPOSITION_INFO {
pub DeleteFile: BOOL, // for true use -1, not TRUE

This comment has been minimized.

@retep998

retep998 Feb 27, 2016

Member

Why does this need to be -1? I don't see anything on MSDN for why this can't be TRUE.

This comment has been minimized.

@pitdicker

pitdicker Feb 28, 2016

Author Contributor

This struct is the same as FILE_DISPOSITION_INFORMATION which uses a BOOLEAN. But TRUE just doesn't work, it gives an error. The documentation for this fuction is not great, this one is better: https://msdn.microsoft.com/nl-nl/library/windows/hardware/ff567096(v=vs.85).aspx

This comment has been minimized.

@retep998

retep998 Feb 28, 2016

Member

Set this member to TRUE to delete the file when it is closed.

Nowhere do I see any mention of -1. What error do you get when you use TRUE instead of -1?

This comment has been minimized.

@pitdicker

pitdicker Feb 28, 2016

Author Contributor

Something like: the file name, directory name or volume lable syntax is incorrect (I am on mobile, can't test now). But it took me hours to get working because of this bug. Only afterwards I found out this function is just a very thin wrapper around an NT internal function that takes a BOOLEAN instead of a BOOL, explaining the problem.

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Can you move this comment to where it's called down in remove(&self) and put some of these comments there as well? Sounds like it'll be useful info for the next reader!


// FIXME: check for internal NULs in 'new'
let reserved = OsStr::new(&"\0\0\0\0\0\0\0\0\0\0"[..STRUCT_SIZE/2]);
let mut data: Vec<u16> = reserved.encode_wide()

This comment has been minimized.

@retep998

retep998 Feb 27, 2016

Member

Wouldn't it be easier to simply do [0u16; STRUCT_SIZE].iter()?

This comment has been minimized.

@pitdicker

pitdicker Feb 28, 2016

Author Contributor

Yes, much easier :). But I could not get the iterator to chain with the next, which gives some EncodeWide type. This is the best I could come up with, but I am happy to change it.

This comment has been minimized.

@retep998

retep998 Feb 28, 2016

Member

Oh, it's probably because [0u16; STRUCT_SIZE].iter() returns an Iterator<Target=&u16> so you'd need an extra .map(|x| *x).

This comment has been minimized.

@retep998

retep998 Feb 28, 2016

Member

Could also do iter::repeat(0u16).take(STRUCT_SIZE).

This comment has been minimized.

@pitdicker

pitdicker Feb 28, 2016

Author Contributor

Thank you!

@@ -567,23 +647,85 @@ pub fn remove_dir_all(path: &Path) -> io::Result<()> {
// rmdir only deletes dir symlinks and junctions, not file symlinks.
rmdir(path)
} else {
remove_dir_all_recursive(path)
// canonicalize to get a `/??/`-path, which can handle files like `CON`

This comment has been minimized.

@retep998

retep998 Feb 27, 2016

Member

\\?\, not /??/. We're canonicalizing to verbatim paths not NT paths which are internal to the kernel.

This comment has been minimized.

@pitdicker

pitdicker Feb 28, 2016

Author Contributor

Oops, typo

@pitdicker

This comment has been minimized.

Copy link
Contributor Author

pitdicker commented Feb 28, 2016

@retep998 thanks for helping out with FILE_RENAME_INFO on irc!

And time to improve my git skills :(

@pitdicker pitdicker force-pushed the pitdicker:rmdir_all branch from 7f2999b to fafd9dc Feb 28, 2016

pitdicker added some commits Feb 28, 2016

Small fixes to setting Windows file permissions
On Windows files can have a simple read-only permission flag, and the
more complex access rights
and access control lists. The read-only flag is one bit of the file
attributes.

To make sure only the read-only flag is toggled and not the other
attributes, read the files
attributes and write them back with only this bit changed. By reading
and setting the attributes
from the handle the file does not have to be opened twice.

Also directories can not be read-only, here the flag has a different
meaning. So we should no
longer report directories as read-only, and also not allow making them
read-only.
Fix `fs::remove_dir_all`
This fixes the case where `remove_dir_all` is unable to remove large
directories on Windows due to race conditions.
It is now also able to remove read-only files, and when the paths become
longer that MAX_PATH.
@pitdicker

This comment has been minimized.

Copy link
Contributor Author

pitdicker commented Feb 28, 2016

Rebased

opts.access_mode(c::FILE_READ_ATTRIBUTES | c::FILE_WRITE_ATTRIBUTES);
opts.custom_flags(c::FILE_FLAG_BACKUP_SEMANTICS);
let file = try!(File::open(path, &opts));
file.set_perm(perm)

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Could you add some tests for this as well? It's fine for some of them to be windows-specific (e.g. in this module), but specifically:

  • Directories can't be set to readonly
  • Other attributes are preserved
let mut opts = fs::OpenOptions::new();
opts.write(true);
opts.create(true);
opts.attributes(0x1); // FILE_ATTRIBUTE_READONLY

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Could this reach into sys::c to get this constant out?

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Also if this is the only platform-specific part of this test, couldn't this be a platform-agnostic test by using set_permissions?

#[cfg(target_arch = "x86")]
const STRUCT_SIZE: usize = 12;
#[cfg(target_arch = "x86_64")]
const STRUCT_SIZE: usize = 20;

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Could this be done with mem::size_of::<FILE_RENAME_INFO>?

This comment has been minimized.

@pitdicker

pitdicker Feb 29, 2016

Author Contributor

Not really, as this is the size of the struct without padding at the end.
I will add a comment.

This comment has been minimized.

@alexcrichton

alexcrichton Mar 1, 2016

Member

Hm, perhaps this could be something like offset_of!(relevant_field, relevant_struct)?

This comment has been minimized.

@pitdicker

pitdicker Mar 5, 2016

Author Contributor

Is this an existing macro? I can't find it...

This comment has been minimized.

@alexcrichton

alexcrichton Mar 7, 2016

Member

Ah yeah sorry that was intended as basically "we should calculate the offset of a field and use that instead". The macro itself doesn't exist in-tree yet, but here's an implementation of it:

https://github.com/alexcrichton/ctest/blob/50ac771acb7bb45cf0c182a5a9c8188a15c89efc/src/lib.rs#L373-L377

#[cfg(target_arch = "x86_64")]
const STRUCT_SIZE: usize = 20;

// FIXME: check for internal NULs in 'new'

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

This to_u16s helper may be useful here, although it may need some tweaks if you want to avoid the extra allocation.

This comment has been minimized.

@pitdicker

pitdicker Mar 5, 2016

Author Contributor

I have taken a look at to_u16s before, and would like to change it a bit. I am not sure how many allocations it does now when building the Vec, but only one should be enough. It is simple to precompute how many u16's an OsStr converts to. Together with optionally reserving some space for a struct, it should help here and with symlink_junction.
But I would like to leave this for an other pr, this one is getting pretty big for me.

This comment has been minimized.

@alexcrichton

alexcrichton Mar 7, 2016

Member

Ok, sounds good to me. Right now this is only called for existing files so I don't think it's an error case we'll run into anyway.

// Thanks to alignment guarantees on Windows this works
// (8 for 32-bit and 16 for 64-bit)
let mut info = data.as_mut_ptr() as *mut c::FILE_RENAME_INFO;
(*info).ReplaceIfExists = if replace { -1 } else { c::FALSE };

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Could you add some docs here as well for -1 vs TRUE?

}
}

fn remove_dir_all_recursive(path: &Path) -> io::Result<()> {
fn remove_dir_all_recursive(path: &Path, base_dir: &Path, mut counter: u64)
-> io::Result<u64> {

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Stylistically we typically align the -> with the first character after the (


fn move_item(file: &File, base_dir: &Path, mut counter: u64)
-> io::Result<u64> {
let mut tmpname = base_dir.join(format!{"rm-{}", counter});

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Stylistically we typically use parens on format! macro invocations rather than braces

try!(file.set_perm(perm));
};
Ok(counter)
}

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Could you move these helper functions to the end of the function body? I tend to find it helpful to read the code and then go look at the implementation details.

// On Windows it is not enough to just recursively remove the contents of a
// directory and then the directory itself. Deleting does not happen
// instantaneously, but is scheduled. So just before or after flagging a
// file for deletion, we move it to some `base_dir` to avoid races.

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

This implementation is getting pretty complicated, could you expand this comment to explain all the cases we're trying to catch here? It'll be nice to have a big intro to what's going on below.

}
}
rmdir(path)
counter = try!(remove_item(path, base_dir, counter, false));

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Managing this counter variable may be best done with some sort of Context structure or something like that so it doesn't have to constantly be passed in and reset.

fn remove_item(path: &Path,
base_dir: &Path,
mut counter: u64,
readonly: bool) -> io::Result<u64> {

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Perhaps this could take a FilePermissions instead of readonly to be a little easier to read below?

// remove read-only permision
try!(file.set_perm(FilePermissions::new()));
counter = try!(move_item(&file, base_dir, counter));
try!(file.remove());

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Could the file be removed before moving? I'm a little worried about moving a file and then the process crashes or something, so these rm-{tag} files get littered all over the place

This comment has been minimized.

@pitdicker

pitdicker Feb 29, 2016

Author Contributor

According to the documentation there is just about nothing you can do with the handle after remove() except close it. I am a little surprised restoring the read-only flag works. But moving after flagging for deletion does not work.

This comment has been minimized.

@alexcrichton

alexcrichton Mar 1, 2016

Member

In that case maybe it'd be better to open, set the bits, close, reopen with "close on delete", and then move?

This comment has been minimized.

@pitdicker

pitdicker Mar 5, 2016

Author Contributor

I don't like to open the file twice, but you are right.

opts.access_mode(c::DELETE);
opts.custom_flags(c::FILE_FLAG_BACKUP_SEMANTICS | // delete directory
c::FILE_FLAG_OPEN_REPARSE_POINT | // delete symlink
c::FILE_FLAG_DELETE_ON_CLOSE);

This comment has been minimized.

@alexcrichton

alexcrichton Feb 29, 2016

Member

Just to confirm, the mode here is:

  • This will fail if there are any other option handles which do not specific FILE_SHARE_DELETE
  • All other future openings will fail unless they specify FILE_SHARE_DELETE if this succeeds.
  • Once all handles are closed, the file is deleted

So in other words, so long as nothing deadlocks, we're guaranteed to delete the file if the open succeeds, right?

This comment has been minimized.

@pitdicker

pitdicker Feb 29, 2016

Author Contributor

Yes, except all future openings will fail no matter what options or share mode they specify.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Feb 29, 2016

Thanks @pitdicker! This is looking great to me

@alexcrichton alexcrichton assigned alexcrichton and unassigned aturon Feb 29, 2016

@pitdicker

This comment has been minimized.

Copy link
Contributor Author

pitdicker commented Feb 29, 2016

Thanks for the review!.
I will do the other modifications during the week

pitdicker added some commits Mar 5, 2016

@pitdicker

This comment has been minimized.

Copy link
Contributor Author

pitdicker commented Mar 5, 2016

Updated.

I messed up a lot with the permissions stuff of directories. Here is the article that explains how the read-only attribute indicates a directory is special on Windows. As it turns out you can create and delete files in a read-only directory, but nor delete or remove the dir itself. This is exactly the opposite of Unix, where you can't create or delete files in a read-only directory, but can delete the dir itself.

So I reversed the permission stuff. And we now have an other edge case to handle on Unix: readonly dirs. Before every call to remove_dir_all_recursive we now check if the dir is read-only, and adjust the permissions. I choose to do this outside of remove_dir_all_recursive, because we already do a stat in remove_dir_all that can now be reused.

I think this addresses all the comments except the suggestion to use offset_of! (I can't find it) and to tweak to_u16s (for an other pr).

// readonly
mode.set_readonly(false);
try!(set_perm(&child.path(), mode));
}

This comment has been minimized.

@alexcrichton

alexcrichton Mar 7, 2016

Member

Perhaps this function could take &Metadata as an argument and perform this logic at the start? That way it could prevent duplication between here/above.

Err(ref err) => assert!(err.to_string().contains($s),
format!("`{}` did not contain `{}`", err, $s))
}
) }

This comment has been minimized.

@alexcrichton

alexcrichton Mar 7, 2016

Member

This macro doesn't look used any more, so it can probably get deleted

}
) }

pub struct TempDir(PathBuf);

This comment has been minimized.

@alexcrichton

alexcrichton Mar 7, 2016

Member

Perhaps this could move to a sys_common::tempdir module to share between std::fs::test and here?

if err.kind() != io::ErrorKind::AlreadyExists { return Err(err) };
tmpname = ctx.base_dir.join(format!("rm-{}", ctx.counter));
ctx.counter += 1;
}

This comment has been minimized.

@alexcrichton

alexcrichton Mar 7, 2016

Member

Could the join + increment be deduplicated here with something like:

loop {
    let tmpname = foo.join(bar);
    counter++;
    match rename() {
        Ok(()) => return Ok(()),
        Err(ref e) if e.is_already_exists() => {}
        Err(e) => return Err(e),
    }
}

It may also be a good idea to have the basename be something obscure like __rust__rm__ and to also cap the number of iterations done here. It looks like a bunch of concurrent remove_dir_all could cause a lot of contention on the filesystem here perhaps?

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 7, 2016

Hm now I'm getting a little more worried with this as time passes, it seems like we're definitely going out of our way to implement a arguably primitive operation, which may end up being surprising to some. This code was originally modelled after Boost which is significantly simpler than what's going on here (even on Unix). That's not necessarily either good or bad, but it's indicative to me that we should tread lightly here.

Perhaps we can try to find some other implementations of this function in other libraries? It it standard practice to attempt to so aggressively remove files/directories?

@pitdicker

This comment has been minimized.

Copy link
Contributor Author

pitdicker commented Mar 7, 2016

I think the part in the Windows version of moving files is not really over the top. You could say it is just a necessary evil to avoid races. But I share your feeling about the rest.

Personally I am finding more and more functions that could benefit from options. Overwrite or fail if something already exist. Follow symlinks or not. Delete read-only files, fail, or maybe ask for confirmation.... For copy whether we should copy times, attributes etc. Be recursive or not.

What to do about read-only files on Windows, and read-only directories on Unix is the difficult part. I kind of like it to just work, as you are also able to move the dir somewhere else and it looks deleted from that place. But I don't feel very strongly about it. (of course it would be sad, as it took me quite some time to get working...)

For a filesystem library to take inspiration from, I look a little at AFIO. I can't read it very easily, but it has the focus on correctness, avoiding races, and cross-platform consistency. But it is not all good, otherwise it would have ended up in Boost.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 9, 2016

Yeah it was somewhat planned that std::fs would grow various builders over time for options like these. You're right that fs::copy could do so much more (attributes, recursive, etc), but they'd probably want to all be encapsulated in some builder with tweakable options.

In terms of removing readonly directories on Unix, it looks like rm -rf doesn't automatically remove a readonly directory. Maybe we should stay consistent with that? Maybe we should only support removing readonly files on Windows specially?

@alexcrichton alexcrichton added T-libs and removed T-libs labels Mar 14, 2016

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 17, 2016

We discussed this during the libs triage meeting yesterday, and we unfortunately didn't reach too many conclusions about this. Some thoughts brought up were:

  • It would likely be worth looking at what other standard libraries do in this respect. For example this was copied from Boost, and it looks like Python also has a similar implementation (and similar bug reports).
  • We could perhaps add a remove_dir_all_force variant along the lines of rm -f to do the "extra work" to really delete something. If this is the function that basically everyone uses, however, then that seems unfortunate?
  • We may want to think about what would precisely be documented here. For example what are we comfortable documenting that this function does across platforms? Can we strictly document all "weird cases" it handles?
  • There's also the whole story of if you want to whitelist syscalls changing this over time can be hard, but this dies into whether we can document what it does today.

cc @rust-lang/libs

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Mar 17, 2016

The MSVC implementation of std::experimental::filesystem::remove_all is interesting. Instead of iterating over all the entries in the directory, it just gets the first entry, removes it, and repeats until the directory is empty or it encounters an error.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Mar 18, 2016

Interesting! Is the source for that public so we can browse over?

@retep998

This comment has been minimized.

Copy link
Member

retep998 commented Mar 18, 2016

Well you can view the source if you install MSVC and the Windows SDK.

remove_all and _Remove_all in C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\filesystem
Some other helper functions in C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\crt\src\stl\filesys.h

@bors

This comment has been minimized.

Copy link
Contributor

bors commented Mar 23, 2016

☔️ The latest upstream changes (presumably #32390) made this pull request unmergeable. Please resolve the merge conflicts.

@alexcrichton

This comment has been minimized.

Copy link
Member

alexcrichton commented Jun 28, 2016

Closing due to inactivity, but feel free to resubmit with a rebase!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment