Skip to content

Commit

Permalink
Do not interleave readdir() calls with deletion of directory entries.
Browse files Browse the repository at this point in the history
On macOS and some other non-Linux OSes, on some filesystems, readdir(dir) may
return NULL (with zero errno) after an entry in dir has been deleted. We thus
need to readdir() all directory entries before starting to delete them.

A benchmark (deleting 10 copies of the Linux kernel source) seems to show
that the new approach is approximately as fast as the previous one (slightly
faster on Linux, slightly slower on Mac), and in any case, is noticeably
faster than the system's /bin/rm.

Bazel's previous unix_jni DeleteTreesBelow implementation:
6.987 s (Linux/ext4); 89.44 s (Mac/APFS)

New Bazel unix_jni DeleteTreesBelow implementation:
6.971 s (Linux/ext4); 90.46 s (Mac/APFS)

`rm -rf`:
7.323 s (Linux/ext4); 99.09 s (Mac/APFS)

RELNOTES: None.
PiperOrigin-RevId: 346790610
  • Loading branch information
tetromino authored and Copybara-Service committed Dec 10, 2020
1 parent bf32cb8 commit 230be16
Showing 1 changed file with 28 additions and 5 deletions.
33 changes: 28 additions & 5 deletions src/main/native/unix_jni.cc
Expand Up @@ -907,6 +907,13 @@ static int DeleteTreesBelow(JNIEnv* env, std::vector<std::string>* dir_path,
}

dir_path->push_back(entry);
// On macOS and some other non-Linux OSes, on some filesystems, readdir(dir)
// may return NULL after an entry in dir is deleted even if not all files have
// been read yet - see https://support.apple.com/kb/TA21420; we thus read all
// the names of dir's entries before deleting. We don't want to simply use
// fts(3) because we want to be able to chmod at any point in the directory
// hierarchy to retry a filesystem operation after hitting an EACCES.
std::vector<std::string> dir_files, dir_subdirs;
for (;;) {
errno = 0;
struct dirent* de = readdir(dir);
Expand All @@ -927,15 +934,31 @@ static int DeleteTreesBelow(JNIEnv* env, std::vector<std::string>* dir_path,
break;
}
if (is_dir) {
if (DeleteTreesBelow(env, dir_path, dirfd(dir), de->d_name) == -1) {
dir_subdirs.push_back(de->d_name);
} else {
dir_files.push_back(de->d_name);
}
}
if (env->ExceptionOccurred() == NULL) {
for (const auto &file : dir_files) {
if (ForceDelete(env, *dir_path, dirfd(dir), file.c_str(), false) == -1) {
CHECK(env->ExceptionOccurred() != NULL);
break;
}
}

if (ForceDelete(env, *dir_path, dirfd(dir), de->d_name, is_dir) == -1) {
CHECK(env->ExceptionOccurred() != NULL);
break;
// DeleteTreesBelow is recursive; don't hold on to file names unnecessarily.
dir_files.clear();
}
if (env->ExceptionOccurred() == NULL) {
for (const auto &subdir : dir_subdirs) {
if (DeleteTreesBelow(env, dir_path, dirfd(dir), subdir.c_str()) == -1) {
CHECK(env->ExceptionOccurred() != NULL);
break;
}
if (ForceDelete(env, *dir_path, dirfd(dir), subdir.c_str(), true) == -1) {
CHECK(env->ExceptionOccurred() != NULL);
break;
}
}
}
if (closedir(dir) == -1) {
Expand Down

0 comments on commit 230be16

Please sign in to comment.