Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return file creation times on Linux (using statx) #248

Merged
merged 9 commits into from
Jun 4, 2022
22 changes: 0 additions & 22 deletions cap-primitives/src/fs/metadata.rs
Original file line number Diff line number Diff line change
Expand Up @@ -48,35 +48,13 @@ impl Metadata {

#[inline]
fn from_parts(std: fs::Metadata, ext: MetadataExt, file_type: FileType) -> Self {
// TODO: Initialize `created` on Linux with `std.created().ok()` once we
// make use of `statx`.
Self {
file_type,
len: std.len(),
permissions: Permissions::from_std(std.permissions()),
modified: std.modified().ok().map(SystemTime::from_std),
accessed: std.accessed().ok().map(SystemTime::from_std),

#[cfg(any(
target_os = "freebsd",
target_os = "openbsd",
target_os = "macos",
target_os = "ios",
target_os = "netbsd",
windows,
))]
created: std.created().ok().map(SystemTime::from_std),

#[cfg(not(any(
target_os = "freebsd",
target_os = "openbsd",
target_os = "macos",
target_os = "ios",
target_os = "netbsd",
windows,
)))]
created: None,

ext,
}
}
Expand Down
4 changes: 3 additions & 1 deletion cap-primitives/src/rustix/fs/metadata_ext.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
use crate::fs::PermissionsExt;
use crate::fs::{FileTypeExt, Metadata};
use crate::time::{Duration, SystemClock, SystemTime};
// TODO: update all these to
// #[cfg(any(target_os = "android", target_os = "linux"))]
// once we're on restix >= v0.34.3.
#[cfg(all(target_os = "linux", target_env = "gnu"))]
use rustix::fs::{makedev, Statx};
use rustix::fs::{RawMode, Stat};
Expand Down Expand Up @@ -222,7 +225,6 @@ impl MetadataExt {
/// Constructs a new instance of `Metadata` from the given `Statx`.
#[cfg(all(target_os = "linux", target_env = "gnu"))]
#[inline]
#[allow(dead_code)] // TODO: use `statx` when possible.
pub(crate) fn from_rustix_statx(statx: Statx) -> Metadata {
Metadata {
file_type: FileTypeExt::from_raw_mode(RawMode::from(statx.stx_mode)),
Expand Down
29 changes: 29 additions & 0 deletions cap-primitives/src/rustix/fs/stat_unchecked.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,14 @@ use rustix::fs::{statat, AtFlags};
use std::path::Path;
use std::{fs, io};

// TODO: update all these to
// #[cfg(any(target_os = "android", target_os = "linux"))]
// once we're on restix >= v0.34.3.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea when the rustix version will be updated? I don't have a sense of how big that is, so I didn't want to add it to this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest thing for cap-std is that there are some semver incompatibilities between versions of io-lifetimes, which is tracking changes from I/O safety stabilization. Fortunately, the FCP looks to be almost done. Once it's done, I plan to do an io-lifetimes release and then a rustix 0.35 release, and then update cap-std and various other things.

#[cfg(all(target_os = "linux", target_env = "gnu"))]
use rustix::fs::{statx, StatxFlags};
#[cfg(all(target_os = "linux", target_env = "gnu"))]
use std::sync::atomic::{AtomicBool, Ordering};

/// *Unsandboxed* function similar to `stat`, but which does not perform
/// sandboxing.
pub(crate) fn stat_unchecked(
Expand All @@ -15,5 +23,26 @@ pub(crate) fn stat_unchecked(
FollowSymlinks::No => AtFlags::SYMLINK_NOFOLLOW,
};

// `statx` is preferred on Linux because it can return creation times.
// Linux kernels prior to 4.11 don't have `statx` and return `ENOSYS`. We
// store the availability in a global to avoid unnecessary syscalls.
#[cfg(all(target_os = "linux", target_env = "gnu"))]
static HAS_STATX: AtomicBool = AtomicBool::new(true);

#[cfg(all(target_os = "linux", target_env = "gnu"))]
if HAS_STATX.load(Ordering::Relaxed) {
let statx_result = statx(
start,
path,
atflags,
StatxFlags::BASIC_STATS | StatxFlags::BTIME,
);
match statx_result {
Ok(statx) => return Ok(MetadataExt::from_rustix_statx(statx)),
Err(rustix::io::Error::NOSYS) => HAS_STATX.store(false, Ordering::Relaxed),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw a comment in open_impl.rs about seccomp on Android making ENOSYS a bad idea. I don't think that applies here yet, but it might later. Is there a better approach? Should I place a comment on this landmine?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust's own standard library calls statx without doing an explicit version check, so we're ok on Androidd here.

Looking into it though, I notice that there is a potential problem with statx returning EPERM on old versions of Docker; std does a special check to avoid this problem. If you want to implement that same trick here, that'd be great, though since rustix also needs to do this, it's also fine to leave this code as-is for now, and I'll handle this when I update rustix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. It seems like if we merge this PR now, it'd break metadata for not-very old versions of Docker, so that's probably bad.

Why/where does rustix need this logic? I thought it just wrapped statx and doesn't really care whether or not statx succeeds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rustix uses statx in its implementation of rustix::fs::stat. Rustix usually doesn't do that kind of emulation, but in this case, it's motivated by a desire to be y2038-ready on platforms which support that. On Linux, that means using statx instead of the native stat family of syscalls whenever possible. So if you want to fix this now, that'd be great, but I am also happy to fix this myself at the same time that I fix rustix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't see that earlier because I was searching through an older version, oops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just simulated the seccomp statx -> EPERM behavior on my system using this Python script (as root):

import errno
import os
import seccomp

f = seccomp.SyscallFilter(seccomp.ALLOW)
f.add_rule(seccomp.ERRNO(errno.EPERM), "statx")
f.load()

os.execl(
    "target/debug/deps/fs_additional-d7e3e72f4c824747",
    "fs_additional",
    "--nocapture",
    "--test-threads=1",
    "metadata_vs_std_fs",
)

It's not quite as accurate as getting an old Docker version, but I'd rather not do that :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated stat_unchecked to handle EPERM. I couldn't do the same EFAULT trick as Rust std uses (I think it'd require unsafe), but I pushed the test off the critical path. Let me know what you think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm considering adding something in Rustix to allow or the null-pointer EFAULT trick, but for now the code you have here looks good.

Err(e) => return Err(e.into()),
}
ongardie marked this conversation as resolved.
Show resolved Hide resolved
}

Ok(statat(start, path, atflags).map(MetadataExt::from_rustix)?)
}
111 changes: 111 additions & 0 deletions tests/fs_additional.rs
Original file line number Diff line number Diff line change
Expand Up @@ -872,3 +872,114 @@ fn reopen_fd() {
let tmpdir2 = check!(cap_std::fs::Dir::reopen_dir(&tmpdir.as_filelike()));
assert!(tmpdir2.exists("subdir"));
}

#[test]
fn metadata_vs_std_fs() {
let tmpdir = tmpdir();
check!(tmpdir.create_dir("dir"));
let dir = check!(tmpdir.open_dir("dir"));
let file = check!(dir.create("file"));

let cap_std_dir = check!(dir.dir_metadata());
let cap_std_file = check!(file.metadata());
let cap_std_dir_entry = {
let mut entries = check!(dir.entries());
let entry = check!(entries.next().unwrap());
assert_eq!(entry.file_name(), "file");
assert!(entries.next().is_none(), "unexpected dir entry");
check!(entry.metadata())
};

let std_dir = check!(dir.into_std_file().metadata());
let std_file = check!(file.into_std().metadata());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is much nicer now, I think. However, it's a little funny as is. It does all this setup to extract metadata in various ways from cap_std and std, but then it only compares the created times of the metadata and nothing else. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On one hand, creation times have an interesting enough implementation landscape that I feel it's justified to have a fair amount of test setup just for that one purpose, so this seems fine. On the other, since we have all this setup here, it'd be nice to expand this to test all of Metadata's accessors to make sure they all match. accessed() might need to be a >= check instead of equality because files can get spurious access-time updates.

I don't want to bog you down if you're working on a project which just needs the creation times fix. So I'll leave it up to you for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I expanded this out for all of the accessible Metadata (with only Unix extensions for now).

match std_dir.created() {
Ok(_) => println!("std::fs supports file created times"),
Err(e) => println!("std::fs doesn't support file created times: {}", e),
}

check_metadata(&std_dir, &cap_std_dir);
check_metadata(&std_file, &cap_std_file);
check_metadata(&std_file, &cap_std_dir_entry);
}

fn check_metadata(std: &std::fs::Metadata, cap: &cap_std::fs::Metadata) {
assert_eq!(std.is_dir(), cap.is_dir());
assert_eq!(std.is_file(), cap.is_file());
assert_eq!(std.is_symlink(), cap.is_symlink());
assert_eq!(std.file_type().is_dir(), cap.file_type().is_dir());
assert_eq!(std.file_type().is_file(), cap.file_type().is_file());
assert_eq!(std.file_type().is_symlink(), cap.file_type().is_symlink());
if cfg!(unix) {
use std::os::unix::fs::FileTypeExt;
assert_eq!(
std.file_type().is_block_device(),
cap.file_type().is_block_device()
);
assert_eq!(
std.file_type().is_char_device(),
cap.file_type().is_char_device()
);
assert_eq!(std.file_type().is_fifo(), cap.file_type().is_fifo());
assert_eq!(std.file_type().is_socket(), cap.file_type().is_socket());
}

assert_eq!(std.len(), cap.len());

assert_eq!(std.permissions().readonly(), cap.permissions().readonly());
if cfg!(unix) {
use std::os::unix::fs::PermissionsExt;
// The standard library returns file format bits with `mode()`, whereas
// cap-std currently doesn't.
assert_eq!(std.permissions().mode() & 0o7777, cap.permissions().mode());
Comment on lines +934 to +936
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unexpected and a bit surprising to me. Is it intentional? If so, is that documented somewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's cap-std's intention to match std's behavior at least, so it looks like a bug in cap-std. I filed #249 to track this.

}

// If the standard library supports file modified/accessed/created times,
// then cap-std should too.
if let Ok(expected) = std.modified() {
assert_eq!(expected, check!(cap.modified()).into_std());
}
// The access times might be a little different due to either our own
// or concurrent accesses.
const ACCESS_TOLERANCE_SEC: u32 = 60;
Comment on lines +944 to +946
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't love this but didn't want to assume we made the cap or std access first, and I wanted to check the access times weren't wildly far into the future either. I guess it's pretty easy to find and update later if it causes trouble.

if let Ok(expected) = std.accessed() {
let access_tolerance = std::time::Duration::from_secs(ACCESS_TOLERANCE_SEC.into());
assert!(
((expected - access_tolerance)..(expected + access_tolerance))
.contains(&check!(cap.accessed()).into_std()),
"std accessed {:#?}, cap accessed {:#?}",
expected,
cap.accessed()
);
}
if let Ok(expected) = std.created() {
assert_eq!(expected, check!(cap.created()).into_std());
}

if cfg!(unix) {
use std::os::unix::fs::MetadataExt;
assert_eq!(std.dev(), cap.dev());
assert_eq!(std.ino(), cap.ino());
assert_eq!(std.mode(), cap.mode());
assert_eq!(std.nlink(), cap.nlink());
assert_eq!(std.uid(), cap.uid());
assert_eq!(std.gid(), cap.gid());
assert_eq!(std.rdev(), cap.rdev());
assert_eq!(std.size(), cap.size());
assert!(
((std.atime() - i64::from(ACCESS_TOLERANCE_SEC))
..(std.atime() + i64::from(ACCESS_TOLERANCE_SEC)))
.contains(&cap.atime()),
"std atime {}, cap atime {}",
std.atime(),
cap.atime()
);
assert!((0..1_000_000_000).contains(&cap.atime_nsec()));
assert_eq!(std.mtime(), cap.mtime());
assert_eq!(std.mtime_nsec(), cap.mtime_nsec());
assert_eq!(std.ctime(), cap.ctime());
assert_eq!(std.ctime_nsec(), cap.ctime_nsec());
assert_eq!(std.blksize(), cap.blksize());
assert_eq!(std.blocks(), cap.blocks());
}
}