Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std: Implement cross-platform metadata API #10486

Merged
merged 5 commits into from Feb 14, 2022
Merged

Conversation

ominitay
Copy link
Contributor

@ominitay ominitay commented Jan 2, 2022

This PR implements a cross-platform metadata API, aiming to reduce unnecessary Unix-dependence of the Zig std.fs api. Presently, all OSes beside Windows are treated as Unix; I think that this is the best way to treat things by default, instead of explicitly listing each Posix-compliant OS.

Adds:

  • File.setPermissions() : Sets permission of a file according to a Permissions struct (not available on WASI)

  • File.Metadata : A cross-platform representation of file metadata

    • Metadata.size() : Returns the size of a file
    • Metadata.permissions() : Returns a Permissions struct, representing permissions on the file
    • Metadata.kind() : Returns the Kind of the file
    • Metadata.time() : Returns the time the file was accessed, modified, or created
  • File.Permissions : A cross-platform representation of file permissions

    • Permissions.readOnly() : Returns whether the file is read-only
    • Permissions.setReadOnly() : Sets whether the file is read-only
    • Permissions.unixSet() : Sets permissions for a class (UNIX-only)
    • Permissions.unixGet() : Checks a permission for a class (UNIX-only)
    • Permissions.unixNew() : Returns a new Permissions struct representing the passed mode (UNIX-only)

I don't quite think this is complete yet, and would like some feedback on this. Additionally, I haven't been able to test whether this works on Windows (it compiles, but not sure if tests pass), as Wine doesn't seem to want to work. If anyone could run tests on a Windows machine for me, it would be much appreciated.

closes #10441
cc #6600

@ominitay
Copy link
Contributor Author

ominitay commented Jan 2, 2022

This is NOT ready to be merged!!

@ominitay
Copy link
Contributor Author

ominitay commented Jan 2, 2022

WASI isn't supported because of a lack of chmod. Is this something that could be implemented on WASI?

@ominitay ominitay force-pushed the metadata branch 2 times, most recently from ac58785 to 721b938 Compare January 2, 2022 14:35
@andreafeletto
Copy link

Regarding the {a,m,c}time functions, I would suggest merging them into one function (maybe called time) which would accept an enum of value .accessed, .modified or .created.

@ominitay
Copy link
Contributor Author

ominitay commented Jan 2, 2022

Regarding the {a,m,c}time functions, I would suggest merging them into one function (maybe called time) which would accept an enum of value .accessed, .modified or .created.

I like that idea! I'll get that done in a bit.

@ominitay ominitay force-pushed the metadata branch 2 times, most recently from fb66278 to 892466b Compare January 2, 2022 22:40
lib/std/fs/file.zig Outdated Show resolved Hide resolved
lib/std/fs/test.zig Outdated Show resolved Hide resolved
@daurnimator daurnimator added the standard library This issue involves writing Zig code for the standard library. label Jan 3, 2022
@ominitay
Copy link
Contributor Author

ominitay commented Jan 3, 2022

Passing tests on Windows now!! 🎉

lib/std/fs/file.zig Outdated Show resolved Hide resolved
@ominitay
Copy link
Contributor Author

ominitay commented Jan 4, 2022

Soooo ctime on Unix and ctime on Windows mean different things. Time to redo a lot of things :)

@ominitay
Copy link
Contributor Author

ominitay commented Jan 4, 2022

@AndreaOrru

Regarding the {a,m,c}time functions, I would suggest merging them into one function (maybe called time) which would accept an enum of value .accessed, .modified or .created.

This won't be happening now, sorry :(

pub const MetadataError = StatError;

/// Returns a `Metadata` struct, representing the permissions on the file
pub fn metadata(self: File) MetadataError!Metadata {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about getMetadata? I think Zig's std convention is usually to use verbs for getting/setting things?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When discussing this in the Discord server, @ifreund said to me the opposite -- that I should drop the 'get' prefix which I was using previously. That matches with .allocator() too.

/// Can be obtained with `File.metadata()`
pub const Metadata = switch (builtin.os.tag) {
.windows => struct {
attributes: windows.DWORD,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should prefix all platform specific fields with a platform identifier? So in this case windowsAttributes? The reason I suggest this is to make it obvious what parts of the API are platform agnostic and which aren't. For example,

const m = try file.metadata();
std.log.info("file attributes={d}", .{m.attributes});

This code snippet looks like it's platform agnostic but it's not. If the developer who wrote this only compiles it for Windows then they won't know the mistake. This would become much more obvious if it looked like this:

const m = try file.metadata();
std.log.info("file attributes={d}", .{m.windowsAttributes});

Now it's alot more obvious that this field is specific to Windows and if they want their code to work elsewhere to surround it with a builtin.os.tag check. We also use this strategy for platform specific functions like you did with unixHas etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to rethink a lot of this really, and one idea I keep coming back do is having File.Metadata, which consistently has the same functions, and has a field inner, which contains the actual platform-specific implementation (MetadataLinux etc), and has methods which aren't shared across platforms. This would help discourage direct use of member fields as well as prefixing them with the platform name, and additionally keep documentation for File.Metadata exclusively to cross-platform methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's an improvement IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should now be resolved by the pushed changes, though I haven't yet added any platform-specific operations :)

@ominitay
Copy link
Contributor Author

ominitay commented Jan 7, 2022

Have been doing some research in the Discord server regarding how OSes expose creation time to us. So far, I have found:

When there is no creation time (i.e. the filesystem does not support it):

  • Linux zeroes a mask bit with statx
    • This is nice -- we can easily detect this, and don't lose any otherwise valid times.
  • FreeBSD sets birthtime seconds to -1, and nsecs to 0
  • NetBSD sets birthtime seconds to 0, and nsecs to 0
  • OpenBSD sets birthtime seconds to 0, and nsecs to 0
    • The *BSDs are less nice -- we can easily detect this, but we lose otherwise valid times.
  • MacOS sets birthtime to ctime
    • This is awful -- we can't detect this.

Obtaining the creation time:

  • Linux goes its own way with the added statx syscall, but is fairly nice to use, so I'm not upset.
  • Other Unices either have a field in their stat structs, or don't provide it at all. This is not so good because of the above point. Additionally, they may be named differently, so I intend to change the declarations to fix this.
  • Windows is Windows, and has its own way, putting CreationTime in BasicFileInfo.

@ominitay
Copy link
Contributor Author

ominitay commented Jan 7, 2022

This means we have two options for Metadata.created():

Firstly, as it has to return an optional anyway (we can't know whether the filesystem supports it or not until runtime), we could return null if the target doesn't have Stat.btim. This would have the disadvantage of leaving this to be dealt with at compile-time to be dealt with at runtime.

On the other hand, as we know for certain whether the target has Stat.btim or not at compile-time, we could instead throw a compile error. This would have the disadvantage of meaning that we can't do Metadata.created(), only Metadata.inner.created().

My personal preference is for the first, as I think it makes the API more flexible, but would like to see what you think :)

You can react 👍 for the first or 👎 for the second if you want

// On MacOS, it is set to ctime -- we cannot detect this!!
// Additionally, we don't know how filesystems themselves may deal with this -- they may
// in-fact return a different value, so this check is fairly permissive.
if (birthtime.tv_sec == -1 or birthtime.tv_nsec <= -1 or (birthtime.tv_sec == 0 and birthtime.tv_nsec == 0)) // wtf i hate this
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I anticipate that people will want to examine this line in particular.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this check to broken up per OS as in the comment.

@ominitay ominitay force-pushed the metadata branch 2 times, most recently from c6dcaa9 to 32d94aa Compare January 9, 2022 01:12
@ominitay
Copy link
Contributor Author

ominitay commented Jan 9, 2022

If someone is still running it, could someone test this on a machine with a kernel earlier than 4.11?

@squeek502
Copy link
Collaborator

squeek502 commented Jan 9, 2022

If someone is still running it, could someone test this on a machine with a kernel earlier than 4.11?

I downloaded the latest master binary tarball of zig, replaced lib/std/fs.zig/lib/std/fs/file.zig/lib/std/fs/test.zig with the versions from this branch, added a debug.print call to the metadata test to print out the accessed/modified/created times, which gave me:

$ uname -r
4.4.0-210-generic
$ ./zig test lib/std/fs/test.zig
Test [36/39] test "File.metadata"... accessed: 1641697197363780756 modified: 1641697197363780756 created: null
All 39 tests passed.

Let me know if there's something more specific that you're looking to check.

@ominitay
Copy link
Contributor Author

ominitay commented Jan 9, 2022

That looks good! Thank you :)

lib/std/fs/file.zig Outdated Show resolved Hide resolved
// On MacOS, it is set to ctime -- we cannot detect this!!
// Additionally, we don't know how filesystems themselves may deal with this -- they may
// in-fact return a different value, so this check is fairly permissive.
if (birthtime.tv_sec == -1 or birthtime.tv_nsec <= -1 or (birthtime.tv_sec == 0 and birthtime.tv_nsec == 0)) // wtf i hate this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this check to broken up per OS as in the comment.

@ominitay
Copy link
Contributor Author

I'd prefer this check to broken up per OS as in the comment.

Not sure why I can't reply to this in a conversation. My thought process was to make this check fairly loose, so that if the filesystem is being weird and setting this value to a different (also invalid) value, we'd still deal with it. Though I think you are right about this the further I've thought about it.

@ominitay
Copy link
Contributor Author

I think it's best to compile error if a switch branch isn't specified though, in order to force future contributors to update this properly.

@ominitay
Copy link
Contributor Author

So I can't find what Haiku sets it to by default. This PR would still be mergeable without a branch for it, but means that Haiku users would encounter a compile error when trying to use Metadata.created().

Uses timespec for times in `Stat` instead of two `isize` fields per time. This matches the <sys/stat.h> header file.
Adds a birthtime function to the `Stat` structs of Unices which support this. This is done to match the `atime`, `mtime`, and `ctime` functions.
Implements a cross-platform metadata API, aiming to reduce unnecessary Unix-dependence of the `std.fs` api. Presently, all OSes beside Windows are treated as Unix; this is likely the best way to treat things by default, instead of explicitly listing each Unix-like OS.

Platform-specific operations are not provided by `File.Metadata`, and instead are to be accessed from `File.Metadata.inner`.

Adds:

- File.setPermissions() : Sets permission of a file according to a `Permissions` struct (not available on WASI)

- File.Permissions : A cross-platform representation of file permissions
  - Permissions.readOnly() : Returns whether the file is read-only
  - Permissions.setReadOnly() : Sets whether the file is read-only
  - Permissions.unixSet() : Sets permissions for a class (UNIX-only)
  - Permissions.unixGet() : Checks a permission for a class (UNIX-only)
  - Permissions.unixNew() : Returns a new Permissions struct to represent the passed mode (UNIX-only)

- File.Metadata : A cross-platform representation of file metadata
  - Metadata.size() : Returns the size of a file
  - Metadata.permissions() : Returns a `Permissions` struct, representing permissions on the file
  - Metadata.kind() : Returns the `Kind` of the file
  - Metadata.accessed() : Returns the time the file was last accessed
  - Metadata.modified() : Returns the time the file was last modified
  - Metadata.created() : Returns the time the file was created (this is an optional, as the underlying filesystem, or OS may not support this)

Methods of `File.Metadata` are also available for the below, so I won't repeat myself
The below may be used for platform-specific functionality

- File.MetadataUnix : The internal implementation of `File.Metadata` on Unices

- File.MetadataLinux : The internal implementation of `File.Metadata` on Linux

- File.MetadataWindows : The implementation of `File.Metadata` on Windows
@ominitay
Copy link
Contributor Author

Solved all that, notwithstanding the Haiku question. Should be ready to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a better file metadata API
6 participants