Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs.Path: add a parent function that returns null when the root is reached #6746

Closed
wants to merge 1 commit into from

Conversation

FireFox317
Copy link
Contributor

When calling dirname in a loop, it's pretty easy to forget that you have
to check for a root directory. So this adds an API that forces the check.

Closes #6727, #6584, #6592, #6602.

/// Asserts that the path has a directory component.
pub fn parent(path: []const u8) ?[]const u8 {
const new_path = dirname(path) orelse unreachable;
if (new_path.len == path.len) return null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few edge cases that should be handled, such as dirname('//') = '/' that the length check doesn't detect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The length check does exactly check for this case to happen:
When passing in / or C:\, it has no parent component, so the path returned has the exact same length as the one passed in, in all other cases, the length will shorten and thus, a parent directory is found

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"//".len != "/".len, the check fails yet the two strings point to the same path and they're both root dirs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I also think we shouldn't handle this edge case in here. Lets say someone is calling parent with //// then the first time parent returns / and then the next time it will find the root (because of dirname returning / for /). Which seems fine in my opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're telling me that the parent of the root is the root itself?

then the next time it will find the root

You're assuming this is repeatedly applied, nobody is stopping me from calling it just once and get the wrong result.

Rust shares my view and returns None when applied to "//".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I see. Okay, let me see how to implement this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// is not equivalent to /, infact in posix, // is allowed to point to an entirely different root/tree.
However, three or more slashes (///) can indeed be simplified to /

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I stand corrected. Posix is so full of edge cases... AFAICS the double slash has no meaning on Linux/BSD but under Cygwin it's used to refer to network shares.
If we go full-posix then dirname // should return // no?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we go full-posix then dirname // should return // no?

correct. and dirname /// could return /

/// Returns the parant path of a directory path, or null when the root is reached.
/// Asserts that the path has a directory component.
pub fn parent(path: []const u8) ?[]const u8 {
const new_path = dirname(path) orelse unreachable;
Copy link
Contributor

@LemonBoy LemonBoy Oct 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking this should be orelse return ".". When dirname returns null you either have a file name or an empty string so the parent is indeed . (or "").

Implementing this opens the door to another class of infinite loops, returning null is fine.

@LemonBoy
Copy link
Contributor

Tests please. A simple (terminating!) loop that chops a Windows/unix path is enough.

lib/std/fs/path.zig Outdated Show resolved Hide resolved
@kubkon kubkon added the standard library This issue involves writing Zig code for the standard library. label Oct 20, 2020
@FireFox317
Copy link
Contributor Author

FireFox317 commented Oct 22, 2020

Thanks for all the review comments! However I was just interested in fixing the infinite loop issues in the compiler and coding a perfect parent function is totally out of scope for me. I could not add the parent function and just fix the functionally inside the compiler code or add a simplified parent function. Which one would you prefer @andrewrk ?

@andrewrk
Copy link
Member

I think dirname should be fixed so that it returns null when there is no parent, and then we don't need this parent function at all. I'll leave this PR open until that is done.

…ched

When calling dirname in a loop, it's pretty easy to forget that you have
to check for a root directory. So this adds an API that forces the check.
@FireFox317
Copy link
Contributor Author

FireFox317 commented Oct 27, 2020

PR is updated regarding review. I decided to follow @LemonBoy in regards to this comment:

The name dirname has a specific meaning and behaviour. Changing how it works requires a new name , people will surely come and ask why dirname isn't reallydirname if you keep it.
And IMO parent is more discoverable than dirname, at least for people who are not practical with the Unix core tools.

I also decided that for me it is out of scope to handle all the edge cases, so i added a TODO comment linking to this PR.
Also a test has been added.

@andrewrk andrewrk added this to the 0.7.1 milestone Nov 9, 2020
@FireFox317 FireFox317 closed this Nov 9, 2020
@FireFox317 FireFox317 reopened this Nov 9, 2020
@andrewrk andrewrk closed this in bf03ae7 Nov 14, 2020
andrewrk added a commit that referenced this pull request Nov 15, 2020
This intentionally diverges from the unix dirname command, as well as
Python and Node.js standard libraries, which all have this edge case
return the input path, unmodified. This is a footgun, and nobody should
have ever done it this way.

Even the man page contradicts the behavior. It says:
"strip last component from file name". Now consider, if you
remove the last item from an array of length 1, then you
have now an array of length 0. After you strip the last component, there
should be no components remaining. Clearly, returning the input parameter
unmodified in this case does not match the documented behavior. This is
my justification for taking a stand on this API design.

closes #6746
closes #6727
closes #6584
closes #6592
closes #6602
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
standard library This issue involves writing Zig code for the standard library.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zig build hangs on faccessat sycall
7 participants