Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use std::filesystem::path for Path #9205

Open
Ericson2314 opened this issue Oct 22, 2023 · 10 comments
Open

Use std::filesystem::path for Path #9205

Ericson2314 opened this issue Oct 22, 2023 · 10 comments
Labels

Comments

@Ericson2314
Copy link
Member

Ericson2314 commented Oct 22, 2023

We currently have Path defined as std::string. This is kinda sloppy in general, but especially bad if we ever want to work on Windows.

We should instead use std::filesystem::path, which is the newish C++ standard Path type, and which does all the right things on all platforms.

Unfortunately, there isn't yet a std::filesystem::path_view. cplusplus/papers#406 will eventually fix this, but it is not yet stable and won't be for a while. This gives us two choices:

  1. Use https://github.com/ned14/llfio, which is pioneering what will hopefully be standardized as std::filesystem::path_view.
  2. Hand-roll our own PathView, which can be a small convenience around std::basic_string_view<Path::value_type> that has the missing implicit coercions.

I think (2) is the better first step; we can easily got from (2) to (1) later if we want.

Also per what @edolstra says before CanonPath should not change and we should also use it more. It would be for "internal" paths / paths in our SourcePath VFS we want almost all file access to go though. That means std::filesystem::path is just used for the lower level glue code with the OS.

@edolstra
Copy link
Member

We already have CanonPath, we should probably use that more widely.

@Ericson2314
Copy link
Member Author

@edolstra I would say they are both useful but for different purposes:

  • CanonPath is our normalized, portable notion of a path, just like file system objects as opposed to actual per-OS filesystem interfaces.

  • std::filesystem::path is native paths for acting with the physical filesystem, with all their craziness (case insensitivity, crazy Windowsism, etc.)

I started slogging away at a branch for this issue, and I made a NarPath that indeed should just be CanonPath so thanks for bringing it up! :)

@ned14
Copy link

ned14 commented Nov 2, 2023

Re: (2) the LLFIO path_view implementation isn't especially dependent on the rest of LLFIO, and it could be hoisted out without much work.

Re: stability, I don't think LEWG made a single change last meeting which affected the code, not even renaming.

@Ericson2314
Copy link
Member Author

@ned14 Thanks for chiming in here. While yes path_view part is a relatively small part of your library, I think we would end up using more of it because it seems to be the only game in town for abstracting over file descriptors and windows handles. We have a fair amount of new-school file-descriptor-based system calls, and it would be a shame to give that up for some lowest-common-denominator portability.

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-11-17-nix-team-meeting-minutes-104/35753/1

Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Feb 6, 2024
As discussed in the last Nix team meeting (2024-02-95), this method
doesn't belong because `CanonPath` is a virtual/ideal absolute path
format, not used in file systems beyond the native OS format for which a
"current working directory" is defined.

Progress towards NixOS#9205
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Feb 8, 2024
As discussed in the last Nix team meeting (2024-02-95), this method
doesn't belong because `CanonPath` is a virtual/ideal absolute path
format, not used in file systems beyond the native OS format for which a
"current working directory" is defined.

Progress towards NixOS#9205
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Feb 8, 2024
As discussed in the last Nix team meeting (2024-02-95), this method
doesn't belong because `CanonPath` is a virtual/ideal absolute path
format, not used in file systems beyond the native OS format for which a
"current working directory" is defined.

Progress towards NixOS#9205
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Feb 8, 2024
As discussed in the last Nix team meeting (2024-02-95), this method
doesn't belong because `CanonPath` is a virtual/ideal absolute path
format, not used in file systems beyond the native OS format for which a
"current working directory" is defined.

Progress towards NixOS#9205
Ericson2314 added a commit to obsidiansystems/nix that referenced this issue Feb 8, 2024
As discussed in the last Nix team meeting (2024-02-95), this method
doesn't belong because `CanonPath` is a virtual/ideal absolute path
format, not used in file systems beyond the native OS format for which a
"current working directory" is defined.

Progress towards NixOS#9205
@qknight
Copy link
Member

qknight commented Apr 30, 2024

@Ericson2314 I tried to create symlinks on Windows using MinGW but it fails for files and also directories so instead this worked:

 if (CreateSymbolicLinkA(dstd, srcd, SYMBOLIC_LINK_FLAG_DIRECTORY) == FALSE) {
        std::cout << "Error creating directory symlink\n" << std::endl;
        DWORD error = ::GetLastError();
        std::string message = std::system_category().message(error);
        std::cout << message << std::endl;
    }

And playing with canonPath and absPath I've discovered that it converts Windows paths neatly but some of the functions only work if the files are actually there and can't be used to 'normalize' a path soly based on a string representation, i.e. remove the trailing slash for instance.

Path canonPath(PathView path, bool resolveSymlinks) {
    assert(path != "");
    fs::path p = path;

    if (!p.is_absolute())
        throw "not an absolute path";

    fs::path canonical_path;
    try {
        canonical_path = fs::weakly_canonical(p);
    }
    catch (std::filesystem::filesystem_error const &ex) {
        throw "weakly_canonical path problem";
    }
    return Path(canonical_path.make_preferred().lexically_normal().string());

}

Path absPath(PathView path, std::optional <PathView> dir = {}, bool resolveSymlinks = false) {
    Path pathAbsolute;
    try {
        pathAbsolute = fs::absolute(path).make_preferred().lexically_normal().string();
    }
    catch (std::filesystem::filesystem_error const &ex) {
        throw "abs path problem";
    }
    return Path(canonPath(pathAbsolute, resolveSymlinks));
}

I'll provide a branch with unit tests for inspiration soon.

@Ericson2314
Copy link
Member Author

@qknight Great! Since I am hopefully a GSOC project will be picking up this issue (freeing us to work on other things), starting with a bunch of unit tests is an excellent first step.

@qknight
Copy link
Member

qknight commented May 3, 2024

In 508e762 i implemented:

  • absPath_
  • canonPath_

which both use std::filesystem and are wrapped in absPath(..) and canonPath(..).

i only tested them using ./libnixutil-tests.exe on windows 10 natively in powershell, here are the results:

.\libnixutil-tests.exe --gtest_color=yes --gtest_filter=*
...
[ PASSED ] 278 tests.
[ FAILED ] 5 tests, listed below:
[ FAILED ] GitTest.blob_read
[ FAILED ] GitTest.blob_write
[ FAILED ] GitTest.tree_read
[ FAILED ] GitTest.tree_write
[ FAILED ] GitTest.both_roundrip

future tests

i would wish for

  • create tests with actual filesystem usage, by making use of std::tmpfile
  • create tests for symlinks on linux and windows
  • extend the windows nix docoumentation running tests requires either 'windows developer' opt-in to run or run as administrator

@qknight
Copy link
Member

qknight commented May 5, 2024

We should also have tests covering these requirements:

The problem is the filesystem library must support long file names and symlinks (including dangling and recursive).
Remember, there is no reliable cp -r on Windows. All software I tested: from built-in copy and xcopy through MinGW bash and coreutils, to 3-rd party robocopy do fail on doing cp -r with real-world directory structure which has both long filenames, recursive symlinks and non-printable characters (such as checked out LLVM and Chromium monorepos), so cp -r, ln -s, … have been written in Perl with some external modules (and even this choice required patches in C code of those modules).

Ericson2314 pushed a commit that referenced this issue May 7, 2024
Progress on #9205

Co-Authored-By: John Ericson <John.Ericson@Obsidian.Systems>

* Get rid of `PathNG`, just use `std::filesystem::path`
@ned14
Copy link

ned14 commented May 8, 2024

All software I tested: from built-in copy and xcopy through MinGW bash and coreutils, to 3-rd party robocopy do fail on doing cp -r with real-world directory structure which has both long filenames, recursive symlinks and non-printable characters (such as checked out LLVM and Chromium monorepos)

I would agree about copy and xcopy.

I don't agree about robocopy, which was written by an experienced NT kernel developer precisely because all the other tools had issues. I've never found anything possible on a NT filesystem robocopy can't be persuaded to handle, though it may take non-obvious options. It doesn't use the same copying semantics as POSIX cp, which can produce unpleasant surprise.

Where robocopy does fall down somewhat is large directories - it isn't quick at deleting a 100 million item directory. A LLFIO based tool can clear that in under ten seconds, robocopy can take lots of minutes.

I also really don't like std::tmpfile at all or indeed anything in the standard library to do with temporary files. They were all very poorly designed. POSIX mkotemp is much more like it.

pull bot pushed a commit to auxolotl/nix that referenced this issue May 9, 2024
Progress on NixOS#9205

Co-Authored-By: John Ericson <John.Ericson@Obsidian.Systems>

* Get rid of `PathNG`, just use `std::filesystem::path`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants