Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Path.Exists -- to replace inefficient FileExists || DirectoryExists #21678

Closed
JeremyKuhne opened this issue May 12, 2017 · 40 comments
Closed
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.IO help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Milestone

Comments

@JeremyKuhne
Copy link
Member

JeremyKuhne commented May 12, 2017

We should implement the code for this internally and look at adding a public API for it. Probably Path.Exists(string path).

The internal helper should avoid any sort of path validation/normalization if possible. A bad path, by definition, doesn't exist. At least on the Windows side pre normalizing isn't needed (and wasteful). We should also consider exposing overloads such as Path.Exists(string path, bool normalize) to allow users to skip this as well (or perhaps a config switch).

[added by @danmoseley --]

Background and motivation

It's fairly common to want to know whether a file path is available to write to, without writing the file. For example, in order to prompt the user for a default save file name. In order to do this using our standard API, one must separately check whether the path matches an existing file, or an existing directory:

if (File.Exists(path) || Directory.Exists(path))
{
     // choose another name, bail out, etc
}

A new API can unify both these checks into one call, which is less verbose but more importantly potentially more efficient -- the underlying implementation (eg lstat, GetFileAttributesEx) generally returns all this info with a single call. While it's likely caching at some layer below .NET means that the above will not cause two disk reads, there's certainly inefficiency in performing two syscalls to get the same information.

API Proposal

    public static partial class Path
    {
        // Implementation is logically equivalent to File.Exists(path) || Directory.Exists(path)
        public static bool Exists([System.Diagnostics.CodeAnalysis.NotNullWhenAttribute(true)] string? path) { .... };
    }

API Usage

if (Path.Exists(path))
{
     // choose another name, bail out, etc
}

Alternative Designs

There is already an abstract method FileSystemInfo.Exists which but to make that useable, it would have to change from abstract to virtual, as would the FileSystemInfo class, which would also need to expose a public constructor. This would also change its nature from representing either a directory or file (chosen at construction time) to representing something that is possibly either. As an instance method it would require newing up an object, which is wasteful. Finally, FileSystemInfo.Exists is likely less discoverable and readable than Path.Exists and discoverability is important since this is a fairly basic operation.

Risks

"Paths" in the broadest sense can represent more than traditional files and directories, and Path.Exists might imply to some that it could for example on Windows be used to test for the existence of a named pipe or other OS object. I suspect that is unlikely though, as the overwhelming use of the Path class is not with such objects.

@JeremyKuhne
Copy link
Member Author

See dotnet/corefx#19716 (comment)

@ghost
Copy link

ghost commented May 12, 2017

I would just call it Path.Exists(). It would instantly replace most of my File.Exists() calls, so it'd be nice not to have a unwieldly name like DirectoryOrFileExists().

@kellypleahy
Copy link
Contributor

I started working on this, but I'm wondering. I can test on Windows and Linux, but how best to get tests for WinRT to run? Can I do that on a linux / windows box?

@kellypleahy
Copy link
Contributor

@karelz so.... who should I talk to about how to test the PR I want to do for this one? Specifically, I'm wondering about windows/winRT since I'm doing the changes on linux. I can test on windows as well as I have a machine with windows, but I don't have anything that does winRT.

@kellypleahy
Copy link
Contributor

I would just call it Path.Exists(). It would instantly replace most of my File.Exists() calls, so it'd be nice not to have a unwieldly name like DirectoryOrFileExists().

@atsushikan It seems like Path is probably in mscorlib and not CoreFX? I don't see an easy path to add Path.Exists given that the implementation of it would presumably live on FileSystem? Am I missing something? cc @karelz

@kellypleahy
Copy link
Contributor

@stephentoub mentioned wanting to move Path.cs to corefx at some point here: https://github.com/dotnet/coreclr/issues/1104#issuecomment-111193983

I wonder if that has to happen before this can really be done properly as a public API, or is there some magic type-forwarding foo or something else I need to know about to do Path.Exists?

@stephentoub
Copy link
Member

@stephentoub mentioned wanting to move Path.cs to corefx at some point here

The Path implementation did move to corefx, and then with all of the .NET Standard 2.0 compatibility work, it moved back.

I wonder if that has to happen before this can really be done properly as a public API

It doesn't. The API can be added in coreclr, and then exposed from the reference assembly and tested in corefx:
https://github.com/dotnet/corefx/blob/master/src/System.Runtime.Extensions/ref/System.Runtime.Extensions.cs#L1155

But before a PR is submitted for any new APIs, we need to run it through the API review process.
https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/api-review-process.md

@kellypleahy
Copy link
Contributor

It doesn't. The API can be added in coreclr, and then exposed from the reference assembly and tested in corefx:
https://github.com/dotnet/corefx/blob/master/src/System.Runtime.Extensions/ref/System.Runtime.Extensions.cs#L1155

OK... I get that an API review would be needed. Now, I just want to understand how this works mechanically. Are you saying that the API would be added and implemented in coreclr and we would only have the tests in corefx, or are you saying the implementation would somehow live in corefx?

I started down the path of implementing the support in the different FileSystem internal classes. However, then I realized I probably can't expose Path.Exists even if I wanted to using those as the implementation, and instead the implementation would need to live in coreclr.

@stephentoub
Copy link
Member

@kellypleahy, it can be confusing, I know.

When adding an API, there are potentially three pieces: the implementation (src), the reference assembly (ref), and the tests (tests) (and I say potentially three because for some libraries there is no ref). The latter two live in corefx; the former may also live in corefx, or for very core types may live in coreclr. With Path, the implementation is here in coreclr:
https://github.com/dotnet/coreclr/blob/master/src/mscorlib/shared/System/IO/Path.cs
The reference assembly is here in corefx:
https://github.com/dotnet/corefx/blob/master/src/System.Runtime.Extensions/ref/System.Runtime.Extensions.cs#L1155
and the tests are here in corefx:
https://github.com/dotnet/corefx/blob/master/src/System.Runtime.Extensions/tests/System/IO/PathTests.cs

@karelz
Copy link
Member

karelz commented Jun 9, 2017

Here's rough info/notes about adding API into CoreCLR and exposing it in CoreFX: https://github.com/dotnet/corefx/wiki/Contributions#expose-api-across-coreclrcorefx

We will need API review first -- that requires formal API proposal.

@JeremyKuhne please add the "next-steps" info into up-for-grabs bugs in future (see triage rules). Also please use just 1 issue type.

who should I talk to about how to test the PR I want to do for this one? Specifically, I'm wondering about windows/winRT since I'm doing the changes on linux. I can test on windows as well as I have a machine with windows, but I don't have anything that does winRT.

We leave it on developers to decide if they want to test on 1 platform or more platforms locally. It is fine to test on just 1 platform (e.g. Linux) and look at results in CI for the other platforms (Windows, Mac, other Linux distros). I clarified it in new contributor notes "It is ok to test on just 1 platform locally and use CI to test the other platforms. Whatever is more convenient for contributor."
WinRT/UWP is platform which cannot be tested locally today. We are working on making it possible. I hope it will happen in next couple of weeks.

@kellypleahy
Copy link
Contributor

@karelz thanks for the clarification. Looking forward to working more on this stuff in the future and I'm just trying to get my bearings at this point. All this is very helpful. I'll see if there are tips we can distill into your "notes" on the wiki to help answer the same questions I'm having.

@JeremyKuhne
Copy link
Member Author

@kellypleahy Did you just want to work on the new API? Or did you want to fix the existing ones as well? I'm going to create a new issue for the existing improvements, that might be the right way to get started / familiar with the code.

@kellypleahy
Copy link
Contributor

@karelz Thanks for the comments about testing. Another possible recommendation (tell me if I should file an issue). You said...

WinRT/UWP is platform which cannot be tested locally today. We are working on making it possible. I hope it will happen in next couple of weeks.

It would be really nice if there were a way to run targeted tests in your CI pool for when we are working on a PR. For instance, being able to just push to our local branch before we create the PR and ask something to run tests for specific assembly on that branch (so we can iterate quickly if possible). Think of it sorta like the code/build/test cycle that we would use locally, except it can run tests for us with multiple target platforms.

@karelz
Copy link
Member

karelz commented Jun 10, 2017

@kellypleahy did you check the dotnet-bot? I think it has capabilities like that today. (use "@dotnet-bot help" to get full list of commands, options, etc.)

@karelz
Copy link
Member

karelz commented Jun 10, 2017

BTW: I plan to have link to the static dotnet-bot docs also in our docs -- I just didn't find it yet 😮
Here's the intent: https://github.com/dotnet/corefx/wiki/Contributions#small-change

@kellypleahy
Copy link
Contributor

use "@dotnet-bot help" to get full list of commands, options, etc.

@karelz where should I do that? I assume doing it in the issue isn't recommended, or will it just PM me the info? I just want to make sure before I do that I don't spam anyone :D

@karelz
Copy link
Member

karelz commented Jun 10, 2017

People do it regularly in PRs - e.g. dotnet/corefx#20819 (comment).
I don't like the spam either, but until we have static link to the dotnet-bot help source code linked from decent documentation for contributors, that's the best we have and you should feel encouraged to use it.

@karelz
Copy link
Member

karelz commented Jun 10, 2017

BTW: I updated the docs to: "@dotnet-bot docs - use command "@dotnet-bot help" in PRs, or search some older one"

@danmoseley
Copy link
Member

Hi @kellypleahy are you planning to take this one? If so I'll assign it. If not that's fine we might have someone else who can.

@kellypleahy
Copy link
Contributor

I think so... but other priorities have been in the way recently. If you have someone already ready to jump on this it might make sense to hand it off. I'm hoping to get back to this in a couple weeks or three.

@danmoseley
Copy link
Member

@kellypleahy OK. @JeremyKuhne this might be a candidate for one of your "team".

@JeremyKuhne JeremyKuhne self-assigned this Jul 26, 2017
@JeremyKuhne
Copy link
Member Author

Taking it for now as it should fit well with my team's current schedule.

@nchikanov nchikanov self-assigned this Aug 23, 2017
@karelz
Copy link
Member

karelz commented Aug 29, 2017

@JeremyKuhne @danmosemsft which API shape is the final for API review? The top post is a bit vague ...

@karelz
Copy link
Member

karelz commented Sep 5, 2017

Questions from API review:

  1. Is FileSystemInfo.Exists better place for the API?
  2. Would it return true for named pipes and similar things?
  3. Can it be implemented consistently and efficiently on both Windows and Linux? (esp. non-file resources)

@danmoseley
Copy link
Member

@JeremyKuhne note questions above.

@kellypleahy
Copy link
Contributor

My personal opinion is that FileSystemInfo.Exists is the best place for this.

Re: (3) I believe that, yes, it can be implemented in a performant way on all systems, except maybe for the windows RT/UWP platform (not sure about this one).

Re: (2) - I started down the path of making a more general API that would allow you to retrieve the type of a file system entry. I think this is a good thing to have, but I don't think it was ready to be published. That said, I think it is the right "future" for the API, since I often find myself having to resort to using clib / windows API functions to do this stuff, but it might be hard to spec that out at this point. So, I was assuming I'd start by implementing that underlying API (that would be internal/private) and then using that to support FileSystemInfo.Exists properly.

I'm sorry I've gotten too busy to help with this, but I would love to still provide feedback on API design if you'll have me ;)

@danmoseley
Copy link
Member

Past API complete for this one. It is worth pursuing I think. @JeremyKuhne would drive it along.

@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@carlossanlop carlossanlop added this to To do in System.IO - File system via automation Mar 6, 2020
@JeremyKuhne JeremyKuhne removed their assignment Mar 25, 2020
@carlossanlop carlossanlop modified the milestones: 5.0.0, Future Jun 18, 2020
@danmoseley
Copy link
Member

@JeremyKuhne do you have opinions on API review feedback above?

@JeremyKuhne
Copy link
Member Author

Is FileSystemInfo.Exists better place for the API?

There is already a FileSystemInfo.Exists. Having to create multiple FileSystemInfo derived classes to check if a path exists is even more wasteful. Putting a new static there doesn't follow the existing usage. Path, Directory and File are the statics classes.

Would it return true for named pipes and similar things?

Whatever the OS says for the path that would be the logical amalgam of File.Exists and Directory.Exists.

Can it be implemented consistently and efficiently on both Windows and Linux? (esp. non-file resources)

Yes, if there is a question it should match File.Exists(path) || Directory.Exists(path). You can see that https://source.dot.net/#System.Private.CoreLib/FileSystem.Exists.Unix.cs,35 is effectively a "filter out directories" set of logic. You'd just trim out the special directory logic (the trim and the check for the directory flag). This would be an even more substantial perf benefit on Unix than Windows (removes a potential extra string allocation as well).

@danmoseley danmoseley added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-needs-work API needs work before it is approved, it is NOT ready for implementation labels Jan 19, 2022
@danmoseley
Copy link
Member

Thanks @JeremyKuhne. I've edited your proposal to match the standard format . @dotnet/area-system-io I've marked api-ready-for-review. Feel free to remove if you have concerns/alternative preferences.

@GSPP
Copy link

GSPP commented Jan 20, 2022

Can this method (or a variant of it) return whether an existing path is a file or a directory? That seems useful.

@adamsitnik adamsitnik self-assigned this Jan 20, 2022
@adamsitnik
Copy link
Member

@danmoseley thank you for improving the proposal! I've assigned myself and I'll get it moving from here.

@terrajobst
Copy link
Member

terrajobst commented Jan 25, 2022

Video

  • We considered returning whether it was a file or directory but
    • It seems rarely useful
    • It's more expensive in case of symlinks
namespace System.IO
{
    public static partial class Path
    {
        // Implementation is logically equivalent to File.Exists(path) || Directory.Exists(path)
        public static bool Exists([NotNullWhenAttribute(true)] string? path);
    }
}

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jan 25, 2022
@danmoseley danmoseley added the help wanted [up-for-grabs] Good issue for external contributors label Jan 25, 2022
@danmoseley
Copy link
Member

Up for grabs...

@adamsitnik adamsitnik removed their assignment Jan 25, 2022
@Tarun047
Copy link
Contributor

Hey Guys, I'd like to take up this issue, I've created a simple implementation here: Tarun047@81349ba

But I am not sure, where or if I should add tests.

@danmoseley
Copy link
Member

Hi @Tarun047 tests could go with existing Path tests? Thanks!

@adamsitnik
Copy link
Member

@Tarun047 To be able to consume the new API from the tests you need to add it to a reference assembly here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Runtime/ref/System.Runtime.cs

The Path tests can be found here: https://github.com/dotnet/runtime/blob/main/src/libraries/System.Runtime.Extensions/tests/System/IO/PathTests.cs

You can reuse some of the tests that were written for File.Exists and Directory.Exists:
https://github.com/dotnet/runtime/blob/main/src/libraries/System.IO.FileSystem/tests/File/Exists.cs
https://github.com/dotnet/runtime/blob/main/src/libraries/System.IO.FileSystem/tests/Directory/Exists.cs

I am OK with adding Path.Exists tests to src/libraries/System.IO.FileSystem/tests/

Here you can find how to run the tests: https://github.com/dotnet/runtime/blob/main/docs/workflow/testing/libraries/testing.md

Usually what I do is building the entire product and running selected tests from the repo root:

.\build.cmd -c Release -subset clr+libs+libs.tests
dotnet build .\src\libraries\System.IO.FileSystem\tests\System.IO.FileSystem.Tests.csproj /p:Configuration=Release /t:Test

System.IO - File system automation moved this from To do to Done Jan 31, 2022
@adamsitnik adamsitnik modified the milestones: Future, 7.0.0 Jan 31, 2022
@ghost ghost locked as resolved and limited conversation to collaborators Mar 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.IO help wanted [up-for-grabs] Good issue for external contributors tenet-performance Performance related issue
Projects
Development

No branches or pull requests