-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API to get actual file casing in path #14321
Comments
NTFS is case-sensitive, its Windows which makes it look bad. :) Thank you @ellismg. But honestly I opened this issue in coreclr repo because I think the solution should come from lowest possible API, instead of C# manipulation of canonical paths, or making repeated expensive IO operations from a distance. I still standby and vote for the coreclr option. No solution is better than pure managed solution (like present), so people don't start relying on bad-performing algorithm out of ignorance. CoreCLR solution = C/C++ = fast performance |
@jasonwilliams200OK For .NET Core, the source for the lowest level path APIs are in the CoreFX repository. The implementation in mscorlib.dll in CoreCLR is not what we expose as part of .NET core and will be removed if possible. I don't fully understand your comment about CoreFX solution having worse performance than one in CoreCLR. In either case, we would likely implement the feature the same way, by PInvoking to the relevant OS APIs. The native portions of the runtime are not used for any of the File APIs that exposed to managed code. |
@ellismg, thanks for expanding on the general workflow here. But unfortunately, this particular issue focuses on the limitation of Windows API, where it does not emit the actual casing to the application tier. To overcome this problem, we have two options in .NET:
This is my understanding. I hope there exists a better way (O(1)) to query the file-system for the exact path for .NET Core that I am unaware of, otherwise recursive walking will be a perf. nightmare. |
One crazy way to solve it quickly is shelling out cmd commands: e:\> c:
c:\> cd c:\sharepoint\scriPts
C:\SharePoint\scripts> echo %cd%\MasterDeployment.ps1
:: returns C:\SharePoint\scripts\MasterDeployment.ps1 or one liner: c: && cd c:\sharepoint\scriPts && echo %cd%\MasterDeployment.ps1 Update: The above solution was not resolving the filename. So here is a working solution: e:\> c:
c:\> cd c:\sharepoint\scriPts
C:\SharePoint\scripts> set var1=%cd%
C:\SharePoint\scripts> for /f "delims=" %A in ('dir maStErdeployment.pS1 /B') do set "var=%A"
dif maStEreployment.pS1 /B
C:\SharePont\scripts> echo %var1%\%var2%
:: returns C:\SharePoint\scripts\MasterDeployment.ps1 Note it will not work, had you issued But issuing shell commands is a bad workaround. |
With dotnet/corefx#2219, I added As you can see in dotnet/corefx@36b4113, the body of methods vary because filename does not get normalized by underlying native method, so we have to do little extra work to get the "filename part" casing correct in case of @ellismg, I didn't get any feedback on this for months, so I decided to take a stab. Is there another smarter way to skip this part, as this is not a full-fledge API but a simple (but tricky) convenient method? |
@JeremyKuhne, since I have noticed you had been involved in some path-related features, can you please review this one? Do you have any objections or suggestions on implementing this feature like 36b4113? Please feel free to criticize, I would love to have this functionality in CoreFX someday. |
@JeremyKuhne, it seems Roslyn had a similar request for a an API that gets the canonical path, i.e. resolves sym links, gets the correct casing etc. What's your take? |
I'm not clear on what such an API would mean in a typical Unix context, where file AbC is unrelated to file ABc. It'd just return the original supplied string without modification? public static string GetActualCasing(string path) { return path; } ? |
@jasonwilliams200OK, @terrajobst: We do need stuff along these lines, what I was going to suggest was Path.GetFinalPath to match the semantics we're already using (e.g. aligning with the File Management APIs in Windows). Essentially it would just be a call to GetFinalPathNameByHandle() on Windows and realpath() on Linux. Symbolic links will be resolved and the file will have to exist. @stephentoub I think this api/behavior would address what you're bringing up. I don't think there is a way to specify files with non canonical casing even when using case-insensitive file systems (NTFS) on Unix- but I'm certainly not positive. Anyone know? As far as not resolving sym links I'm not sure the best available APIs to make this happen and what we would call it. I suppose if we had a Path.GetCanonicalPath instead of GetFinalPath (matching Linux/Java semantics) we could add an overload to not resolve sym links? Note that on Windows you can find the right casing by walking DirectoryInfo/FileInfo objects. The file name that comes back from FindFirst/NextFile is always in the right case. |
@JeremyKuhne, during this exercise: dotnet/corefx@36b4113 one thing I figured out was that we can get correct casing with |
Hmm- I hadn't seen that behavior when I was playing with GetLongPathName- I'll take a closer look. The Windows file system folks mentioned that there was a way to get the correct cased name without following sym links- I'll ping them again to see if I can drag out details. |
@jasonwilliams200OK - The thing I was missing was FILE_FLAG_OPEN_REPARSE_POINT. GetFinalPathNameByHandle will give you the real canonical name on whatever file you open. If you pass the flag above when calling CreateFile it won't follow the reparse points and will, instead, open the actual link. |
@JeremyKuhne, thanks for the info! And good to know that Win32 API provides the functionality. :) Would it make sense to change the behavior of existing This way, we wouldn't be needing an additional method / property for this; only fixing the behavior of |
@jasonwilliams200OK The Name property should already be in the proper case. My only concern would be adding hidden perf impact as we'd have to make a call to CreateFile, then GetFinalPathNameByHandle, then clean up the prefixing. I suppose we could optimize that away when we're starting from an Info class so probably not too bad... I'll dig in a bit more as I get a chance. |
We need formal API proposal. Anyone wants to pick it up? |
The Java equivalent of this API request is Path.toRealPath(...). It is important that it be possible to get the properly-cased path without resolving symbolic links (an option which Java provides). Although, browsing the Java source code, it's not clear to me that it looks up properly-cased filenames on a case-insensitive filesystem (ex. vfat) in Linux... and I'm not certain that it's actually possible. Even |
@carlreinke if you would like to make an API proposal (just picking on you since perhaps you have an interest) the process is here: https://github.com/dotnet/corefx/blob/master/Documentation/project-docs/api-review-process.md It would probably be clearerest to close this and make a new issue for the proposal, since then it can be top posting. |
Original proposal is too specific to If we think of what else is missing then combining it with features of UNIX Proposal:public static class Path
{
public static string GetRealPath ( string path );
public static string GetRealPath ( ReadOnlySpan<char> path );
public static string GetRealPath ( ReadOnlyMemory<char> path );
} On case-sensitive file systems, Either this superset one of https://github.com/dotnet/corefx/issues/25569 or https://github.com/dotnet/corefx/issues/24685, or the vice versa fold this functionality. |
Triage: We want to do this, but there are questions around performance (as we have to walk the full path) and what to do for paths that don't exist or are not accessible. We also need to understand what to do with drive letters and validate casing requirements for device paths (e.g. |
PowerShell needs this and currently uses a workaround GetCorrectCasedPath() |
If an OS allows using case-insensitive paths, why should it be considered a problem if the specified path string does not have the exact same casing as the actual path? It is allowed behavior. What would be the benefit of fixing this? This comment in the linked PowerShell issue offers a related explanation:
Also, suggesting a cross-platform API to get the real path would not make sense in Unix, as it was pointed out here. |
Beyond the extent of any APIs it can provide to help this issue, the OS should really not be a consideration here. Case sensitivity is an attribute of the file system, not the operating system. Apple's file systems have always supported both due to legacy Mac OS's case sensitivity. EXT4 supports toggling case insensitivity at the directory level. Linux (and notably
The scenario that prompted me to follow this issue in the first place is largely the third problem, but I've run into all three. A project I'm currently working on involves parsing C++ code using the Clang compiler. From a high level view, one aspect of that process looks like this:
The "if they correspond" bit is the source of some whacky edge-case bugs right now. The main issue is that the paths I get back from Clang might not be the same paths I originally put in because files can be included more than once in C++. Currently we assume everything is case-insensitive, and this will do the right thing in 99.9% of cases. If you want to read about those 0.1% of cases I did a more detailed quick-and-dirty writeup of the issues we face here: MochiLibraries/Biohazrd#1 (comment) I definitely see this API as something for solving weird edge cases. You should not normally need it, but when you do it's annoying to get right. The processing of paths has always been a huge source of edge case bugs in software (see countless security issues caused by directory traversal attacks, the need for |
In PowerShell repository main motivation was "users want nice output" like Windows Explorer does. It was approved and implemented by PowerShell MSFT team.
As mentioned it is file system behavior. And we should consider both local scenario (mount NTFS on Unix and ext4 on Windows) and remote scenario (I mean PowerShell can connect to remote computer - what is an expected behavior in the case?). /cc @mklement0 |
A good usecase for this API (maybe even the most important) are interactions with tools like |
I was in a situation where I needed to perform some operations on a bunch of files except those specified by the user. On case-insensitive file systems, exceptions were to be matched case-insensitively, i.e. "file" would match "file", "File", and "FILE". On case-sensitive file systems, exceptions needed to match exactly, i.e. "file" would only match "file" and not "File" or "FILE". I want to point out that the For example, macOS is case-insensitive by default, and Windows NTFS also supports this, enabled per-directory with Most common Linux filesystems are case-sensitive by default. But there's a diverse ecosystem of file systems out there for Linux, I wouldn't be surprised if one of them is case-insensitive (take NTFS on Linux). Therefore I would say assuming case sensitivity of the OS is probably not a robust design. Depending on your use case, you might be able to get away with an explicitly case-insensitive |
Per @jasonwilliams200OK in dotnet/coreclr#390
Based on this answer: http://stackoverflow.com/a/81493/1712065 (further redirected from: http://stackoverflow.com/a/326153).
Please add the ability to retrieve path with actual case via
FileInfo
andDirectoryInfo
classes. The candidate member beingFullPath
andName
. Perhaps there is some sophisticated way of getting it from win32 file system API, but that seems to be a working solution.Expected:
Actual result:
The text was updated successfully, but these errors were encountered: