Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Windows and macOS, Get-Item and Get-ChildItem report file names as specified, not their actual case #13190

Closed
mklement0 opened this issue Jul 16, 2020 · 43 comments
Labels
Area-FileSystem-Provider specific to the FileSystem provider Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-No Activity Issue has had no activity for 6 months or more Waiting - DotNetCore waiting on a fix/change in .NET WG-Cmdlets-Management cmdlets in the Microsoft.PowerShell.Management module WG-Engine-Providers built-in PowerShell providers such as FileSystem, Certificates, Registry, etc.

Comments

@mklement0
Copy link
Contributor

Note: The problem:

  • affects only files, not also directories

  • for file paths, affects all path components.

Simple demonstration (macOS and Windows):

PS> (Get-Item $PSHOME/POWERSHELL.config.json).Name
POWERSHELL.config.json

Note how POWERSHELL.config.json was reported via .Name - exactly as specified - even though the actual casing of the filename is powershell.config.json

Steps to reproduce

Windows and macOS only (platforms with case-insensitive (but case-preserving) file-systems).

# Note: The problem occurs only with *files*, not directories.
#       With files, the problem occurs in all path components.
Describe "Get-Item / Get-ChildItem: non-wildcard file-name case fidelity test" {
  BeforeAll {
    Push-Location (Get-Item testdrive:/).FullName
    $nameActual = 'AB'
    $nameCaseVariant = 'aB'
    New-Item $nameActual # create as all-uppercase

    $testCases = 
      @{ Cmdlet = 'Get-Item'; Parameter = 'Path'; Property = 'Name' },
      @{ Cmdlet = 'Get-Item'; Parameter = 'Path'; Property = 'FullName' },
      @{ Cmdlet = 'Get-Item'; Parameter = 'LiteralPath'; Property = 'Name' },
      @{ Cmdlet = 'Get-Item'; Parameter = 'LiteralPath'; Property = 'FullName' },
      @{ Cmdlet = 'Get-ChildItem'; Parameter = 'Path'; Property = 'Name' },
      @{ Cmdlet = 'Get-ChildItem'; Parameter = 'Path'; Property = 'FullName' },
      @{ Cmdlet = 'Get-ChildItem'; Parameter = 'LiteralPath'; Property = 'Name' },
      @{ Cmdlet = 'Get-ChildItem'; Parameter = 'LiteralPath'; Property = 'FullName' }
  }
    
  It "<Cmdlet>: .<Property> should report '$nameActual' as the file name when literal case variant '$nameCaseVariant' is passed to <Parameter>" -Skip:$IsLinux -TestCase $testCases {
    param($Cmdlet, $Parameter, $Property) 
    $htArgs = @{ $Parameter = $nameCaseVariant }
    Split-Path -Leaf (& $Cmdlet @htArgs).$Property | Should -BeExactly $nameActual
  }
  AfterAll {
    Pop-Location
  }
}

Expected behavior

The tests should succeed.

Actual behavior

All tests fail.

Environment data

PowerShell Core 7.1.0-preview.5
@mklement0 mklement0 added the Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a label Jul 16, 2020
@vexx32
Copy link
Collaborator

vexx32 commented Jul 16, 2020

@mklement0 does directly calling a .NET API to retrieve the name report it with the correct casing?

@mklement0
Copy link
Contributor Author

mklement0 commented Jul 16, 2020

Good point, @vexx32 - indeed, the underlying .NET Core API - as well as .NET Framework - does the same:

PS> [System.IO.FileInfo]::new("$pshome/POWERSHELL.config.json").Name
POWERSHELL.config.json  # !! name as specified, not the actual casing

However, it does so for directories too - whereas PowerShell (Core only!) exhibits the desired behavior:

# .NET
PSonWin> [System.IO.DirectoryInfo]::new('c:\wINdows').Name
wINdows  # !! name as specified, not the actual casing

# PowerShell Core only (WinPS exhibits the behavior above).
PSonWin> (Get-Item C:\wINdows).Name
Windows  # OK, true casing

So the question is:

  • Is reporting the true names for directories just a happy implementation accident, or is it by design?

  • If the latter, do we want to provide the same behavior for files too?

@vexx32
Copy link
Collaborator

vexx32 commented Jul 16, 2020

I'd imagine since we're interested in supporting Unix that we'd want casing to be accurate, aye.

I'm surprised .NET Core hasn't got that ironed out yet, it'll make things very complicated for Unix developers wanting to use those APIs 😬

@mklement0
Copy link
Contributor Author

Well, it's not strictly a functional problem, because it only applies to platforms with case-insensitive file systems, notably macOS and Windows - on Linux, with its case-sensitive file system you have to supply the case-accurate representation to begin with, otherwise you won't find the file / directory.

Still, I imagine that users consistently expect to get the true casing of file-system items when they call Get-Item or Get-ChildItem - after all, they get objects back that describe the item's own properties, which should not depend on the particular path form you chose to locate that item.

I'm surprised that no one (to my knowledge, based on searching through the issues) has complained in the .NET Core repo about this.

@iSazonov
Copy link
Collaborator

iSazonov commented Jul 20, 2020

I'd speculate that it is because performance reasons - re-combine all path is very expensive. See #9250 as sample.

@iSazonov iSazonov added the Resolution-By Design The reported behavior is by design. label Jul 20, 2020
@mklement0
Copy link
Contributor Author

So there's no single system call on all platforms with case-insensitive file-systems that would give you the case-exact form of a path (e.g., returning C:\Windows\System32\APHostRes.dll for c:\windows\system32\aphostres.dll)?

Given that we currently do it - but for directories only - how do we do it?

@iSazonov
Copy link
Collaborator

there's no single system call

Yes, only way to get an original value is to read from system.

how do we do it?

See #9250. GetCorrectCasedPath() explicitly does this for directories.

@mklement0
Copy link
Contributor Author

mklement0 commented Jul 20, 2020

Thanks for the link, @iSazonov.

I think it's more important for us to make Get-Item and Get-ChildItem exhibit consistent behavior in this respect than to worry about performance.

Performance is more likely to be a concern when enumerating file-system items, but here we're talking about targeting a given file (pattern), and wanting to know its true name.

Note that there are more inconsistencies:

# Get-ChildItem with -Filter: Exact case of the *full path*
PS> gci c:\windows\system32 -filter aphost*.dll | % FullName
C:\Windows\System32\APHostClient.dll
C:\Windows\System32\APHostRes.dll
C:\Windows\System32\APHostService.dll

# Get-ChildItem with -Path: Exact case of the *file name only*
PS> gci c:\windows\system32\aphost*.dll | % FullName
C:\windows\system32\APHostClient.dll
C:\windows\system32\APHostRes.dll
C:\windows\system32\APHostService.dll

Again: If I use Get-Item and Get-ChildItem, I expect to get information objects that reflect the item's true name and path, not whatever case variation I happen to have used to identify the item.

@iSazonov
Copy link
Collaborator

Personally I do not like strongly #9250 and GetCorrectCasedPath(). I think if a system is case-insensitive we should follow this - accept a path as user typed and expose an enumerated path in case as it saved in the system. It is less expensive and more predictable. cmd.exe does so:

dir c:\windows
 Volume in drive C has no label.
 Volume Serial Number is 8861-77AF

 Directory of c:\windows

but PowerShell does extra work:

dir c:\windows

    Directory: C:\Windows

@mklement0
Copy link
Contributor Author

I think if a system is case-insensitive we should follow

We already do honor this for referring to paths and would continue to do so, but since these file systems are also case-preserving, we should honor that too, by reporting the actual name/path when explicitly requesting information about an item.

Yes, cmd.exe and .NET act differently - but we can do better:

The extra work that PowerShell already does for directories is helpful - let's do the work for files too.

Again: The incidental form of a path I use to refer to an item of interest (case variations, relative vs. absolute path) should not change how its innate properties are being reported.

@iSazonov
Copy link
Collaborator

The extra work that PowerShell already does for directories is helpful

Helpful?
If I type "c:\WinDowS" it is my strong intention to work exactly with "c:\WinDowS". And it works! Why should the system change this to "C:\Windows" for me?
It seems we even have a case(-s) about this. Like why expanded:

Get-Item temp:\

    Directory: C:\Users\1\AppData\Local

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
d----          20.07.2020    22:45                Temp

@iSazonov iSazonov removed the Resolution-By Design The reported behavior is by design. label Jul 20, 2020
@mklement0

This comment has been minimized.

@mklement0
Copy link
Contributor Author

Joking aside, @iSazonov, my hope was that what I said above,

The incidental form of a path I use to refer to an item of interest (case variations, relative vs. absolute path) should not change how its innate properties are being reported.

makes it generally clear why the behavior is helpful, but let me address your specific example:

If I type "c:\WinDowS" it is my strong intention to work exactly with "c:\WinDowS".

Being able to use a case variation is a convenience that has two advantages:

  • you don't have to know what the target item's true name or path is in terms of case - even if you accidentally get the case wrong, the item will still be found.

  • for typing convenience you can therefore use an all-lowercase version of the name or path, c:\windows in this case.

So, no, you most likely wouldn't use the cumbersome-to-type form c:\WinDowS - and if you did, you certainly shouldn't expect the information about the item you have merely referenced (identified) by this case variation to reflect this - incidental - variation: there is no item named WinDowS in C:\, only one named Windows.

Why should the system change this to "C:\Windows" for me?

Conceptually, there is no change here: just a truthful reporting of the name as it is actually stored in the filesystem.

Also note that the pre-#9250 behavior actually caused bugs (e.g. vuejs/vue-cli#648 (comment)), albeit in the context of Set-Location.

Even cmd.exe does the right thing with cd: cd /d c:\windows changes to C:\Windows (true case).


Also, I don't understand your Get-Item temp:\ example: the name of the PS drive, temp:, has no relationship with the name of the directory that is its root directory.

@iSazonov
Copy link
Collaborator

@mklement0 My point is that on Windows we never pay attention to path case. It is so many years in cmd.exe and PowerShell too. I never understand when someone (it is not about you) tries to make Windows from Unix and vice versa - each system is good at its own area and if we follow the nature of a system, then we can get more benefits than from limited and expensive imitation.

@SeeminglyScience
Copy link
Collaborator

tries to make Windows from Unix and vice versa - each system is good at its own area and if we follow the nature of a system

I don't think he's asking for resolution to be case sensitive, just for the object to report it's real name. Aside from just being kind of annoying, it's also problematic when creating files based on another file (e.g. gi somethingpascalcase.txt|% { cpi $_.FullName "$($_.FullName).bk" }).

I don't have an opinion on if it's ultimately a good idea, but it's not just @mklement0 . It's very aesthetically annoying.

@mklement0
Copy link
Contributor Author

Indeed, @SeeminglyScience, thanks for clarifying.

@iSazonov, to be clear, this has absolutely nothing to do with Unix, and everything to do with case-insensitive file-systems, which happen to be the default kind of file-system on Windows and macOS.

If by Unix you mean case-sensitive file-systems, such as on Linux: there, the problem at hand by definition never arises, because the only way to refer to a file or directory is by its case-exact name / path.

@iSazonov
Copy link
Collaborator

I only mentioned Unix to say that it is pointless to drag something to Windows that works great on Unix because in most cases it will not work well. Each system is good in its area.

Again, it is my strong belief that the input paths (as any input data) should remain unchanged. This is exactly what the user expects in common. We shouldn't expand .\, ~\, and maybe temp:\, and preserve this in History. We have some issues for this.

If you say about:

It's very aesthetically annoying.

I ask why do you allow unaesthetic typing for yourself? On the console, you will see everything that you typed in an unaesthetic form. Want aesthetics - print it "right"! If an user types c:\winDows then this is his preference and probably aesthetic for him.

PowerShell is already extremely slow due to the fact that it does a lot of extra work that is often unnecessary (globbing, slash normalization and more).

@SeeminglyScience
Copy link
Collaborator

SeeminglyScience commented Jul 23, 2020

I ask why do you allow unaesthetic typing for yourself?

Easier sometimes.

On the console, you will see everything that you typed in an unaesthetic form. Want aesthetics - print it "right"! If an user types c:\winDows then this is his preference and probably aesthetic for him.

If you were going to argue that most people don't care, I don't really have a response for that and might be true. But what possible reason would someone have to want it see it that way? Also winDows is a weird example. Most of the time we're talking about all lowercase because it's easier.

PowerShell is already extremely slow due to the fact that it does a lot of extra work that is often unnecessary (globbing, slash normalization and more).

This would only apply no non-wildcard paths for a single item. An extra half millisecond isn't likely to make a big difference.

@iSazonov
Copy link
Collaborator

I assume we could have an option for the file provider to enable this feature, but it should be disabled by default. But really if we say about aesthetics this must be moved to Formatting System. It would be amazing to sacrifice a performance for aesthetics, which is pointless for a script.

Main PowerShell principle is do not limit users in their capabilities. Here we impose the transformation on the users.

An extra half millisecond isn't likely to make a big difference.

We do not know how users could use Get-Item. Really the normalization code is on hot path (it normalize every part of the path - for c:\a\b\c\d\file we will have 5 extra disk operations at least) and this always slow down scripts,
I am opposed to a frequently used operation running slower. Again it would be amazing to sacrifice a performance for aesthetics, which is pointless for a script.

@SeeminglyScience
Copy link
Collaborator

I assume we could have an option for the file provider to enable this feature, but it should be disabled by default. But really if we say about aesthetics this must be moved to Formatting System. It would be amazing to sacrifice a performance for aesthetics, which is pointless for a script.

Eh, it's less of a problem for me when it's being displayed and more when the FileInfo object is being used to create something else. I still don't like it displayed wrong, but I don't think that's the worst part.


Honestly all I personally want is some command to get the correct path. afaik the only way to do it is dipping into p/invoke which never feels worth it interactively (or even as a profile function).

I'll let @mklement0 debate the rest, I just wanted to mention it's not just him annoyed by this.

@SeeminglyScience
Copy link
Collaborator

Actually I think a CodeProperty is all I want. Leave it out of the default formatting so it's only calculated when asked for. I can change my formatting if I want, and I can use that property for creating new things based on that thing.

It'd be nice to have it "fixed" but 🤷 the above is fine imo. @mklement0 feel free to continue on if you disagree, I'm gonna dip from this thread.

@iSazonov
Copy link
Collaborator

Honestly all I personally want is some command to get the correct path. afaik the only way to do it is dipping into p/invoke which never feels worth it interactively (or even as a profile function).

Sorry I tired you :-)

  • "the correct path" - only way to detect whether a path is correct is to call file system. And different in case is correct on Windows. Really you want a path as it was stored.
  • "the only way to do it" - you can use explicit enumerating. Follow works Get-ChildItem c:\ -Filter "windows"

@SeeminglyScience
Copy link
Collaborator

Sorry I tired you :-)

You didn't, I never felt strongly about it in the first place.

  • "the correct path" - only way to detect whether a path is correct is to call file system. And different in case is correct on Windows. Really you want a path as it was stored.

Call it what you like I guess 🤷 I'll keep calling it correct.

  • "the only way to do it" - you can use explicit enumerating. Follow works Get-ChildItem c:\ -Filter "windows"

Yeah same problem, really heavy to do it all the way through. It'd be nice as a code prop.

@iSazonov
Copy link
Collaborator

iSazonov commented Jul 23, 2020

It'd be nice as a code prop.

I will pull the PR. I hope you vote and the PR will be approved. :-)
What should the property name be? PSProviderPath?

@iSazonov iSazonov self-assigned this Jul 23, 2020
@SeeminglyScience
Copy link
Collaborator

What should the property be? PSProviderPath?

Hmm that's a good question... I do like PSProviderPath. Though I guess the question is will it have resolved PSDrives to real provider paths? If so definitely PSProviderPath ❤️

@iSazonov
Copy link
Collaborator

will it have resolved PSDrives to real provider paths? If so definitely PSProviderPath

Since we read from disk we have to resolve to absolute path like (Get-Item .).PSPath but without prefix.

@SeeminglyScience
Copy link
Collaborator

Perfect 🙂

@mklement0
Copy link
Contributor Author

I appreciate the willingness to tackle this, but I don't think we need a separate property:

Unless I'm missing something, the performance impact of doing the right thing automatically should be negligible, because the overhead of determining the case-exact path only ever needs to be incurred when creating a [System.IO.FileInfo] instance from a path supplied by the user. Therefore, this applies only to file paths (because we already do the right thing for directories), as follows:

  • for literal paths in full (e.g., c:\windows\odbc.ini should return [System.IO.FileInfo] instance with .FullName C:\Windows\ODBC.INI); that is, all path components need to be corrected.

  • for wildcard-based paths, only if the wildcard pattern is limited to the file-name (leaf) component; e.g., Get-Item c:\windows\odbc.in* reports c:\windows\ODBC.INI, i.e. only the file name case-exactly. (By contrast, if the wildcard is (also) in a parent path component, things already work as expected; e.g., Get-Item c:\window*\odbc.in -> C:\Windows\ODBC.INI).

(As stated, we already do this for [System.IO.DirectoryInfo], and as such using something like Get-ChildItem <dir> -even with Recurse - automatically gives us case-exact [System.IO.FileInfo] instances already.)

Notably, determining the case-exact name is not necessary as part of the path normalizations we seemingly always perform.
That is, something like Get-Content c:\windows\odbc.ini does not need to care that the case-exact form of this file's path is C:\Windows\ODBC.INI - all that matters here is that the file can be located for reading its contents.

@vexx32
Copy link
Collaborator

vexx32 commented Jul 25, 2020

I'm kind of lost at this point as to what you think is actually needed here, @mklement0.

@mklement0
Copy link
Contributor Author

It's spelled out in detail in the 2nd paragraph and the two associated bullet points, @vexx32.

To summarize: action is only ever needed if (a) [System.IO.FileInfo] instances - i.e. file-info objects - are constructed and then only if (b) they are constructed from user-specified paths. The bullet points then detail which path components need case corrections, depending on whether the user-specified path is a literal or a wildcard-based one.

Please tell me which aspect lacks clarity.

@vexx32
Copy link
Collaborator

vexx32 commented Jul 25, 2020

That makes sense! Sorry, got a bit lost amongst all your emphasis, I couldn't tell what was actually the main point there. Appreciate the clarification! 🙂

@mklement0
Copy link
Contributor Author

😁 Granted, sometimes I can slip into putting too much emphasis on the emphasis (if you will), but the rationale behind doing so in my penultimate comment was to highlight that I don't think there's a performance concern here, and that it therefore shouldn't drive the implementation - a direct solution that automatically does the right thing is much more convenient than having to be aware of the problem to begin with and then knowing what alternate property to consult (and having the burden of needing to do so).

@iSazonov
Copy link
Collaborator

Also note that the pre-#9250 behavior actually caused bugs (e.g. vuejs/vue-cli#648 (comment)), albeit in the context of Set-Location.

It is a bug in node.js/vue-cli. The issue was closed without investigations by the app owners.
This is another argument for reverting #9250.

Unless I'm missing something, the performance impact of doing the right thing automatically should be negligible

After #9250 FileSystem provider does the normalization for every path again and again. It is extra operation for scripts. Why do my script should works now slower if anybody want "nice output"?
If user want "right" output - it should be in Formatting System.

that is, all path components need to be corrected

This makes no sense for scripts. If we do it like #9250, then our file operations will be extremely slow. And this is only for the sake of a "beautiful" output in rare cases?
If user types in console he will most likely prefer to use tab-completion and get the "right" case.

@vexx32
Copy link
Collaborator

vexx32 commented Jul 27, 2020

As @mklement0 said, this current issue would be a small change. No additional action would be needed for the vast majority of cases. It would only be where we're finding files based on direct user input that is affecting case of returned results.

@mklement0
Copy link
Contributor Author

mklement0 commented Jul 27, 2020

And this is only for the sake of a "beautiful" output in rare cases?

Let me put it as simply as possible: PowerShell is lying to you if you ask for an object describing the Windows directory and that object reports the directory's name as winDOws, just because you happened to refer to the directory using path c:\winDOws.

The directory's true name is Windows - no matter how you're also allowed to refer to it, and that's what you want to know when you explicitly ask for an object that describes the item's properties.

Yes, other shells and even .NET are lying to you too, but that's no excuse for us not to do better, especially given that we've half done so already (even if the motivation may have been different originally).

Again:

  • This only matters if you ask for an object describing the item, and it is not just a display problem, and @vexx32 has just reiterated that this would not require any change in the vast majority of cases - certainly not as part of every path normalization.

    • (I don't think we necessarily need to worry about edge cases such as the PSPath provider property values that Get-Content decorates lines read from a file with - but even there it would be one case-correction operation per input file)
  • While you could argue that Warning for brand new project instance: There are multiple modules with names that only differ in casing vuejs/vue-cli#648 was a bug, it is no accident that cmd.exe - which otherwise also lies to you - also case-corrects when you use the cd command, specifically: e.g., cd c:\windows makes c:\Windows the current directory.

@iSazonov
Copy link
Collaborator

certainly not as part of every path normalization.

So you agree that #9250 should be reverted and a fix you ask should be in another place and normalize a path "only if you ask for an object describing the item"?

@mklement0
Copy link
Contributor Author

mklement0 commented Jul 28, 2020

@iSazonov, I hadn't looked closely at #9250 before, but I agree that it's unnecessary to call GetCorrectCasedPath for every directory-path normalization.

To summarize, this means that case correction is only required in the following scenarios:

  • Whenever a [System.IO.DirectoryInfo] or [System.IO.FileInfo] instance is constructed directly from a user-specified path (Get-Item, Get-ChildItem).

  • Whenever a file-system container (directory) is made the current location (Set-Location)

    • Incidentally: just like cmd.exe also case-corrects in this scenario (cd), so does [environment]::CurrentDirectory - but only on macOS, not on Windows.
  • Possibly, we could additionally do users the courtesy of case-correcting paths in ETS properties used to decorate output objects, such as in the PSPath and related properties added by Get-Content - I think that would make sense for consistency, but it's less important, I think.

@iSazonov
Copy link
Collaborator

iSazonov commented Aug 19, 2020

Proposed API in .Net Runtime dotnet/runtime#14321

For Windows we could use more fast a workaround

/cc @SteveL-MSFT for tracking .Net issue

Retrieves the final path for the specified file.

@iSazonov iSazonov added WG-Engine-Providers built-in PowerShell providers such as FileSystem, Certificates, Registry, etc. Waiting - DotNetCore waiting on a fix/change in .NET labels Aug 19, 2020
@iSazonov iSazonov removed their assignment Aug 19, 2020
@mklement0
Copy link
Contributor Author

@iSazonov, thanks for the link.

As for the workaround: GetFinalPathNameByHandle() resolves symlinks to their ultimate targets, which we don't want to do by default; asking for the properties of a symlink with Get-Item should return the symlink's name in .Name; ditto with parent paths that have symlink components (even if the target item itself is not a reparse point).

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

2 similar comments
Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

@microsoft-github-policy-service microsoft-github-policy-service bot added the Resolution-No Activity Issue has had no activity for 6 months or more label Nov 16, 2023
Copy link
Contributor

This issue has been marked as "No Activity" as there has been no activity for 6 months. It has been closed for housekeeping purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-FileSystem-Provider specific to the FileSystem provider Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-No Activity Issue has had no activity for 6 months or more Waiting - DotNetCore waiting on a fix/change in .NET WG-Cmdlets-Management cmdlets in the Microsoft.PowerShell.Management module WG-Engine-Providers built-in PowerShell providers such as FileSystem, Certificates, Registry, etc.
Projects
None yet
Development

No branches or pull requests

4 participants