Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Property-based pipeline-parameter binding should be applied to argument-based parameter binding as well #6057

Closed
mklement0 opened this issue Jan 28, 2018 · 22 comments
Labels
Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif Resolution-No Activity Issue has had no activity for 6 months or more WG-Cmdlets-Core cmdlets in the Microsoft.PowerShell.Core module

Comments

@mklement0
Copy link
Contributor

mklement0 commented Jan 28, 2018

Note: Fixing this would be a breaking change, but arguably one that falls into Bucket 3: Unlikely Grey Area.

Currently, when you use a ValueFromPipelineByPropertyName parameter attribute, the property named there is only bound when using pipeline input and not also when passing the same object as a direct argument.

To illustrate the discrepancy with a simple example:

# Declare a function with a parameter that binds pipeline input by the value of 
# property .LiteralPath or its alias .PSPath
function foo { 
  param(
   [Parameter(ValueFromPipelineByPropertyName)]
   [Alias('PSPath')]
   [string[]] ${LiteralPath}
  ) 
  "`$LiteralPath: [$LiteralPath]" 
}

# Create object with a .PSPath property.
$obj = [pscustomobject] @{ PSPath = 'pspath' }

# Pass the object to the function, first via the pipeline, then as a direct argument
$obj | foo
foo -LiteralPath $obj

The output shows that the .PSPath property is bound es expected via the pipeline, but not as a direct argument:

$LiteralPath: [pspath]                       # OK (pipeline): .PsPath value was bound.
$LiteralPath: [@{PSPath=pspath}]  # !! (argument): $obj was simply stringified (.ToString())

The binding of the direct argument pays attention only to the parameter type, causing it to simply stringify $obj with .ToString() (not with default output formatting).

This discrepancy is not only surprising, it can have grave consequences when dealing with [System.IO.FileInfo] instances output by Get-ChildItem, because the latter often stringify to a mere filename, potentially causing unexpected and destructive behavior with -LiteralPath cmdlets such as Remove-Item, Move-Item and Copy-Item:

# Create sample file 'tmpFile' in subfolder 'tmpDir'; i.e.: tmpDir/tmpFile
$null > (Join-Path (New-Item -force -Type Directory tmpDir) tmpFile)

# Obtain a [System.IO.FileInfo] instance representing the new file.
# Note that `Get-ChildItem tmpDir` rather than `Get-Item tmpDir/tmpFile` (or
# `Get-ChildItem tmpDir/tmpFile` is purposely used, because it is only
# `Get-ChildItem` without a filename component in the path that stringifies 
#  the resulting instances to their filename only.
$f = Get-ChildItem tmpDir

# Demonstrate that $f stringies to just 'tmpFile' - the mere filename.
"`$f stringifies to: [$f]"

# Now bind $f to the -LiteralPath parameter as a *direct argument* of 
# several cmdlets, which *fails*, because trying to locate the input object
# as a *string* that is the *filename only* fails, given that file is in
# a *subdirectory*.
Remove-Item -WhatIf -LiteralPath $f
Move-Item -WhatIf -LiteralPath $f foo
Copy-Item -WhatIf -LiteralPath $f foo

The last 3 calls fail unexpectedly; while this happens not to be destructive / make unexpected changes in this example, it would be the case if namesake files happened to exist in the current location.

The fact that whether this problem arises depends on the specific method through which the [System.IO.FileInfo], [System.IO.DirectoryInfo] instances were obtained makes this behavior particularly insidious.

Environment data

PowerShell Core v6.0.0 on macOS 10.13.2
PowerShell Core v6.0.0 on Ubuntu 16.04.3 LTS
PowerShell Core v6.0.0 on Microsoft Windows 10 Pro (64-bit; v10.0.15063)
Windows PowerShell v5.1.15063.674 on Microsoft Windows 10 Pro (64-bit; v10.0.15063)
@iSazonov
Copy link
Collaborator

We could get significant performance improvements if our cmdlets could accept not only string paths but also [FileInfo] objects - currently we do a lot of unnecessary work on the path objects obtained from the file system (globbing, normalization ...).

@iSazonov iSazonov added Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif WG-Cmdlets-Core cmdlets in the Microsoft.PowerShell.Core module labels Jan 29, 2018
@mklement0
Copy link
Contributor Author

@iSazonov:

That's a great idea - is there an issue for that yet?

What you propose would fix the problems for file-related cmdlets, but I feel that the general discrepancy between via-pipeline and by-argument parameter binding is worth addressing as such.

@iSazonov
Copy link
Collaborator

No Issue. The nearest is #5789
I don't know the code yet, and I'm moving forward very slowly. 😕

If we implement using [FileInfo] internally then maybe your offer will look a little different (simpler).

@powercode
Copy link
Collaborator

The big issue here is that we cannot assume anything about the providers. FileSystem is only one provider, and the common cmdlets are generic in how they work with respect to the providers.

However, it's maybe not unreasonable to say that "All Providers are equal, but some Providers (cough* FileSystem*cough) are more equal than others".

Maybe we should have special treatment for the FileSystem provider since is is verly likely not only the most used provider, but also a provider with large data sets.

@iSazonov
Copy link
Collaborator

The API is not public so we are free to enhance it or re-design to make it more flexible.

@powercode
Copy link
Collaborator

@iSazonov What API specifically are you talking about here?

@iSazonov
Copy link
Collaborator

Mostly I meant our globbing and that #5789 API don't exist still.

@mklement0
Copy link
Contributor Author

mklement0 commented Jan 31, 2018

@powercode:

Good point about different providers: given that the items returned across providers don't implement a shared interface, we can't anticipate all item types using dedicated parameter sets.

While giving preferential treatment to the FileSystem provider sounds sensible, wouldn't that require us to declare the -LiteralPath parameters as [object] and then reflect on the type (given that you can't have the same parameter name with different types in different parameter sets)?
For all other types, .PSPath-based could (continue to) be used, albeit with the fix proposed here for argument passing.

@iSazonov:

If we implement using [FileInfo] internally then maybe your offer will look a little different (simpler).

Again, I feel the discrepancy discussed should be fixed in general, irrespective of the performance issues around file-related cmdlets.

There are two additional aspects:

  • For providers whose items are PowerShell-extended versions of general-use .NET types - [System.IO.FileInfo] again being a prime example - you don't get the .PSPath property if you instantiate such types directly (as opposed to obtaining instances via provider cmdlets) - see Directly constructed .NET type instances lack properties that ones output by cmdlets have #4347.

    • If a reflection-based [object] parameter is introduced to detect [System.IO.FileInfo], [System.IO.DirectoryInfo] instances, that problem would go away for the FileSystem provider, but it could affect other providers too.
  • Arguably, for symmetry with ValueFromPipelineByPropertyName, a ValueByPropertyName parameter attribute should be introduced that allows the same type of property-name-based binding for non-pipeline parameters.

@iSazonov
Copy link
Collaborator

iSazonov commented Feb 5, 2018

Note: in .Net Core 2.1 File enumeration extensibility. is expected

@BrucePay
Copy link
Collaborator

Some points:

  • The default position 0 parameter on commands that take paths is Path not LiteralPath so even if we enabled property binding from the command line it would have no effect.
  • It's Path not LiteralPath because people type paths and want to be able to use globbing.
  • It's of type string because that's what people type and that's what the providers consume universally.

So I don't see doing property binding from the command line as a win. On the other hand, this

ls -rec | foreach { test-path $_ }

is a recurring problem. Perhaps adding an argument transformation attribute to Path that converts FileInfo to a string by getting .FullName is the way to go.

@iSazonov
Copy link
Collaborator

iSazonov commented Jun 20, 2018

Perhaps adding an argument transformation attribute to Path that converts FileInfo to a string by getting .FullName is the way to go.

We could enhance -Path and -LiteralPath to accept FileInfo/DirectoryInfo/PathInfo/etc - it allow us to skip tons of re-parsings and tons of path normalizations.

@mklement0
Copy link
Contributor Author

@BrucePay:

Good point about the 1st positional argument binding to -Path, so adding an argument transformation for FileInfo / DirectoryInfo to all cmdlets that accept filesystem items via -Path sounds like a great idea (it should probably bind via .PSPath for symmetry with how -LiteralPath pipeline input currently binds).

Longer-term, @iSazonov's idea of supporting these types directly is desirable (though as previously discussed, that may require [object] parameters combined with reflection, so as to remain provider-neutral - or is there a better alternative?).

That said, the general symmetry that this issue asks for - bind by property whether the object came from the pipeline or by argument - is still worth implementing.

In the case at hand, if someone used -LiteralPath / -lp with an argument, they'd expect reliable full-path binding too:

ls -rec | foreach { test-path -LiteralPath $_ }  # !! Currently also unreliable

On a side note:

The stringification behavior of FileSystem actually changed since v6.0.2 and it seems that FileInfo instances now consistently stringify to their full path.
While this would help, I suspect it was an unintentional breaking change - see #7132

@BrucePay
Copy link
Collaborator

Rather than making -Path and -LiteralPath take object, we could add another parameter set that took a FileSystemInfo object. That violates the egalitarian notion of all providers are equal but it solves the file system name problem. However, even if we get the object into the cmdlet, using it more efficiently might be problematic as the entire provider infrastructure is based on the abstract notion of a path.

@mklement0
Copy link
Contributor Author

@BrucePay:

Irrespective of efficiency issues, wouldn't the parameter in that new parameter set have to have a different name?

I guess we could bind to that parameter positionally, but introducing yet another parameter name may be confusing (-FileSystemInfo / -FileInfo / -DirectoryInfo?).

Also, many cmdlets only accept file(s), so for them the type would have to be FileInfo only.

@iSazonov
Copy link
Collaborator

We could have unified wrapper for all providers - PathInfo.

@mklement0
Copy link
Contributor Author

Here's a proof-of-concept (in PowerShell code) that makes do with the existing parameters and only requires attaching a new argument-transformation attribute to -LiteralPath.

It should work with all providers, given that they all decorate their items with a .PSPath property.
That said, I know little about providers, so there may be something I'm missing.

The approach relies on non-string arguments seemingly getting bound to the parameter with the argument-transformation argument (-LiteralPath) even with positional binding, which with string input defaults to a different parameter, -Path, as per the default parameter set.

using namespace System.Management.Automation

# Transforms objects that happen to have a  .PSPath property - assumed to be
# PS provider items as returned by Get-Item / Get-ChildItem - to that property's
# value.
# Instances of all other types are passed through as-is.
class ProviderItemPathTransformationAttribute : ArgumentTransformationAttribute  {
  [object] Transform([EngineIntrinsics] $engineIntrinsics, [object] $inputData) {
    return $(foreach ($o in $inputData) { 
      if ($psPathProp = $o.psobject.Properties['PSPath']) {
        $psPathProp.Value
      } else {
        $o
      }
    })
  }
}

function pathdemo {

  [CmdletBinding(DefaultParameterSetName='Path', PositionalBinding=$false)]
  param (
    [Parameter(Mandatory, Position=0, ParameterSetName='Path', ValueFromPipeline, ValueFromPipelineByPropertyName)]
    [string[]] $Path
    ,
    [Parameter(Mandatory, Position=0, ParameterSetName='LiteralPath', ValueFromPipelineByPropertyName)]
    [ProviderItemPathTransformation()]
    [Alias('LP', 'PSPath')]
    [string[]] $LiteralPath
  )

  process {
    @"
    Path:        $Path
    LiteralPath: $LiteralPath
"@

  }
}

Sample calls:

# String input by argument: bind to -Path, as before:
PS> pathdemo 'foo'
    Path:        foo  
    LiteralPath: 

# String input from the pipeline: bind to -Path, as before:
PS> 'foo' | pathdemo
    Path:        foo  
    LiteralPath: 


# NEW: Provider-item input by argument: bind to -LiteralPath, via .PSPath, 
# using the argument-transformation attribute.
PS> pathdemo (Get-Item /)
    Path:        
    LiteralPath: Microsoft.PowerShell.Core\FileSystem::/

# Provider-item input from the pipeline: bind to -LiteralPath, via .PSPath, 
# using the by-property binding via the 'PSPath' parameter alias, as before.
PS> Get-Item / | pathdemo
    Path:        
    LiteralPath: Microsoft.PowerShell.Core\FileSystem::/

Again, note that while this approach happens to restore symmetry in recognizing input objects with a .PSPath property from both the pipeline and an argument in this specific case, that symmetry should be implemented generically.

@iSazonov
Copy link
Collaborator

iSazonov commented Jun 26, 2018

@mklement0 Thanks for the great sample!

If we extract a string from PSPath argument and assign it to LiteralPath, we have to parse this string again internally. Seems we need something like PSPathObject - may be with PathInfo type. Although in this case the new parameter set will be better to exclude overheads.

@mklement0
Copy link
Contributor Author

@iSazonov:

I fully agree that in the long run we should be passing objects (as opposed to strings) around so as to also improve performance.

My approach was meant as an intermediate, easy-to-implement step in the right direction that at least addresses the functional problem.

If someone is willing to take on the proper solution right away, that's great.

As for terminology: If we pass the output from, say, Get-ChildItem around, we're passing items themselves, not path-information objects.

Therefore, my suggestion is to name the parameter -Item, and to implement a provider-independent item type named, e.g., PSProviderItem.

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

1 similar comment
Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

@microsoft-github-policy-service microsoft-github-policy-service bot added the Resolution-No Activity Issue has had no activity for 6 months or more label Nov 16, 2023
@microsoft-github-policy-service microsoft-github-policy-service bot added the Resolution-No Activity Issue has had no activity for 6 months or more label Nov 16, 2023
Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

Copy link
Contributor

This issue has been marked as "No Activity" as there has been no activity for 6 months. It has been closed for housekeeping purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Discussion the issue may not have a clear classification yet. The issue may generate an RFC or may be reclassif Resolution-No Activity Issue has had no activity for 6 months or more WG-Cmdlets-Core cmdlets in the Microsoft.PowerShell.Core module
Projects
None yet
Development

No branches or pull requests

4 participants