Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a common parameter pattern for requesting undecorated / not-wrapped-in-helper-types output objects #7855

Closed
mklement0 opened this issue Sep 24, 2018 · 20 comments
Labels
Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-No Activity Issue has had no activity for 6 months or more WG-Cmdlets general cmdlet issues

Comments

@mklement0
Copy link
Contributor

Follow-up from #7715; related to #7713, #7537, and #5797.

A pattern is emerging for asking cmdlets to output "bare" objects, which means output objects that:

  • are not decorated with NoteProperty members, the way lines read from a file with Get-Content are, for instance.

  • are not wrapped in instances of a helper type, the way that Select-String or Compare-Object output is, for instance.

There are three, not mutually exclusive motivations for requesting such "bare" output:

There are probably more existing cmdlets that could benefit from the pattern, as would future ones.

In line with PowerShell's commitment to consistency, a common (shared) parameter name should be used in all these cases.

-Bare makes the most sense to me.

To avoid confusion with -Raw as implemented in Get-Content - which simply reads the whole file while still decorating the resulting string - #7715 proposed deprecating -Raw in favor of a more descriptive name such as -Whole. (Note that by deprecating I don't mean to imply removing support for -Raw, just documenting -Whole first and mentioning -Raw as a legacy name).

Environment data

Written as of:

PowerShell Core 6.1.0
@iSazonov
Copy link
Collaborator

@mklement0 Could you please list all affected cmdlets (from the repo) in the description?

@mklement0
Copy link
Contributor Author

@iSazonov: Do you mean additional cmdlets that could benefit from this pattern?

@iSazonov
Copy link
Collaborator

Yes, it will help PowerShell Committee to review and approve if they see the full list.

@iSazonov iSazonov added the WG-Cmdlets general cmdlet issues label Sep 25, 2018
@BrucePay
Copy link
Collaborator

@mklement0 @iSazonov We've already reviewed this issue as part of #7715. The committee unanimously agreed that we are going to continue to use the existing pattern -Raw as the parameter to indicate that an object should be returned unadorned. (What exactly that means is up to the cmflet author.) Adding a new, largely indistinguishable parameter is undesirable as it will add confusion while providing no tangible benefit.

@BrucePay BrucePay added the Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a label Sep 26, 2018
@mklement0
Copy link
Contributor Author

@BrucePay: #7715 was focused on paving the way for the pattern proposed here - and the feedback suggested that the larger pattern and benefit perhaps wasn't fully considered in the decision.

Hence, this issue was opened, which focuses on the bigger picture.

Adding a new, largely indistinguishable parameter

The pattern proposed is here is clearly distinguishable in intent from what the - unfortunately named - -Raw currently does in the context of Get-Content, as described.

With the proposed deprecation (without removal) of -Raw, the distinction problem goes away (not for legacy code, but going forward).

while providing no tangible benefit.

The benefit is the pattern described in the initial post, for which we have concrete uses already.
I'm sure there are more, and, once the pattern is established, future cmdlets can take advantage too.

@mklement0
Copy link
Contributor Author

And, just to give a dying horse another wack:

the existing pattern -Raw as the parameter to indicate that an object should be returned unadorned.

Get-Content's -Raw doesn't return anything unadorned (undecorated), it just changes the output partitioning and still adorns.

In a very loose sense you can consider that reading the input "raw", but this loose sense:

  • gets in the way of a meaningful, reusable pattern, given that we now want Get-Content to output truly undecorated lines as well (Get-Content is slow on large text files. Could it have a parameter to speed it up by not adding NoteProperties? #7537; i.e., without changing the output partitioning) - and I think retaining the existing -Raw while introducing something -RawLines isn't a great solution in and of itself, let alone impeding the establishment of a well-defined general pattern.

  • is in itself poorly descriptive of the very specific action performed by Get-Content -Raw (hence the suggested alias -Whole).

@mklement0
Copy link
Contributor Author

OK, one more for the road:

Note: The premise is that there is value in establishing a general pattern using a shared parameter name and in applying it to #7713 and perhaps #5797, among others.

What exactly that [-Rare] means is up to the cmdlet author.

It is an option to live with the loose definition of -Raw, but:

  • we'd then have to live with the -Raw -RawLines confusion within Get-Content, and the general confusion over the distinctly different Get-Content -Raw behavior.

  • -Raw, especially in the context of file I/O has a distinct connotation of raw bytes, which is inapplicable.

By contrast:

  • -Bare better connotes "lack of decoration".

  • With Get-Content's -Raw deprecated, the specific function it performs can be given a more descriptive name, such as -Whole (and -Whole in itself could become a standardized name for read-everything-at-once, but at this time I'm unaware of other cmdlets that could use it).

@BrucePay
Copy link
Collaborator

@mklement0

and the feedback suggested that the larger pattern and benefit perhaps wasn't fully considered in the decision.

It was. Sorry if that was unclear.

@mklement0
Copy link
Contributor Author

Thanks, @BrucePay, but further clarification is needed:

To quote @SteveL-MSFT's summary:

The current use of -Raw is acceptable and therefore no reason to make the proposed change. We would support a proposal to add a type parameter for streaming line-by-line w/o annotations although we did not come to agreement on the naming. -Bare is not different enough from -Raw to communicate functional differences.

This tells me that the decision was entirely focused on rejecting the proposal to deprecate -Raw as currently used with Get-Content, and what to name the new parameter that fits the pattern described in this issue in the context of Get-Content.

Aside from my obvious preference for this deprecation (without removal - I'll stop saying that now, consider it implied; "deemphasizing" is just too clunky), this proposal's gist is to introduce a general parameter pattern with a shared name - while my naming preference is clearly -Bare, that is just one suggestion.

Committing to this pattern means the newly agreed-upon name should be used in #7713, #7537, and perhaps #5797, as well as going forward.

By your own reasoning,

"Raw" in this context means, "undecorated", "not cooked", etc.
In PowerShell we try to choose a single term and apply it consistently so that, even if it does not seem intuitive to a person, they only have to learn it once. Sometimes we choose a sub-optimal term but we live with it because you should only have to learn something once.
[...] use the existing pattern -Raw as the parameter to indicate that an object should be returned unadorned.

In other words, this issue:

  • just asks more formally for defining and establishing the use of a -Raw-like parameter pattern.

  • and to you -Raw is an acceptable name for this general pattern, because what Get-Content -Raw currently does fits in as well, correct?

If so, and everyone's happy with this -Raw deal (if you will), then that leaves just one, incidental question:

What to name the new streaming line-by-line w/o annotations parameter for Get-Content, given that -Raw is already taken there.
Given the -Raw deal, the name -RawLines, which you yourself have pondered, sounds just fine to me.

@vexx32
Copy link
Collaborator

vexx32 commented Sep 27, 2018

Y'know, thinking on that a bit, @mklement0, although it would be a more significant change... I would consider simply replacing the -Raw parameter with your proposed -RawLines parameter.

There's no need for an extra parameter; as has been noted in the associated issues at least once, -ReadCount can already be used to read the file in one go. Additionally, use of that is backwards compatible.

So while old code would need to be updated to work properly with the new version (thus a breaking change, I suppose), new code would be relatively backwards compatible, if the read-all-at-once was all that it was being used for. And if the undecorated output was required, it would be restricted to the newer versions.

@mklement0
Copy link
Contributor Author

mklement0 commented Sep 27, 2018

@vexx32:
I think removing -Raw would be too drastic a change (and with my proposal it wouldn't go away, we'd just tell people to use its new alias, -Whole, from now on).

Actually, what -Raw does (read the entire file as a single string) can not be done with -ReadCount: the latter is a chunking mechanism that is still array-based; -ReadCount 0 reads all lines at once, but puts them into an array.

There's no functional problem with naming the new parameter for reading lines undecorated (but still line-by-line) -RawLines, but it will cause confusion, because what "raw" means in -Raw vs. -RawLines is then quite distinct.


As an aside:

Somewhat ironically, the only parameter that ever deserved to be called -Raw - the undocumented Format-Hex -Raw, which asked for a raw byte representation in certain contexts - is now obsolete.

With my proposal, there would be no more (non-obsolete, non-deprecated) -Raw parameter - for now.

That said, there's no reason not to revive it where appropriate, given that, despite -Raw and -Bare having substantial semantic overlap, the following distinction can be useful:

  • -Raw ... ask for uninterpreted data (typically, raw bytes)

  • -Bare ... ask for undecorated data (without NoteProperties and, in a wider sense, not wrapped in a helper type that provides metadata)

There, I feel better now, although you could ask: hasn't that poor equine suffered enough?

@dgc
Copy link

dgc commented Dec 1, 2018

If this proposal or some variant of it were to go ahead then I would suggest this as a general pattern:

[-OutputMode {Raw | RawLines | ...}]

It would make it clear at a glance to the casual user that there are options to affect the output of the cmdlet and that there is a choice between a set of distinct modes.

If I came across a future Get-Content cmdlet that had -Bare or -RawLines in the parameters then I would have to read further on into the documentation before I realised they merely configured the output. Also, I don't think it would be obvious that they were mutually exclusive until stated.

EDIT: OK, I see now that that -Raw and -Bare can be combined in your last post.

@mklement0
Copy link
Contributor Author

@dgc:

While -Raw and -Bare could be separate switches with distinct meanings, they'd be mutually exclusive.

Note that there's no need to distinguish between Raw and RawLines, because the aspect of partitioning the output has nothing to do with the semantics of the -Bare switch proposed here (undecorated/non-wrapped objects) - and is worth keeping separate.

The partitioning aspect, which is not common, is covered in Get-Content as follows:

@mklement0
Copy link
Contributor Author

@dgc:

To reframe my previous comment in light of the decision that -Raw will be retained as the general switch name for requesting non-decorated/non-wrapped output:

The need for the -Raw / -RawLines distinction should not arise outside of Get-Content, where it only arises in order to preserve backward compatibility.

@HumanEquivalentUnit
Copy link
Contributor

HumanEquivalentUnit commented Mar 27, 2019

I went exploring, and maybe irrelevant but for the record, found a few other more things which, if you squint a bit, fit this issue's described pattern of asking a cmdlet to return a more basic output, or switch to another commonly desired output, or do less work for improved performance. Not all related to "undecorated" output, exactly:

  • Get-Date returns a [DateTime], and -Format 'yyyy-mm-dd' returns [string].
  • Test-Connection returns [TestConnectionCommand+PingReport], and -Quiet returns [bool].
  • ConvertTo-Xml returns [XmlDocument], and -As String makes it return a [string].
  • Get-Variable returns [PSVariable] but with -ValueOnly it returns just the variable value.
  • Get-CimInstance has -KeyOnly and -Shallow to ask it to get less information (KeyOnly is documented as "[returns key parameters only ..] reduces the amount of data transferred over the network." - presumably that's a performance reason).
  • Get-ComputerInfo -Property BiosCaption asks it to return less information, just like feeding it through | select BiosCaption would .. but apparently still makes you wait as long as it takes to get all the information, so it doesn't seem to be a performance reason. Help says it "Specifies, as a string array, the computer properties in which this cmdlet displays." as if it's intended to be an output display option for a user.
  • Measure-Command returns [GenericMeasureInfo] but with -Lines it switches to [TextMeasureInfo] (to try and behave like the wc utility?).

and existing in Windows PowerShell, but not PS 6.1:

  • Get-Clipboard has a -Raw parameter which ignores newlines.
  • Get-EventLog has -AsBaseObject which "Indicates that this cmdlet returns a standard System.Diagnostics.EventLogEntry object for each event. Without this parameter, Get-EventLog returns an extended PSObject object with additional EventLogName, Source, and InstanceId properties."

@iSazonov
Copy link
Collaborator

Test-Connection returns Win32_PingStatus, and -Quiet returns [bool].

Currently we don't use Win32_PingStatus - please update your message.

@HumanEquivalentUnit
Copy link
Contributor

@iSazonov updated.

@mklement0
Copy link
Contributor Author

mklement0 commented Mar 27, 2019

Thanks, @HumanEquivalentUnit.

The patterns I see in your examples is to ask for an alternative output data type or only part of the usual output objects.

With respect to an alternative data type, it is what the -As<type> switches in some of the examples do.

You could argue that -ValueOnly should therefore be -AsValue, and -Quiet (which isn't really quiet, only quieter) should be -AsBoolean, and, though a less clear-cut case, perhaps -KeyOnly should be -AsKey.

Similarly, Get-ChildItem's -Name could be -AsName, and it is an example of how the reasonable expectation that asking for something simpler or for less also results in better performance doesn't always hold: see #9119 - just like you found with Get-ComputerInfo -Property in #9234.

Get-Clipboard's use of -Raw is just as unfortunate as Get-Content's - I've previously proposed
-Whole in #7715, but, based on the above, -AsString could work too (though, while more consistent, it is a tad more obscure in this specific case).

Get-EventLog's -AsBaseObject is a good example of what this issue calls for: requesting the usual output object, but undecorated (without tacked-on ETS properties); thus, it should be -Raw (though I wonder if I've expressed my preference for -Bare before).

Measure-Object's -Line, -Word and -Character switches are interesting, because they only not only result in a different type of output object, but also change the input processing to count inside of the input objects, i.e., the lines, words, and characters inside multi-line strings. TextMeasureInfo is derived from the abstract MeasureInfo class, just like the default GenericMeasureInfo output type.

Copy link
Contributor

This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you.

@microsoft-github-policy-service microsoft-github-policy-service bot added the Resolution-No Activity Issue has had no activity for 6 months or more label Nov 16, 2023
@microsoft-github-policy-service microsoft-github-policy-service bot added the Resolution-No Activity Issue has had no activity for 6 months or more label Nov 16, 2023
Copy link
Contributor

This issue has been marked as "No Activity" as there has been no activity for 6 months. It has been closed for housekeeping purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Question ideally support can be provided via other mechanisms, but sometimes folks do open an issue to get a Resolution-No Activity Issue has had no activity for 6 months or more WG-Cmdlets general cmdlet issues
Projects
None yet
Development

No branches or pull requests

6 participants