Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"line" should be "item" #7133

Closed
iRon7 opened this issue Jan 15, 2021 · 8 comments · Fixed by #7150
Closed

"line" should be "item" #7133

iRon7 opened this issue Jan 15, 2021 · 8 comments · Fixed by #7150
Assignees
Labels
area-utility Area - Microsoft.PowerShell.Utility module

Comments

@iRon7
Copy link
Contributor

iRon7 commented Jan 15, 2021

The word "line" is very confusing in this context:

-Stream

Indicates that the cmdlet sends a separate string for each line of an
input object. By default, the strings for each object are accumulated
and sent as a single string.

And should probably be "item" as it doesn't split raw text (as in Get-Content -Raw) into lines but instead it should mean that each item in a input stream (as in the default Get-Content) is kept separated (where Out-String usually joins all the items to a single text string).

See also StackOverflow question/answer: Select-String not working on piped object using Out-String


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

@sdwheeler sdwheeler added the area-utility Area - Microsoft.PowerShell.Utility module label Jan 15, 2021
@chasewilson chasewilson self-assigned this Jan 15, 2021
@chasewilson
Copy link
Contributor

Hey @iRon7, thanks for the feedback. I see your point here.

Do you think in the context of the document as a whole it makes more sense here to keep it as line? "Line" is common terminology throughout and in my view makes it easer to keep straight in my head when thinking about blocks of text.

What's your take on it?

@iRon7
Copy link
Contributor Author

iRon7 commented Jan 17, 2021

I think it makes sense to change it also for the Description but I am not sure whether this will also affect other documents.
This is how I would define it (note that I am not a writer and not native English, therefore I would check with some other engineers as well. @mklement0, are you able to comment on this?)

The Out-String cmdlet converts input objects into strings. By default, Out-String accumulates the strings and returns them as a single string, but you can use the Stream parameter to direct Out-String to return one line at a time or create and array of strings. This cmdlet lets you search and manipulate string output as you would in traditional shells when object manipulation is less convenient.

In the above context, "the strings" and "one line" both refer to each current item in the pipeline (same definition as in $_/$PSItem; the current object in the pipeline object)

Meaning, I would define this as:

The Out-String cmdlet converts input objects into strings. By default, Out-String accumulates the display output of each item and returns them as a single string, but you can use the Stream parameter to direct the display output of each item immediately to the pipeline or create an array of strings. This cmdlet lets you search and manipulate string output as you would in traditional shells when object manipulation is less convenient.

So far I can determine, the words "line " in the description of the -Width parameter really refer to a line (demarcated by Newline characters) within text string.

@mklement0
Copy link
Contributor

mklement0 commented Jan 17, 2021

Thanks for pinging me, @iRon7. The per-item logic doesn't quite apply as such, and here's the summary from the answer that I posted on Stack Overflow:


  • With -Stream, line-by-line output behavior typically occurs - except for input objects that happen to be multiline strings, which are output as-is.

For so-called in-band data types, -Stream works as follows, which truly results in line-by-line output:

  • Input objects are formatted by PowerShell's rich formatting system, and the lines that make up the resulting representation are then output one by one.

Out-of-band data types are individually formatted outside of the formatting system, by simply calling their .NET .ToString() method.

In short: data types that represent a single value are out-of-band, and in addition to [string] out-of-band data types also comprise [char] and the various (standard) numeric types, such as [int], [long], [double], ...

[string] is the only out-of-band type that itself can result in a multiline representation, because calling .ToString() on a string is effective no-op that returns the string itself - whether it is single- or multiline.

Therefore:

  • Any string - notably also a multiline string - is output as-is, as a whole, and splitting it into individual lines requires an explicit operation; e.g. (note that regex \r?\n matches both Windows-style CRLF and Unix-style LF-only newlines):

    "line 1`nline 2`nline 3" -split '\r?\n' # -> 'line 1', 'line 2', 'line 3'
    
  • If your input objects are a mix of in-band objects and (invariably out-of-band) multiline strings, you can combine Out-String -Stream with -split; e.g.:

    ((Get-Date), "line 1`nline 2`nline 3" | Out-String -Stream) -split '\r?\n' 
    

@iRon7
Copy link
Contributor Author

iRon7 commented Jan 19, 2021

@mklement0 , Thank you for the explanation and the extra details but I am not sure whether this means that the current document should be sufficient for a developer to understand situation behind this (and whether this document-issue should be closed or not).

@chasewilson
Copy link
Contributor

@iRon7 I see your point here. I have a PR in to try and remedy the article. If you want to take a look at it, I'd appreciate your feedback.

@mklement0
Copy link
Contributor

mklement0 commented Jan 19, 2021

@iRon7: I do agree that the topic should be improved, and I was hoping my comment could serve as the basis for that.

@chasewilson, thanks for tackling this improvement; as for the specifics:

for each item of an input object.

I would find that confusing - what is an item in this context?

The gist of it is:

  • Out-String, like all Out-* cmdlets, uses PowerShell's for-display output-formatting system; in short: Out-String uses the same string representations you'd see in the console (terminal).

  • -Stream causes the lines of the resulting string representations to be emitted one by one - except for input objects that happen to be multi-line strings, which are output as-is.

@mklement0
Copy link
Contributor

Also, given that the current behavior with respect to multiline input string is both obscure and unhelpful, I've created a proposal to split them into individual lines too - see PowerShell/PowerShell#14638

sdwheeler added a commit that referenced this issue Jan 20, 2021
* Updates information about Stream parameter

* Apply suggestions from code review

Co-authored-by: Sean Wheeler <sean.wheeler@microsoft.com>

Co-authored-by: Sean Wheeler <sean.wheeler@microsoft.com>
@mklement0
Copy link
Contributor

Thanks for the quick turnaround, but the merged PR added incorrect information - please see #7153

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-utility Area - Microsoft.PowerShell.Utility module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants