Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parameter -Delimiter to Set-Content, Out-File, Out-String to complement Get-Content -Delimiter #3855

Closed
mklement0 opened this issue May 24, 2017 · 15 comments
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-Won't Fix The issue won't be fixed, possibly due to compatibility reason. WG-Cmdlets-Management cmdlets in the Microsoft.PowerShell.Management module

Comments

@mklement0
Copy link
Contributor

mklement0 commented May 24, 2017

Get-Content -Delimiter can be used to read text records with an arbitrary (literally matching) terminator into an array of strings, as an alternative to reading line by line.

To complement this functionality, the general-purpose text-outputting cmdlets should support the same parameter for creating such files:

  • Set-Content / Add-Content
  • Out-File
  • Out-String

(ConvertTo-Csv and Export-Csv already have a -Delimiter parameter, but there it is a field separator (line-internal), and a newline is implied as the record separator.)

Note:

  • -Delimiter '' would effectively behave the same way as -NoNewline (which is currently missing from Out-String - see Out-String cmdlet should support -NoNewline too #3684).

  • -Delimiter "`n" on Windows and -Delimiter "`r`n" on Unix would then allow creating text files with the respective other platform's newline sequence on demand - which would also cover Let us specify EOL when using out-file #2145.

  • The term delimiter is problematic and already ambiguous in terms of what is currently implemented:

    • In the context of *-Csv* cmdlets, -Delimiter refers to a separator - something placed between elements (fields), but not after the last.

    • In the context of Get-Content, -Delimiter refers to an (optional) terminator, something (expected to be placed after every element (record), including the last.
      Get-Content -Delimiter treats input that differs only with respect to a trailing delimiter identically - to signal an empty trailing element two trailing delimiters are needed.

    • This Get-Content -Delimiter behavior implies that Set-Content, Out-File, ... should also treat -Delimiter to mean terminator, and add a trailing delimiter instance to the last record.

Desired behavior

'one', 'two' | Set-Content t.txt -Delimiter '@'
get-content t.txt
'---'
get-content t.txt -Delimiter '@' | % { "[$_]" }
one@two@
---
[one]
[two]

Environment Data

PowerShell Core v6.0.0-beta (v6.0.0-beta.1)
@tats-u
Copy link

tats-u commented May 26, 2020

@iSazonov
-Delimiter option is compatible with an option for new lines.
In Bash, redirection of a line to a file inserts one LF to the end of file.

$ echo a b c > test.txt
$ hexyl ./test.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 61 20 62 20 63 0a       ┊                         │a b c_  ┊        │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘

It is required options to reproduce this in PowerShell for Windows:

  • Whether a set of EOL character(s) (LF or CRLF) is inserted just before EOF (echo -n in Linux)
  • What the set is (LF, CRLF, or native one)

like:

"a","b","c" | Out-File text.txt -Encoding utf8NoBOM -Delimiter " " -FinalNewLine "`n"

# equals to 
"a", "b", "c" -join " " | Out-File text.txt -Encoding utf8NoBOM -Delimiter "`n"

@vexx32
Copy link
Collaborator

vexx32 commented May 26, 2020

Setting a newline option is only really applicable for the writing cmdlets rather than the reading ones, but yeah.

I'm wondering if we should have a base class implement the -Delimiter and then have a base class inherit that and add -NoNewline to omit the trailing newline (trailing newline would be default -- I think it is currently default, actually?)

So it would look something like this:

  1. Base class DelimitedContentBase inherits PSCmdlet and implements a [Parameter] for Delimiter (IMO this should also be aliased to something like -NewlineStyle or something)
  2. Get-Content inherits DelimitedContentBase and does everything else it needs to.
  3. DelimitedContentWriterBase inherits DelimitedContentBase and extends it with a [Parameter] for -NoNewline (a switch parameter) (IMO aliased to -SkipTrailingNewline, or vice versa)
  4. Set-Content, Out-File, Out-String, and Write-Host all inherit DelimitedContentWriterBase

This will involve a bit of refactoring, but it will avoid quite a bit of duplication of code. Some of the base classes may also end up adding methods for stitching the text together, if needed.

@SteveL-MSFT
Copy link
Member

The -join operator already covers the case of creating strings with arbitrary delimiters so it doesn't seem necessary to add to the other cmdlets just for symmetry.

@SteveL-MSFT SteveL-MSFT added Resolution-Won't Fix The issue won't be fixed, possibly due to compatibility reason. and removed Up-for-Grabs Up-for-grabs issues are not high priorities, and may be opportunities for external contributors labels Jun 8, 2021
@ghost
Copy link

ghost commented Jul 8, 2021

This issue has been marked as won't fix and has not had any activity for 1 day. It has been closed for housekeeping purposes.

@ghost ghost closed this as completed Jul 8, 2021
@tats-u
Copy link

tats-u commented Jul 8, 2021

@SteveL-MSFT
Let me make sure of this. Are you saying that we must take such a ugly approach to force PowerShell to use LF? I suppose that there are too many parentheses:

tatsu@TATSU-NB-4TH   base  ~  ((echo a b c) -join "`n") + "`n" | Out-File -NoNewline -Encoding utf8NoBOM test.txt
tatsu@TATSU-NB-4TH   base  ~  hexyl test.txt
┌────────┬─────────────────────────┬─────────────────────────┬────────┬────────┐
│00000000│ 61 0a 62 0a 63 0a       ┊                         │a_b_c_  ┊        │
└────────┴─────────────────────────┴─────────────────────────┴────────┴────────┘

Could you tell me how you feel about it?

@ShivnarenSrinivasan
Copy link

I found a good solution to this; hopefully it helps you and others @tats-u

Set PSDefaultParameterValues in your profile, and it will automatically modify the > operator. (as it merely aliases Out-File)

$Global:PSDefaultParameterValues["Out-File:Encoding"] = "utf8"
$Global:PSDefaultParameterValues['Out-File:NoNewLine'] = $true

The NoNewLine fixes both extra new line char, and the unwanted carriage return.

@tats-u
Copy link

tats-u commented Oct 3, 2021

@ShivnarenSrinivasan This fixes just the half of the issues. This just allows us to use the operator > instead of piping the cmdlet Out-File.
We still have to take an ugly way ((Some-Cmdlet) -join "`n") + "`n".

@dkaszews
Copy link
Contributor

@SteveL-MSFT Please reconsider, the -join is a really ugly hack just to make something as basic as file redirection to work. I daily drive both Linux and Windows, and the same simple command creating different files is really bothering me.

To not make this feature a duplicate of -join and make it more precise solution, I suggest not making it a generic string but instead an enum -NewlineSequence Native | CRLF | LF.

@dkaszews
Copy link
Contributor

Additional three arguments why the ugly -join hack is not an acceptable solution, beyond just syntax:

  1. It breaks streaming, requiring potentially a very big string to be created in memory. Go ahead, try (Get-Content input.txt | %{ $_ -replace $from, $to }) -join "`n" | Out-File output.txt on a couple-gig file
  2. Line endings are arguably part of encoding, so -NewlineSequence fits together with -Encoding
  3. Making it a parameter of Out-File allows use of $PSDefaultParameters so that you can set it in your $PROFILE and never have to worry about it across all your systems

@CEbbinghaus
Copy link

No way this is still a discussion. Ran into this when pwsh creates CRLF files when creating git patches git diff > out.patch. Now I have to manually open & change line endings or edit the diff generation to output a single lf delimited string ((git diff) -join "n") + "n".

Ideally there would also be a PSDefaultParameterValues option to set this across the board. $PSDefaultParameterValues['*:Delimiter'] = 'lf'

@dkaszews
Copy link
Contributor

dkaszews commented Jun 5, 2023

Is there any process to appeal a decision of some guy at MSFT closing an issue just because they don't see it as a problem themselves?

I would like to create a PR myself, but don't want to waste time if it is to be immediately rejected due to the issue being closed.

@brian6932
Copy link

brian6932 commented Apr 20, 2024

@dkaszews Did you ever open a PR for this? I've ran into this issue quite a lot, and would really like some way to change it globally.

@dkaszews
Copy link
Contributor

@brian6932 Nope, I lost all motivation for working on PowerShell due to the way mods handle things around here. I have a PR fixing white-on-white text in -Confirm, that is waiting to be merged for almost 2 years, mods told me to "just use dark terminal".

@tats-u
Copy link

tats-u commented Apr 20, 2024

@brian6932
@SteveL-MSFT has never said anything since then, so we cannot take any actions (including sending a PR).
He has kept silent for too long time (as long as more than 2 years) since today.

@CEbbinghaus
Copy link

By Who? Why? Has this been sufficiently addressed?

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Enhancement the issue is more of a feature request than a bug Resolution-Won't Fix The issue won't be fixed, possibly due to compatibility reason. WG-Cmdlets-Management cmdlets in the Microsoft.PowerShell.Management module
Projects
None yet
Development

No branches or pull requests

9 participants