New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ConvertTo-Csv / Export-Csv do not create valid CSV with new-lines and double-quotes #9284
Comments
Hi there, I would like to work on this issue. |
It's |
I've been looking at this. The |
I was preparing to document this and ran into the same issue as @rspears74. The error is not very descriptive at all and really doesn't tell you what you need to do to resolve it. |
Given that typically newlines are record separators in a CSV file, it almost seems like we should be encoding newlines somehow in the file. Is there an established standard for doing so? |
@vexx32 No there isn't a standard for escaping control characters. That would be more like CTX: http://www.creativyst.com/Doc/Std/ctx/ctx.htm There is a secondary standard which uses an escape character instead e.g. backslash before the newline; a bit like unix pipe delimited files. However this is not supposed to be mixed with quoted fields. i.e. you either quote or escape. A lot of csv parsers don't handle the quoted version though (they only handle the RFC4180 https://tools.ietf.org/html/rfc4180). If you are asking about converting the newlines to something else e.g. \r\n then, no there is nothing standard about this. That is outside the scope of CSV since really it depends on the reader and writer of these files. Given PowerShell is such a good toolbelt then we could be looking to read and write csv written by many other applications as long as it conforms to RFC4180. Handling escapes would be nice but extra. |
We could look how Excel output multi line cells to csv. |
Is this too stale to re-comment on? This is still an issue as of Powershell v7.2-preview6. If you use
The Relevant code: To be RFC-4180 compliant you'd quote if you detect any newline characters ( |
Fix for issue PowerShell#9284 - Escape fields that contain quotes and newlines, not just those that contain the delimiter
I have a file that is tab delimited and includes quoted strings. Python's csv library has an option to not interpret quotes (quoting=csv.QUOTE_NONE). It would be nice if PowerShell had an option for this, otherwise I don't have a straightforward way to properly read these files. e.g., the following loses quotes even though I am using simple tab delimited format:
|
(Please excuse formatting - on mobile)
Is this reading a file? This issue is about converting TO CSVs rather than
from. Have you tried replacing the quotes before converting?
# try using double doublequotes, that’s standard CSV quote escaping. If
that doesn’t work try another like null char `0 or vertical tab `v
$replaceChar = '""'
(Get-Content myfile.tsv -Raw) -replace '"',$replaceChar | ConvertFrom-Csv
-Delimiter "`t"
If it’s converting TO a CSV then I’d be surprised if it doesn’t maintain
quotes - please provide sample code. Try with -UseQuotes Never
|
This issue still exists on PowerShell 7.3: PS> $PSVersionTable
Name Value
---- -----
PSVersion 7.3.9
PSEdition Core
GitCommitId 7.3.9
OS Microsoft Windows 10.0.19045
Platform Win32NT
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0…}
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
WSManStackVersion 3.0 The output is still incorrect and unchanged from above:
When using
|
Steps to reproduce
Using the following as a test (this is valid CSV).
Expected behavior
The output file should look the same as the input file
Actual behavior
Its OK that there are quotes around everything now but the formatter has closed the double-quote before the line break and as a result introduced new records into the file.
Environment data
The text was updated successfully, but these errors were encountered: