Skip to content
This repository has been archived by the owner on May 24, 2022. It is now read-only.

Remove unnecessary escapings #22

Open
ngirard opened this issue Mar 21, 2021 · 1 comment
Open

Remove unnecessary escapings #22

ngirard opened this issue Mar 21, 2021 · 1 comment

Comments

@ngirard
Copy link

ngirard commented Mar 21, 2021

Some tools produce unnecessary escapings, making it difficult to distinguish between genuine differences resulting from data transformation and spurious differences resulting from such escapings.

As an exemple that is still in my mind, Gocsv transforms 2021-03-21 into 2021\-03\-21.

Unless I'm mistaken, Scrubcsv doesn't offer such kind of normalization.
It would be nice if it did.

@emk
Copy link
Contributor

emk commented Mar 22, 2021

Do you have a specification describing any "standard" normalizations that should be performed? I suspect there's a list of these normalizations floating around somewhere, with a specific focus on ensuring that Excel doesn't treat certain character sequences as "magic." But I don't have a trustworthy and complete version of that list.

But in any case, it might be best to handle this in a separate tool. scrubcsv is not intended to replace CSV toolkits like xsv or gocsv. Its job is to get data into a clean, normalized format so that it can be piped into other tools.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants