Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leading numeric zeros in JSON format #1293

Open
OXYAMINE opened this issue May 16, 2023 · 5 comments
Open

Leading numeric zeros in JSON format #1293

OXYAMINE opened this issue May 16, 2023 · 5 comments

Comments

@OXYAMINE
Copy link

I'm converting CSV to JSON and values like "0012AS4" are presented like {"Key": "0012AS4"}.
But if the values is like "0123456789" it is presented in JSON output like {"Key": 0123456789 }. Which makes invalid JSON produced.

@OXYAMINE
Copy link
Author

Same issue apples to values like "+12123" they are considered as numbers even if explicitly quoted. Result - invalid JSON output

@OXYAMINE
Copy link
Author

Actually the problem is wider. How a value is considered as number or string during JSON conversion? What if a value has been explicitly quoted, why it is still converted to number?

For example value 1867e593836000726799386923505081003007900978 is considered a number and JSON syntax is ok
But obviously this isn't a number and attempt to store it will fail.

@aborruso
Copy link
Contributor

But if the values is like "0123456789" it is presented in JSON output like {"Key": 0123456789 }. Which makes invalid JSON produced.

Hi, I have used this sample input

fielda,fieldb
a,0123456789

and if I run mlr --c2j cat t.csv I do not have {"Key": 0123456789 }, but {"Key": "0123456789" }

[
{
  "fielda": "a",
  "fieldb": "0123456789"
}
]

I'm using mlr 6.7.0

@johnkerl
Copy link
Owner

johnkerl commented Jun 6, 2023

even if explicitly quoted

There are two different things.

For JSON -- "123" means string and 123 means int. Double quotes serve as type-indicators.

For CSV -- double-quotes are there for delimiters -- to allow people to put embedded commas and/or newlines into cells.

The Go CSV-parser library I'm using doesn't return back to the caller information about whether a field was quoted. However, I've already forked and hacked on it a bit; I can look into trying to get back a was-quoted flag from the parser ...

@johnkerl
Copy link
Owner

johnkerl commented Jun 6, 2023

Meanwhile please also check out mlr -S (maybe overkill, but, it does avoid type-inference ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants