Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON parsing failure for big number with no exponent part #45782

Open
sgsfak opened this issue Jan 30, 2023 · 2 comments
Open

JSON parsing failure for big number with no exponent part #45782

sgsfak opened this issue Jan 30, 2023 · 2 comments

Comments

@sgsfak
Copy link

sgsfak commented Jan 30, 2023

The following code:

SELECT isValidJSON('{"value": 7230000000000000000000000000000000000000000000000000000000000000000000000000}') AS is_it

returns 0 as you can see here

It seems that ClickHouse (or simdjson?) tries to parse the number as Integer and fails because it's outside the 64bit range. The number's value is actually 7.23e75 i.e. it can be represented as Float64 (Double precision) so the JSON is in fact standards compliant.

P.S. This is not a contrived example, we actually got JSON documents with values like this from medical (DICOM) images...

@qoega
Copy link
Member

qoega commented Jan 30, 2023

There is no number precision in JSON definition, so it is always implementation defined and it will not convert number to floating point just because it does not fit specific type.

I propose you to try
input_format_json_read_numbers_as_strings for your use case and run string to number conversion explicitly

@wizzard0
Copy link

wizzard0 commented Oct 24, 2023

Same here. Per the JSON spec, numbers can be arbitrarily long, except that an implementation is not expected to preserve whatever can't fit in int53.

select isValidJSON('1844674400000000000') as ok - 1 (OK)
select isValidJSON('18446744073709552000') as bug - 0 (BUG) (CH v.23.4.1.1912)

Try JSON.parse("1844674400000000000000") in the browser console
And it's not a bigint, regular JSON.stringify({x:18446744073709552000}) outputs '{"x":18446744073709552000}' as an integer, not in an exponential form -- which isnt deemed valid by ClickHouse as well.

Adding ".0" helps but I can't figure out how to do it in JS easily

Though I agree folding to doubles is a tradeoff as well; maybe this behavior can be made into the column attribute? Having functions like JSONExtractString unexpectedly break for some rows if a single parameter doesn't fit isn't fun too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants