Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

toJson() produces invalid JSON for columns with "time" type #7988

Closed
nyh opened this issue Jan 28, 2021 · 4 comments
Closed

toJson() produces invalid JSON for columns with "time" type #7988

nyh opened this issue Jan 28, 2021 · 4 comments
Assignees
Milestone

Comments

@nyh
Copy link
Contributor

nyh commented Jan 28, 2021

The toJson() function prints columns of various types in JSON format. Because JSON has no time-specific type, Cassandra represents times in JSON as as string looking like "hh:mm:ss.fffffffff"

Scylla has a bug where it prints this string without the quotes... This is of course not legal JSON, and JSON parsers reading this output will not like it.

This bug is reproduced by the following two xfailing tests:

  1. test_json.py::test_tojson_time
  2. cassandra_tests/validation/entities/json_test.py::testToJsonFct, a translated Cassandra test which among many other things tests toJson() of a time column.
nyh added a commit to nyh/scylla that referenced this issue Feb 4, 2021
Add two xfailing tests reproducing two issues - scylladb#7988 and scylladb#8002.
The first is that toJson() incorrectly formats values of the "time"
type - it should be a string but Scylla forgets the quotes.
The second is that toJson() format "decimal" values as JSON numbers
without using an exponent, resulting in memory allocation failure
for numbers with high exponents, like 1e1000000000.

Both tests pass on Cassandra, and fail on Scylla.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
@slivne slivne added this to the 4.x milestone Feb 7, 2021
avikivity pushed a commit that referenced this issue Feb 14, 2021
Add two xfailing tests reproducing two issues - #7988 and #8002.
The first is that toJson() incorrectly formats values of the "time"
type - it should be a string but Scylla forgets the quotes.
The second is that toJson() format "decimal" values as JSON numbers
without using an exponent, resulting in memory allocation failure
for numbers with high exponents, like 1e1000000000.

Both tests pass on Cassandra, and fail on Scylla.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210201064842.440602-1-nyh@scylladb.com>
avikivity pushed a commit that referenced this issue Feb 15, 2021
This patch adds several more tests reproducing bugs in toJson() and
SELECT JSON.

First add two xfailing tests reproducing two toJson() issues - #7988
and #8002. The first is that toJson() incorrectly formats values of the
"time" type - it should be a string but Scylla forgets the quotes.
The second is that toJson() format "decimal" values as JSON numbers
without using an exponent, resulting in memory allocation failure
for numbers with high exponents, like 1e1000000000.

The actual test for 1e1000000000 has to be skipped because in
debug build mode we get a crash trying this huge allocation.
So instead, we check 1e1000 - this generates a string of 1000
characters, which is much too much (should just be "1e1000")
but doesn't crash.

Then we add a reproducing test for issue #8077: When using SELECT JSON
on a function, such as count(*), ttl(v) or intAsBlob(v), Cassandra has
a specific way how it formats the result in JSON, and Scylla should do
it the same way unless we have a good reason not to.

As usual, the new tests passes on Cassandra, fails on Scylla, so is marked
xfail.

Refs #7988
Refs #8002
Refs #8077.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210214210727.1098388-1-nyh@scylladb.com>
avikivity pushed a commit that referenced this issue Feb 16, 2021
In this patch, we port validation/entities/json_test.java, containing
21 tests for various JSON-related operations - SELECT JSON, INSERT JSON,
and the fromJson() and toJson() functions.

In porting these tests, I uncovered 19 (!!) previously unknown bugs in
Scylla:

Refs #7911: Failed fromJson() should result in FunctionFailure error, not
            an internal error.
Refs #7912: fromJson() should allow null parameter.
Refs #7914: fromJson() integer overflow should cause an error, not silent
            wrap-around.
Refs #7915: fromJson() should accept "true" and "false" also as strings.
Refs #7944: fromJson() should not accept the empty string "" as a number.
Refs #7949: fromJson() fails to set a map<ascii, int>.
Refs #7954: fromJson() fails to set null tuple elements.
Refs #7972: toJson() truncates some doubles to integers.
Refs #7988: toJson() produces invalid JSON for columns with "time" type.
Refs #7997: toJson() is missing a timezone on timestamp.
Refs #8001: Documented unit "µs" not supported for assigning a "duration"
            type.
Refs #8002: toJson() of decimal type doesn't use exponents so can produce
            huge output.
Refs #8077: SELECT JSON output for function invocations should be
            compatible with Cassandra.
Refs #8078: SELECT JSON ignores the "AS" specification.
Refs #8085: INSERT JSON with bad arguments should yield InvalidRequest
            error, not internal error.
Refs #8086: INSERT JSON cannot handle user-defined types with case-
            sensitive component names.
Refs #8087: SELECT JSON incorrectly quotes strings inside map keys.
Refs #8092: SELECT JSON missing null component after adding field to
            UDT definition.
Refs #8100: SELECT JSON with IN and ORDER BY does not obey the ORDER BY.

Due to these bugs, 9 out of the 21 tests here currently xfail.

As usual in these sort of tests, all 21 tests pass when running against
Cassandra.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210216154858.1172313-1-nyh@scylladb.com>
avikivity pushed a commit that referenced this issue Feb 18, 2021
In this patch, we port validation/entities/json_test.java, containing
21 tests for various JSON-related operations - SELECT JSON, INSERT JSON,
and the fromJson() and toJson() functions.

In porting these tests, I uncovered 19 (!!) previously unknown bugs in
Scylla:

Refs #7911: Failed fromJson() should result in FunctionFailure error, not
            an internal error.
Refs #7912: fromJson() should allow null parameter.
Refs #7914: fromJson() integer overflow should cause an error, not silent
            wrap-around.
Refs #7915: fromJson() should accept "true" and "false" also as strings.
Refs #7944: fromJson() should not accept the empty string "" as a number.
Refs #7949: fromJson() fails to set a map<ascii, int>.
Refs #7954: fromJson() fails to set null tuple elements.
Refs #7972: toJson() truncates some doubles to integers.
Refs #7988: toJson() produces invalid JSON for columns with "time" type.
Refs #7997: toJson() is missing a timezone on timestamp.
Refs #8001: Documented unit "µs" not supported for assigning a "duration"
            type.
Refs #8002: toJson() of decimal type doesn't use exponents so can produce
            huge output.
Refs #8077: SELECT JSON output for function invocations should be
            compatible with Cassandra.
Refs #8078: SELECT JSON ignores the "AS" specification.
Refs #8085: INSERT JSON with bad arguments should yield InvalidRequest
            error, not internal error.
Refs #8086: INSERT JSON cannot handle user-defined types with case-
            sensitive component names.
Refs #8087: SELECT JSON incorrectly quotes strings inside map keys.
Refs #8092: SELECT JSON missing null component after adding field to
            UDT definition.
Refs #8100: SELECT JSON with IN and ORDER BY does not obey the ORDER BY.

Due to these bugs, 8 out of the 21 tests here currently xfail and one
has to be skipped (issue #8100 causes the sanitizer to detect a use
after free, and crash Scylla).

As usual in these sort of tests, all 21 tests pass when running against
Cassandra.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20210217130732.1202811-1-nyh@scylladb.com>
@DoronArazii DoronArazii added P4 Low Priority and removed low labels Dec 28, 2022
@nyh
Copy link
Contributor Author

nyh commented Nov 21, 2023

This issue was encountered by a user, who was using "SELECT JSON" on a "time" column, and got incorrectly-formatted JSON without quotes. I think it's time to fix it, and I'll take a stab at it.

@nyh nyh self-assigned this Nov 21, 2023
nyh added a commit to nyh/scylla that referenced this issue Nov 21, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

Fixes scylladb#7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
nyh added a commit to nyh/scylla that referenced this issue Nov 21, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes scylladb#7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
@mykaul mykaul added backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed backport/5.4 Issues that should be backported to 5.4 branch once they'll be fixed labels Nov 27, 2023
@denesb
Copy link
Contributor

denesb commented Dec 14, 2023

Needs a backport in dtest, before we can backport.

denesb pushed a commit that referenced this issue Dec 15, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes #7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #16121

(cherry picked from commit 8d04032)
@denesb
Copy link
Contributor

denesb commented Dec 15, 2023

Backported to 5.4. Need to wait a bit more with 5.2 as the dtest backport didn't promote yet.

@denesb denesb removed the backport/5.4 Issues that should be backported to 5.4 branch once they'll be fixed label Dec 15, 2023
denesb pushed a commit that referenced this issue Dec 18, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes #7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #16121

(cherry picked from commit 8d04032)
@denesb
Copy link
Contributor

denesb commented Dec 18, 2023

Backported to 5.2.

@denesb denesb removed the backport/5.2 Issues that should be backported to 5.2 branch once they'll be fixed label Dec 18, 2023
denesb pushed a commit that referenced this issue Dec 18, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes #7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #16121

(cherry picked from commit 8d04032)
denesb pushed a commit that referenced this issue Dec 18, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes #7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #16121

(cherry picked from commit 8d04032)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants