Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cql: fix SELECT toJson() or SELECT JSON of time column #16121

Closed
wants to merge 1 commit into from

Conversation

nyh
Copy link
Contributor

@nyh nyh commented Nov 21, 2023

The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column of type "time" forgot to put the time string in quotes. The result was invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test for this issue which now starts to pass. We also add a second test for this issue - the existing test was for "SELECT TOJSON(t)", and the second test shows that "SELECT JSON t" had exactly the same bug - and both are fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug, but that test continues to fail because of other bugs, so we just need to update the xfail string.

Fixes #7988

@nyh nyh requested a review from ptrsmrn November 21, 2023 16:53
@nyh nyh requested a review from tgrabiec as a code owner November 21, 2023 16:53
@ptrsmrn
Copy link
Contributor

ptrsmrn commented Nov 21, 2023

It looks like 1 dtest (https://jenkins.scylladb.com/job/scylla-master/job/gating-dtest-release/6123/testReport/junit/json_test/TestsJsonFullRowInsertSelect/FullDtest___full_split000___test_simple_schema/) relies on this piece of functionality and that it's not only quotes that are different but also datetime format:
image

@scylladb-promoter
Copy link
Contributor

🔴 CI State: FAILURE

✅ - Build
❌ - Sanity Tests
❌ - Unit Tests

Failed Tests (2/21868):

Build Details:

@nyh
Copy link
Contributor Author

nyh commented Nov 21, 2023

It looks like 1 dtest (https://jenkins.scylladb.com/job/scylla-master/job/gating-dtest-release/6123/testReport/junit/json_test/TestsJsonFullRowInsertSelect/FullDtest___full_split000___test_simple_schema/) relies on this piece of functionality and that it's not only quotes that are different but also datetime format: image

Really weird, because didn't change the time format in this patch, I only changed the quotes around the existing format of the timestamp! I'll have to try and debug this locally.

But it made me realize I should probably expand my own test to have a date, not just time, to see if I have timestamp format bugs as well. Regadless of dtest.

@nyh
Copy link
Contributor Author

nyh commented Nov 21, 2023

I see there's also a failure in test/boost/json_cql_query_test.cc. I'll start working on a v2 of my patch.

The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes scylladb#7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
@nyh
Copy link
Contributor Author

nyh commented Nov 21, 2023

Pushed a new version which fixes the failing Boost test - which blatantly enshrined the wrong output :-(

(unlike the Python test, the C++ test doesn't run a JSON parser on the string so it didn't detect it was invalid JSON, and also doesn't run against Cassandra. By the way, the reason why the Python test does run a JSON parser on the string is that it wants to compare the expected and actual results are the same JSON, not the same strings - e.g., it's fine to have different whitespace in the JSON between Scylla and Cassandra.

The dtest will probably still fail, I'll have to look into that.

@nyh
Copy link
Contributor Author

nyh commented Nov 21, 2023

@fruch I'm still looking into the dtest failure of json_test.py::TestsJsonFullRowInsertSelect::test_simple_schema. It's enshrining wrong behavior and I'll need to fix it (the locksteping to do it is a disaster...), but I'm puzzled why was this test run in CI - I see it has the @pytest.mark.dtest_full mark, not gating. What am I missing?

Another strange thing is that this test also has the wrong format for "timestamp" columns. Clearly the test somehow ignores this failure. But how and where? :-( I'll need to figure out and probably do the same ignoring for "time". Perhaps we should just delete this test :-(

@fruch
Copy link
Contributor

fruch commented Nov 21, 2023

@fruch I'm still looking into the dtest failure of json_test.py::TestsJsonFullRowInsertSelect::test_simple_schema. It's enshrining wrong behavior and I'll need to fix it (the locksteping to do it is a disaster...), but I'm puzzled why was this test run in CI - I see it has the @pytest.mark.dtest_full mark, not gating. What am I missing?

There a next_gating applied to the whole file, see at the top of the file

Another strange thing is that this test also has the wrong format for "timestamp" columns. Clearly the test somehow ignores this failure. But how and where? :-( I'll need to figure out and probably do the same ignoring for "time". Perhaps we should just delete this test :-(

If you look closely at the test there's an OR there, and the test supports 2 different options for date

Since those keep changing, and the compares the output of the cqlsh

@nyh
Copy link
Contributor Author

nyh commented Nov 21, 2023

There a next_gating applied to the whole file, see at the top of the file

How not confusing :-(

Another strange thing is that this test also has the wrong format for "timestamp" columns. Clearly the test somehow ignores this failure. But how and where? :-( I'll need to figure out and probably do the same ignoring for "time". Perhaps we should just delete this test :-(

If you look closely at the test there's an OR there, and the test supports 2 different options for date

Yes, I finally figured this out.
So now I need to add a third option... I'll send a dtest PR in a few minutes.

@nyh
Copy link
Contributor Author

nyh commented Nov 21, 2023

Sent a dtest patch https://github.com/scylladb/scylla-dtest/pull/3742.
When the dtest patch goes in, I can rerun the CI here and expect it to finally pass.

@scylladb-promoter
Copy link
Contributor

🔴 CI State: FAILURE

✅ - Build
❌ - Sanity Tests
✅ - Unit Tests

Failed Tests (1/21868):

Build Details:

@mykaul
Copy link
Contributor

mykaul commented Nov 22, 2023

I suspect we need to re-run CI now with the fixed dtest?

@nyh
Copy link
Contributor Author

nyh commented Nov 23, 2023

I suspect we need to re-run CI now with the fixed dtest?

Yes, I see now this dtest did reach master - github's UI in https://github.com/scylladb/scylla-dtest/commit/8ebe00c86e00be9f6bc1eb6b26bb931ee9dab5b0 confused me to think it was merged into dtest next but did not reach master. But apparently it did. So I'll retry the CI.

@scylladb-promoter
Copy link
Contributor

🟢 CI State: SUCCESS

✅ - Build
✅ - Sanity Tests
✅ - Unit Tests

Build Details:

@nyh
Copy link
Contributor Author

nyh commented Nov 26, 2023

ping @scylladb/scylla-maint - the fixed passed CI (after I fixed dtest) and has one positive review.

denesb pushed a commit that referenced this pull request Dec 15, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes #7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #16121

(cherry picked from commit 8d04032)
denesb pushed a commit that referenced this pull request Dec 18, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes #7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #16121

(cherry picked from commit 8d04032)
denesb pushed a commit that referenced this pull request Dec 18, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes #7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #16121

(cherry picked from commit 8d04032)
denesb pushed a commit that referenced this pull request Dec 18, 2023
The implementation of "SELECT TOJSON(t)" or "SELECT JSON t" for a column
of type "time" forgot to put the time string in quotes. The result was
invalid JSON. This is patch is a one-liner fixing this bug.

This patch also removes the "xfail" marker from one xfailing test
for this issue which now starts to pass. We also add a second test for
this issue - the existing test was for "SELECT TOJSON(t)", and the second
test shows that "SELECT JSON t" had exactly the same bug - and both are
fixed by the same patch.

We also had a test translated from Cassandra which exposed this bug,
but that test continues to fail because of other bugs, so we just
need to update the xfail string.

The patch also fixes one C++ test, test/boost/json_cql_query_test.cc,
which enshrined the *wrong* behavior - JSON output that isn't even
valid JSON - and had to be fixed. Unlike the Python tests, the C++ test
can't be run against Cassandra, and doesn't even run a JSON parser
on the output, which explains how it came to enshrine wrong output
instead of helping to discover the bug.

Fixes #7988

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #16121

(cherry picked from commit 8d04032)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

toJson() produces invalid JSON for columns with "time" type
6 participants