Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fromJson() or INSERT JSON fails to set a map<timeuuid, int> #18477

Closed
nyh opened this issue Apr 30, 2024 · 1 comment
Closed

fromJson() or INSERT JSON fails to set a map<timeuuid, int> #18477

nyh opened this issue Apr 30, 2024 · 1 comment

Comments

@nyh
Copy link
Contributor

nyh commented Apr 30, 2024

More than 3 years ago, in #7949, we noticed that trying to insert a JSON (using either fromJson() or INSERT JSON) into a map<ascii, int> doesn't work. We only fixed that bug recently, but a user just discovered that the same bug exists even today for timeuuid keys, i.e., map<timeuuid, int>.

The following test with tuuidmap being a map<timeuuid, int> works on Cassandra and fails on Scylla:

def test_fromjson_map_timeuuid(cql, table1):
    p = unique_key_int()
    cql.execute("INSERT INTO " + table1 + " (p, tuuidmap) VALUES (" + str(p) + ", fromJson('{\"a4a70900-24e1-11df-8924-001ff3591711\": 3}'))")
    assert list(cql.execute(f"SELECT p, tuuidmap from {table1} where p = {p}")) == [(p, {UUID('a4a70900-24e1-11df-8924-001ff3591711'): 3})]

The failure on Scylla is:

execution of fromjson failed: Failed execution of function system.fromjson: marshal_exception (marshaling error: Failed parsing map_key "a4a70900-24e1-11df-8924-001ff3591711": Parsing JSON failed: Invalid value. at 0)

Evidently, the parsing code doesn't understand that what is inside the quotes as the key is a string that doesn't need to be parsed further - and tries to parse it as JSON again, which is a mistake. The fix will probably be very similar to the one for ascii key (#7949, fixed by 75109e9)

We should probably check the same for other types as well before discovering the same bug for yet another type in three years time.

@nyh nyh self-assigned this Apr 30, 2024
@nyh nyh changed the title fromJson() or INSERT JSON fails to set a map<ascii, int> fromJson() or INSERT JSON fails to set a map<timeuuid, int> Apr 30, 2024
@nyh
Copy link
Contributor Author

nyh commented Apr 30, 2024

I'm writing more tests and sadly we also have bugs in INSERT JSON of maps with the following key types:

  • timeuuid
  • blob
  • date
  • inet
  • time
  • timestamp
  • uuid

My tests shows that for all other types (various number types, string, ascii (after the previous fix) and boolean) it works correctly.

nyh added a commit to nyh/scylla that referenced this issue Apr 30, 2024
More than three years ago, in issue scylladb#7949, we noticed that trying to
set a `map<ascii, int>` from JSON input (i.e., using INSERT JSON or the
fromJson() function) fails - the ascii key is incorrectly parsed.
We fixed that issue in commit 75109e9
but unfortunately, did not do our due diligence: We did not write enough
tests inspired by this bug, and failed to discover that actually we have
the same bug for many other key types, not just for "ascii". Specifically,
the following key types have exactly the same bug:

  * blob
  * date
  * inet
  * time
  * timestamp
  * timeuuid
  * uuid

Other types, like numbers or boolean worked "by accident" - instead of
parsing them as a normal string, we asked the JSON parser to parse them
again after removing the quotes, and because unquoted numbers and
unquoted true/false happwn to work in JSON, this didn't fail.

The fix here is very simple - for all *native* types (i.e., not
collections or tuples), the encoding of the key in JSON is simply a
quoted string - and removing the quotes is all we need to do and there's
no need to run the JSON parser a second time. Only for more elaborate
types - collections and tuples - we need to run the JSON parser a
second time on the key string to build the more elaborate object.

This patch also includes tests for fromJson() reading a map with all
native key types, confirming that all the aforementioned key types
were broken before this patch, and all key types (including the numbers
and booleans which worked even befoe this patch) work with this patch.

Fixes scylladb#18477.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
mergify bot pushed a commit that referenced this issue May 5, 2024
More than three years ago, in issue #7949, we noticed that trying to
set a `map<ascii, int>` from JSON input (i.e., using INSERT JSON or the
fromJson() function) fails - the ascii key is incorrectly parsed.
We fixed that issue in commit 75109e9
but unfortunately, did not do our due diligence: We did not write enough
tests inspired by this bug, and failed to discover that actually we have
the same bug for many other key types, not just for "ascii". Specifically,
the following key types have exactly the same bug:

  * blob
  * date
  * inet
  * time
  * timestamp
  * timeuuid
  * uuid

Other types, like numbers or boolean worked "by accident" - instead of
parsing them as a normal string, we asked the JSON parser to parse them
again after removing the quotes, and because unquoted numbers and
unquoted true/false happwn to work in JSON, this didn't fail.

The fix here is very simple - for all *native* types (i.e., not
collections or tuples), the encoding of the key in JSON is simply a
quoted string - and removing the quotes is all we need to do and there's
no need to run the JSON parser a second time. Only for more elaborate
types - collections and tuples - we need to run the JSON parser a
second time on the key string to build the more elaborate object.

This patch also includes tests for fromJson() reading a map with all
native key types, confirming that all the aforementioned key types
were broken before this patch, and all key types (including the numbers
and booleans which worked even befoe this patch) work with this patch.

Fixes #18477.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
(cherry picked from commit 7821bba)
mergify bot pushed a commit that referenced this issue May 5, 2024
More than three years ago, in issue #7949, we noticed that trying to
set a `map<ascii, int>` from JSON input (i.e., using INSERT JSON or the
fromJson() function) fails - the ascii key is incorrectly parsed.
We fixed that issue in commit 75109e9
but unfortunately, did not do our due diligence: We did not write enough
tests inspired by this bug, and failed to discover that actually we have
the same bug for many other key types, not just for "ascii". Specifically,
the following key types have exactly the same bug:

  * blob
  * date
  * inet
  * time
  * timestamp
  * timeuuid
  * uuid

Other types, like numbers or boolean worked "by accident" - instead of
parsing them as a normal string, we asked the JSON parser to parse them
again after removing the quotes, and because unquoted numbers and
unquoted true/false happwn to work in JSON, this didn't fail.

The fix here is very simple - for all *native* types (i.e., not
collections or tuples), the encoding of the key in JSON is simply a
quoted string - and removing the quotes is all we need to do and there's
no need to run the JSON parser a second time. Only for more elaborate
types - collections and tuples - we need to run the JSON parser a
second time on the key string to build the more elaborate object.

This patch also includes tests for fromJson() reading a map with all
native key types, confirming that all the aforementioned key types
were broken before this patch, and all key types (including the numbers
and booleans which worked even befoe this patch) work with this patch.

Fixes #18477.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
(cherry picked from commit 7821bba)

# Conflicts:
#	test/cql-pytest/test_json.py
nyh added a commit that referenced this issue May 5, 2024
More than three years ago, in issue #7949, we noticed that trying to
set a `map<ascii, int>` from JSON input (i.e., using INSERT JSON or the
fromJson() function) fails - the ascii key is incorrectly parsed.
We fixed that issue in commit 75109e9
but unfortunately, did not do our due diligence: We did not write enough
tests inspired by this bug, and failed to discover that actually we have
the same bug for many other key types, not just for "ascii". Specifically,
the following key types have exactly the same bug:

  * blob
  * date
  * inet
  * time
  * timestamp
  * timeuuid
  * uuid

Other types, like numbers or boolean worked "by accident" - instead of
parsing them as a normal string, we asked the JSON parser to parse them
again after removing the quotes, and because unquoted numbers and
unquoted true/false happwn to work in JSON, this didn't fail.

The fix here is very simple - for all *native* types (i.e., not
collections or tuples), the encoding of the key in JSON is simply a
quoted string - and removing the quotes is all we need to do and there's
no need to run the JSON parser a second time. Only for more elaborate
types - collections and tuples - we need to run the JSON parser a
second time on the key string to build the more elaborate object.

This patch also includes tests for fromJson() reading a map with all
native key types, confirming that all the aforementioned key types
were broken before this patch, and all key types (including the numbers
and booleans which worked even befoe this patch) work with this patch.

Fixes #18477.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
(cherry picked from commit 21557cf)

Closes #18522
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants