New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fromJson() fails to set a map<ascii, int> #7949
Comments
psarna
pushed a commit
that referenced
this issue
Jan 22, 2021
The fromJson() function can take a map JSON and use it to set a map column. However, the specific example of a map<ascii, int> doesn't work in Scylla (it does work in Cassandra). The xfailing tests in this patch demonstrate this. Although the tests use perfectly legal ASCII, scylla fails the fromJson() function, with a misleading error. Refs #7949. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210121233855.100640-1-nyh@scylladb.com>
syuu1228
pushed a commit
to syuu1228/scylla
that referenced
this issue
Jan 25, 2021
The fromJson() function can take a map JSON and use it to set a map column. However, the specific example of a map<ascii, int> doesn't work in Scylla (it does work in Cassandra). The xfailing tests in this patch demonstrate this. Although the tests use perfectly legal ASCII, scylla fails the fromJson() function, with a misleading error. Refs scylladb#7949. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210121233855.100640-1-nyh@scylladb.com>
avikivity
pushed a commit
that referenced
this issue
Feb 16, 2021
In this patch, we port validation/entities/json_test.java, containing 21 tests for various JSON-related operations - SELECT JSON, INSERT JSON, and the fromJson() and toJson() functions. In porting these tests, I uncovered 19 (!!) previously unknown bugs in Scylla: Refs #7911: Failed fromJson() should result in FunctionFailure error, not an internal error. Refs #7912: fromJson() should allow null parameter. Refs #7914: fromJson() integer overflow should cause an error, not silent wrap-around. Refs #7915: fromJson() should accept "true" and "false" also as strings. Refs #7944: fromJson() should not accept the empty string "" as a number. Refs #7949: fromJson() fails to set a map<ascii, int>. Refs #7954: fromJson() fails to set null tuple elements. Refs #7972: toJson() truncates some doubles to integers. Refs #7988: toJson() produces invalid JSON for columns with "time" type. Refs #7997: toJson() is missing a timezone on timestamp. Refs #8001: Documented unit "µs" not supported for assigning a "duration" type. Refs #8002: toJson() of decimal type doesn't use exponents so can produce huge output. Refs #8077: SELECT JSON output for function invocations should be compatible with Cassandra. Refs #8078: SELECT JSON ignores the "AS" specification. Refs #8085: INSERT JSON with bad arguments should yield InvalidRequest error, not internal error. Refs #8086: INSERT JSON cannot handle user-defined types with case- sensitive component names. Refs #8087: SELECT JSON incorrectly quotes strings inside map keys. Refs #8092: SELECT JSON missing null component after adding field to UDT definition. Refs #8100: SELECT JSON with IN and ORDER BY does not obey the ORDER BY. Due to these bugs, 9 out of the 21 tests here currently xfail. As usual in these sort of tests, all 21 tests pass when running against Cassandra. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210216154858.1172313-1-nyh@scylladb.com>
avikivity
pushed a commit
that referenced
this issue
Feb 18, 2021
In this patch, we port validation/entities/json_test.java, containing 21 tests for various JSON-related operations - SELECT JSON, INSERT JSON, and the fromJson() and toJson() functions. In porting these tests, I uncovered 19 (!!) previously unknown bugs in Scylla: Refs #7911: Failed fromJson() should result in FunctionFailure error, not an internal error. Refs #7912: fromJson() should allow null parameter. Refs #7914: fromJson() integer overflow should cause an error, not silent wrap-around. Refs #7915: fromJson() should accept "true" and "false" also as strings. Refs #7944: fromJson() should not accept the empty string "" as a number. Refs #7949: fromJson() fails to set a map<ascii, int>. Refs #7954: fromJson() fails to set null tuple elements. Refs #7972: toJson() truncates some doubles to integers. Refs #7988: toJson() produces invalid JSON for columns with "time" type. Refs #7997: toJson() is missing a timezone on timestamp. Refs #8001: Documented unit "µs" not supported for assigning a "duration" type. Refs #8002: toJson() of decimal type doesn't use exponents so can produce huge output. Refs #8077: SELECT JSON output for function invocations should be compatible with Cassandra. Refs #8078: SELECT JSON ignores the "AS" specification. Refs #8085: INSERT JSON with bad arguments should yield InvalidRequest error, not internal error. Refs #8086: INSERT JSON cannot handle user-defined types with case- sensitive component names. Refs #8087: SELECT JSON incorrectly quotes strings inside map keys. Refs #8092: SELECT JSON missing null component after adding field to UDT definition. Refs #8100: SELECT JSON with IN and ORDER BY does not obey the ORDER BY. Due to these bugs, 8 out of the 21 tests here currently xfail and one has to be skipped (issue #8100 causes the sanitizer to detect a use after free, and crash Scylla). As usual in these sort of tests, all 21 tests pass when running against Cassandra. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210217130732.1202811-1-nyh@scylladb.com>
michaelhly
added a commit
to michaelhly/scylladb
that referenced
this issue
Sep 20, 2023
For column type of map<ascii, int>, we don't want to parse the map keys as JSON. Because ascii map keys are not valid JSON. Fixes: scylladb#7949 Signed-off-by: michaelhly <michaelhly@gmail.com>
michaelhly
added a commit
to michaelhly/scylladb
that referenced
this issue
Sep 20, 2023
For column type of map<ascii, int>, we don't want to parse the map keys as JSON. Because ascii map keys are not valid JSON. Fixes: scylladb#7949 Signed-off-by: michaelhly <michaelhly@gmail.com>
michaelhly
added a commit
to michaelhly/scylladb
that referenced
this issue
Sep 20, 2023
For column type of map<ascii, int>, we don't want to parse the map keys as JSON. Because ascii map keys are not valid JSON. Fixes: scylladb#7949 Signed-off-by: michaelhly <michaelhly@gmail.com>
michaelhly
added a commit
to michaelhly/scylladb
that referenced
this issue
Sep 20, 2023
For column type of map<ascii, int>, we don't want to parse the map keys as JSON. Because ascii characters are not valid JSON. Fixes: scylladb#7949 Signed-off-by: michaelhly <michaelhly@gmail.com>
michaelhly
added a commit
to michaelhly/scylladb
that referenced
this issue
Sep 20, 2023
For column type of map<ascii, int>, we don't want to parse the map keys as JSON. Because ascii characters are not valid JSON. Fixes: scylladb#7949 Signed-off-by: michaelhly <michaelhly@gmail.com>
michaelhly
added a commit
to michaelhly/scylladb
that referenced
this issue
Sep 21, 2023
For JSON objects represented as map<ascii, int>, don't treat ASCII keys as a nested JSON string. We were doing that prior to the patch, which led to parsing errors. Fixes: scylladb#7949 Signed-off-by: michaelhly <michaelhly@gmail.com>
michaelhly
added a commit
to michaelhly/scylladb
that referenced
this issue
Sep 21, 2023
For JSON objects represented as map<ascii, int>, don't treat ASCII keys as a nested JSON string. We were doing that prior to the patch, which led to parsing errors. Fixes: scylladb#7949 Signed-off-by: michaelhly <michaelhly@gmail.com>
michaelhly
added a commit
to michaelhly/scylladb
that referenced
this issue
Sep 21, 2023
For JSON objects represented as map<ascii, int>, don't treat ASCII keys as a nested JSON string. We were doing that prior to the patch, which led to parsing errors. Included the error offset where JSON parsing failed for rjson::parse related functions to help identify parsing errors better. Fixes: scylladb#7949 Signed-off-by: Michael Huang <michaelhly@gmail.com>
michaelhly
added a commit
to michaelhly/scylladb
that referenced
this issue
Sep 22, 2023
For JSON objects represented as map<ascii, int>, don't treat ASCII keys as a nested JSON string. We were doing that prior to the patch, which led to parsing errors. Included the error offset where JSON parsing failed for rjson::parse related functions to help identify parsing errors better. Fixes: scylladb#7949 Signed-off-by: Michael Huang <michaelhly@gmail.com>
denesb
pushed a commit
that referenced
this issue
Dec 18, 2023
For JSON objects represented as map<ascii, int>, don't treat ASCII keys as a nested JSON string. We were doing that prior to the patch, which led to parsing errors. Included the error offset where JSON parsing failed for rjson::parse related functions to help identify parsing errors better. Fixes: #7949 Signed-off-by: Michael Huang <michaelhly@gmail.com> Closes #15499 (cherry picked from commit 75109e9)
Backported to 5.2. |
nyh
added a commit
to nyh/scylla
that referenced
this issue
Apr 30, 2024
More than three years ago, in issue scylladb#7949, we noticed that trying to set a `map<ascii, int>` from JSON input (i.e., using INSERT JSON or the fromJson() function) fails - the ascii key is incorrectly parsed. We fixed that issue in commit 75109e9 but unfortunately, did not do our due diligence: We did not write enough tests inspired by this bug, and failed to discover that actually we have the same bug for many other key types, not just for "ascii". Specifically, the following key types have exactly the same bug: * blob * date * inet * time * timestamp * timeuuid * uuid Other types, like numbers or boolean worked "by accident" - instead of parsing them as a normal string, we asked the JSON parser to parse them again after removing the quotes, and because unquoted numbers and unquoted true/false happwn to work in JSON, this didn't fail. The fix here is very simple - for all *native* types (i.e., not collections or tuples), the encoding of the key in JSON is simply a quoted string - and removing the quotes is all we need to do and there's no need to run the JSON parser a second time. Only for more elaborate types - collections and tuples - we need to run the JSON parser a second time on the key string to build the more elaborate object. This patch also includes tests for fromJson() reading a map with all native key types, confirming that all the aforementioned key types were broken before this patch, and all key types (including the numbers and booleans which worked even befoe this patch) work with this patch. Fixes scylladb#18477. Signed-off-by: Nadav Har'El <nyh@scylladb.com>
avikivity
pushed a commit
that referenced
this issue
May 5, 2024
More than three years ago, in issue #7949, we noticed that trying to set a `map<ascii, int>` from JSON input (i.e., using INSERT JSON or the fromJson() function) fails - the ascii key is incorrectly parsed. We fixed that issue in commit 75109e9 but unfortunately, did not do our due diligence: We did not write enough tests inspired by this bug, and failed to discover that actually we have the same bug for many other key types, not just for "ascii". Specifically, the following key types have exactly the same bug: * blob * date * inet * time * timestamp * timeuuid * uuid Other types, like numbers or boolean worked "by accident" - instead of parsing them as a normal string, we asked the JSON parser to parse them again after removing the quotes, and because unquoted numbers and unquoted true/false happwn to work in JSON, this didn't fail. The fix here is very simple - for all *native* types (i.e., not collections or tuples), the encoding of the key in JSON is simply a quoted string - and removing the quotes is all we need to do and there's no need to run the JSON parser a second time. Only for more elaborate types - collections and tuples - we need to run the JSON parser a second time on the key string to build the more elaborate object. This patch also includes tests for fromJson() reading a map with all native key types, confirming that all the aforementioned key types were broken before this patch, and all key types (including the numbers and booleans which worked even befoe this patch) work with this patch. Fixes #18477. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #18482
mergify bot
pushed a commit
that referenced
this issue
May 5, 2024
More than three years ago, in issue #7949, we noticed that trying to set a `map<ascii, int>` from JSON input (i.e., using INSERT JSON or the fromJson() function) fails - the ascii key is incorrectly parsed. We fixed that issue in commit 75109e9 but unfortunately, did not do our due diligence: We did not write enough tests inspired by this bug, and failed to discover that actually we have the same bug for many other key types, not just for "ascii". Specifically, the following key types have exactly the same bug: * blob * date * inet * time * timestamp * timeuuid * uuid Other types, like numbers or boolean worked "by accident" - instead of parsing them as a normal string, we asked the JSON parser to parse them again after removing the quotes, and because unquoted numbers and unquoted true/false happwn to work in JSON, this didn't fail. The fix here is very simple - for all *native* types (i.e., not collections or tuples), the encoding of the key in JSON is simply a quoted string - and removing the quotes is all we need to do and there's no need to run the JSON parser a second time. Only for more elaborate types - collections and tuples - we need to run the JSON parser a second time on the key string to build the more elaborate object. This patch also includes tests for fromJson() reading a map with all native key types, confirming that all the aforementioned key types were broken before this patch, and all key types (including the numbers and booleans which worked even befoe this patch) work with this patch. Fixes #18477. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit 7821bba)
mergify bot
pushed a commit
that referenced
this issue
May 5, 2024
More than three years ago, in issue #7949, we noticed that trying to set a `map<ascii, int>` from JSON input (i.e., using INSERT JSON or the fromJson() function) fails - the ascii key is incorrectly parsed. We fixed that issue in commit 75109e9 but unfortunately, did not do our due diligence: We did not write enough tests inspired by this bug, and failed to discover that actually we have the same bug for many other key types, not just for "ascii". Specifically, the following key types have exactly the same bug: * blob * date * inet * time * timestamp * timeuuid * uuid Other types, like numbers or boolean worked "by accident" - instead of parsing them as a normal string, we asked the JSON parser to parse them again after removing the quotes, and because unquoted numbers and unquoted true/false happwn to work in JSON, this didn't fail. The fix here is very simple - for all *native* types (i.e., not collections or tuples), the encoding of the key in JSON is simply a quoted string - and removing the quotes is all we need to do and there's no need to run the JSON parser a second time. Only for more elaborate types - collections and tuples - we need to run the JSON parser a second time on the key string to build the more elaborate object. This patch also includes tests for fromJson() reading a map with all native key types, confirming that all the aforementioned key types were broken before this patch, and all key types (including the numbers and booleans which worked even befoe this patch) work with this patch. Fixes #18477. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit 7821bba) # Conflicts: # test/cql-pytest/test_json.py
nyh
added a commit
that referenced
this issue
May 5, 2024
More than three years ago, in issue #7949, we noticed that trying to set a `map<ascii, int>` from JSON input (i.e., using INSERT JSON or the fromJson() function) fails - the ascii key is incorrectly parsed. We fixed that issue in commit 75109e9 but unfortunately, did not do our due diligence: We did not write enough tests inspired by this bug, and failed to discover that actually we have the same bug for many other key types, not just for "ascii". Specifically, the following key types have exactly the same bug: * blob * date * inet * time * timestamp * timeuuid * uuid Other types, like numbers or boolean worked "by accident" - instead of parsing them as a normal string, we asked the JSON parser to parse them again after removing the quotes, and because unquoted numbers and unquoted true/false happwn to work in JSON, this didn't fail. The fix here is very simple - for all *native* types (i.e., not collections or tuples), the encoding of the key in JSON is simply a quoted string - and removing the quotes is all we need to do and there's no need to run the JSON parser a second time. Only for more elaborate types - collections and tuples - we need to run the JSON parser a second time on the key string to build the more elaborate object. This patch also includes tests for fromJson() reading a map with all native key types, confirming that all the aforementioned key types were broken before this patch, and all key types (including the numbers and booleans which worked even befoe this patch) work with this patch. Fixes #18477. Signed-off-by: Nadav Har'El <nyh@scylladb.com> (cherry picked from commit 21557cf) Closes #18522
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The
fromJson()
function can take a map JSON and use it to set a map column. For example, if we have amap<text, int>
columnm
, the following test works well in coverting the JSON{"a": 1, "b": 2}
to a value for this column:However, we have a bug if the column type is a
map<ascii, int>
- note the ascii instead of text. Although the strings "a" and "b" above are perfectly legal ASCII, Scylla fails the statement execution, with the error:The same thing works correctly in Cassandra, and is in fact checked by the Cassandra
test cassandra_tests/validation/entities/json_test.py
::testFromJsonFct
- which fails on Scylla on this issue. I also created separate tests for this issue (based on the example code above) intest_json.py
::test_fromjson_map_ascii_unprepared
andtest_fromjson_map_ascii_prepared
.The "Parsing JSON failed" error is a red herring: This is a perfectly valid JSON, and moreover, exactly the same statement works if the type of the column is
map<text,int>
instead ofmap<ascii,int>
- and it parses exactly the same JSON without any complains. I'm guessing that the ASCII validation is incorrectly applied to the correctly-parsed JSON, and when this validation (wrongly) fails, we produce a misleading error message.As part of fixing this issue we should also improve this error message - which may (?) be visible in other cases as well and mislead us again.
When we discover the cause for this bug, we should consider whether it is specific to ascii (why?) and map - or maybe a similar bug applies to other key types? To value types? To other collection types?
The text was updated successfully, but these errors were encountered: