Skip to content
/ server Public

Comments

MDEV-24517: JSON_EXTRACT as select conditions syntax error on Spider#1839

Merged
nayuta-yanagisawa merged 2 commits intoMariaDB:10.3from
yongxin-xu:10.6-MDEV-24517
Jul 23, 2021
Merged

MDEV-24517: JSON_EXTRACT as select conditions syntax error on Spider#1839
nayuta-yanagisawa merged 2 commits intoMariaDB:10.3from
yongxin-xu:10.6-MDEV-24517

Conversation

@yongxin-xu
Copy link
Contributor

how to repeat:
CREATE TABLE t1(i INT PRIMARY KEY, j JSON)ENGINE=spider...
INSERT INTO t1 VALUES(1, '{"Name":"Tom", "Age":18}'), (2, '{"Name":"Jerry", "Age":20}');
SELECT * FROM t1 WHERE json_extract(jdoc, '$.Age')=20; -- ERROR 1064 (42000): You have an error in your SQL syntax;

how to solve:
Let Spider process Item_func::JSON_EXTRACT_FUNC as Item_func::UNKNOWN_FUNC does

@nayuta-yanagisawa
Copy link
Contributor

@yongxin-xu Hi! Thank you for your patch, and sorry for our silence. @vuvova and I will review your patch.

First of all, could you please change the base branch to 10.3? This is a bug fix, thus the patch should be merged into the oldest supported version that has the bug (10.3 in this case).

@yongxin-xu
Copy link
Contributor Author

@nayuta-yanagisawa Hi! Thanks for the reply.

I would like to know why I need to change the base branch. This bug fix also works for branch 10.6.

@nayuta-yanagisawa
Copy link
Contributor

Yes, we could review even on 10.6. However, we need to merge the fix to all the supported versions. We frequently merge an older branch to a newer one (e.g., e7f4daf). So, once your patch is merged into 10.3, it will be merged into newer branches too.

@yongxin-xu yongxin-xu changed the base branch from 10.6 to 10.3 July 21, 2021 09:47
@yongxin-xu
Copy link
Contributor Author

@nayuta-yanagisawa Thanks for explaining! I've changed the base branch to 10.3 and updated my commit. Please take a look.

Copy link
Contributor

@nayuta-yanagisawa nayuta-yanagisawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch looks mostly good to me, but I have a comment on the handling of JSON_EXTRACT.

Comment on lines 4152 to 4154
case Item_func::JSON_EXTRACT_FUNC:
/* JSON_EXTRACT_FUNC should be processed the same way as UNKNOWN_FUNC */
/* fall through */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already know the Functype of JSON_EXTRACT, and thus there seems to be no need to handle it as if it is UNKNOWN_FUNC. In fact, JSON_EXTRACT doesn't match any of if ... else if conditions under CASE Item_func::UNKNOWN_FUNC, and is processed by spd_db_mysql.cc:4787-4800.

It seems to be better, for me, to treat JSON_EXTRACT as other functions that have known types. In other words, I prefer to treat JSON_EXTRACT as the following:

      ...
    case Item_func::JSON_EXTRACT_FUNC:
      /* code only for JSON_EXTRACT */
      break;
    case Item_func::FooBar:
      ...

@yongxin-xu yongxin-xu force-pushed the 10.6-MDEV-24517 branch 2 times, most recently from 3e53993 to a122a18 Compare July 22, 2021 03:12
@yongxin-xu
Copy link
Contributor Author

@nayuta-yanagisawa You are correct. I have moved the Item_func::JSON_EXTRACT_FUNC case just before the default case and write the corresponding code.

Copy link
Contributor

@nayuta-yanagisawa nayuta-yanagisawa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Let' wait for @vuvova 's review.

@nayuta-yanagisawa
Copy link
Contributor

@yongxin-xu Just one thing. Please add a descriptive commit message.

The `item_func::JSON_EXTRACT_FUNC` was not handled correctly in the previous
versions on the Spider storage engine, which makes queries like
`SELECT * FROM t1 WHERE json_extract(jdoc, '$.Age')=20`
failed with syntax error.

This patch writes specific code to handle JSON_EXTRACT in the Spider Storage
Engine and fix that bug.
@yongxin-xu
Copy link
Contributor Author

@nayuta-yanagisawa Sorry, I forgot to write a specific commit message. I have rebased on the latest 10.3 code and use git commit --amend to write a new commit message.

Copy link
Member

@vuvova vuvova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally isn't a big fan of copy-pasting the same code all around. In that sense I liked more your first approach, of treating JSON_EXTRACT as an Item_func::UNKNOWN_FUNC.

But @nayuta-yanagisawa is right, there is no need to go through all these strcmp's for a function that we already know. So what about this:

--- a/storage/spider/spd_db_mysql.cc
+++ b/storage/spider/spd_db_mysql.cc
@@ -4711,6 +4711,9 @@ int spider_db_mbase_util::open_item_func(
           DBUG_RETURN(0);
         }
       }
+      /* normal function, no special treatment */
+      /* fall through */
+    case Item_func::JSON_EXTRACT:
       if (str)
       {
         if (str->reserve(func_name_length + SPIDER_SQL_OPEN_PAREN_LEN))

Perhaps in the future more functions can be added here. I would expect that many functions can be treated just as "function name, open paren, comma separated arguments, closing paren".

@nayuta-yanagisawa
Copy link
Contributor

Thank you for your look. I agree. That looks also better to me.

@yongxin-xu
Copy link
Contributor Author

--- a/storage/spider/spd_db_mysql.cc
+++ b/storage/spider/spd_db_mysql.cc
@@ -4711,6 +4711,9 @@ int spider_db_mbase_util::open_item_func(
           DBUG_RETURN(0);
         }
       }
-      /* normal function, no special treatment */
+      /* above are codes for Item_func::UNKNOWN_FUNC */
+      /* fall through */
+    case Item_func::JSON_EXTRACT:
       if (str)
       {
         if (str->reserve(func_name_length + SPIDER_SQL_OPEN_PAREN_LEN))

@vuvova Thanks for your look. I'm afraid this would bring some problems since we must assign func_name and func_name_length before the if (str) line.

In your suggestion, the code above case Item_func::JSON_EXTRACT is for Item_func::UNKNOWN_FUNC, which have assigned func_name and func_name_lengthin the very beginning of its case. The code for UNKNOWN_FUNC is very long, making it hard to maintain in the future (since some code might do a second assignment for func_name and func_name_length before reaching codecase Item_func::JSON_EXTRACT).

My suggestion is to use my way to process item_func::JSON_EXTRACT_FUNC

@@ -5084,6 +5084,23 @@ int spider_db_mbase_util::open_item_func(
 #else
       DBUG_RETURN(ER_SPIDER_COND_SKIP_NUM);
 #endif
+    case Item_func::JSON_EXTRACT_FUNC:
+      func_name = (char*) item_func->func_name();
+      func_name_length = strlen(func_name);
+      if (str)
+      {
+        if (str->reserve(func_name_length + SPIDER_SQL_OPEN_PAREN_LEN))
+          DBUG_RETURN(HA_ERR_OUT_OF_MEM);
+        str->q_append(func_name, func_name_length);
+        str->q_append(SPIDER_SQL_OPEN_PAREN_STR, SPIDER_SQL_OPEN_PAREN_LEN);
+      }
+      func_name = SPIDER_SQL_COMMA_STR;
+      func_name_length = SPIDER_SQL_COMMA_LEN;
+      separator_str = SPIDER_SQL_COMMA_STR;
+      separator_str_length = SPIDER_SQL_COMMA_LEN;
+      last_str = SPIDER_SQL_CLOSE_PAREN_STR;
+      last_str_length = SPIDER_SQL_CLOSE_PAREN_LEN;
+      break;
     default:

In this way, we can also add some functions in the future if they can be processed as Item_func::JSON_EXTRACT_FUNC

     case Item_func::JSON_EXTRACT_FUNC:
+    case Item_func::SOME_NEW_FUNC:
+    func_name = (char*) item_func->func_name();
+    ...

Copy link
Member

@vuvova vuvova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. So, my suggestion is still to avoid code duplication, when possible.

But I feel like the decision should be made by people who actually write Spider code. Than is @nayuta-yanagisawa and (judging by the number of pull requests) you.

@nayuta-yanagisawa
Copy link
Contributor

nayuta-yanagisawa commented Jul 23, 2021

@yongxin-xu How about the following fix? We could avoid code duplication by this. Yes, assignments to func_name and func_name_length are redundant for UNKNOWN_FUNC, but I think this level of redundancy is acceptable.

diff --git a/storage/spider/spd_db_mysql.cc b/storage/spider/spd_db_mysql.cc
index 85c765a1a27..31825cb9d79 100644
--- a/storage/spider/spd_db_mysql.cc
+++ b/storage/spider/spd_db_mysql.cc
@@ -4711,6 +4711,11 @@ int spider_db_mbase_util::open_item_func(
           DBUG_RETURN(0);
         }
       }
+      /* normal function, no special treatment */
+      /* fall through */
+    case Item_func::JSON_EXTRACT_FUNC:
+      func_name = (char*) item_func->func_name();
+      func_name_length = strlen(func_name);
       if (str)
       {
         if (str->reserve(func_name_length + SPIDER_SQL_OPEN_PAREN_LEN))

@yongxin-xu
Copy link
Contributor Author

@nayuta-yanagisawa
Consider this situation

  case Item_func::UNKNOWN_FUNC:
    func_name = (char*) item_func->func_name(); /* "func_a" */
    ...
    if (some condition) {
        ...
        func_name = some_other_function; /* "func_b" */
        ...
    }
  /* fall through*/
  case Item_fuc::JSON_EXTRACT_FUNC:
    func_name =  (char*) item_func->func_name(); /* be reset to "func_a", wrong, should be "func_b" */

Although this example is not currently happening because all updates to the func_name between two cases break in advance without reaching the JSON_EXTRACT case. But I think such a solution has potential dangers.

I'm okay with both solutions, you can make the final decision, and I will update the commit accordingly.

@nayuta-yanagisawa
Copy link
Contributor

@yongxin-xu You have a point. At least the current fix seem to be slightly more resistant to changes.

OK, I pushed your patch to https://github.com/MariaDB/server/tree/bb-10.3-mdev-24517. Once the CIs pass, I will merge the fix

@nayuta-yanagisawa nayuta-yanagisawa merged commit 73d32cc into MariaDB:10.3 Jul 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants