Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISTINCT(X) is considered a function by vtgate #5767

Closed
aquarapid opened this issue Jan 26, 2020 · 3 comments
Closed

DISTINCT(X) is considered a function by vtgate #5767

aquarapid opened this issue Jan 26, 2020 · 3 comments

Comments

@aquarapid
Copy link
Contributor

aquarapid commented Jan 26, 2020

Schema and vSchema:

$ cat schema.sql 
create table foo( c1 bigint not null, PRIMARY KEY (c1)) engine=innodb;

$ cat vschema.json 
{
    "ks1": {
        "sharded": true,
        "vindexes": {
            "hash": {
                "type": "hash"
            }
        },
        "tables": {
            "foo": {
                "columnVindexes": [
                    {
                        "column": "c1",
                        "name": "hash"
                    }
                ]
            }
        }
    }
}

Now:

$ vtexplain -schema-file schema.sql -vschema-file vschema.json -shards 2 -sql 'select distinct(c1) from foo;'
ERROR: vtexplain execute error in 'select distinct(c1) from foo': generating order by clause: cannot reference a complex expression

Note the distinct with parentheses; which is valid synonym for distinct without parentheses. Without them, the query works as expected:

$ vtexplain -schema-file schema.sql -vschema-file vschema.json -shards 2 -sql 'select distinct c1 from foo;'
----------------------------------------------------------------------
select distinct c1 from foo

1 ks1/-80: select distinct c1 from foo limit 10001
1 ks1/80-: select distinct c1 from foo limit 10001

----------------------------------------------------------------------

The same happens when using DISTINCTROW instead of DISTINCT too, of course.

The parser doesn't recognize that they are equivalent, and thinks DISTINCT with parentheses is a SQL function to be pushed down, bypassing the special planbuilder handling for DISTINCT.

@sougou
Copy link
Contributor

sougou commented Jan 31, 2020

This is unfortunate:

mysql> select * from a;
+----+------+
| id | val  |
+----+------+
|  1 |    1 |
|  2 |    1 |
+----+------+
2 rows in set (0.00 sec)

mysql> select distinct(val) from a;
+------+
| val  |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

mysql> select distinct(val), id from a;
+------+----+
| val  | id |
+------+----+
|    1 |  1 |
|    1 |  2 |
+------+----+
2 rows in set (0.00 sec)

mysql> select distinct val, id from a;
+------+----+
| val  | id |
+------+----+
|    1 |  1 |
|    1 |  2 |
+------+----+
2 rows in set (0.00 sec)

mysql> select distinct (val, id) from a;
ERROR 1241 (21000): Operand should contain 1 column(s)

My interpretation for distinct as a function is:

  • It takes only one argument.
  • If there are other expressions in the select, then it's a no-op.
  • If it's the only argument in the select, and it's the distinct function, then it's equivalent to a regular distinct.

@systay: we should amend the parser to convert these to the normalized form.

@systay
Copy link
Collaborator

systay commented Jan 31, 2020

My understanding is that something different is happening.

SELECT (c1) from foo

is a valid query, right? It's just the value from column c1 with parens around it - a perfectly normal expression (albeit complex). Adding DISTINCT before makes it look like a function call, but it's not. You can double check this by trying something like:

SELECT c2, distinct(c1) from foo

This is not accepted - DISTINCT must come before the first expression in the SELECT clause.

We could (and maybe will) do more expression rewriting and remove unnecessary parens like this one, but we don't at the moment, and that why this doesn't work.

@morgo morgo added this to To do in MySQL Compatibility via automation Feb 19, 2020
@systay
Copy link
Collaborator

systay commented Mar 9, 2020

This should be solved by #5906

MySQL Compatibility automation moved this from To do to Done Mar 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

4 participants