Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Grouping for Varchar columns for sharded keyspaces #7707

Closed
GuptaManan100 opened this issue Mar 17, 2021 · 1 comment
Closed

Support Grouping for Varchar columns for sharded keyspaces #7707

GuptaManan100 opened this issue Mar 17, 2021 · 1 comment

Comments

@GuptaManan100
Copy link
Member

Feature Description

While ordering for all columns is supported after #7678, GROUP BY will not work for varchar columns since for now orderedAggregate still uses the original column for merging. i.e

# scatter group by a text column, reuse existing weight_string
"select count(*) k, a, textcol1, b from user group by a, textcol1, b order by k, textcol1"
{
  "QueryType": "SELECT",
  "Original": "select count(*) k, a, textcol1, b from user group by a, textcol1, b order by k, textcol1",
  "Instructions": {
    "OperatorType": "Sort",
    "Variant": "Memory",
    "OrderBy": "0 ASC, 2 ASC",
    "Inputs": [
      {
        "OperatorType": "Aggregate",
        "Variant": "Ordered",
        "Aggregates": "count(0)",
        "Distinct": "false",
        "GroupBy": "1, 4, 3",
        "Inputs": [
          {
            "OperatorType": "Route",
            "Variant": "SelectScatter",
            "Keyspace": {
              "Name": "user",
              "Sharded": true
            },
            "FieldQuery": "select count(*) as k, a, textcol1, b, weight_string(textcol1), weight_string(a), weight_string(b) from `user` where 1 != 1 group by a, textcol1, b",
            "OrderBy": "2 ASC, 1 ASC, 3 ASC",
            "Query": "select count(*) as k, a, textcol1, b, weight_string(textcol1), weight_string(a), weight_string(b) from `user` group by a, textcol1, b order by textcol1 asc, a asc, b asc",
            "Table": "`user`"
          }
        ]
      }
    ]
  }
}

Notice that in the plan output the ordered aggregate is using 1,4 and 3 columns instead of 5,4 and 6. To support this the orderedAggregate will also have to store both the column and the weight_string column.

Use Case(s)

GROUP BY on varchar columns in shared keyspace for tables which are not authoritative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants