Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primitive keys: Support INT, BIGINT, DOUBLE and STRING in PARTITION BY #4092

Closed
big-andy-coates opened this issue Dec 10, 2019 · 0 comments · Fixed by #4098
Closed

Primitive keys: Support INT, BIGINT, DOUBLE and STRING in PARTITION BY #4092

big-andy-coates opened this issue Dec 10, 2019 · 0 comments · Fixed by #4098
Assignees
Milestone

Comments

@big-andy-coates
Copy link
Contributor

No description provided.

big-andy-coates added a commit to big-andy-coates/ksql that referenced this issue Dec 10, 2019
Fixes: confluentinc#4092

WIP: This commit gets `PARTITION BY` clauses working with primitive key types. However, it does disable a couple of join until confluentinc#4094 has been completed.

BREAKING CHANGE: A `PARTITION BY` now changes the SQL type of `ROWKEY` in the output schema of a query.

For example, consider:

```sql
CREATE STREAM INPUT (ROWKEY STRING KEY, ID INT) WITH (...);
CREATE STREAM OUTPUT AS SELECT ROWKEY AS NAME FROM INPUT PARTITION BY ID;
```

Previously, the above would have resulted in an output schema of `ROWKEY STRING KEY, NAME STRING`, where `ROWKEY` would have stored the string representation of the integer from the `ID` column.  With this commit the output schema will be `ROWKEY INT KEY, NAME STRING`.
@big-andy-coates big-andy-coates added this to the 0.7.0 milestone Dec 10, 2019
big-andy-coates added a commit that referenced this issue Dec 10, 2019
* chore: partition-by primitive key support

Fixes: #4092

WIP: This commit gets `PARTITION BY` clauses working with primitive key types. However, it does disable a couple of join until #4094 has been completed.

BREAKING CHANGE: A `PARTITION BY` now changes the SQL type of `ROWKEY` in the output schema of a query.

For example, consider:

```sql
CREATE STREAM INPUT (ROWKEY STRING KEY, ID INT) WITH (...);
CREATE STREAM OUTPUT AS SELECT ROWKEY AS NAME FROM INPUT PARTITION BY ID;
```

Previously, the above would have resulted in an output schema of `ROWKEY STRING KEY, NAME STRING`, where `ROWKEY` would have stored the string representation of the integer from the `ID` column.  With this commit the output schema will be `ROWKEY INT KEY, NAME STRING`.
big-andy-coates added a commit to big-andy-coates/ksql that referenced this issue Dec 10, 2019
Fixes: confluentinc#4092

This commit gets `GROUP BY` clauses working with primitive key types.

BREAKING CHANGE: A `GROUP BY` on single expressions now changes the SQL type of `ROWKEY` in the output schema of the query to match the SQL type of the expression.

 For example, consider:

 ```sql
 CREATE STREAM INPUT (ROWKEY STRING KEY, ID INT) WITH (...);
 CREATE TABLE OUTPUT AS SELECT COUNT(*) AS COUNT FROM INPUT GROUP BY ID;
 ```

 Previously, the above would have resulted in an output schema of `ROWKEY STRING KEY, COUNT BIGINT`, where `ROWKEY` would have stored the string representation of the integer from the `ID` column.

 With this commit the output schema will be `ROWKEY INT KEY COUNT BIGINT`.

 BREAKING CHANGE: Any`GROUP BY` expression that resolves to `NULL`, including because a UDF throws an exception, now results in the row being excluded from the result.  Previously, as the key was a `STRING` a value of `"null"` could be used. With other primitive types this is not possible. As key columns must be non-null any exception is logged and the row is excluded.
big-andy-coates added a commit that referenced this issue Dec 12, 2019
* chore: group-by primitive key support

Fixes: #4092

This commit gets `GROUP BY` clauses working with primitive key types.

BREAKING CHANGE: A `GROUP BY` on single expressions now changes the SQL type of `ROWKEY` in the output schema of the query to match the SQL type of the expression.

 For example, consider:

 ```sql
 CREATE STREAM INPUT (ROWKEY STRING KEY, ID INT) WITH (...);
 CREATE TABLE OUTPUT AS SELECT COUNT(*) AS COUNT FROM INPUT GROUP BY ID;
 ```

 Previously, the above would have resulted in an output schema of `ROWKEY STRING KEY, COUNT BIGINT`, where `ROWKEY` would have stored the string representation of the integer from the `ID` column.

 With this commit the output schema will be `ROWKEY INT KEY COUNT BIGINT`.

BREAKING CHANGE: Any`GROUP BY` expression that resolves to `NULL`, including because a UDF throws an exception, now results in the row being excluded from the result.  Previously, as the key was a `STRING` a value of `"null"` could be used. With other primitive types this is not possible. As key columns must be non-null any exception is logged and the row is excluded.
@big-andy-coates big-andy-coates self-assigned this Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant