Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

INSERT VALUES fails if key field is not STRING #3347

Closed
big-andy-coates opened this issue Sep 13, 2019 · 4 comments · Fixed by #3381
Closed

INSERT VALUES fails if key field is not STRING #3347

big-andy-coates opened this issue Sep 13, 2019 · 4 comments · Fixed by #3381
Assignees

Comments

@big-andy-coates
Copy link
Contributor

big-andy-coates commented Sep 13, 2019

CREATE TABLE movies (id INT, title VARCHAR, release_year INT) WITH (kafka_topic='movies', key='id', partitions=1, value_format='avro');

CREATE STREAM ratings (id INT, rating DOUBLE) WITH (kafka_topic='ratings', partitions=1, value_format='avro');

INSERT INTO movies (id, title, release_year) VALUES (294, 'Die Hard', 1998);
-- fails with: 
-- Failed to insert values into stream/table: MOVIES
-- Caused by: Invalid Java object for schema type STRING: class java.lang.Integer
--	for field: "ROWKEY"

The issue here is that the id key field specified in the movies WITH clause is an INT.
KSQL sees that the id field is the same as the key and hence tries to set ROWKEY to 294, but ROWKEY is a STRING field, and KSQL blows up...

Until we have primitive keys other than String I think KSQL needs to support converting the value to a STRING to store in ROWKEY.

We need to make sure the conversion is compatible with what KSQL would do if storing, for example, a STRUCT field in the key as part of a GROUP BY clause.

A secondary issue is that even if the user attempts to explicitly supply ROWKEY, or ROWKEY and not Id, it still fails:

INSERT INTO ratings (rowkey, id, title, release_year) VALUES ('294', 294, 'Die Hard', 1998);
-- fails with:
-- Failed to insert values into stream/table: RATINGS
-- Caused by: java.lang.IllegalStateException

NSERT INTO ratings (rowkey, title, release_year) VALUES ('294', 'Die Hard', 1998);
-- fails with:
--  Failed to insert values into stream/table: RATINGS
--  Caused by: java.lang.IllegalStateException
@big-andy-coates
Copy link
Contributor Author

Related to #3282, and maybe #3021

@big-andy-coates
Copy link
Contributor Author

The only available work around at the moment is to drop the key field from the WITH clause and then explicitly supply both Id and ROWKEY:

CREATE TABLE movies (id INT, title VARCHAR, release_year INT) WITH (kafka_topic='movies', partitions=1, value_format='avro');

INSERT INTO movies (rowkey, id, title, release_year) VALUES ('294', 294, 'Die Hard', 1998);

Note, if you don't supply rowkey you get another error (another bug):

INSERT INTO movies (id, title, release_year) VALUES (294, 'Die Hard', 1998);
-- fails with:
-- Failed to insert values into stream/table: MOVIES
-- Caused by: Producer is closed forcefully.

@purplefox
Copy link
Contributor

According to https://docs.confluent.io/current/ksql/docs/developer-guide/syntax-reference.html#ksql-key-requirements

"KEY must be set to a column of type VARCHAR aka STRING."

So the CREATE TABLE movies statement above should be failing as it's not valid to have an id field of type INT?

@big-andy-coates
Copy link
Contributor Author

I believe key fields also work with no-STRING fields. But would be good to check this is the case, especially check joins work as expected.

If they do work, then fix this bug and update the docs. If the don't work, then we should fail the CREATE statement if the keyField is not a STRING, and also change a lot lot stuff internally in too. Let's hope its the former.

purplefox pushed a commit to purplefox/ksql that referenced this issue Sep 18, 2019
…been provided in an insert then make sure rowkey gets the string form of it
purplefox added a commit that referenced this issue Sep 18, 2019
…g into a table with a key (#3381)

* fix:#3347 If the table has an key and a value for it has been provided in an insert then make sure rowkey gets the string form of it

* fix: When ROWKEY and ID are both specified in insert into, then check rowkey is string form of id

* fix: Add a bunch of RQTT tests to test combinations of inserts with id and rowkey when stream has key

* fix:nit

* fix:nit

* fix:nits
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants