-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UDFs that returns non-optional data types silently fail in persistent queries #5364
Comments
Hey @MichaelDrogalis! I looked into this with current master.
and then created stream
and inserted a row into it
Then, I created stream
And print the output topic of CSAS
which returned nothing. There was also no error or exception in the logs. But, then I inserted another row into stream
and that caused the exception in the logs
However, no error is returned to the CLI. Moreover, if you try
again, there is no error in the CLI. |
I am facing the similar issue while trying to write the data to a stream. code.txt |
The strategic fix here is to move away from using the Connect This is certainly something we need to do before going v1.0. There may be a quicker fix/hack we can put in to work around this issue for now. Or maybe we should just bite the bullet and fix the source of the issue. Somewhat related to #3624 |
Strategic fix to be covered under #3413. This ticket will be used for the short term (i.e. 3 years), fix. |
Had a look at this today. Current state of affairs (on 0.11.0) is as follows:
In terms of what we can do to make this better:
@MichaelDrogalis do you have thoughts on what sort of fix we'd like to pursue? I don't have a sense of how prevalent this particular issue is and what the priority is as a result. Also note that there is a workaround for this in ksqlDB today: specific optional return types in the Struct, rather than non-optional return types (e.g., |
Thanks for the breakdown @vcrfxia. It seems like if we want to fix this, we just need to bite the bullet and do it properly. We can kick this out a little further, but we should probably queue it up soon. The issue with the workaround is that its almost impossible to discover. All you get is a hanging query with no feedback, so there's no way for you to know what the problem is, much less what to do about it. |
Needing to check the processing log to discover processing errors is common across many different types of errors in ksqlDB: inability to deserialize source records, various types of production errors (e.g., record size too large), etc. In my mind if a query isn't producing the expected results, the processing log is a natural first place to look, so I'm surprised by the statement that this is "almost impossible to discover." Are you suggesting we should think about replacing the processing log as a first-step in debugging? I agree, though, that figuring out what to about the error (post-discovery) is pretty rough, unless we add the additional hack of checking for non-optional schema types upon hitting schema incompatibility, in order to improve the error message. |
My comment extends from the fact that this error wasn't showing up in the processing log. :) If that's fixed, then we're in a better place. |
Opened a PR to add serialization exceptions to the processing log, and throw a custom error message calling out the possibility of serialization exceptions (for struct fields) being caused by non-optional schemas from UDFs: #6084 The long-term fix of switching away from Connect types to remove the possibility of encountering this error will be tracked in #4961 instead. |
Describe the bug
When a custom UDTF returns an Avro schema whose data types are non-optional, push queries are able to work with the results, but persistent queries silently swallow the output.
To Reproduce
Confluent Platform 5.5.0, ksqlD 0.9.0.
docker-compose.yml
for the components:pom.xml
for the UDTF:Create the file
src/main/java/io/confluent/developer/TestUdf.java
. Note theSchema.INT32_SCHEMA
, which should beSchema.OPTIONAL_INT32_SCHEMA
to work with ksqlDB:Build the jar:
Start Docker Compose:
Start the CLI:
And run:
Now, use the UDTF in a push query:
Which yields:
Now turn that statement into a persistent query:
And query:
This one hangs and returns nothing.
Expected behavior
The last select should either return the same data as the push query did, or both should emit some kind of error about bad data types. Nothing is visible in the logs or processing log.
The text was updated successfully, but these errors were encountered: