-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sink jdbc-clickhouse] avro (and possibly others) sink is broken #11457
Comments
I think the problem is that to enter data in JSON format, you have to specify the clickhouse type of JSONEachRow format. But i think the jdbc connector doesn't give that option. Now i probe with "kafka on pulsar" and configure the table how say in this web https://clickhouse.tech/docs/en/engines/table-engines/integrations/kafka/ and consume. But i dont like to use "kafka on pulsar" |
I'm assuming it has to do with the way schemas are handled regardless of destination, here's another issue that makes it seem specific to golang & sql |
The issue had no activity for 30 days, mark with Stale label. |
The PR mentioned in linked issue deals with this downstream, but I do not know if underlying issue has been fixed. |
I'm seeing a similar issue but without schema on pulsar 3.1 and JDBC Clickhouse sink with Here's my config: configs:
userName: "default"
password: "default"
jdbcUrl: "jdbc:clickhouse://localhost:8123/iot"
tableName: "pulsar_data"
useTransactions: "false"
nonKey: "device_name, data_content, device_properties, datetime_info, topicName" |
I managed to setup the following Avro schema and it still not able to render data, everything is {
"name": "clickhouse-schema",
"type": "AVRO",
"schema": "{\"type\": \"record\", \"namespace\": \"iot\",\"name\":\"exampleAvro\",\"fields\":[{\"name\": \"device_name\",\"type\": \"string\", \"default\": \"\"},{\"name\": \"data_content\", \"type\":{\"type\":\"map\", \"values\": \"string\", \"default\": {}}},{\"name\": \"device_properties\", \"type\":{ \"type\": \"map\", \"values\": \"string\", \"default\": {}}},{\"name\":\"datetime_info\", \"type\": \"string\", \"default\": \"\"}, {\"name\": \"topicName\", \"type\": \"string\", \"default\": \"\"}]}",
"properties": {
"__jsr310ConversionEnabled": "false"
}
} Running the sink in 023-11-10T10:56:24,865-0500 [pool-3-thread-1] ERROR org.apache.pulsar.io.jdbc.JdbcAbstractSink - Got exception Cannot set null to non-nullable column #1 [datetime_info String] after 0 ms, failing 200 messages
java.sql.SQLException: Cannot set null to non-nullable column #1 [datetime_info String]
at com.clickhouse.jdbc.SqlExceptionUtils.clientError(SqlExceptionUtils.java:73) ~[clickhouse-jdbc-0.4.6-all.jar:clickhouse-jdbc 0.4.6 (revision: dd91e17)]
at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.addBatch(InputBasedPreparedStatement.java:340) ~[clickhouse-jdbc-0.4.6-all.jar:clickhouse-jdbc 0.4.6 (revision: dd91e17)]
at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.executeAny(InputBasedPreparedStatement.java:113) ~[clickhouse-jdbc-0.4.6-all.jar:clickhouse-jdbc 0.4.6 (revision: dd91e17)]
at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.execute(InputBasedPreparedStatement.java:312) ~[clickhouse-jdbc-0.4.6-all.jar:clickhouse-jdbc 0.4.6 (revision: dd91e17)]
at org.apache.pulsar.io.jdbc.JdbcAbstractSink.flush(JdbcAbstractSink.java:289) ~[pulsar-io-jdbc-core-3.1.1.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?]
at java.lang.Thread.run(Unknown Source) ~[?:?] here's my clickhouse table definition: SET allow_experimental_object_type = 1;
CREATE TABLE mqtt.pulsar_data(datetime_info String, topicName String, data_content JSON, device_properties JSON, device_name String) ENGINE = MergeTree ORDER BY(topicName, datetime_info); |
Describe the bug
When publishing & consuming messages avro & json schema encoding works as expected. After creating a simple clickhouse sink (using config from the docs), the messages get written to the table, but every value is
null
, making it useless. Downgrading to 2.7.2 re-running the test it all works as expected. No avro, db schema changes or sink config changes are needed when downgrading.To Reproduce
Steps to reproduce the behavior:
Expected behavior
I expect the sink to correctly populate my fields.
The text was updated successfully, but these errors were encountered: