From 1f414f03677c45e684ba376dedd726acacac5964 Mon Sep 17 00:00:00 2001 From: Saketh Kurnool Date: Fri, 17 Apr 2026 17:46:58 -0700 Subject: [PATCH 1/3] add avro -> KC type mapping and unsupported avro schemas --- .../kafka/kafka-clickhouse-connect-sink.md | 108 ++++++++++++++---- 1 file changed, 87 insertions(+), 21 deletions(-) diff --git a/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md b/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md index fd5c9dfc013..ced5cf6243a 100644 --- a/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md +++ b/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md @@ -138,27 +138,27 @@ Sink, use [Kafka Connect Transformations](https://docs.confluent.io/platform/cur **With a schema declared:** -| Kafka Connect Type | ClickHouse Type | Supported | Primitive | -| --------------------------------------- |-----------------------| --------- | --------- | -| STRING | String | ✅ | Yes | -| STRING | JSON. See below (1) | ✅ | Yes | -| INT8 | Int8 | ✅ | Yes | -| INT16 | Int16 | ✅ | Yes | -| INT32 | Int32 | ✅ | Yes | -| INT64 | Int64 | ✅ | Yes | -| FLOAT32 | Float32 | ✅ | Yes | -| FLOAT64 | Float64 | ✅ | Yes | -| BOOLEAN | Boolean | ✅ | Yes | -| ARRAY | Array(T) | ✅ | No | -| MAP | Map(Primitive, T) | ✅ | No | -| STRUCT | Variant(T1, T2, ...) | ✅ | No | -| STRUCT | Tuple(a T1, b T2, ...) | ✅ | No | -| STRUCT | Nested(a T1, b T2, ...) | ✅ | No | -| STRUCT | JSON. See below (1), (2) | ✅ | No | -| BYTES | String | ✅ | No | -| org.apache.kafka.connect.data.Time | Int64 / DateTime64 | ✅ | No | -| org.apache.kafka.connect.data.Timestamp | Int32 / Date32 | ✅ | No | -| org.apache.kafka.connect.data.Decimal | Decimal | ✅ | No | +| Kafka Connect Type | ClickHouse Type | Supported | Primitive | +|-----------------------------------------|--------------------------|-----------|-----------| +| STRING | String | ✅ | Yes | +| STRING | JSON. See below (1) | ✅ | Yes | +| INT8 | Int8 | ✅ | Yes | +| INT16 | Int16 | ✅ | Yes | +| INT32 | Int32 | ✅ | Yes | +| INT64 | Int64 | ✅ | Yes | +| FLOAT32 | Float32 | ✅ | Yes | +| FLOAT64 | Float64 | ✅ | Yes | +| BOOLEAN | Boolean | ✅ | Yes | +| ARRAY | Array(T) | ✅ | No | +| MAP | Map(Primitive, T) | ✅ | No | +| STRUCT | Variant(T1, T2, ...) | ✅ | No | +| STRUCT | Tuple(a T1, b T2, ...) | ✅ | No | +| STRUCT | Nested(a T1, b T2, ...) | ✅ | No | +| STRUCT | JSON. See below (1), (2) | ✅ | No | +| BYTES | String | ✅ | No | +| org.apache.kafka.connect.data.Time | Int64 / DateTime64 | ✅ | No | +| org.apache.kafka.connect.data.Timestamp | Int32 / Date32 | ✅ | No | +| org.apache.kafka.connect.data.Decimal | Decimal | ✅ | No | - (1) - JSON is supported only when ClickHouse settings has `input_format_binary_read_json_as_string=1`. This works only for RowBinary format family and the setting affects all columns in the insert request so they all should be a string. Connector will convert STRUCT to a JSON string in this case. @@ -249,6 +249,72 @@ The connector can consume data from multiple topics } ``` +###### Type mapping {#avro-type-mapping} +✅: Supported + +❌: Not supported + +️⚠️: Partially supported + +| Avro Type | Kafka Connect Type | Supported | Notes | +|-----------|--------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| null | _N/A_ | ❌ | Not supported as a standalone type, but can be used in unions | +| boolean | BOOLEAN | ✅ | | +| int | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has property `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | +| long | INT64 | ✅ | | +| float | FLOAT32 | ✅ | | +| double | FLOAT64 | ✅ | | +| bytes | BYTES | ✅ | | +| string | STRING | ✅ | | +| record | STRUCT | ✅ | | +| enum | STRING | ✅ | | +| array | ARRAY/MAP | ✅ | Defaults to ARRAY. Resolves to MAP if the field was originally constructed via `AvroData.fromConnectSchema` ([source](https://github.com/confluentinc/schema-registry/blob/174907bfc0d9424e8d02e788f450f4afcdda1750/avro-data/src/main/java/io/confluent/connect/avro/AvroData.java#L943)) | +| map | MAP | ✅ | | +| union | STRUCT/`` | ⚠️ | Defaults to STRUCT. Resolves to the singleton type `T` in the union definition if AvroDataConfig property `flatten.singleton.unions=true` | +| fixed | BYTES | ⚠️ | Fixed `decimal` logical type is not supported (see below) | + +Refer to [Supported data types](#supported-data-types) for the mapping between Kafka Connect types and ClickHouse types. + +###### Unsupported data types {#unsupported-avro-types} + +Currently, the following Avro data types are unsupported by the connector - support for them is forthcoming: +- fixed `decimal` logical type +```json +{"name": "decimal_18_4", "type": "fixed", "size": 8, "logicalType": "decimal", "precision": 18, "scale": 4} +``` +- nullable unions +```json +{"name": "mixed_union", "type": ["null", "string", "int"], "default": null} +``` +- record unions +```json +{ + "name": "record_union", + "type": [ + { + "type": "record", + "name": "TypeA", + "fields": [ + { + "name": "label", + "type": "string" + } + ] + }, + { + "type": "record", + "name": "TypeB", + "fields": [ + { + "name": "count", + "type": "int" + } + ] + } + ] +} +``` + ##### Protobuf schema support {#protobuf-schema-support} ```json From 57885629b9717b62dc4b14b0d949ecf1949d39ba Mon Sep 17 00:00:00 2001 From: Saketh Kurnool Date: Fri, 24 Apr 2026 16:01:59 +0900 Subject: [PATCH 2/3] change unsupported data types to "unsupported schemas" --- .../data-ingestion/kafka/kafka-clickhouse-connect-sink.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md b/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md index ced5cf6243a..53fb62d4ac4 100644 --- a/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md +++ b/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md @@ -275,9 +275,9 @@ The connector can consume data from multiple topics Refer to [Supported data types](#supported-data-types) for the mapping between Kafka Connect types and ClickHouse types. -###### Unsupported data types {#unsupported-avro-types} +###### Unsupported schemas {#unsupported-avro-schemas} -Currently, the following Avro data types are unsupported by the connector - support for them is forthcoming: +The following Avro schemas are unsupported by the connector: - fixed `decimal` logical type ```json {"name": "decimal_18_4", "type": "fixed", "size": 8, "logicalType": "decimal", "precision": 18, "scale": 4} From f7c786a2f382bfa9f95be65d4c8dc222e73e788a Mon Sep 17 00:00:00 2001 From: Saketh Kurnool Date: Tue, 5 May 2026 16:10:26 -0700 Subject: [PATCH 3/3] sergey review comments --- .../data-ingestion/kafka/kafka-clickhouse-connect-sink.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md b/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md index 53fb62d4ac4..0174160c664 100644 --- a/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md +++ b/docs/integrations/data-ingestion/kafka/kafka-clickhouse-connect-sink.md @@ -249,7 +249,9 @@ The connector can consume data from multiple topics } ``` -###### Type mapping {#avro-type-mapping} +###### Avro type mapping {#avro-type-mapping} +The type mapping below is defined by `io.confluent.connect.avro.AvroConverter`, the official Avro serializer/deserializer implementation in Kafka Connect. See the Kafka Connect [docs](https://docs.confluent.io/platform/current/connect/userguide.html#avro) for advanced information on conversion logic. + ✅: Supported ❌: Not supported @@ -270,12 +272,12 @@ The connector can consume data from multiple topics | enum | STRING | ✅ | | | array | ARRAY/MAP | ✅ | Defaults to ARRAY. Resolves to MAP if the field was originally constructed via `AvroData.fromConnectSchema` ([source](https://github.com/confluentinc/schema-registry/blob/174907bfc0d9424e8d02e788f450f4afcdda1750/avro-data/src/main/java/io/confluent/connect/avro/AvroData.java#L943)) | | map | MAP | ✅ | | -| union | STRUCT/`` | ⚠️ | Defaults to STRUCT. Resolves to the singleton type `T` in the union definition if AvroDataConfig property `flatten.singleton.unions=true` | +| union | STRUCT/`` | ⚠️ | Defaults to STRUCT. Resolves to the singleton type `T` in the union definition if `flatten.singleton.unions=true` (see [docs](https://docs.confluent.io/cloud/current/connectors/reference/connector-configuration.html#value-converter-flatten-singleton-unions)) | | fixed | BYTES | ⚠️ | Fixed `decimal` logical type is not supported (see below) | Refer to [Supported data types](#supported-data-types) for the mapping between Kafka Connect types and ClickHouse types. -###### Unsupported schemas {#unsupported-avro-schemas} +###### Unsupported Avro schemas {#unsupported-avro-schemas} The following Avro schemas are unsupported by the connector: - fixed `decimal` logical type