Skip to content

Add Protobuf -> Kafka Connect type mapping and unsupported proto schemas#6115

Merged
kurnoolsaketh merged 7 commits into
mainfrom
issue-634
May 11, 2026
Merged

Add Protobuf -> Kafka Connect type mapping and unsupported proto schemas#6115
kurnoolsaketh merged 7 commits into
mainfrom
issue-634

Conversation

@kurnoolsaketh
Copy link
Copy Markdown
Contributor

@kurnoolsaketh kurnoolsaketh commented Apr 24, 2026

Summary

Add type mappings between Protobuf and Kafka Connect to improve users' visibility into how proto types eventually resolve to ClickHouse data types. Also, add the current (evolving) list of unsupported proto schemas based on the findings in ClickHouse/clickhouse-kafka-connect#735 (issue: ClickHouse/clickhouse-kafka-connect#634)

closes: ClickHouse/clickhouse-kafka-connect#634

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 24, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
clickhouse-docs Ready Ready Preview May 5, 2026 11:26pm
4 Skipped Deployments
Project Deployment Actions Updated (UTC)
clickhouse-docs-jp Ignored Ignored May 5, 2026 11:26pm
clickhouse-docs-ko Ignored Ignored Preview May 5, 2026 11:26pm
clickhouse-docs-ru Ignored Ignored Preview May 5, 2026 11:26pm
clickhouse-docs-zh Ignored Ignored Preview May 5, 2026 11:26pm

Request Review

Copy link
Copy Markdown
Collaborator

@dhtclk dhtclk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a misspelling on analogously that needs to be corrected before merging.

|-----------------------------------------|-----------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| double | FLOAT64 | ✅ | |
| float | FLOAT32 | ✅ | |
| int32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| int32 | INT8/INT16/INT32 || Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) |
| int32 | INT8/INT16/INT32 || Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analogously for INT16 if `connect.type=int16`) |

| double | FLOAT64 | ✅ | |
| float | FLOAT32 | ✅ | |
| int32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) |
| sint32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| sint32 | INT8/INT16/INT32 || Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) |
| sint32 | INT8/INT16/INT32 || Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analogously for INT16 if `connect.type=int16`) |

| float | FLOAT32 | ✅ | |
| int32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) |
| sint32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) |
| sfixed32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| sfixed32 | INT8/INT16/INT32 || Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) |
| sfixed32 | INT8/INT16/INT32 || Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analogously for INT16 if `connect.type=int16`) |

@BentsiLeviav BentsiLeviav added the Don't Merge Don't merge yet label May 5, 2026

Please note: if you encounter issues with missing classes, not every environment comes with the protobuf converter and you may need an alternate release of the jar bundled with dependencies.

###### Type mapping {#proto-type-mapping}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should tell from what type mapping is.

I suggest having a Header 2 with Type Mapping
then header 3 Protobuf Type Mapping

then we will add for avro similar table.

Copy link
Copy Markdown
Contributor Author

@kurnoolsaketh kurnoolsaketh May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo, it's clear this refers to protobuf since this is a subheading of Protobuf schema support. But I can make this Protobuf type mapping for more clarity 👍 .

then we will add for avro similar table.

we should have the same heading/sub-heading pattern for both avro and protobuf (see https://github.com/ClickHouse/clickhouse-docs/pull/6081/changes#diff-cfd6efb64f516235ee2ecb43e9da90a4a4f49b69cd47dbfe06c9e1586fb606bdR252). Since Avro and Protobuf are separate sections already, imo we don't need to make a new header 2 called Type Mapping. Instead, we can just make the sub-heading names more explicit (Protobuf type mapping or Avro type mapping). WDYT?

❌: Not supported

️⚠️: Partially supported

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we need to highlight that this mapping valid when io.confluent.connect.protobuf.ProtobufConverter is used. This is important because it may be another converters and they do another mapping.
Additionally we need to have a link to the https://docs.confluent.io/platform/current/connect/userguide.html#json-schema-and-protobuf because it explains some conversion.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, good callout. I have added this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| bool | BOOLEAN | ✅ | |
| string | STRING | ✅ | |
| bytes | BYTES | ✅ | |
| enum | INT32/STRING | ✅ | Defaults to STRING. Resolves to INT32 if `int.for.enums=true` (see [ProtobufDataConfig.java](https://github.com/confluentinc/schema-registry/blob/22ced2df8e61586f89a1c88034e18fdada3cea9c/protobuf-converter/src/main/java/io/confluent/connect/protobuf/ProtobufDataConfig.java#L38)) |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there some better documentation?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, added it 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refer to [Supported data types](#supported-data-types) for the mapping between Kafka Connect types and ClickHouse types.

###### Note on translating `oneof` fields to ClickHouse columns {#oneof-translation}
The connector does not support translating Protobuf unions (`oneof`) to the ClickHouse Variant type. Instead, list the `oneof` fields as individual nullable fields in your ClickHouse table schema.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for Avro we should support union of string and bytes. Is it true for the Protobuf?

You have mentioned that it works with nullable fields. Is there any other requirements to make it work?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for Avro we should support union of string and bytes. Is it true for the Protobuf?

yes, it's true. I will link to an example here soon.

Copy link
Copy Markdown
Contributor Author

@kurnoolsaketh kurnoolsaketh May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any other requirements to make it work?

Besides making the fields separate and nullable, there's no other requirement.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


###### Unsupported schemas {#unsupported-proto-schemas}
The following Protobuf schemas are unsupported by the connector:
- multi-message unions
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we support nested messages? in what cases?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes we do, i will link to examples shortly

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kurnoolsaketh kurnoolsaketh removed the Don't Merge Don't merge yet label May 11, 2026
@kurnoolsaketh kurnoolsaketh merged commit 1abdcb6 into main May 11, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[test+docs] Extend Protobuf testing

5 participants