Add Protobuf -> Kafka Connect type mapping and unsupported proto schemas#6115
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
4 Skipped Deployments
|
dhtclk
left a comment
There was a problem hiding this comment.
Looks good, just a misspelling on analogously that needs to be corrected before merging.
| |-----------------------------------------|-----------------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| | double | FLOAT64 | ✅ | | | ||
| | float | FLOAT32 | ✅ | | | ||
| | int32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | |
There was a problem hiding this comment.
| | int32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | | |
| | int32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analogously for INT16 if `connect.type=int16`) | |
| | double | FLOAT64 | ✅ | | | ||
| | float | FLOAT32 | ✅ | | | ||
| | int32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | | ||
| | sint32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | |
There was a problem hiding this comment.
| | sint32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | | |
| | sint32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analogously for INT16 if `connect.type=int16`) | |
| | float | FLOAT32 | ✅ | | | ||
| | int32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | | ||
| | sint32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | | ||
| | sfixed32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | |
There was a problem hiding this comment.
| | sfixed32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analagously for INT16 if `connect.type=int16`) | | |
| | sfixed32 | INT8/INT16/INT32 | ✅ | Defaults to INT32. Resolves to INT8 if the schema has option `connect.type=int8` (analogously for INT16 if `connect.type=int16`) | |
|
|
||
| Please note: if you encounter issues with missing classes, not every environment comes with the protobuf converter and you may need an alternate release of the jar bundled with dependencies. | ||
|
|
||
| ###### Type mapping {#proto-type-mapping} |
There was a problem hiding this comment.
This should tell from what type mapping is.
I suggest having a Header 2 with Type Mapping
then header 3 Protobuf Type Mapping
then we will add for avro similar table.
There was a problem hiding this comment.
imo, it's clear this refers to protobuf since this is a subheading of Protobuf schema support. But I can make this Protobuf type mapping for more clarity 👍 .
then we will add for avro similar table.
we should have the same heading/sub-heading pattern for both avro and protobuf (see https://github.com/ClickHouse/clickhouse-docs/pull/6081/changes#diff-cfd6efb64f516235ee2ecb43e9da90a4a4f49b69cd47dbfe06c9e1586fb606bdR252). Since Avro and Protobuf are separate sections already, imo we don't need to make a new header 2 called Type Mapping. Instead, we can just make the sub-heading names more explicit (Protobuf type mapping or Avro type mapping). WDYT?
| ❌: Not supported | ||
|
|
||
| ️⚠️: Partially supported | ||
|
|
There was a problem hiding this comment.
Here we need to highlight that this mapping valid when io.confluent.connect.protobuf.ProtobufConverter is used. This is important because it may be another converters and they do another mapping.
Additionally we need to have a link to the https://docs.confluent.io/platform/current/connect/userguide.html#json-schema-and-protobuf because it explains some conversion.
There was a problem hiding this comment.
thanks, good callout. I have added this
| | bool | BOOLEAN | ✅ | | | ||
| | string | STRING | ✅ | | | ||
| | bytes | BYTES | ✅ | | | ||
| | enum | INT32/STRING | ✅ | Defaults to STRING. Resolves to INT32 if `int.for.enums=true` (see [ProtobufDataConfig.java](https://github.com/confluentinc/schema-registry/blob/22ced2df8e61586f89a1c88034e18fdada3cea9c/protobuf-converter/src/main/java/io/confluent/connect/protobuf/ProtobufDataConfig.java#L38)) | |
There was a problem hiding this comment.
is there some better documentation?
There was a problem hiding this comment.
yes, added it 👍
| Refer to [Supported data types](#supported-data-types) for the mapping between Kafka Connect types and ClickHouse types. | ||
|
|
||
| ###### Note on translating `oneof` fields to ClickHouse columns {#oneof-translation} | ||
| The connector does not support translating Protobuf unions (`oneof`) to the ClickHouse Variant type. Instead, list the `oneof` fields as individual nullable fields in your ClickHouse table schema. |
There was a problem hiding this comment.
for Avro we should support union of string and bytes. Is it true for the Protobuf?
You have mentioned that it works with nullable fields. Is there any other requirements to make it work?
There was a problem hiding this comment.
for Avro we should support union of string and bytes. Is it true for the Protobuf?
yes, it's true. I will link to an example here soon.
There was a problem hiding this comment.
Is there any other requirements to make it work?
Besides making the fields separate and nullable, there's no other requirement.
There was a problem hiding this comment.
|
|
||
| ###### Unsupported schemas {#unsupported-proto-schemas} | ||
| The following Protobuf schemas are unsupported by the connector: | ||
| - multi-message unions |
There was a problem hiding this comment.
do we support nested messages? in what cases?
There was a problem hiding this comment.
yes we do, i will link to examples shortly
There was a problem hiding this comment.
Summary
Add type mappings between Protobuf and Kafka Connect to improve users' visibility into how proto types eventually resolve to ClickHouse data types. Also, add the current (evolving) list of unsupported proto schemas based on the findings in ClickHouse/clickhouse-kafka-connect#735 (issue: ClickHouse/clickhouse-kafka-connect#634)
closes: ClickHouse/clickhouse-kafka-connect#634