Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KafkaConnectSink effectively ignores topic name from sink configuration file #12880

Closed
kaja78 opened this issue Nov 18, 2021 · 4 comments
Closed
Assignees
Labels
lifecycle/stale type/bug The PR fixed a bug or issue reported a bug

Comments

@kaja78
Copy link

kaja78 commented Nov 18, 2021

Describe the bug
The topic configuration property of PulsarKafkaConnectSinkConfig is not passed to Kafka sink. Instead of it, the original Pulsar topic name is passed.

In the KafkaConnectSink.toSinkRecord() method, the topic name is derived using:

final String topic = sourceRecord.getTopicName().orElse(topicName);

The topicName instance variable holds topic name from PulsarKafkaConnectSinkConfig. However I guess, there is always topicName present on sourceRecord, so topic property from configuration is never passed to Kafka sink.

Expected behaviour
Simply the topic from PulsarKafkaConnectSinkConfig should be always used in KafkaConnectSink.toSinkRecord().
final String topic = topicName;

Actually it would make sense to reverse the original logic => if the topic is not set in sink configuration file, then use pulsar topic name. However, the topic configuration property is currently required, so it just should be always used.

Additional context
I identified this issue while testing https://github.com/datastax/snowflake-connector. It uses KafkaConnectSink as parent class.
The target table name is derived from Kafka topic name in com.snowflake.kafka.connector.SnowflakeSinkConnector. Since I am not able to configure the topic name in sink configuration file, pulsar messages are stored into tables with ugly names following pattern: PERSISTENT___[tenant][namespace][topic]_[hash] .

@dlg99
Copy link
Contributor

dlg99 commented Nov 18, 2021

@kaja78 I don't think using final String topic = topicName; is a good idea.
IIRC sink can use multiple input topics (via --inputs) and we'd want to route them to appropriate destinations.
"_" appear as result of replacement of ":" and "/" in the full topic name (URI as in "persistent://tenant/ns/topic") that snowflake connector does internally to generate table name compatible with the snowflake.
Theoretically, topic2table mapping of "persistent___tenant_ns_topic:cooltablename" might work though it did not for me. I am not sure why - either the character replacement happens after the table mapping in the snowflake's kafka connector or I screwed something up with upper/lower cases.

@kaja78
Copy link
Author

kaja78 commented Nov 19, 2021

The character replacement happens only in case there is no topic2table mapping defined and topic name is not valid snowflake identifier: https://github.com/snowflakedb/snowflake-kafka-connector/blob/v1.5.5/src/main/java/com/snowflake/kafka/connector/Utils.java#L536

Since topic name contains colon character, there is no way to specify table mapping for pulsar persistent topics.

I think the Kafka topic name derivation logic should be at least refactored into protected method, so one can easily override it in child Sink implementation.

Anyway, I think current implementation is not consistent with @FieldDoc. So may be making topic optional and using it to override original pulsar topic name makes more sense.

@github-actions
Copy link

The issue had no activity for 30 days, mark with Stale label.

@tisonkun
Copy link
Member

Closed as stale and no consensus. Please create a new issue if it's still relevant to the maintained versions.

See #12893 (review).

@tisonkun tisonkun closed this as not planned Won't fix, can't repro, duplicate, stale Dec 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
4 participants