Skip to content

Fix multi-topic routing to prevent writing all records to all tables#15639

Open
aryash45 wants to merge 1 commit intoapache:mainfrom
aryash45:fix/multi-topic-routing-#15584
Open

Fix multi-topic routing to prevent writing all records to all tables#15639
aryash45 wants to merge 1 commit intoapache:mainfrom
aryash45:fix/multi-topic-routing-#15584

Conversation

@aryash45
Copy link

When iceberg.tables.route-field is not set, records from all topics were being written to all tables if they shared the same id-columns name and similar schemas. This was due to the routing logic broadcasting to all tables instead of matching topics to table names.

This change implements topic-based routing by:

  • Extracting the last dot-delimited segment from both topic and table names
  • Matching them case-insensitively
  • Only writing to matching tables
  • Falling back to broadcast only if no topic-to-table match is found

This preserves backward compatibility while fixing the multi-topic routing issue reported in #15584.

Changes:

  • SinkWriter.java: Added topic-based routing in routeRecordStatically()
  • Added lastSegment() helper method
  • TestSinkWriter.java: Added testTopicRoute(), testTopicRouteSecondTable(), and testTopicRouteFallbackBroadcast() test cases
    Questions For Maintainers
    -Should unmatched topic fallback log a warning to alert users of potential misconfiguration?
  • Is topic-based routing the officially recommended approach for multi-topic setups?
  • Should this routing behavior be documented in kafka-connect.md

Closes #15584

When iceberg.tables.route-field is not set, records from all topics were
being written to all tables if they shared the same id-columns name and
similar schemas. This was due to the routing logic broadcasting to all
tables instead of matching topics to table names.

This change implements topic-based routing by:
- Extracting the last dot-delimited segment from both topic and table names
- Matching them case-insensitively
- Only writing to matching tables
- Falling back to broadcast only if no topic-to-table match is found

This preserves backward compatibility while fixing the multi-topic routing
issue reported in apache#15584.

Changes:
- SinkWriter.java: Added topic-based routing in routeRecordStatically()
- Added lastSegment() helper method
- TestSinkWriter.java: Added testTopicRoute(), testTopicRouteSecondTable(),
  and testTopicRouteFallbackBroadcast() test cases

Closes apache#15584
@aryash45
Copy link
Author

hey @nastra @huaxingao please review this pr and tell me if anything needs changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi-topic → multi-table routing writes all records to all tables when schemas are identical

1 participant