Skip to content

KafkaIO bounded source #18353

@kennknowles

Description

@kennknowles

KafkaIO could be a useful source for batch applications as well. It could implement a bounded source. The primary question is how the bounds are specified.

One option : Source specifies a time period (say 9am-10am), and KafkaIO fetches appropriate start and end offsets based on time-index in Kafka. This would suite many batch applications that are launched on a scheduled.

Another option is to always read till the end and commit the offsets to Kafka. Handling failures and multiple runs of a task might be complicated.

Imported from Jira BEAM-2185. Original Jira may contain additional context.
Reported by: rangadi.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions