Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sources #7

Closed
purplefox opened this issue Jul 1, 2021 · 0 comments
Closed

Sources #7

purplefox opened this issue Jul 1, 2021 · 0 comments
Assignees
Milestone

Comments

@purplefox
Copy link
Collaborator

Implement sources for ingesting data from Kafka.

Much of the plumbing is already in place for this.

One instance of a particular source will be deployed on each node of the cluster. The source instance will maintain a set of Kafka consumers (can be just one for phase 1). As a batch of records is fetched from the consumer it should be passed to the source.ingestRows method for processing. If the batch is processed without error the consumer offsets for the batch should be committed.

If a source has a key already that should be used for the key in the source table. If not, then the user can optionally specify a key from an existing field (e.g. timestamp). A monotonically increasing value can also be generated for the key if needed.

@purplefox purplefox added this to the Phase 1 milestone Jul 1, 2021
@purplefox purplefox self-assigned this Jul 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant