Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-11325] Kafka Dynamic Read #13750

Merged
merged 1 commit into from Feb 4, 2021
Merged

Conversation

boyuanzz
Copy link
Contributor

@boyuanzz boyuanzz commented Jan 14, 2021

design doc: https://docs.google.com/document/d/1FU3GxVRetHPLVizP3Mdv6mP5tpjZ3fd99qNjUI5DT5k/edit#


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Dataflow Flink Samza Spark Twister2
Go Build Status --- Build Status --- Build Status ---
Java Build Status Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status Build Status
Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
--- Build Status ---
XLang Build Status Build Status Build Status --- Build Status ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website Whitespace Typescript
Non-portable Build Status Build Status
Build Status
Build Status
Build Status
Build Status Build Status Build Status Build Status
Portable --- Build Status --- --- --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

@boyuanzz
Copy link
Contributor Author

cc: @aromanenko-dev It is a really early stage PR but just in case you are interested in : )

@boyuanzz boyuanzz changed the title [WIP] Kafka Dynamic Read [11325] Kafka Dynamic Read Jan 20, 2021
@boyuanzz boyuanzz changed the title [11325] Kafka Dynamic Read [BEAM-11325] Kafka Dynamic Read Jan 20, 2021
@boyuanzz
Copy link
Contributor Author

Hi Alexey, this PR is ready to review. I mark WatchKafkaTopicPartitionDoFn as Experimental for now and we can remove it after I have an IT for it.

@aromanenko-dev
Copy link
Contributor

@boyuanzz Thanks! I'll take a look on this in the next week.

@boyuanzz
Copy link
Contributor Author

boyuanzz commented Feb 1, 2021

Kindly pinging : )
I want to make it into 2.29.0 release if that's feasible.

@aromanenko-dev
Copy link
Contributor

Run Java PreCommit

Copy link
Contributor

@aromanenko-dev aromanenko-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, sorry for delay. I took a quick first look and it's fine to me, just a minor notes.

Did you test it against real Kafka cluster with adding/removing topics and/or partitions?

CC: @chamikaramj Could you take a look as well if you have a time?

*
* For a given kafka bootstrap_server, KafkaIO is also able to detect and read from available {@link
* TopicPartition} dynamically. For more design details, please refer to
* https://docs.google.com/document/d/1FU3GxVRetHPLVizP3Mdv6mP5tpjZ3fd99qNjUI5DT5k/. To enable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, put the details here (at least in a short form) instead of the link since it can be expired or disappeared along the time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Thanks for pointing this out.

@@ -553,6 +587,9 @@
abstract Builder<K, V> setValueDeserializerProvider(
DeserializerProvider deserializerProvider);

abstract Builder<K, V> setCheckStopReadingFn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't this functionality added in #13710 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I opened this PR prior to mergeing #13710. I'll rebase them. Thanks for catching it : )


/**
* A stateful {@linkl DoFn} that emits new available {@link TopicPartition} regularly. Please refer
* to https://docs.google.com/document/d/1FU3GxVRetHPLVizP3Mdv6mP5tpjZ3fd99qNjUI5DT5k/edit# for more
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same note about the link to details.

@aromanenko-dev
Copy link
Contributor

Also, ptal on "Java PreCommit" test

16:40:12 Execution failed for task ':sdks:java:io:kafka:checkstyleMain'.

@boyuanzz
Copy link
Contributor Author

boyuanzz commented Feb 2, 2021

I added most javadoc to KafkaIO to explain how it works and potential race condition. I haven't test this functionality with real Kafka cluster so I marked WatchKafkaTopicPartitionDoFn as @Experimental. I'm exploring ways to have ITs for these functionalities. @aromanenko-dev

Copy link
Contributor

@aromanenko-dev aromanenko-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM
I think we can merge it if there are no other objections/comments

@aromanenko-dev
Copy link
Contributor

Run Java PostCommit

@boyuanzz
Copy link
Contributor Author

boyuanzz commented Feb 4, 2021

Thanks for your quick review! I'm going to merge this PR.

Cham, feel free to drop any comments if you have concerns around this. I'll follow up if any.

@boyuanzz boyuanzz merged commit 47d3326 into apache:master Feb 4, 2021
@aromanenko-dev
Copy link
Contributor

@boyuanzz I think we forgot one small but important thing - update CHANGES.md about this feature.

@boyuanzz
Copy link
Contributor Author

boyuanzz commented Feb 9, 2021

@boyuanzz I think we forgot one small but important thing - update CHANGES.md about this feature.

Thanks for the remainder! I'll update CHANGES.md later today.

@boyuanzz
Copy link
Contributor Author

@boyuanzz I think we forgot one small but important thing - update CHANGES.md about this feature.

Thanks for the remainder! I'll update CHANGES.md later today.

I just checked CHANGES.md and it seems like we want to add this feature into 2.29.0 release. I'll follow up with this when 2.29.0 release branch is cut.

@boyuanzz boyuanzz deleted the dynamic_kafka branch March 11, 2021 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants