[BEAM-744] A runner should be able to override KafkaIO max wait prope… #1125

amitsela · 2016-10-18T11:51:23Z

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

Make sure the PR title is formatted like:
[BEAM-<Jira issue #>] Description of pull request
Make sure tests pass via mvn clean verify. (Even better, enable
Travis-CI on your fork and ensure the whole test matrix passes).
Replace <Jira issue #> in the title with the actual Jira issue
number, if there is one.
If this contribution is large, please file an Apache
Individual Contributor License Agreement.

…rties.

Add KafkaOptions for the UnboundedKafkaReader.

amitsela · 2016-10-18T11:53:48Z

R: @rangadi and @dhalperi (committer).
CC: @aljoscha, @tgroh (Flink/Direct runners).

amitsela · 2016-10-18T12:32:07Z

Jenkins failures seem to be related to Dataflow IT tests - NoSuchMethodError - could it be a dirty classpath in the container running the job ?

dhalperi · 2016-10-18T16:20:57Z

I think that this is generally on the wrong path. Runners should not need to override temporal constants in specific transforms to get sane behavior. I believe the simple rule of thumb should be "readers should return as soon as they are able" + "runners may poll advance() in a loop for a certain period of time if it returned too fast" + "runners must tolerate sources that take a long time to start or advance, because real systems operate that way".

I think we're violating all of these in various places, but that combined these principles add up to a good solution. Thoughts?

(Also, if we reach agreement we should probably summarize to dev@ list?)

amitsela · 2016-10-18T17:14:36Z

I'd be happy to summarize this once we have something, and I agree with what you write Dan, but it seems that this was an issue for the DirectRunner as explained in this conversation.
According to your suggestion, we should set the two "wait" properties to something like 10 msec, correct ?
If the DirectRunner can handle it, I'd be happy to change the ticket, PR, and commit.

dhalperi · 2016-10-18T17:23:26Z

per conversation with @tgroh , I believe we can and should do this now.

tgroh · 2016-10-18T18:18:06Z

The DirectRunner should reuse readers for some amount of time, and forever until they produce elements. #1128 fixes the current state to actually continue to read from a shard that ever returned false in an initial evaluation, but is not directly related to this change.

rangadi · 2016-10-18T18:51:07Z

sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java

@@ -757,7 +757,7 @@ public void validate() {

    private static final Duration KAFKA_POLL_TIMEOUT = Duration.millis(1000);
    // how long to wait for new records from kafka consumer inside start()
-    private static final Duration START_NEW_RECORDS_POLL_TIMEOUT = Duration.standardSeconds(5);
+    private static final Duration START_NEW_RECORDS_POLL_TIMEOUT = Duration.millis(10);


Better to remove this then.

Recently when I modified KafkaIOTest, I removed a bit of extra code that handled 'false' from start(). I need to put that back. I can send a separate PR for that.

So we'll only have NEW_RECORDS_POLL_TIMEOUT, sure why not.

rangadi · 2016-10-18T18:56:47Z

I believe the simple rule of thumb should be "readers should return as soon as they are able" + "runners may poll advance() in a loop for a certain period of time if it returned too fast" + "runners must tolerate sources that take a long time to start or advance, because real systems operate that way".

I like these. 👍

rangadi

LGTM.
Suggested a minor improvement to comment.
I will send another CL with a fix to KafkaIOTest (otherwise it would occasionally flake).

rangadi · 2016-10-18T20:40:52Z

sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java

-    // how long to wait for new records from kafka consumer inside start()
-    private static final Duration START_NEW_RECORDS_POLL_TIMEOUT = Duration.standardSeconds(5);
-    // how long to wait for new records from kafka consumer inside advance()
+    // how long to wait for new records from kafka consumer.


Can you add 'inside advance()' or 'inside advance()/start()' to this comment? Would make it more clear where the time out is.

rangadi · 2016-10-19T04:58:26Z

Fixed KafkaIOTest in #1133

rangadi · 2016-10-19T05:02:08Z

sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java

@@ -968,7 +966,7 @@ public void run() {

      // Wait for longer than normal when fetching a batch to improve chances a record is available
      // when start() returns.
-      nextBatch(START_NEW_RECORDS_POLL_TIMEOUT);


actually, can you remove this arg for nextBatch and use NEW_RECORDS_POLL_TIMEOUT diretly inside nextBatch().

rangadi · 2016-10-19T05:05:40Z

sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java

@@ -968,7 +966,7 @@ public void run() {

      // Wait for longer than normal when fetching a batch to improve chances a record is available


remove this comment.

amitsela · 2016-10-19T07:02:06Z

@rangadi I've addressed your comments, PTAL.

rangadi · 2016-10-19T18:25:03Z

sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java

-                                             TimeUnit.MILLISECONDS);
+        // poll available records, wait (if necessary) up to the specified timeout.
+        records = availableRecordsQueue.poll(NEW_RECORDS_POLL_TIMEOUT.getMillis(),
+            TimeUnit.MILLISECONDS);


optional/minor : align the args?

I really don't mind but it seems like we don't have a consensus on arg-alignment..
I'll align and commit, thanks!

rangadi

👍 LGTM.
Thanks for the updates.

amitsela force-pushed the BEAM-744 branch from 627f50c to b86b108 Compare October 18, 2016 18:10

rangadi reviewed Oct 18, 2016

View reviewed changes

[BEAM-744] UnboundedKafkaReader should return as soon as it can.

f5410c2

amitsela force-pushed the BEAM-744 branch from b86b108 to f5410c2 Compare October 18, 2016 19:04

rangadi approved these changes Oct 18, 2016

View reviewed changes

rangadi reviewed Oct 19, 2016

View reviewed changes

Use timeout directly in nextBatch()

db195c8

rangadi reviewed Oct 19, 2016

View reviewed changes

fixup! align args.

387d61b

asfgit closed this in b0cb2e8 Oct 19, 2016

amitsela deleted the BEAM-744 branch October 19, 2016 18:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BEAM-744] A runner should be able to override KafkaIO max wait prope… #1125

[BEAM-744] A runner should be able to override KafkaIO max wait prope… #1125

amitsela commented Oct 18, 2016 •

edited

amitsela commented Oct 18, 2016

amitsela commented Oct 18, 2016 •

edited

dhalperi commented Oct 18, 2016 •

edited

amitsela commented Oct 18, 2016

dhalperi commented Oct 18, 2016

tgroh commented Oct 18, 2016

rangadi Oct 18, 2016

amitsela Oct 18, 2016

rangadi commented Oct 18, 2016

rangadi left a comment

rangadi Oct 18, 2016

rangadi commented Oct 19, 2016

rangadi Oct 19, 2016

rangadi Oct 19, 2016

amitsela commented Oct 19, 2016

rangadi Oct 19, 2016

amitsela Oct 19, 2016

rangadi left a comment

		@@ -968,7 +966,7 @@ public void run() {

		// Wait for longer than normal when fetching a batch to improve chances a record is available

[BEAM-744] A runner should be able to override KafkaIO max wait prope… #1125

[BEAM-744] A runner should be able to override KafkaIO max wait prope… #1125

Conversation

amitsela commented Oct 18, 2016 • edited

amitsela commented Oct 18, 2016

amitsela commented Oct 18, 2016 • edited

dhalperi commented Oct 18, 2016 • edited

amitsela commented Oct 18, 2016

dhalperi commented Oct 18, 2016

tgroh commented Oct 18, 2016

rangadi Oct 18, 2016

Choose a reason for hiding this comment

amitsela Oct 18, 2016

Choose a reason for hiding this comment

rangadi commented Oct 18, 2016

rangadi left a comment

Choose a reason for hiding this comment

rangadi Oct 18, 2016

Choose a reason for hiding this comment

rangadi commented Oct 19, 2016

rangadi Oct 19, 2016

Choose a reason for hiding this comment

rangadi Oct 19, 2016

Choose a reason for hiding this comment

amitsela commented Oct 19, 2016

rangadi Oct 19, 2016

Choose a reason for hiding this comment

amitsela Oct 19, 2016

Choose a reason for hiding this comment

rangadi left a comment

Choose a reason for hiding this comment

amitsela commented Oct 18, 2016 •

edited

amitsela commented Oct 18, 2016 •

edited

dhalperi commented Oct 18, 2016 •

edited