Skip to content

[BEAM-3405] Make maxNumRecords a long instead of an int for KinesisIO#4339

Merged
iemejia merged 1 commit intoapache:masterfrom
iemejia:BEAM-3405-maxnumrecords
Jan 12, 2018
Merged

[BEAM-3405] Make maxNumRecords a long instead of an int for KinesisIO#4339
iemejia merged 1 commit intoapache:masterfrom
iemejia:BEAM-3405-maxnumrecords

Conversation

@iemejia
Copy link
Member

@iemejia iemejia commented Jan 3, 2018

Follow this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Run mvn clean verify to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

@iemejia iemejia requested review from jkff and lukecwik January 3, 2018 15:07
PTransform<PBegin, PCollection<KinesisRecord>> transform = unbounded;

if (getMaxNumRecords() < Long.MAX_VALUE) {
transform = unbounded.withMaxNumRecords(getMaxNumRecords());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous code supported a combination of both options, but the current code does not. Is this intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to make it consistent with all the other uses on unbounded sources e.g. KafkaIO, MqttIO, etc.

Notice that the with methods on Read.Unbounded seem to be exclusive. See org.apache.beam.sdk.io.Read.Unbounded#withMaxNumRecords (and the method below), but they can be of course overwritten because BoundedReadFromUnboundedSource supports both parameters.

I suppose that a user that wants the combined options can just use use the time based one followed by Sample.any. What do you prefer? Should I let it like it was or make it consistent with the other IOs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable. Then can you just add code to verify that the user didn't specify both options at the same time, i.e. that we're not silently ignoring one of them?

@iemejia iemejia force-pushed the BEAM-3405-maxnumrecords branch 5 times, most recently from ba0d99b to 99b1647 Compare January 10, 2018 21:42
@iemejia
Copy link
Member Author

iemejia commented Jan 10, 2018

I decided to respect the semantics that it has before and add these semantics to the other Unbounded IOs on Beam in a future PR. I added an extra validation to see if a stream exists before continuing with the execution. PTAL @jkff

Copy link
Contributor

@jkff jkff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

try {
DescribeStreamResult describeStreamResult = client.describeStream(streamName);
StreamDescription streamDescription = describeStreamResult.getStreamDescription();
return streamName.equals(streamDescription.getStreamName());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way this can return false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's right that validation was quite silly. From looking at the official doc on the response of the Aws Kinesis API it actually is status code 200 or exception, so I updated to do it like that, for future ref.
https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DescribeStream.html

I hope everything is ok now, thanks again and sorry for the silly validation.

@iemejia iemejia force-pushed the BEAM-3405-maxnumrecords branch from 99b1647 to 912fa56 Compare January 11, 2018 14:14
@iemejia iemejia merged commit a5b5779 into apache:master Jan 12, 2018
@iemejia iemejia deleted the BEAM-3405-maxnumrecords branch January 12, 2018 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants