Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-961] Add starting number to CountingInput #1505

Closed
wants to merge 2 commits into from

Conversation

vladisav
Copy link
Contributor

@vladisav vladisav commented Dec 4, 2016

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

@asfbot
Copy link

asfbot commented Dec 4, 2016

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/5512/
--none--

@jbonofre
Copy link
Member

jbonofre commented Dec 4, 2016

R: @jbonofre
CC: @davorbonaci
CC: @dhalperi

Copy link
Contributor

@jkff jkff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution! I left a bunch of nitpicks.

* <p>To produce a bounded {@code PCollection<Long>} starting from {@code startOffset},
* use {@link CountingInput#forSubrange(long, long)} :
*
* <pre>{@code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code example is probably unnecessary - can just say at line 56, "use {@link ...} instead".

* Creates a {@link BoundedCountingInput} that will produce the specified number of elements,
* starting from {@code startOffset} to (excluding) {@code endOffset}.
*/
public static BoundedCountingInput forSubrange(long startOffset, long endOffset) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "index" might be a better name than "offset" here - offsets are usually measured relative to something important (e.g. the beginning of a file).

@@ -76,6 +86,16 @@ public static BoundedCountingInput upTo(long numElements) {
}

/**
* Creates a {@link BoundedCountingInput} that will produce the specified number of elements,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overload does not let you specify the number of elements.

*
* @deprecated use {@link CountingInput#forSubrange(long, long)} instead
*/
@Deprecated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add a method and immediately deprecate it? If you want users to not use it, package-local visibility is sufficient for that.

@@ -83,6 +83,20 @@
}

/**
* Creates a {@link BoundedSource} that will produce the specified number of elements,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nitpicks about this comment.

*/
@Deprecated
static BoundedSource<Long> createSourceForSubrange(long startOffset, long endOffset) {
checkArgument(endOffset > startOffset, "numElements (%s) must be greater than 0",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to mention in an error message the things that the user directly specifies. Referring to numElements implies that the user specified a parameter called numElements, which is not the case. Rephrase as "endOffset must be greater than startOffset"?

@@ -79,6 +79,26 @@ public static void addCountingAsserts(PCollection<Long> input, long numElements)
.isEqualTo(numElements - 1);
}

public static void addCountingAsserts(PCollection<Long> input, long start, long end) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this duplication of tests: what's the difference between CountingSourceTest and CountingInputTest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's the same. I also refactored&removed some duplicate code from CountingInputTest

@asfbot
Copy link

asfbot commented Dec 5, 2016

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/beam_PreCommit_Java_MavenInstall/5519/
--none--

@jkff
Copy link
Contributor

jkff commented Dec 5, 2016

Looks good as far as I'm concerned; I'll let JB do the rest of the review, thanks!

Copy link
Member

@jbonofre jbonofre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I gonna merge.

@asfgit asfgit closed this in 493c04f Dec 6, 2016
@dhalperi
Copy link
Contributor

dhalperi commented Dec 6, 2016

Minor drive-by comments:

  • Why only add this support to the bounded source? Presumably the unbounded source would benefit from the same change. In general, we strive for unified APIs where we can.
  • I probably would have added a test where start index (and/or end index) was negative, to sanity check those cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants