Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GOBBLIN-492] Refactor LoopingDatasetFinderSource to make it extensible #2363

Closed
wants to merge 2 commits into from

Conversation

autumnust
Copy link
Contributor

Dear Gobblin maintainers,

Some changes to make DeepIterator extensible and easily being overwritten by different implementation of iterators. They includes:

  • Several signature change.
  • Create getWorkUnitIterator method to be overwritten in subclass.
  • Create doComputeNext method to separate the logic of getting next available dataset object (Can be a dataset or a partition) from DeepIterator itself, so that different iterator that extends it can inherit the existing functionality of looping datasets, watermarks, etc. but have its own logic on how to fetch the next available dataset.

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Copy link
Contributor

@ibuenros ibuenros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Looks good other than one comment.

*/
protected Iterator<WorkUnit> getWorkUnitIterator(Iterator<Dataset> datasetIterator, Optional
<WorkUnitState> maxWorkUnit, int maximumWorkUnits) throws IOException {
String previousDatasetUrnWatermark = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to move inside the method?

@asfgit asfgit closed this in 1c2d30b May 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants