New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GOBBLIN-284] Add retry in SalesforceExtractor to handle transient ne… #2137
Conversation
@zxcware please review. |
this.bulkRecordCount++; | ||
|
||
// Insert records in record set until it reaches the batch size | ||
if (recordCount >= batchSize) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while
loop already has a check, is it necessary to have this extra check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the existing logic moved into a new method. The check here is required to chunk the stream into batches since the while loop is only checking for end of stream. This check could be moved into the while, but I think it is here for the extra logging when the condition is true.
// skip header | ||
reader.nextRecord(); | ||
|
||
int recordsToSkip = this.bulkRecordCount - this.prevBulkRecordCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a batch is processed one after the other, it's not necessary to skip but clearing the result set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there are two reasons for this.
- Clearing and recreating records would be more expensive than skipping.
- The reset moves to the beginning of the result set. There can be many 2K record batches in a result set and all except the last batch have been processed and given out of the extractor, so we need to skip over at least all batches prior to the last one.
if (!this.isBulkJobFinished()) { | ||
rs = getBulkData(); | ||
} | ||
// Skip empty result sets since they will cause the extractor to terminate early |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to do the skip loop in getBulkData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getBulkData
has a couple places where it returns, so I thought that this was cleaner and less error prone.
if (!this.isBulkJobFinished()) { | ||
rs = getBulkData(); | ||
} | ||
} while (rs != null && rs.isEmpty() && !this.isBulkJobFinished()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while ((rs == null || rs.isEmpty()) && !this.isBulkJobFinished()) {
rs = getBulkData();
}
} | ||
|
||
/** | ||
* Fetch a batch of records |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to make it clearer that this batch isn't the chunk batch, but the extractor is grouping the records
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Closes apache#2137 from htran1/salesforce_fetch_fixes
…twork errors
Dear Gobblin maintainers,
Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
JIRA
Description
Tests
Tested with a read pull of the Contact table.
Commits