-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: handle sync abort, reduce duplicate STATE
messages, rename _MAX_RECORD_LIMIT
as ABORT_AT_RECORD_COUNT
#1436
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
added: RequestedAbortException, MaxRecordsLimitException, AbortedSyncExceptionBase, AbortedSyncFailedException, AbortedSyncPausedException
aaronsteers
changed the title
PR Patch: handle sync abort
fix: handle sync abort, reduce duplicate Feb 21, 2023
STATE
messages, rename _MAX_RECORD_LIMIT
as ABORT_AT_RECORD_COUNT
Codecov Report
@@ Coverage Diff @@
## main #1436 +/- ##
==========================================
- Coverage 85.73% 85.62% -0.11%
==========================================
Files 57 57
Lines 4689 4723 +34
Branches 801 806 +5
==========================================
+ Hits 4020 4044 +24
- Misses 481 487 +6
- Partials 188 192 +4
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
kgpayne
changed the base branch from
kgpayne/apply-max-records-limit-during-testing
to
main
March 30, 2023 14:24
kgpayne
changed the base branch from
main
to
kgpayne/apply-max-records-limit-during-testing
March 30, 2023 14:31
for more information, see https://pre-commit.ci
…during-testing' into feat-handle-sync-abort
for more information, see https://pre-commit.ci
…feat-handle-sync-abort
kgpayne
changed the base branch from
kgpayne/apply-max-records-limit-during-testing
to
main
March 30, 2023 15:03
@aaronsteers this is epic 🙌 I have:
|
kgpayne
approved these changes
Mar 30, 2023
edgarrmondragon
approved these changes
Mar 30, 2023
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is proposed to target PR #1399, either to merge into that PR or replace #1399 and merge to
main
.Importantly, this adds handling for how "abort" exceptions should be handled. There are two scenarios: either the stream successfully pauses the sync operation or else it raises a failure exception if pausing is not possible.
Notes:
Stream._MAX_RECORD_LIMIT
is renamed toStream.ABORT_AT_RECORD_COUNT
. This rename properly documents the original and valid purpose for this limit - specifically this is a record count after which to trigger abort of the stream sync.sync_all()
behavior can callsync_all()
directly, as this change reverts the behavior so thatsync_all()
will once again sync all records. This is the only way to ensure tests will get valid and finalizedstate
messages. Otherwise, the aborted streams will have bookmarks left in an unfinalized, non-resumeable state.run_connection_test()
has been refactored to call a new and more genericrun_sync_dry_run()
, which can accept an arbitrary max record abort threshold. Bothrun_connection_test()
andrun_sync_dry_run()
suppress the abort exceptions regardless of resumeability. In contrast,sync_all()
will trigger the same exceptions ifABORT_AT_RECORD_COUNT
is set and breached, but it will not suppress the exceptions as the two test-focused methods would doAbortedSyncPausedException
specifies that we are mostly meeting Singer Spec, with the one exception that we left records on the source server.AbortSyncFailedException
will be raised.Other fixes:
SQLStream
implementation with a SQLlimit()
constraint. This has been updated to bemax limit + 1
in order for the SDK to know whether or not there are more records not synced, so we can properly inform the caller whether the sync operation was prematurely canceled or not. (IfABORT_AT_RECORD_COUNT=100
and we have exactly 100 records in the source, then the sync operation can wrap with success. But if there are >=101 records in the source, we will report that the operation was aborted prematurely before syncing all records.TapTestRunner.tap
andTargetTestRunner.target
were creating a new tap or target upon each call to the property, which violates the expected behavior of a property returning a pointer to the same object on each subsequent call. I've renamed these to be regular methods,new_tap()
andnew_target()
, since that reflects the behavior as implemented..tap
and.target
properties, but this seemed likely to leak state across multiple tests that otherwise would be atomic. Hence, I thinknew_tap()
andnew_target()
as methods probably best describe the desired behavior here.STATE
message after every firstRECORD
message. This PR fixes the modulo math, which breaks pre-recorded tests that expected aSTATE
message after every firstRECORD
message.STATE
message. This PR adds an internal private memberStream._needs_state_flush
which is used to track whether one or moreRECORD
message has been written since the lastSTATE
message being sent. The final duplicateSTATE
message would no longer to be sent after this PR.Clarifications:
_MAX_RECORD_LIMIT
was applied to certain subclasses ofStream
. Given that the base class implementation ofStream._sync_records()
was applying the limit (and still does here in this PR), this should fully coverRESTStream
,SQLStream
, and all other subclasses ofStream
.Future work: