-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[airbyte-cdk] Updates Low Code CDK ErrorHandlers and BackoffStrategies to align with Python CDK Interfaces #38743
[airbyte-cdk] Updates Low Code CDK ErrorHandlers and BackoffStrategies to align with Python CDK Interfaces #38743
Conversation
…w error handling flow
…and requester to align with python cdk error handler and backoff strategy. Removes `ResponseStatus`.
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
…Removes response status unit tests.
139d532
to
fec77c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some comments around some nits in code style and we also already talked about adjusting the relevant docstrings so won't add comments to those. Overall though I like this incremental approach to releasing this. This definitely feels like the trickiest part and hopefully actually using the http client to make the send ends up being easier.
I think the main thing to talk about is the impact of these changes on existing connectors that either implement the BackoffStrategy
interface or override existing backoff strategy components. With some of the naming changes and parameters change, those will no longer work. I think it would be good to get a sense for the impact of how many are affected.
Have you tested these against some existing connectors? And in addition to documenting the path to migrating a component to the new interface, we should at least test that process against at least one connector that implements a custom backoff strategy/
...yte-cdk/python/airbyte_cdk/sources/declarative/requesters/error_handlers/backoff_strategy.py
Outdated
Show resolved
Hide resolved
...cdk/python/airbyte_cdk/sources/declarative/requesters/error_handlers/http_response_filter.py
Show resolved
Hide resolved
...cdk/python/airbyte_cdk/sources/declarative/requesters/error_handlers/http_response_filter.py
Outdated
Show resolved
Hide resolved
...cdk/python/airbyte_cdk/sources/declarative/requesters/error_handlers/http_response_filter.py
Outdated
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
Outdated
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
Outdated
Show resolved
Hide resolved
...dk/python/airbyte_cdk/sources/declarative/requesters/error_handlers/default_error_handler.py
Show resolved
Hide resolved
…empt_count to HttpClient
…o retain current behavior
if not isinstance(matched_error_resolution, ErrorResolution): | ||
continue | ||
|
||
if matched_error_resolution.response_action == ResponseAction.SUCCESS: | ||
return matched_error_resolution | ||
|
||
if ( | ||
matched_error_resolution.response_action == ResponseAction.RETRY | ||
or matched_error_resolution.response_action == ResponseAction.IGNORE | ||
): | ||
return matched_error_resolution | ||
if matched_error_resolution: | ||
return matched_error_resolution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not isinstance(matched_error_resolution, ErrorResolution): | |
continue | |
if matched_error_resolution.response_action == ResponseAction.SUCCESS: | |
return matched_error_resolution | |
if ( | |
matched_error_resolution.response_action == ResponseAction.RETRY | |
or matched_error_resolution.response_action == ResponseAction.IGNORE | |
): | |
return matched_error_resolution | |
if matched_error_resolution: | |
return matched_error_resolution | |
if not isinstance(matched_error_resolution, ErrorResolution): | |
continue | |
else: | |
return matched_error_resolution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thought behind the current indentation levels/response hierarchies is:
- Iterate over each error handler to determine:
a. If response is SUCCESS, return success resolution
b. If response is RETRY or IGNORE, return resolution
c. If response is FAIL, iterate to next error handler to see if there is a "better" specified action for the given response - After iterating over all error handlers, if there is a matched error resolution, (it would presumably be a FAIL response action, otherwise it would have already been returned), then return the resolution
- Otherwise, return a default error resolution (RETRY w/ system_error)
The thought here is we don't want to prematurely fail if there is an error handler with a response filter that has a "better" outcome. IMO this makes a lot of assumptions about the connector dev's intentions, but while the CompositeErrorHandler still exists, I'm trying to maintain its behavior as much as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then should it be like this?
def interpret_response(self, response_or_exception: Optional[Union[requests.Response, Exception]]) -> ErrorResolution:
best_resolution = None
for error_handler in self.error_handlers:
error_resolution = error_handler.interpret_response(response_or_exception)
# Check if error_resolution is valid and prioritize based on action
if isinstance(error_resolution, ErrorResolution):
if error_resolution.response_action == ResponseAction.SUCCESS:
return error_resolution # Immediate return on success
elif error_resolution.response_action in (ResponseAction.RETRY, ResponseAction.IGNORE):
best_resolution = error_resolution # Store RETRY or IGNORE as potential best outcomes
elif error_resolution.response_action == ResponseAction.FAIL and not best_resolution:
best_resolution = error_resolution # Store FAIL only if no better alternative has been found
# Return the best resolution found, default to RETRY if none provided
return best_resolution if best_resolution else ErrorResolution(ResponseAction.RETRY)
Why do we not have the same implementation for DefaultErrorHandler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few small comments to address, but nothing that needs to block approval. nice work!
A few follow up questions:
Were you able to trigger any live test runs using local CDK to verify the changes on real connector that you can include in a comment?
One question about the release, we had originally discussed either committing to a temp branch that the last PR would go into or alternatively releasing this w/o a publish. Which did you decide on?
And did you create follow up tickets for the other work that we didn't scope in. I know the http client work is tracked, but are we tracking the work to deprecate CompositeErrorHandler?
...cdk/python/airbyte_cdk/sources/declarative/requesters/error_handlers/http_response_filter.py
Outdated
Show resolved
Hide resolved
...yte-cdk/python/airbyte_cdk/sources/declarative/requesters/error_handlers/backoff_strategy.py
Outdated
Show resolved
Hide resolved
if should_retry == response_status.IGNORE or should_retry.action == ResponseAction.RETRY: | ||
return should_retry | ||
return should_retry | ||
return max([error_handler.max_time or 0 for error_handler in self.error_handlers]) # type: ignore # property not defined in ErrorHandler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is interesting. What is the exact error that mypy raises when we don't have this ignore annotation. The ErrorHandler
interface does have a max_time
property so I'm surprised this gets raised. Not a blocker, but it is curious to know if we can avoid the annotation because the code itself looks fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
airbyte_cdk/sources/declarative/requesters/error_handlers/composite_error_handler.py:51: error: "ErrorHandler" has no attribute "max_time" [attr-defined]
This is because the new parent class is the Python CDK ErrorHandler which does not have the max_time/max_retries, etc.
...ative/requesters/error_handlers/backoff_strategies/wait_time_from_header_backoff_strategy.py
Outdated
Show resolved
Hide resolved
…eTracedException is raised
What
ErrorHandler
and implementations to align with Python CDK interfaceBackoffStrategy
and implementations to align with Python CDK interfaceResponseStatus
class in favor ofErrorResolution
dataclass, which encapsulates the response action, the failure type, and an associated error message.HttpResponseFilter
to act as the error mapping / matching mechanism for the low code CDK -> continues to map http status codes, predicates to declared action, but will default to theHttpStatusErrorHandler
mapping from the Python CDK, when no mapping is provided. Returns anErrorResolution
. Default failure type issystem_failure
unless otherwise mapped.DefaultHttpResponseFilter
to default to after iterating through the underlyingHttpResponseFilters
of a givenDefaultErrorHandler
.HttpRequester
to follow new request management flow and response action execution.AbstractSource
to emit control messages when an AirbyteTracedException is raised during a read operation.Why
Update is to align the low-code CDK interfaces with common Python CDK interfaces and to provide sensible error defaults for response actions and failure types for all low-code connectors.
Recommended Reading Order
ErrorHandler
classesBackoffStrategy
classesHttpResponseFilter
DefaultHttpResponseFilter
HttpRequester
(note:ErrorStatus
class has been removed)abstract_source.py
model_to_component_factory.py
Other Notes
CompositeErrorHandler
class, that will be in a subsequent PR.ErrorHandler
orBackoffStrategy
interfaces include themax_retries
andmax_time
properties by default, I have maintained inclusion of these within the low-code CDK implementations of theErrorHandler
interfaces.