-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source Zendesk Support: handle 400 error for tickets stream #23246
Source Zendesk Support: handle 400 error for tickets stream #23246
Conversation
/test connector=connectors/source-zendesk-support
Build PassedTest summary info:
|
Removing my review in favor of @erohmensing |
@roman-yermilov-gl Am I missing something - I thought this was already added in version 2.22.2 here. Did that code get removed from master somehow? 🤔 |
try: | ||
yield from super(Tickets, self).read_records(sync_mode, cursor_field, stream_slice, stream_state) | ||
except HTTPError as e: | ||
if e.response.status_code == 400 and e.response.json().get('error', '') == 'StartTimeTooRecent': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming super(Tickets, self).read_records(sync_mode, cursor_field, stream_slice, stream_state)
would yield [1, 2, <raise HTTPError>, 4, 5]
, we would not get entries 4
and 5
. Is this a case that can happen here? If so, is this the expected behaviour?
Else, is this a case of ErrorHandler that would return ResponseAction.IGNORE
when this specific condition is met?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would yield from []
instead of return []
help here at all, or are we still out of the generator original yield from
loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely assumed that the third record would return [] and 4 and 5 would still be returned last time I looked at this, so thanks for raising
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxi297 @erohmensing
Why this scenario should even happen? As I see from code the .read_records()
is called per slice, not per page. It means that if a slice has wrong dates then it fails entirely on first HTTP query with those wrong dates and no pagination will happen. Could you please describe how the error can happen in the middle of per-page iteration or maybe I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just wondering what should be the behaviour there. It feels wrong to me to just return []
for a slice since we don't know the records that will be provided for this slice. Will the AbstractSource consider this slice as "done" and update the stream state? If this is the case, we will miss records when the startTime won't be too recent anymore. For me, this HTTPError is an error and we should handle it as such
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case also should never happen. In normal case we should not have slices when start_date > now. But for case when state is abnormal we must return empty slice (at least acceptance tests expect us to do it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I run tests the stream fails with error:
ERROR root:connector_runner.py:172 Docker container failed, code 1, error:
{"type": "TRACE", "trace": {"type": "ERROR", ..., "error": {"message": "StartTimeTooRecent", ...}
Results (431.74s):
43 passed
1 failed
- test_incremental.py:262 TestIncremental.test_state_with_abnormally_large_values[inputs0]
3 skipped
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 120, in read
yield from self._read_stream(
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 189, in _read_stream
for record in record_iterator:
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/abstract_source.py", line 256, in _read_incremental
for message_counter, record_data_or_message in enumerate(records, start=1):
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 419, in read_records
yield from self._read_pages(
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 435, in _read_pages
request, response = self._fetch_next_page(stream_slice, stream_state, next_page_token)
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 458, in _fetch_next_page
response = self._send_request(request, request_kwargs)
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 360, in _send_request
return backoff_handler(user_backoff_handler)(request, request_kwargs)
File "/usr/local/lib/python3.9/site-packages/backoff/_sync.py", line 105, in retry
ret = target(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/backoff/_sync.py", line 105, in retry
ret = target(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 327, in _send
raise exc
File "/usr/local/lib/python3.9/site-packages/airbyte_cdk/sources/streams/http/http.py", line 324, in _send
response.raise_for_status()
File "/usr/local/lib/python3.9/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://d3v-airbyte.zendesk.com/api/v2/incremental/tickets.json?start_time=1677178666
So the only reason I made the fix is to fix error above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxi297
I made the fix in a different way. Increased recent time gap for Tickets stream
@erohmensing It's look like Charles revert it - #22483 (comment) |
Yes, It was rolled back because of code freeze |
1ad5d2c
to
563de78
Compare
/test connector=connectors/source-zendesk-support
Build PassedTest summary info:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just a non-blocking comment
@@ -544,6 +544,10 @@ class Tickets(SourceZendeskIncrementalExportStream): | |||
response_list_name: str = "tickets" | |||
transformer: TypeTransformer = TypeTransformer(TransformConfig.DefaultSchemaNormalization) | |||
|
|||
@staticmethod | |||
def check_start_time_param(requested_start_time: int, value: int = 1): | |||
return SourceZendeskIncrementalExportStream.check_start_time_param(requested_start_time, value=3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a description why the value is 3 minutes lookback instead of 1? This will help the devs maintain this class in case there is a change made to check_start_time_param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Added description
/publish connector=connectors/source-zendesk-support
if you have connectors that successfully published but failed definition generation, follow step 4 here |
…q#23246) * Source Zendesk Support: increase recent start time for ticket stream * Source Zendesk Support: added more comments * auto-bump connector version --------- Co-authored-by: Octavia Squidington III <octavia-squidington-iii@users.noreply.github.com>
What
Handle 400 error for Tickets stream