-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source Instatus: built in yaml #21008
Conversation
/test connector=connectors/source-instatus
Build FailedTest summary info:
|
/test connector=connectors/source-instatus |
/test connector=connectors/source-instatus
Build PassedTest summary info:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes are required. Please check the comments bellow.
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please add new line
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please add new line
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
} | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
type: string | ||
title: Rest API Key | ||
airbyte_secret: true | ||
description: INSTATUS REST API key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why upper-case letters are used?
401_ignore_requester: | ||
$ref: "*ref(definitions.requester)" | ||
error_handler: | ||
type: CompositeErrorHandler | ||
error_handlers: | ||
- type: DefaultErrorHandler | ||
response_filters: | ||
# 401 This user is not an owner of the status page | ||
- http_codes: [ 401 ] | ||
action: IGNORE | ||
- response_filters: | ||
- http_codes: [ 429 ] | ||
action: RETRY | ||
backoff_strategies: | ||
- type: "ConstantBackoffStrategy" | ||
backoff_time_in_seconds: 310 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to handle 422 and 403 or are these the exhausted list of exceptions for this connector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation doesn't have detailed information about the errors, so all I could do was handle the errors I encountered. 401 replaces 403 and 422 errors.
action: RETRY | ||
backoff_strategies: | ||
- type: "ConstantBackoffStrategy" | ||
backoff_time_in_seconds: 310 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does API return the RetryAfter
header for 429?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thank you, I replaced ConstantBackoffStrategy
with WaitTimeFromHeader
.
- type: "ConstantBackoffStrategy" | ||
backoff_time_in_seconds: 310 | ||
|
||
401_ignore_requester: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use a user-friendly name for this one? something like: api_error_handler_requester
?
@@ -0,0 +1,114 @@ | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module should be covered with unit/integration tests (up to you), as it overrides/extends the default components.
/test connector=connectors/source-instatus
Build PassedTest summary info:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, once custom components are covered with unit/integration tests. Nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good but I added a couple of suggestions on how we might streamline some of the components which had some repitition. Along with some formatting nits.
Also echoing @bazarnov to please add unit tests for the custom components and explain in the PR description why they were needed over the existing ones in the language. This feedback helps us figure out how we can enhance the language. Once we have those I will approve
from setuptools import find_packages, setup | ||
|
||
MAIN_REQUIREMENTS = [ | ||
"airbyte-cdk", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case to protect from potential breaking major changes in the future, can you pin this to "airbyte-cdk~=0.1"
so we will only get the latest minor and patch versions like it is in the setup.py template from the generator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
# 401 This user is not an owner of the status page | ||
- http_codes: [ 401 ] | ||
action: IGNORE | ||
- response_filters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, can you also include type: DefaultErrorHandler
on this second element of the error_handlers
array?
- type: DefaultErrorHandler
response_filters:
- http_codes: [ 429 ]
action: RETRY
backoff_strategies:
- type: WaitTimeFromHeader
header: "Retry-After"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
# 401 This user is not an owner of the status page | ||
- http_codes: [ 401 ] | ||
action: IGNORE | ||
- response_filters: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
type: BearerAuthenticator | ||
api_token: "{{ config['api_key'] }}" | ||
|
||
public_data_requester: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the public_data_requester
is duplicating almost everything from the requester
/api_error_handler_requester
. The main thing that appears to change is that the url_base is set to https://{{ stream_slice.url }}.instatus.com/summary.json
instead. Can we instead refactor this a bit so that everything can be defined on requester
and reduce the duplicated fields..
The public_data_requester
can just be written to override the one url_base
instead of the whole thing:
public_data_requester:
$ref: "*ref(definitions.requester)"
url_base: "https://{{ stream_slice.url }}.instatus.com/summary.json"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
- path: [ "updates_ids" ] | ||
value: "{{ record['updates'] }}" | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit extra space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
stream_slicer: | ||
$ref: "*ref(definitions.page_url_stream_slicer)" | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit extra space
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
$ref: "*ref(definitions.retriever)" | ||
requester: | ||
$ref: "*ref(definitions.public_data_requester)" | ||
path: "https://{{ stream_slice.url }}.instatus.com/summary.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To stick with convention we should separate the base_url
and the path
:
url_base: "https://{{ stream_slice.url }}.instatus.com/"
path: "summary.json"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I returned the previous version because it doesn't have stream_slice
at the url_base step, so I received an error while SAT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah i see. sorry for the mistake
@dataclass | ||
class ListAddFields(AddFields): | ||
""" | ||
ListAddFields uses to transform record by adding an ids from list object field to one list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For both of these custom components can you please include a quick summary in the PR description and in these comments why the existing AddFields
component and the UpdatesSubstreamSlicer
were not suitable for reuse by the Instatus API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added two custom components ListAddFields and UpdatesSubstreamSlicer.
This was done for the maintenance_updates_stream and incident_updates_stream to work correctly.
The work algorithm is as follows, in the parent streams maintenance and incident is a list of all update objects. The ListAddFields component selects their IDs from these objects and adds them to the new field "updates_ids", which is a list.
Next, UpdatesSubstreamSlicer creates a stream of slices, additionally iterating over the list of updates_ids and adding them to the slice with the correct parent id.
That is, ListAddFields solves the problem that we do not have a separate list of all updates_ids at the beginning and cannot extract them from the list of update objects. And UpdatesSubstreamSlicer has an additional iteration over the list of updates and creates a correct stream slice.
/test connector=connectors/source-instatus
Build FailedTest summary info:
|
Added unit tests and explanations about components. |
/test connector=connectors/source-instatus |
/test connector=connectors/source-instatus
Build FailedTest summary info:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding comments for the custom components and for updating the connector manifest. One more suggestion to reuse the schema_loader
component, and once the TestBasicRead
and TestFullRefresh
are passing then this is ready to be released.
stream_slice_field: "page_id" | ||
|
||
page_components_stream: | ||
schema_loader: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of defining this in every stream, if you were to reference the schema loader in the base stream, you should be able to avoid defining it in every stream.
base_stream:
retriever:
$ref: "*ref(definitions.retriever)"
schema_loader:
$ref: "*ref(definitions.schema_loader)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, done
/test connector=connectors/source-instatus
Build PassedTest summary info:
|
/publish connector=connectors/source-instatus
if you have connectors that successfully published but failed definition generation, follow step 4 here |
/test connector=connectors/source-instatus
Build PassedTest summary info:
|
Added source-instatus
Added two custom components ListAddFields and UpdatesSubstreamSlicer.
This was done for the maintenance_updates_stream and incident_updates_stream to work correctly.
The work algorithm is as follows, in the parent streams maintenance and incident is a list of all update objects. The ListAddFields component selects their IDs from these objects and adds them to the new list field "updates_ids".
Next, UpdatesSubstreamSlicer creates a stream of slices, iterating over the list of updates_ids and adding them to the slice with the correct parent id.
That is, ListAddFields solves the problem of not having a separate list of all updates_ids at the beginning and cannot extract them from the list of update objects. And UpdatesSubstreamSlicer has an additional iteration over the list of updates and creates a correct stream slice.