New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream #28657
Conversation
Sources to test with:
|
/legacy-test connector=connectors/source-chartmogul local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-close-com local_cdk=1
Build FailedTest summary info:
|
/legacy-test connector=connectors/source-gainsight-px local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-google-pagespeed-insights local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-greenhouse local_cdk=1
Build FailedTest summary info:
|
/legacy-test connector=connectors/source-twilio local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-square local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-zenloop local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-pocket local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-monday local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-tempo local_cdk=1
Build PassedTest summary info:
|
close.com fails with the same error on master with local cdk |
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/requester.py
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/sources/declarative/declarative_stream.py
Outdated
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/connector_builder/connector_builder_handler.py
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/sources/declarative/declarative_stream.py
Outdated
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py
Outdated
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py
Show resolved
Hide resolved
paginator_method, stream_state=stream_state, stream_slice=stream_slice, next_page_token=next_page_token | ||
) | ||
stream_slicer_mapping, stream_slicer_keys = self._get_mapping( | ||
stream_slicer_method, stream_state=stream_state, stream_slice=stream_slice, next_page_token=next_page_token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think stream_state
is guaranteed to be equivalent to the self.state
we used to pass.
If the cursor is a PerPartitionCursor
, self.state is the full state while stream_state will be scoped to a single partition.
This might be fine in practice since I wouldn't expect many connectors to reference the stream state directly, but I think this would be a breaking change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a valid point, however I'm a little confused why we even need
if self.cursor and hasattr(self.cursor, "select_state"): # type: ignore
slice_state = self.cursor.select_state(stream_slice) # type: ignore
elif self.cursor:
slice_state = self.cursor.get_stream_state()
else:
slice_state = {}
then in the first place on master - from what I can tell all the things self._read_pages(self.parse_records, stream_slice, slice_state)
is calling are within the simple retriever and will ignore slice_state
eventually and use self.state
:
- _read_pages calls _fetch_next_page and the record generator function which is parse_records
- _fetch_next_page calls request_headers, path, request_params, request_body_json, request_body_data, but all of them are implemented on the simple retriever to ignore the passed in stream_state and use self.state
- parse_records calls parse_response
- parse_response calls select_records, but it's using self.state instead of the passed in state
In sum, the calculated slice_state is never accessed from what I can tell. Can we remove this whole thing? Or did I miss something here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I honestly don’t have a strong opinion on this… For me, the question is: what is the state interpolation interface we want to offer to the user? My guess is that we don’t want to user to have to find the state in an array of states when the state is per partition so I wouldn’t pass self.state
and keep the hack. However in practice, I don’t think we have a stream with state per partition that use state interpolation.
Do we still intend to remove the state from the interpolation altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @maxi297 . As it seems like it's actually not used right now and the current behavior is to pass down the full state, let's just keep that for now and split this decision out.
I removed the slice state calculation for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still intend to remove the state from the interpolation altogether?
Yes
airbyte-cdk/python/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py
Outdated
Show resolved
Hide resolved
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/http_requester.py
Outdated
Show resolved
Hide resolved
…simple-retriever-2
…simple-retriever-2
…rbytehq/airbyte into flash1293/decouple-simple-retriever-2
source-mssql test report (commit
|
Step | Result |
---|---|
Validate airbyte-integrations/connectors/source-mssql/metadata.yaml | ✅ |
Connector version semver check | ✅ |
QA checks | ✅ |
Build connector tar | ✅ |
Build source-mssql docker image for platform linux/x86_64 | ✅ |
./gradlew :airbyte-integrations:connectors:source-mssql:integrationTest | ❌ |
Acceptance tests | ✅ |
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-mssql test
I noticed a small issue - as we used to log requests on the retriever / token provider level, some responses are not logged because the requester is just throwing an exception instead of returning the response. I fixed this by moving the response logging logic into the requester where it belongs - this means we now capture all requests and responses, even if they end up erroring out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few nit comments. Amazing work @flash1293 !
airbyte-cdk/python/airbyte_cdk/connector_builder/connector_builder_handler.py
Outdated
Show resolved
Hide resolved
@@ -472,6 +466,14 @@ def _send(self, request: requests.PreparedRequest) -> requests.Response: | |||
) | |||
response: requests.Response = self._session.send(request) | |||
self.logger.debug("Receiving response", extra={"headers": response.headers, "status": response.status_code, "body": response.text}) | |||
if log_request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be reasonable to remove this boolean flag and always log the message if log_formatter
is not not None
? That would make the long list of parameters a little shorter
@@ -27,7 +27,7 @@ def get_request_params( | |||
stream_state: Optional[StreamState] = None, | |||
stream_slice: Optional[StreamSlice] = None, | |||
next_page_token: Optional[Mapping[str, Any]] = None, | |||
) -> Mapping[str, Any]: | |||
) -> MutableMapping[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be safe to change the RequestOptionsProvider's return type to Mapping
, but fine to do this separately
@@ -387,37 +276,59 @@ def next_page_token(self, response: requests.Response) -> Optional[Mapping[str, | |||
""" | |||
return self._paginator.next_page_token(response, self._records_from_last_response) | |||
|
|||
def _fetch_next_page( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for adding a comment here
airbyte-cdk/python/unit_tests/connector_builder/test_connector_builder_handler.py
Outdated
Show resolved
Hide resolved
@@ -704,10 +706,10 @@ def _create_response(body, request): | |||
|
|||
def _create_page(response_body): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think this method should be renamed since it returns a Response, not a pair of request,response
/legacy-test connector=connectors/source-chartmogul local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-gainsight-px local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-google-pagespeed-insights local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-twilio local_cdk=1
Build FailedTest summary info:
|
/legacy-test connector=connectors/source-square local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-zenloop local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-pocket local_cdk=1
Build PassedTest summary info:
|
/legacy-test connector=connectors/source-tempo local_cdk=1
Build PassedTest summary info:
|
* fix tests * format * review comments * Automated Commit - Formatting Changes * review comments * review comments * review comments * log all messages * log all message * review comments * review comments * Automated Commit - Formatting Changes * add comment --------- Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
* Add everything for BQ but migrate, refactor interface after practical work * Make new default methods, refactor to single implemented method * MigrationInterface and BQ impl created * Trying to integrate with standard inserts * remove unnecessary NameAndNamespacePair class * Shimmed in * Java Docs * Initial Testing Setup * Tests! * Move Migrator into TyperDeduper * Functional Migration * Add Integration Test * Pr updates * bump version * bump version * version bump * Update to airbyte-ci-internal (#29026) * 🐛 Source Github, Instagram, Zendesk-support, Zendesk-talk: fix CAT tests fail on `spec` (#28910) * connectors-ci: better modified connectors detection logic (#28855) * connectors-ci: report path should always start with `airbyte-ci/` (#29030) * make report path always start with airbyte-ci * revert report path in orchestrator * add more test cases * bump version * Updated docs (#29019) * CDK: Embedded reader utils (#28873) * relax pydantic dep * Automated Commit - Format and Process Resources Changes * wip * wrap up base integration * add init file * introduce CDK runner and improve error message * make state param optional * update protocol models * review comments * always run incremental if possible * fix --------- Co-authored-by: flash1293 <flash1293@users.noreply.github.com> * 🤖 Bump minor version of Airbyte CDK * 🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream (#28657) * fix tests * format * review comments * Automated Commit - Formatting Changes * review comments * review comments * review comments * log all messages * log all message * review comments * review comments * Automated Commit - Formatting Changes * add comment --------- Co-authored-by: flash1293 <flash1293@users.noreply.github.com> * 🤖 Bump minor version of Airbyte CDK * 🐛 Source Github, Instagram, Zendesk Support / Talk - revert `spec` changes and improve (#29031) * Source oauth0: new streams and fix incremental (#29001) * Add new streams Organizations,OrganizationMembers,OrganizationMemberRoles * relax schema definition to allow additional fields * Bump image tag version * revert some changes to the old schemas * Format python so gradle can pass * update incremental * remove unused print * fix unit test --------- Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com> * 🐛 Source Mongo: Fix failing acceptance tests (#28816) * Fix failing acceptance tests * Fix failing strict acceptance tests * Source-Greenhouse: Fix unit tests for new CDK version (#28969) Fix unit tests * Add CSV options to the CSV parser (#28491) * remove invalid legacy option * remove unused option * the tests pass but this is quite messy * very slight clean up * Add skip options to csv format * fix some of the typing issues * fixme comment * remove extra log message * fix typing issues * skip before header * skip after header * format * add another test * Automated Commit - Formatting Changes * auto generate column names * delete dead code * update title and description * true and false values * Update the tests * Add comment * missing test * rename * update expected spec * move to method * Update comment * fix typo * remove unused import * Add a comment * None records do not pass the WaitForDiscoverPolicy * format * remove second branch to ensure we always go through the same processing * Raise an exception if the record is None * reset * Update tests * handle unquoted newlines * Automated Commit - Formatting Changes * Update test case so the quoting is explicit * Update comment * Automated Commit - Formatting Changes * Fail validation if skipping rows before header and header is autogenerated * always fail if a record cannot be parsed * format * set write line_no in error message * remove none check * Automated Commit - Formatting Changes * enable autogenerate test * remove duplicate test * missing unit tests * Update * remove branching * remove unused none check * Update tests * remove branching * format * extract to function * comment * missing type * type annotation * use set * Document that the strings are case-sensitive * public -> private * add unit test * newline --------- Co-authored-by: girarda <girarda@users.noreply.github.com> * Dagster: Add sentry logging (#28822) * Add sentry * add sentry decorator * Add traces * Use sentry trace * Improve duplicate logging * Add comments * DNC * Fix up issues * Move to scopes * Remove breadcrumb * Update lock * ✨Source Shortio: Migrate Python CDK to Low-code CDK (#28950) * Migrate Shortio to Low-Code * Update abnormal state * Format * Update Docs * Fix metadata.yaml * Add pagination * Add incremental sync * add incremental parameters * update metadata * rollback update version * release date --------- Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> * Update to new verbiage (#29051) * [skip ci] Metadata: Remove leading underscore (#29024) * DNC * Add test models * Add model test * Remove underscore from metadata files * Regenerate models * Add test to check for key transformation * Allow additional fields on metadata * Delete transform * Proof of concept parallel source stream reading implementation for MySQL (#26580) * Proof of concept parallel source stream reading implementation for MySQL * Automated Change * Add read method that supports concurrent execution to Source interface * Remove parallel iterator * Ensure that executor service is stopped * Automated Commit - Format and Process Resources Changes * Expose method to fix compilation issue * Use concurrent map to avoid access issues * Automated Commit - Format and Process Resources Changes * Ensure concurrent streams finish before closing source * Fix compile issue * Formatting * Exclude concurrent stream threads from orphan thread watcher * Automated Commit - Format and Process Resources Changes * Refactor orphaned thread logic to account for concurrent execution * PR feedback * Implement readStreams in wrapper source * Automated Commit - Format and Process Resources Changes * Add readStream override * Automated Commit - Format and Process Resources Changes * 🤖 Auto format source-mysql code [skip ci] * 🤖 Auto format source-mysql code [skip ci] * 🤖 Auto format source-mysql code [skip ci] * 🤖 Auto format source-mysql code [skip ci] * 🤖 Auto format source-mysql code [skip ci] * Debug logging * Reduce logging level * Replace synchronized calls to System.out.println when concurrent * Close consumer * Flush before close * Automated Commit - Format and Process Resources Changes * Remove charset * Use ASCII and flush periodically for parallel streams * Test performance harness patch * Automated Commit - Format and Process Resources Changes * Cleanup * Logging to identify concurrent read enabled * Mark parameter as final --------- Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com> Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com> Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com> Co-authored-by: rodireich <rodireich@users.noreply.github.com> * connectors-ci: disable dependency scanning (#29033) * updates (#29059) * Metadata: skip breaking change validation on prerelease (#29017) * skip breaking change validation * Move ValidatorOpts higher in call * Add prerelease test * Fix test * ✨ Source MongoDB Internal POC: Generate Test Data (#29049) * Add script to generate test data * Fix prose * Update credentials example * PR feedback * Bump Airbyte version from 0.50.12 to 0.50.13 * Bump versions for mssql strict-encrypt (#28964) * Bump versions for mssql strict-encrypt * Fix failing test * Fix failing test * 🎨 Improve replication method selection UX (#28882) * update replication method in MySQL source * bump version * update expected specs * update registries * bump strict encrypt version * make password always_show * change url * update registries * 🐛 Avoid writing records to log (#29047) * Avoid writing records to log * Update version * Rollout ctid cdc (#28708) * source-postgres: enable ctid+cdc implementation * 100% ctid rollout for cdc * remove CtidFeatureFlags * fix CdcPostgresSourceAcceptanceTest * Bump versions and release notes * Fix compilation error due to previous merge --------- Co-authored-by: subodh <subodh1810@gmail.com> * connectors-ci: fix `unhashable type 'set'` (#29064) * Add Slack Alert lifecycle to Dagster for Metadata publish (#28759) * DNC * Add slack lifecycle logging * Update to use slack * Update slack to use resource and bot * Improve markdown * Improve log * Add sensor logging * Extend sensor time * merge conflict * PR Refactoring * Make the tests work * remove unnecessary classes, pr feedback * more merging * Update airbyte-integrations/bases/base-typing-deduping-test/src/main/java/io/airbyte/integrations/base/destination/typing_deduping/BaseSqlGeneratorIntegrationTest.java Co-authored-by: Edward Gao <edward.gao@airbyte.io> * snowflake updates --------- Co-authored-by: Ben Church <ben@airbyte.io> Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com> Co-authored-by: Augustin <augustin@airbyte.io> Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com> Co-authored-by: Joe Reuter <joe@airbyte.io> Co-authored-by: flash1293 <flash1293@users.noreply.github.com> Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com> Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com> Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io> Co-authored-by: Alexandre Girard <alexandre@airbyte.io> Co-authored-by: girarda <girarda@users.noreply.github.com> Co-authored-by: btkcodedev <btk.codedev@gmail.com> Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Natalie Kwong <38087517+nataliekwong@users.noreply.github.com> Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com> Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com> Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com> Co-authored-by: rodireich <rodireich@users.noreply.github.com> Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com> Co-authored-by: terencecho <terencecho@users.noreply.github.com> Co-authored-by: Lake Mossman <lake@airbyte.io> Co-authored-by: Benoit Moriceau <benoit@airbyte.io> Co-authored-by: subodh <subodh1810@gmail.com> Co-authored-by: Edward Gao <edward.gao@airbyte.io>
* Add everything for BQ but migrate, refactor interface after practical work * Make new default methods, refactor to single implemented method * MigrationInterface and BQ impl created * Trying to integrate with standard inserts * remove unnecessary NameAndNamespacePair class * Shimmed in * Java Docs * Initial Testing Setup * Tests! * Move Migrator into TyperDeduper * Functional Migration * Add Integration Test * Pr updates * bump version * bump version * version bump * Update to airbyte-ci-internal (airbytehq#29026) * 🐛 Source Github, Instagram, Zendesk-support, Zendesk-talk: fix CAT tests fail on `spec` (airbytehq#28910) * connectors-ci: better modified connectors detection logic (airbytehq#28855) * connectors-ci: report path should always start with `airbyte-ci/` (airbytehq#29030) * make report path always start with airbyte-ci * revert report path in orchestrator * add more test cases * bump version * Updated docs (airbytehq#29019) * CDK: Embedded reader utils (airbytehq#28873) * relax pydantic dep * Automated Commit - Format and Process Resources Changes * wip * wrap up base integration * add init file * introduce CDK runner and improve error message * make state param optional * update protocol models * review comments * always run incremental if possible * fix --------- Co-authored-by: flash1293 <flash1293@users.noreply.github.com> * 🤖 Bump minor version of Airbyte CDK * 🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream (airbytehq#28657) * fix tests * format * review comments * Automated Commit - Formatting Changes * review comments * review comments * review comments * log all messages * log all message * review comments * review comments * Automated Commit - Formatting Changes * add comment --------- Co-authored-by: flash1293 <flash1293@users.noreply.github.com> * 🤖 Bump minor version of Airbyte CDK * 🐛 Source Github, Instagram, Zendesk Support / Talk - revert `spec` changes and improve (airbytehq#29031) * Source oauth0: new streams and fix incremental (airbytehq#29001) * Add new streams Organizations,OrganizationMembers,OrganizationMemberRoles * relax schema definition to allow additional fields * Bump image tag version * revert some changes to the old schemas * Format python so gradle can pass * update incremental * remove unused print * fix unit test --------- Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com> * 🐛 Source Mongo: Fix failing acceptance tests (airbytehq#28816) * Fix failing acceptance tests * Fix failing strict acceptance tests * Source-Greenhouse: Fix unit tests for new CDK version (airbytehq#28969) Fix unit tests * Add CSV options to the CSV parser (airbytehq#28491) * remove invalid legacy option * remove unused option * the tests pass but this is quite messy * very slight clean up * Add skip options to csv format * fix some of the typing issues * fixme comment * remove extra log message * fix typing issues * skip before header * skip after header * format * add another test * Automated Commit - Formatting Changes * auto generate column names * delete dead code * update title and description * true and false values * Update the tests * Add comment * missing test * rename * update expected spec * move to method * Update comment * fix typo * remove unused import * Add a comment * None records do not pass the WaitForDiscoverPolicy * format * remove second branch to ensure we always go through the same processing * Raise an exception if the record is None * reset * Update tests * handle unquoted newlines * Automated Commit - Formatting Changes * Update test case so the quoting is explicit * Update comment * Automated Commit - Formatting Changes * Fail validation if skipping rows before header and header is autogenerated * always fail if a record cannot be parsed * format * set write line_no in error message * remove none check * Automated Commit - Formatting Changes * enable autogenerate test * remove duplicate test * missing unit tests * Update * remove branching * remove unused none check * Update tests * remove branching * format * extract to function * comment * missing type * type annotation * use set * Document that the strings are case-sensitive * public -> private * add unit test * newline --------- Co-authored-by: girarda <girarda@users.noreply.github.com> * Dagster: Add sentry logging (airbytehq#28822) * Add sentry * add sentry decorator * Add traces * Use sentry trace * Improve duplicate logging * Add comments * DNC * Fix up issues * Move to scopes * Remove breadcrumb * Update lock * ✨Source Shortio: Migrate Python CDK to Low-code CDK (airbytehq#28950) * Migrate Shortio to Low-Code * Update abnormal state * Format * Update Docs * Fix metadata.yaml * Add pagination * Add incremental sync * add incremental parameters * update metadata * rollback update version * release date --------- Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> * Update to new verbiage (airbytehq#29051) * [skip ci] Metadata: Remove leading underscore (airbytehq#29024) * DNC * Add test models * Add model test * Remove underscore from metadata files * Regenerate models * Add test to check for key transformation * Allow additional fields on metadata * Delete transform * Proof of concept parallel source stream reading implementation for MySQL (airbytehq#26580) * Proof of concept parallel source stream reading implementation for MySQL * Automated Change * Add read method that supports concurrent execution to Source interface * Remove parallel iterator * Ensure that executor service is stopped * Automated Commit - Format and Process Resources Changes * Expose method to fix compilation issue * Use concurrent map to avoid access issues * Automated Commit - Format and Process Resources Changes * Ensure concurrent streams finish before closing source * Fix compile issue * Formatting * Exclude concurrent stream threads from orphan thread watcher * Automated Commit - Format and Process Resources Changes * Refactor orphaned thread logic to account for concurrent execution * PR feedback * Implement readStreams in wrapper source * Automated Commit - Format and Process Resources Changes * Add readStream override * Automated Commit - Format and Process Resources Changes * 🤖 Auto format source-mysql code [skip ci] * 🤖 Auto format source-mysql code [skip ci] * 🤖 Auto format source-mysql code [skip ci] * 🤖 Auto format source-mysql code [skip ci] * 🤖 Auto format source-mysql code [skip ci] * Debug logging * Reduce logging level * Replace synchronized calls to System.out.println when concurrent * Close consumer * Flush before close * Automated Commit - Format and Process Resources Changes * Remove charset * Use ASCII and flush periodically for parallel streams * Test performance harness patch * Automated Commit - Format and Process Resources Changes * Cleanup * Logging to identify concurrent read enabled * Mark parameter as final --------- Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com> Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com> Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com> Co-authored-by: rodireich <rodireich@users.noreply.github.com> * connectors-ci: disable dependency scanning (airbytehq#29033) * updates (airbytehq#29059) * Metadata: skip breaking change validation on prerelease (airbytehq#29017) * skip breaking change validation * Move ValidatorOpts higher in call * Add prerelease test * Fix test * ✨ Source MongoDB Internal POC: Generate Test Data (airbytehq#29049) * Add script to generate test data * Fix prose * Update credentials example * PR feedback * Bump Airbyte version from 0.50.12 to 0.50.13 * Bump versions for mssql strict-encrypt (airbytehq#28964) * Bump versions for mssql strict-encrypt * Fix failing test * Fix failing test * 🎨 Improve replication method selection UX (airbytehq#28882) * update replication method in MySQL source * bump version * update expected specs * update registries * bump strict encrypt version * make password always_show * change url * update registries * 🐛 Avoid writing records to log (airbytehq#29047) * Avoid writing records to log * Update version * Rollout ctid cdc (airbytehq#28708) * source-postgres: enable ctid+cdc implementation * 100% ctid rollout for cdc * remove CtidFeatureFlags * fix CdcPostgresSourceAcceptanceTest * Bump versions and release notes * Fix compilation error due to previous merge --------- Co-authored-by: subodh <subodh1810@gmail.com> * connectors-ci: fix `unhashable type 'set'` (airbytehq#29064) * Add Slack Alert lifecycle to Dagster for Metadata publish (airbytehq#28759) * DNC * Add slack lifecycle logging * Update to use slack * Update slack to use resource and bot * Improve markdown * Improve log * Add sensor logging * Extend sensor time * merge conflict * PR Refactoring * Make the tests work * remove unnecessary classes, pr feedback * more merging * Update airbyte-integrations/bases/base-typing-deduping-test/src/main/java/io/airbyte/integrations/base/destination/typing_deduping/BaseSqlGeneratorIntegrationTest.java Co-authored-by: Edward Gao <edward.gao@airbyte.io> * snowflake updates --------- Co-authored-by: Ben Church <ben@airbyte.io> Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com> Co-authored-by: Augustin <augustin@airbyte.io> Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com> Co-authored-by: Joe Reuter <joe@airbyte.io> Co-authored-by: flash1293 <flash1293@users.noreply.github.com> Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com> Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com> Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io> Co-authored-by: Alexandre Girard <alexandre@airbyte.io> Co-authored-by: girarda <girarda@users.noreply.github.com> Co-authored-by: btkcodedev <btk.codedev@gmail.com> Co-authored-by: marcosmarxm <marcosmarxm@gmail.com> Co-authored-by: Natalie Kwong <38087517+nataliekwong@users.noreply.github.com> Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com> Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com> Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com> Co-authored-by: rodireich <rodireich@users.noreply.github.com> Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com> Co-authored-by: terencecho <terencecho@users.noreply.github.com> Co-authored-by: Lake Mossman <lake@airbyte.io> Co-authored-by: Benoit Moriceau <benoit@airbyte.io> Co-authored-by: subodh <subodh1810@gmail.com> Co-authored-by: Edward Gao <edward.gao@airbyte.io>
What
Closes #28530
This PR decouples the SimpleRetriever class from the HttpStream class and uses the request-making functionality of the HttpRequester instead.
There are two interface changes in the public interfaces of
Retriever
andRequester
:Requester
breaking changeThe
request_kwargs
abstract method was removed from the interface (they were previously unused).Retriever
breaking changeread_records
method does not takesync_mode
,cursor_field
andstream_state
as parameters anymorestream_slices
method does not takesync_mode
andstream_state
as parameters anymoreDownstream connector changes:
How
The main change is in
airbyte-cdk/python/airbyte_cdk/sources/declarative/retrievers/simple_retriever.py
SimpleRetriever
HttpStream
(things around http method, url base and retries)_fetch_next_page
and_read_pages
locally which uses requester.send_request to send out the actual requestChanges in
airbyte-cdk/python/airbyte_cdk/sources/declarative/requesters/http_requester.py
:request_kwargs
as it is not usedChanges in
airbyte-cdk/python/airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
:Changes in
airbyte-cdk/python/airbyte_cdk/sources/declarative/declarative_stream.py
:Changes in
airbyte-cdk/python/airbyte_cdk/connector_builder/connector_builder_handler.py
: