Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connectors-ci: better modified connectors detection logic #28855

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented Jul 31, 2023

What

Closes #28856
Closes #28419

Bug: If a merged PR changes one metadata.yaml but the PR potentially touches multiple connector due to dependency update or other connector change without version bump the publish workflow will publish all the connectors, leading to long, unexpected publish workflow.

Solution: Only publish connectors for which a metadata.yaml file was updated.

How

When --modified is used in the publish command, we only keep connectors that have a modified metadata.yaml file.
This required changing the logic how connectors are selected by command (test / publish / build).

All the connectors command share the same way of selecting connectors by name, language or release stage via the connectors command group.
How connectors get added to the selected_connectors_and_file list is a per command logic (test / publish / build)

Recommended reading order

https://github.com/airbytehq/airbyte/blob/0238aca3bfb006db73e58eaa1932914914388a8e/airbyte-ci/connectors/pipelines/pipelines/commands/groups/connectors.py#L308

def test_test_command_select_modified_connectors(mocker, runner: CliRunner, click_context_obj: dict, new_connector: Connector):

@alafanechere alafanechere force-pushed the augustin/connectors-ci/better-modified-connectors-detection branch from a6f3247 to 0238aca Compare July 31, 2023 11:46
connectors_and_files_to_format = [
(connector, modified_files) for connector, modified_files in ctx.obj["selected_connectors_and_files"].items()
]
def format_code(ctx: click.Context) -> bool:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small renaming, format is a reserved function name in python.

@alafanechere alafanechere force-pushed the augustin/connectors-ci/better-modified-connectors-detection branch from 0238aca to aabbf8c Compare July 31, 2023 12:04
@alafanechere alafanechere marked this pull request as ready for review July 31, 2023 12:06
Copy link
Contributor

@pedroslopez pedroslopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the general approach of only considering metadata changes when publishing, however, I think there's a general change in behavior that's going on here.

Previously, the --modified, --name, --language, etc flags were working as an and. So if you have --modified and --language, it would only affect modified connectors whose language is as specified. Now, it's working as an or, so the same combination of args would affect modified connectors, as well as any connector with the specified language.

IMO I think the and is more useful and is how we implemented the stopgap for java connector publishing in #28344. Please let me know if there's something I'm missing!

Comment on lines 97 to 98
ctx.obj["selected_connectors_by_names"] = {Connector(technical_name=name) for name in names}
ctx.obj["selected_connectors_by_languages"] = {connector for connector in ctx.obj["all_connectors"] if connector.language in languages}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should selected_connectors_by_names also be using ctx.obj["all_connectors"] like *_by_languages and *_by_release_stages is doing, rather than building a new Connector with only the technical name?

**ctx.obj["selected_connectors_and_files"],
**get_modified_connectors_and_files(ctx.obj["modified_files"]),
}
main_logger.info(ctx.obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is logging the whole ctx.obj intentional? Asking since it can include secrets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this. A debugging leftover...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we still need to remove this guy! 😅


ctx.obj["selected_connectors_and_files"] = selected_connectors_and_files
ctx.obj["selected_connectors_names"] = [c.technical_name for c in selected_connectors_and_files.keys()]
ctx.obj["all_connectors"] = get_all_released_connectors()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the get_all_released_connectors() name is a bit confusing. Reading the name and seeing released I would expect this to only have things that are published, but I think it returns all connectors.

What does released refer to here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Let me find a better name.

Comment on lines 127 to 130
ctx.obj["selected_connectors_and_files"] = {
**ctx.obj["selected_connectors_and_files"],
**get_modified_connectors_and_files(ctx.obj["modified_files"]),
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand what's going on here with the **?

I think this means "merge ctx.obj["selected_connectors_and_files"] with get_modified_connectors_and_files(ctx.obj["modified_files"]), but shouldn't select_modified_connectors only have the result of get_modified_connectors_and_files?

It seems like this is adding what you've selected with --name or other args with the modified connectors. But I would expect the --name, --language, etc to filter the modified connectors. At least that was the way it was working before and is how I stopped java connectors from publishing here #28344

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep ** expands a dictionnary, in this context it indeeds merge two dictonaries into one.
And yes you're right it slightly changes the logic. I'll try to revert to the original.
The expected logic is:

  • Combination of same options like --name, --language etc. act as OR
  • Combination of different options are AND --language=python --language=java --release-stage=generally_available should only test Python and java GA connectors.

I'll make sure to add a test for this.

Copy link
Contributor

@pedroslopez pedroslopez Aug 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alafanechere I think we're still or-ing the modified filter with the other filters. Why does --modified not work like the others?

For example --modified --language=java would select only the modified java connectors. Maybe I'm not reading this right...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pedroslopez You're right again 😮‍💨 I originally thought it was OK to or the modified connectors but for the publish use case is was cool to just bypass specific connector language.
I changed the code structure again to centralize the logic even more and added thorough unit tests.

@alafanechere
Copy link
Contributor Author

@pedroslopez would you mind giving a second review? I've changed the selection logic:
https://github.com/airbytehq/airbyte/pull/28855/files#diff-2bb66e753e4d46b035f7fa29fbe9028bc48cdb76d843fa7c0bb691398cb646b4R58

I'm currently unit testing it.

Copy link
Contributor

@bnchrch bnchrch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

**ctx.obj["selected_connectors_and_files"],
**get_modified_connectors_and_files(ctx.obj["modified_files"]),
}
main_logger.info(ctx.obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we still need to remove this guy! 😅

@octavia-squidington-iii
Copy link
Collaborator

source-openweather test report (commit 1da9809dc9) - ❌

⏲️ Total pipeline duration: 16.79s

Step Result

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-openweather test

@octavia-squidington-iii
Copy link
Collaborator

source-openweather test report (commit 351725fd0c) - ❌

⏲️ Total pipeline duration: 02mn42s

Step Result
Validate airbyte-integrations/connectors/source-openweather/metadata.yaml
Connector version semver check
QA checks
Code format checks
Connector package install
Build source-openweather docker image for platform linux/x86_64
Unit tests
Acceptance tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-openweather test

@octavia-squidington-iii
Copy link
Collaborator

source-openweather test report (commit a2fe34bbdc) - ✅

⏲️ Total pipeline duration: 02mn23s

Step Result
Validate airbyte-integrations/connectors/source-openweather/metadata.yaml
Connector version semver check
QA checks
Code format checks
Connector package install
Build source-openweather docker image for platform linux/x86_64
Unit tests
Acceptance tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-openweather test

@octavia-squidington-iii
Copy link
Collaborator

source-faker test report (commit a2fe34bbdc) - ✅

⏲️ Total pipeline duration: 02mn28s

Step Result
Validate airbyte-integrations/connectors/source-faker/metadata.yaml
Connector version semver check
QA checks
Code format checks
Connector package install
Build source-faker docker image for platform linux/x86_64
Unit tests
Acceptance tests

🔗 View the logs here

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=source-faker test

Copy link
Contributor

@pedroslopez pedroslopez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

giphy

Some comments, but looks good! Thank you 😄

Comment on lines 86 to 88
selected_connectors_by_name = validate_selected_connectors_exist(
{Connector(technical_name=name) for name in selected_names}, all_connectors
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not filter on all_connectors like the others?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it for consistency. Connector name validation is now happening at click option parsing 26a68e5

Comment on lines -90 to -93
ctx.obj["connector_names"] = names
ctx.obj["connector_languages"] = languages
ctx.obj["release_states"] = release_stages
ctx.obj["modified"] = modified
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after this change we won't be setting this fields - just confirming, do we need or reference this context anywhere else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think they're used elsewhere. Will confirm by running a pre-release. I'm not against re-adding these if needed but ATM I believe keeping ctx.obj with used fields is better for readability / debugging.

@alafanechere alafanechere enabled auto-merge (squash) August 3, 2023 08:08
@alafanechere alafanechere merged commit b189548 into master Aug 3, 2023
19 checks passed
@alafanechere alafanechere deleted the augustin/connectors-ci/better-modified-connectors-detection branch August 3, 2023 08:27
jbfbell added a commit that referenced this pull request Aug 9, 2023
* Add everything for BQ but migrate, refactor interface after practical work

* Make new default methods, refactor to single implemented method

* MigrationInterface and BQ impl created

* Trying to integrate with standard inserts

* remove unnecessary NameAndNamespacePair class

* Shimmed in

* Java Docs

* Initial Testing Setup

* Tests!

* Move Migrator into TyperDeduper

* Functional Migration

* Add Integration Test

* Pr updates

* bump version

* bump version

* version bump

* Update to airbyte-ci-internal (#29026)

* 🐛 Source Github, Instagram, Zendesk-support, Zendesk-talk: fix CAT tests fail on `spec` (#28910)

* connectors-ci: better modified connectors detection logic (#28855)

* connectors-ci: report path should always start with `airbyte-ci/` (#29030)

* make report path always start with airbyte-ci

* revert report path in orchestrator

* add more test cases

* bump version

* Updated docs (#29019)

* CDK: Embedded reader utils (#28873)

* relax pydantic dep

* Automated Commit - Format and Process Resources Changes

* wip

* wrap up base integration

* add init file

* introduce CDK runner and improve error message

* make state param optional

* update protocol models

* review comments

* always run incremental if possible

* fix

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream (#28657)

* fix tests

* format

* review comments

* Automated Commit - Formatting Changes

* review comments

* review comments

* review comments

* log all messages

* log all message

* review comments

* review comments

* Automated Commit - Formatting Changes

* add comment

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🐛 Source Github, Instagram, Zendesk Support / Talk - revert `spec` changes and improve (#29031)

* Source oauth0: new streams and fix incremental (#29001)

* Add new streams Organizations,OrganizationMembers,OrganizationMemberRoles

* relax schema definition to allow additional fields

* Bump image tag version

* revert some changes to the old schemas

* Format python so gradle can pass

* update incremental

* remove unused print

* fix unit test

---------

Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>

* 🐛 Source Mongo: Fix failing acceptance tests (#28816)

* Fix failing acceptance tests

* Fix failing strict acceptance tests

* Source-Greenhouse: Fix unit tests for new CDK version (#28969)

Fix unit tests

* Add CSV options to the CSV parser (#28491)

* remove invalid legacy option

* remove unused option

* the tests pass but this is quite messy

* very slight clean up

* Add skip options to csv format

* fix some of the typing issues

* fixme comment

* remove extra log message

* fix typing issues

* skip before header

* skip after header

* format

* add another test

* Automated Commit - Formatting Changes

* auto generate column names

* delete dead code

* update title and description

* true and false values

* Update the tests

* Add comment

* missing test

* rename

* update expected spec

* move to method

* Update comment

* fix typo

* remove unused import

* Add a comment

* None records do not pass the WaitForDiscoverPolicy

* format

* remove second branch to ensure we always go through the same processing

* Raise an exception if the record is None

* reset

* Update tests

* handle unquoted newlines

* Automated Commit - Formatting Changes

* Update test case so the quoting is explicit

* Update comment

* Automated Commit - Formatting Changes

* Fail validation if skipping rows before header and header is autogenerated

* always fail if a record cannot be parsed

* format

* set write line_no in error message

* remove none check

* Automated Commit - Formatting Changes

* enable autogenerate test

* remove duplicate test

* missing unit tests

* Update

* remove branching

* remove unused none check

* Update tests

* remove branching

* format

* extract to function

* comment

* missing type

* type annotation

* use set

* Document that the strings are case-sensitive

* public -> private

* add unit test

* newline

---------

Co-authored-by: girarda <girarda@users.noreply.github.com>

* Dagster: Add sentry logging (#28822)

* Add sentry

* add sentry decorator

* Add traces

* Use sentry trace

* Improve duplicate logging

* Add comments

* DNC

* Fix up issues

* Move to scopes

* Remove breadcrumb

* Update lock

* ✨Source Shortio: Migrate Python CDK to Low-code CDK (#28950)

* Migrate Shortio to Low-Code

* Update abnormal state

* Format

* Update Docs

* Fix metadata.yaml

* Add pagination

* Add incremental sync

* add incremental parameters

* update metadata

* rollback update version

* release date

---------

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Update to new verbiage (#29051)

* [skip ci] Metadata: Remove leading underscore (#29024)

* DNC

* Add test models

* Add model test

* Remove underscore from metadata files

* Regenerate models

* Add test to check for key transformation

* Allow additional fields on metadata

* Delete transform

* Proof of concept parallel source stream reading implementation for MySQL (#26580)

* Proof of concept parallel source stream reading implementation for MySQL

* Automated Change

* Add read method that supports concurrent execution to Source interface

* Remove parallel iterator

* Ensure that executor service is stopped

* Automated Commit - Format and Process Resources Changes

* Expose method to fix compilation issue

* Use concurrent map to avoid access issues

* Automated Commit - Format and Process Resources Changes

* Ensure concurrent streams finish before closing source

* Fix compile issue

* Formatting

* Exclude concurrent stream threads from orphan thread watcher

* Automated Commit - Format and Process Resources Changes

* Refactor orphaned thread logic to account for concurrent execution

* PR feedback

* Implement readStreams in wrapper source

* Automated Commit - Format and Process Resources Changes

* Add readStream override

* Automated Commit - Format and Process Resources Changes

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* Debug logging

* Reduce logging level

* Replace synchronized calls to System.out.println when concurrent

* Close consumer

* Flush before close

* Automated Commit - Format and Process Resources Changes

* Remove charset

* Use ASCII and flush periodically for parallel streams

* Test performance harness patch

* Automated Commit - Format and Process Resources Changes

* Cleanup

* Logging to identify concurrent read enabled

* Mark parameter as final

---------

Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>

* connectors-ci: disable dependency scanning (#29033)

* updates (#29059)

* Metadata: skip breaking change validation on prerelease (#29017)

* skip breaking change validation

* Move ValidatorOpts higher in call

* Add prerelease test

* Fix test

* ✨ Source MongoDB Internal POC: Generate Test Data (#29049)

* Add script to generate test data

* Fix prose

* Update credentials example

* PR feedback

* Bump Airbyte version from 0.50.12 to 0.50.13

* Bump versions for mssql strict-encrypt (#28964)

* Bump versions for mssql strict-encrypt

* Fix failing test

* Fix failing test

* 🎨 Improve replication method selection UX (#28882)

* update replication method in MySQL source

* bump version

* update expected specs

* update registries

* bump strict encrypt version

* make password always_show

* change url

* update registries

* 🐛 Avoid writing records to log (#29047)

* Avoid writing records to log

* Update version

* Rollout ctid cdc (#28708)

* source-postgres: enable ctid+cdc implementation

* 100% ctid rollout for cdc

* remove CtidFeatureFlags

* fix CdcPostgresSourceAcceptanceTest

* Bump versions and release notes

* Fix compilation error due to previous merge

---------

Co-authored-by: subodh <subodh1810@gmail.com>

* connectors-ci: fix `unhashable type 'set'` (#29064)

* Add Slack Alert lifecycle to Dagster for Metadata publish (#28759)

* DNC

* Add slack lifecycle logging

* Update to use slack

* Update slack to use resource and bot

* Improve markdown

* Improve log

* Add sensor logging

* Extend sensor time

* merge conflict

* PR Refactoring

* Make the tests work

* remove unnecessary classes, pr feedback

* more merging

* Update airbyte-integrations/bases/base-typing-deduping-test/src/main/java/io/airbyte/integrations/base/destination/typing_deduping/BaseSqlGeneratorIntegrationTest.java

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* snowflake updates

---------

Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>
Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: girarda <girarda@users.noreply.github.com>
Co-authored-by: btkcodedev <btk.codedev@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Natalie Kwong <38087517+nataliekwong@users.noreply.github.com>
Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>
Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com>
Co-authored-by: terencecho <terencecho@users.noreply.github.com>
Co-authored-by: Lake Mossman <lake@airbyte.io>
Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
Co-authored-by: subodh <subodh1810@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
harrytou pushed a commit to KYVENetwork/airbyte that referenced this pull request Sep 1, 2023
* Add everything for BQ but migrate, refactor interface after practical work

* Make new default methods, refactor to single implemented method

* MigrationInterface and BQ impl created

* Trying to integrate with standard inserts

* remove unnecessary NameAndNamespacePair class

* Shimmed in

* Java Docs

* Initial Testing Setup

* Tests!

* Move Migrator into TyperDeduper

* Functional Migration

* Add Integration Test

* Pr updates

* bump version

* bump version

* version bump

* Update to airbyte-ci-internal (airbytehq#29026)

* 🐛 Source Github, Instagram, Zendesk-support, Zendesk-talk: fix CAT tests fail on `spec` (airbytehq#28910)

* connectors-ci: better modified connectors detection logic (airbytehq#28855)

* connectors-ci: report path should always start with `airbyte-ci/` (airbytehq#29030)

* make report path always start with airbyte-ci

* revert report path in orchestrator

* add more test cases

* bump version

* Updated docs (airbytehq#29019)

* CDK: Embedded reader utils (airbytehq#28873)

* relax pydantic dep

* Automated Commit - Format and Process Resources Changes

* wip

* wrap up base integration

* add init file

* introduce CDK runner and improve error message

* make state param optional

* update protocol models

* review comments

* always run incremental if possible

* fix

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🚨🚨 Low code CDK: Decouple SimpleRetriever and HttpStream (airbytehq#28657)

* fix tests

* format

* review comments

* Automated Commit - Formatting Changes

* review comments

* review comments

* review comments

* log all messages

* log all message

* review comments

* review comments

* Automated Commit - Formatting Changes

* add comment

---------

Co-authored-by: flash1293 <flash1293@users.noreply.github.com>

* 🤖 Bump minor version of Airbyte CDK

* 🐛 Source Github, Instagram, Zendesk Support / Talk - revert `spec` changes and improve (airbytehq#29031)

* Source oauth0: new streams and fix incremental (airbytehq#29001)

* Add new streams Organizations,OrganizationMembers,OrganizationMemberRoles

* relax schema definition to allow additional fields

* Bump image tag version

* revert some changes to the old schemas

* Format python so gradle can pass

* update incremental

* remove unused print

* fix unit test

---------

Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>

* 🐛 Source Mongo: Fix failing acceptance tests (airbytehq#28816)

* Fix failing acceptance tests

* Fix failing strict acceptance tests

* Source-Greenhouse: Fix unit tests for new CDK version (airbytehq#28969)

Fix unit tests

* Add CSV options to the CSV parser (airbytehq#28491)

* remove invalid legacy option

* remove unused option

* the tests pass but this is quite messy

* very slight clean up

* Add skip options to csv format

* fix some of the typing issues

* fixme comment

* remove extra log message

* fix typing issues

* skip before header

* skip after header

* format

* add another test

* Automated Commit - Formatting Changes

* auto generate column names

* delete dead code

* update title and description

* true and false values

* Update the tests

* Add comment

* missing test

* rename

* update expected spec

* move to method

* Update comment

* fix typo

* remove unused import

* Add a comment

* None records do not pass the WaitForDiscoverPolicy

* format

* remove second branch to ensure we always go through the same processing

* Raise an exception if the record is None

* reset

* Update tests

* handle unquoted newlines

* Automated Commit - Formatting Changes

* Update test case so the quoting is explicit

* Update comment

* Automated Commit - Formatting Changes

* Fail validation if skipping rows before header and header is autogenerated

* always fail if a record cannot be parsed

* format

* set write line_no in error message

* remove none check

* Automated Commit - Formatting Changes

* enable autogenerate test

* remove duplicate test

* missing unit tests

* Update

* remove branching

* remove unused none check

* Update tests

* remove branching

* format

* extract to function

* comment

* missing type

* type annotation

* use set

* Document that the strings are case-sensitive

* public -> private

* add unit test

* newline

---------

Co-authored-by: girarda <girarda@users.noreply.github.com>

* Dagster: Add sentry logging (airbytehq#28822)

* Add sentry

* add sentry decorator

* Add traces

* Use sentry trace

* Improve duplicate logging

* Add comments

* DNC

* Fix up issues

* Move to scopes

* Remove breadcrumb

* Update lock

* ✨Source Shortio: Migrate Python CDK to Low-code CDK (airbytehq#28950)

* Migrate Shortio to Low-Code

* Update abnormal state

* Format

* Update Docs

* Fix metadata.yaml

* Add pagination

* Add incremental sync

* add incremental parameters

* update metadata

* rollback update version

* release date

---------

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Update to new verbiage (airbytehq#29051)

* [skip ci] Metadata: Remove leading underscore (airbytehq#29024)

* DNC

* Add test models

* Add model test

* Remove underscore from metadata files

* Regenerate models

* Add test to check for key transformation

* Allow additional fields on metadata

* Delete transform

* Proof of concept parallel source stream reading implementation for MySQL (airbytehq#26580)

* Proof of concept parallel source stream reading implementation for MySQL

* Automated Change

* Add read method that supports concurrent execution to Source interface

* Remove parallel iterator

* Ensure that executor service is stopped

* Automated Commit - Format and Process Resources Changes

* Expose method to fix compilation issue

* Use concurrent map to avoid access issues

* Automated Commit - Format and Process Resources Changes

* Ensure concurrent streams finish before closing source

* Fix compile issue

* Formatting

* Exclude concurrent stream threads from orphan thread watcher

* Automated Commit - Format and Process Resources Changes

* Refactor orphaned thread logic to account for concurrent execution

* PR feedback

* Implement readStreams in wrapper source

* Automated Commit - Format and Process Resources Changes

* Add readStream override

* Automated Commit - Format and Process Resources Changes

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* 🤖 Auto format source-mysql code [skip ci]

* Debug logging

* Reduce logging level

* Replace synchronized calls to System.out.println when concurrent

* Close consumer

* Flush before close

* Automated Commit - Format and Process Resources Changes

* Remove charset

* Use ASCII and flush periodically for parallel streams

* Test performance harness patch

* Automated Commit - Format and Process Resources Changes

* Cleanup

* Logging to identify concurrent read enabled

* Mark parameter as final

---------

Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>

* connectors-ci: disable dependency scanning (airbytehq#29033)

* updates (airbytehq#29059)

* Metadata: skip breaking change validation on prerelease (airbytehq#29017)

* skip breaking change validation

* Move ValidatorOpts higher in call

* Add prerelease test

* Fix test

* ✨ Source MongoDB Internal POC: Generate Test Data (airbytehq#29049)

* Add script to generate test data

* Fix prose

* Update credentials example

* PR feedback

* Bump Airbyte version from 0.50.12 to 0.50.13

* Bump versions for mssql strict-encrypt (airbytehq#28964)

* Bump versions for mssql strict-encrypt

* Fix failing test

* Fix failing test

* 🎨 Improve replication method selection UX (airbytehq#28882)

* update replication method in MySQL source

* bump version

* update expected specs

* update registries

* bump strict encrypt version

* make password always_show

* change url

* update registries

* 🐛 Avoid writing records to log (airbytehq#29047)

* Avoid writing records to log

* Update version

* Rollout ctid cdc (airbytehq#28708)

* source-postgres: enable ctid+cdc implementation

* 100% ctid rollout for cdc

* remove CtidFeatureFlags

* fix CdcPostgresSourceAcceptanceTest

* Bump versions and release notes

* Fix compilation error due to previous merge

---------

Co-authored-by: subodh <subodh1810@gmail.com>

* connectors-ci: fix `unhashable type 'set'` (airbytehq#29064)

* Add Slack Alert lifecycle to Dagster for Metadata publish (airbytehq#28759)

* DNC

* Add slack lifecycle logging

* Update to use slack

* Update slack to use resource and bot

* Improve markdown

* Improve log

* Add sensor logging

* Extend sensor time

* merge conflict

* PR Refactoring

* Make the tests work

* remove unnecessary classes, pr feedback

* more merging

* Update airbyte-integrations/bases/base-typing-deduping-test/src/main/java/io/airbyte/integrations/base/destination/typing_deduping/BaseSqlGeneratorIntegrationTest.java

Co-authored-by: Edward Gao <edward.gao@airbyte.io>

* snowflake updates

---------

Co-authored-by: Ben Church <ben@airbyte.io>
Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: flash1293 <flash1293@users.noreply.github.com>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Vasilis Gavriilidis <vasilis.gavriilidis@orfium.com>
Co-authored-by: Jonathan Pearlin <jonathan@airbyte.io>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: girarda <girarda@users.noreply.github.com>
Co-authored-by: btkcodedev <btk.codedev@gmail.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Natalie Kwong <38087517+nataliekwong@users.noreply.github.com>
Co-authored-by: jdpgrailsdev <jdpgrailsdev@users.noreply.github.com>
Co-authored-by: octavia-squidington-iii <octavia-squidington-iii@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: rodireich <rodireich@users.noreply.github.com>
Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com>
Co-authored-by: terencecho <terencecho@users.noreply.github.com>
Co-authored-by: Lake Mossman <lake@airbyte.io>
Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
Co-authored-by: subodh <subodh1810@gmail.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants