Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp QA checks into a battery included package #35322

Merged
merged 1 commit into from
Feb 19, 2024

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented Feb 15, 2024

What

Relates to:

This PR introduces a new connectors-qa 🐍 package which can run static-analysis checks on our connectors and generate documentation.

It will help address the following problems we have:

🤔 Philosophy

This efforts is driven by the following principles:

  • Our connector standards apply to all connectors:
    • Airbyte should commit that certified connectors will always meet all the standard.
    • Any new contribution on a community connector must meet all standards to be merged.
  • We should limit as much as possible any connector specific logic inairbyte-ci. We should consider it an orchestrator calling external tools and packages. airbyte-ci will call a containerizedconnectors-qa via its CLI.

🎉 Net new features

Documentation generation

connectors-qa generate-documentation connectors_qa_documentation.md

This will generate a markdown filedocumenting all the enabled checks.
The content is taken from check classes names and descriptions.

Report generation

connectors-qa run --connector-directory=airbyte-integrations/connectors --report-path=qa_report.json

This will generate a json report of all the QA checks on all our connectors.
We could automate the generation of this report in our CD pipeline and use it to feed dashboards or other assets like connector's README.
And create cool badges like: Dynamic JSON Badge

Recommended reading order

  1. README.md to install and try out the tool locally.
  2. checks/*.py to understand which checks are running
  3. cli.py to grasp how the entrypoint and asyncio logic is implemented

🚨 User Impact 🚨

None as this PR is just introducing a new package which is not yet used by airbyte-ci

Follow up steps

  1. Commit the generated documentation doc: Document our connectors QA checks #35324
  2. Call this package in the QAChecks step of airbyte-ci connectors test
  3. Remove connector_ops/qa_checks and airbyte-ci steps that are now running inside this package (semver version check, pypi publishing etc.)
  4. Automate the report generation and host it on GCS to power dashboards - connector readme badges?
  5. Replace the "Connector checklist" on PR by a comment linking to the generate doc?

Copy link

vercel bot commented Feb 15, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Feb 19, 2024 9:28am

Copy link
Contributor Author

alafanechere commented Feb 15, 2024

@alafanechere alafanechere marked this pull request as ready for review February 15, 2024 14:43
@alafanechere alafanechere requested a review from a team as a code owner February 15, 2024 14:43
@alafanechere alafanechere changed the title Revamp QA check into a battery included package Revamp QA checks into a battery included package Feb 15, 2024
Copy link
Collaborator

@bazarnov bazarnov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very promising to me, thanks @alafanechere ! Not approving now, because i'll get a closer look tomorrow to see the big picture of the checks themselves.

@@ -17,7 +17,7 @@ GitPython = "^3.1.29"
pydantic = "^1.9"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What holds our back to switch to the pydantic 2.0.0? Just curious?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, nothing I'm aware of, but it's a different package / project than the current one :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't yet know the full surface area of us using Pydantic. If we're using 1.* everywhere, should we treat 2.0+ upgrade as a separate lane of work, or do you mostly expect things to work smoothly with such an upgrade?

Is is drop-in compatible syntax-wise?

If we're using it in the CDK itself and in airbyte-ci — I think airbyte-ci could be our guinea pig for such a migration? /cc @bazarnov

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only package that still demands the 1.* version is CAT, and yet, let the airbyte-ci be the guide indeed.

)

expected_title = f"# {connector.name_from_metadata} Migration Guide"
expected_version_header_start = "## Upgrading to "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have all the critical rules covered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ported over what currently exists in qa_checks.py . Let me know if you think additional checks should be implemented. I'm open for it, but in different PRs and with related new GH issues 😄

Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does look great to me — since the potential blast radius of just having the tool is close to zero, I'm comfy approving this.

Caveats: I want follow-up work of:

  • Publishing the actual generated documentation
  • Moving our doc guide to our doc pages instead of hackmd
  • Actually switching to use connectors-qa.

Please wait for @bazarnov's review and work with him to get a go ahead from API Sources, but I'm personally happy.

@@ -17,7 +17,7 @@ GitPython = "^3.1.29"
pydantic = "^1.9"
PyGithub = "^1.58.0"
rich = "^13.0.0"
pydash = "^7.0.4"
pydash = "^6.0.2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why downgrade?

Copy link
Contributor Author

@alafanechere alafanechere Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

connectors_qa depends on two other internal packages which are using pydash but on different version:

  • medata_service/lib uses v6
  • connector_ops uses v7

I'm aligning to metadata_service/lib version because this package is also a dependency of metadata_service/orchestrator, which is also declaring a dependency on pydash.

Aligning to v6 makes me modifying a single package (connector_ops) instead of 2 (metadata_service/lib / metadata_service/orchestrator). A part from that I don't think there's a good reason to stay on v6.

@@ -0,0 +1,87 @@
# Connectors QA
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions here:

  1. Does connectors_qa have to live in this particular directory, or is it just logical spot for it in the monorepo? I.e. could we just put it in the root of the repository instead?
  2. If it's fully independent of the rest of connector_ops stuff (don't think so), is there a good overall readme and list of directories and what project is where? We have many things — registry service, airbyte-ci, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natikgadzhi I put it there because it's a package used in the CI context. But we could put it at the root of the repo, I don't mind 😄 .
It's dependent on a couple of other internal packages. I will explicitly list the dependency and why they exists in this README.

from .documentation import ENABLED_CHECKS as DOCUMENTATION_CHECKS

ENABLED_CHECKS = (
DOCUMENTATION_CHECKS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was about to suggest that you could have a list of lists and then flatten, but then I remembered how flattening lists in Python is expressed kek.

DOCKER_INDEX = "docker.io"
DOCKERFILE_NAME = "Dockerfile"
DOCUMENTATION_STANDARDS_URL = "https://hackmd.io/Bz75cgATSbm7DjrAqgl4rw"
GRADLE_FILE_NAME = "build.gradle"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is Gradle used internally, and for what?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natikgadzhi these checks run also on our java connectors which are using gradle.
This file is used in this program to determine if a connector is a java one.

DOCKER_HUB_USERNAME_ENV_VAR_NAME = "DOCKER_HUB_USERNAME"
DOCKER_INDEX = "docker.io"
DOCKERFILE_NAME = "Dockerfile"
DOCUMENTATION_STANDARDS_URL = "https://hackmd.io/Bz75cgATSbm7DjrAqgl4rw"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow wow wow hold up, why are our standards in a 3rd party service and not a page on our very own docs site? Any specific reason? /cc @girarda @alafanechere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea. This is the first time I see this page. I think this was hacked together by dev-rel two years ago, but that's only based off this slack thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natikgadzhi @girarda this doc was written by our previous documentation team. I believe it was drafted when I was working on the original qa_checks.py and never got checked in our main repo. I believe the doc is good enough to join our Resources section of Contributing to Airbyte. I can do it in a follow up PR.

Screenshot 2024-02-16 at 08 09 30

@alafanechere alafanechere force-pushed the augustin/02-10-qa_checks_v2 branch 3 times, most recently from 2370f63 to 2183818 Compare February 16, 2024 08:15
@alafanechere
Copy link
Contributor Author

This does look great to me — since the potential blast radius of just having the tool is close to zero, I'm comfy approving this.

Caveats: I want follow-up work of:

  • Publishing the actual generated documentation
  • Moving our doc guide to our doc pages instead of hackmd
  • Actually switching to use connectors-qa.

Please wait for @bazarnov's review and work with him to get a go ahead from API Sources, but I'm personally happy.

@natikgadzhi I generated the documentation + add an integration test to make sure it's kept up to date in #35324

Copy link
Collaborator

@bazarnov bazarnov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @alafanechere for this improvement.

@alafanechere alafanechere merged commit 553c9b0 into master Feb 19, 2024
27 checks passed
@alafanechere alafanechere deleted the augustin/02-10-qa_checks_v2 branch February 19, 2024 11:43
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 21, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc added a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
* ✨ source-surveymonkey: migrate to poetry (airbytehq#35168)

* ✨ source-monday: migrate to poetry (airbytehq#35146)

* ✨ source-salesforce: migrate to poetry (airbytehq#35147)

* ✨ source-intercom: migrate to poetry (airbytehq#35148)

* ✨ source-iterable: migrate to poetry (airbytehq#35150)

* ✨ source-mixpanel: migrate to poetry (airbytehq#35151)

* ✨ source-typeform: migrate to poetry (airbytehq#35152)

* ✨ source-twilio: migrate to poetry (airbytehq#35153)

* ✨ source-notion: migrate to poetry (airbytehq#35155)

* ✨ source-zendesk-talk: migrate to poetry (airbytehq#35156)

* ✨ source-amplitude: migrate to poetry (airbytehq#35162)

* ✨ source-jira: migrate to poetry (airbytehq#35160)

* ✨ source-google-ads: migrate to poetry (airbytehq#35158)

* 🐛 Source Slack: Join to the channels while `read` instead of `discovery` (airbytehq#35131)

* ✨ source-hubspot: migrate to poetry (airbytehq#35165)

* ✨ source-pinterest: migrate to poetry (airbytehq#35159)

* ✨ source-sentry: migrate to poetry (airbytehq#35145)

* ✨ source-chargebee: migrate to poetry (airbytehq#35169)

* source-snapchat-marketing: adopt our base image (airbytehq#35170)

* ✨ source-snapchat-marketing: migrate to poetry (airbytehq#35171)

* source-faker: adopt our base image (airbytehq#35172)

* ✨ source-faker: migrate to poetry (airbytehq#35174)

* ✨ source-amazon-ads: migrate to poetry (airbytehq#35180)

* Source Github: add integration tests  (airbytehq#34933)

* ✨ source-bing-ads: migrate to poetry (airbytehq#35179)

* ✨ source-instagram: migrate to poetry (airbytehq#35177)

* ✨ source-facebook-marketing: migrate to poetry (airbytehq#35178)

* destination-async-framework: make emission of state from FlushWorkers synchronized (airbytehq#35144)

* ✨ source-freshdesk: migrate to poetry (airbytehq#35187)

* 🐛 source-mysql Support special chars in dbname (airbytehq#34580)

* AirbyteLib: Release 0.1.0 (airbytehq#35184)

* 📚 Adjust documentation for corepack (airbytehq#35192)

* ✨ source-recharge: migrate to poetry (airbytehq#35182)

* ✨ source-tiktok-marketing: migrate to poetry (airbytehq#35161)

* Bump Airbyte version from 0.50.48 to 0.50.49

* ✨ Destination Postgres: DV2 GA (airbytehq#35042)

Co-authored-by: Marius Posta <marius@airbyte.io>
Co-authored-by: Evan Tahler <evan@airbyte.io>

* Destination snowflake: reorder auth spec options (airbytehq#35194)

* ✨ source-zendesk-chat: migrate to poetry (airbytehq#35185)

* ✨ source-sendgrid: migrate to poetry (airbytehq#35181)

* ✨ source-gitlab: migrate to poetry (airbytehq#35167)

* ✨ source-airtable: migrate to poetry (airbytehq#35149)

* ✨ source-google-search-console: migrate to poetry (airbytehq#35163)

* 🐛Source Amazon Seller Partner: add integration tests (airbytehq#33996)

* ✨ source-s3: migrate to poetry (airbytehq#35164)

* ✨ source-shopify: migrate to poetry (airbytehq#35166)

* ✨ source-file: migrate to poetry (airbytehq#35186)

* ✨ source-slack: migrate to poetry (airbytehq#35157)

* ✨ source-harvest: migrate to poetry (airbytehq#35154)

* Source Chargebee: Updates schemas for validation and missing fields errors, updates test bypass, adds expected records, adds custom error handling, adds incremental support for three streams (airbytehq#34053)

* Don't emit final state if there is an underlying stream failure (airbytehq#34869)

Co-authored-by: Xiaohan Song <xiaohan@airbyte.io>

* Remove IAM Role Setup instructions from s3.md (airbytehq#35190)

* Bump Airbyte version from 0.50.49 to 0.50.50

* airbyte-ci: run `poetry check` before `poetry install` on poetry package install (airbytehq#35212)

* ✨ Source File: add fixed width file format support (airbytehq#34678)

Co-authored-by: mgreene <michael.greene@gravie.com>
Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com>
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com>

* source-postgres: adopt CDK 0.20.4 (airbytehq#35224)

* 🐛  Set cdc record subsequent record wait time to initial wait time as a workaround (airbytehq#35114)

* AirbyteLib: docs: add Colab quicklink (airbytehq#35215)

* AirbyteLib: support secrets in dotenv files (airbytehq#35244)

* Add airbyte trace utility to emit analytics messages & emit messages for MongoDB, Postgres & MySQL (airbytehq#35036)

* AirbyteLib: Docs: fix colab badge (airbytehq#35248)

* AirbyteLib: improve json schema type detection (airbytehq#35263)

* 🏥 Source Mixpanel: update stream Funnels with custom_event_id and custom_event fields fields (airbytehq#35203)

* write logs to file in addition to stdout when running java connector tests (airbytehq#35236)

* destination-duckdb: remove superfluous build.gradle file (airbytehq#35277)

* fix `:airbyte-integrations:connectors:destination-duckdb' could not be found in project` (airbytehq#35279)

* destination-e2e-test,dev-null: use CDK 0.20.6 (airbytehq#35278)

* AirbyteLib: Add support for JSON and VARIANT types (airbytehq#35117)

Co-authored-by: Joe Reuter <joe@airbyte.io>

* Docs: add deprecation note for normalization and custom transformation (airbytehq#35275)

* 🎉 Source Intercom: Update the API Version to `2.10` (airbytehq#35176)

* 🐛 Source Harvest: Revert  poetry update (airbytehq#35296)

* AirbyteLib: Mark and deprioritize slow tests (airbytehq#35298)

* source-clickhouse: adopt CDK 0.20.4 (airbytehq#35235)

* source-cockroachdb: adopt CDK 0.20.4 (airbytehq#35234)

* source-db2: adopt CDK 0.20.4 (airbytehq#35233)

* source-dynamodb: adopt CDK 0.20.4 (airbytehq#35232)

* source-e2e-test: adopt CDK 0.20.4 (airbytehq#35231)

* source-elasticsearch: adopt CDK 0.20.4 (airbytehq#35230)

* source-kafka: adopt CDK 0.20.4 (airbytehq#35229)

* source-oracle: adopt CDK 0.20.4 (airbytehq#35225)

* source-redshift: adopt CDK 0.20.4 (airbytehq#35223)

* source-scaffold-java-jdbc: adopt CDK 0.20.4 (airbytehq#35222)

* source-sftp: adopt CDK 0.20.4 (airbytehq#35221)

* source-snowflake: adopt CDK 0.20.4 (airbytehq#35220)

* source-teradata: adopt CDK 0.20.4 (airbytehq#35219)

* source-tidb: adopt CDK 0.20.4 (airbytehq#35218)

* Throw cdc cursor error

* Revert bad commit

* AirbyteLib: suppress duckdb reflection warnings (airbytehq#35300)

* Source Google Ads: temporary patch to avoid 500 Internal server error (airbytehq#35280)

* 🐛 python cdk: mask oauth access key (airbytehq#34931)

* 🤖 Bump patch version of Python CDK

* Emit multiple error trace messages and continue syncs by default (airbytehq#35129)

* 🤖 Bump minor version of Python CDK

* ✨Source Amazon Seller Partner: add `VendorOrders` stream (airbytehq#35273)

* File-based CDK: enqueue AirbyteMessage of type record instead of sending to the message repository (airbytehq#35318)

* 🤖 Bump patch version of Python CDK

* 🚨🚨🐛 Source Gitlab fix merge_request_commits stream (airbytehq#34548)

* java CDK: improve blobstore module structure (airbytehq#35285)

* source-mysql: add and adopt TestDatabaseWithInvalidDatabaseName (airbytehq#35210)

* ✨ Source File: support ZIP file (airbytehq#32354)

Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com>
Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com>

* destination-async-framework: move the state emission logic into GlobalAsyncStateManager (airbytehq#35240)

* 🐛 Source Harvest: Fix pendulum parsing error (airbytehq#35305)

Co-authored-by: Christo Grabowski <108154848+ChristoGrab@users.noreply.github.com>

* ✨ Source GitHub: updating branches schema and unpin on cloud (airbytehq#35271)

Co-authored-by: maxi297 <maxime@airbyte.io>
Co-authored-by: Maxime Carbonneau-Leclerc <3360483+maxi297@users.noreply.github.com>

* AirbyteLib: Fix no-such-table-error (airbytehq#35311)

Co-authored-by: Bindi Pankhudi <bindi@airbyte.com>
Co-authored-by: Aaron Steers <aj@airbyte.io>

* 📝 add instructions for soft reset (airbytehq#35335)

* [source-postgres] Add test for legacy version of postgres (airbytehq#35329)

* Source Klaviyo: added transform config for profile stream (airbytehq#35336)

* 🏥 Source Hubspot: updated marketing emails schema and expected records (airbytehq#35328)

* gradle: split off python cdk (airbytehq#35306)

* gradle: overall simplification (airbytehq#35307)

* docs: typos (airbytehq#35302)

* Docs: Update stripe.md (airbytehq#35142)

* Test PR to check Slack notifications (airbytehq#35363)

* airbyte-ci: remove reference to buildConnectorImage (airbytehq#35364)

* Source S3: revert rollback to 4.4.1 (airbytehq#35055)

Co-authored-by: Augustin <augustin@airbyte.io>

* 🐛 Source OpsGenie: fix parsing of updated_at timestamps from OpsGenie (airbytehq#35269)

Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* Archive `destination-kvdb` (airbytehq#35370)

* Add `archived` as connector support level (airbytehq#35355)

* Remove `octavia-cli` (airbytehq#33950)

* Docs: update k8s instructions for upgrade (airbytehq#35108)

* Destination redshift: delete some unused files (airbytehq#35314)

* re-add destination-kvdb as archived connector (airbytehq#35377)

* destination-kvdb - publish for real (airbytehq#35379)

* Support user-specified test read limits in `connector_builder` code (airbytehq#35312)

* 🤖 Bump patch version of Python CDK

* destination-kvdb bump to publish (airbytehq#35381)

* ✨ Source Paypal Transactions: Siver Certification  (airbytehq#34510)

Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
Co-authored-by: Augustin <augustin@airbyte.io>

* Revamp QA checks into a battery included package (airbytehq#35322)

* 🏥 Source Pinterest: updated expected records (airbytehq#35353)

* .github: fix python CDK publish (airbytehq#35391)

* 🐛 Source Amazon Seller Partner: Fix check for Vendor accounts (airbytehq#35331)

* doc: Document our connectors QA checks (airbytehq#35324)

* airbyte-ci: use connectors-qa instead of connector_ops.qa_check (airbytehq#35325)

* Update `metadata-service` to latest version + docs (airbytehq#35419)

* Bump destination-kvdb again to test metadata for archival (airbytehq#35422)

* connectors_qa: make `CheckPublishToPyPiIsEnabled` only run on source connectors (airbytehq#35426)

* gradle: remove archived connectors (airbytehq#35423)

* ✨Source Facebook Marketing: add integration tests (airbytehq#35061)

* Delete `requirements.txt` on poetry managed connectors (airbytehq#35406)

* update doc to reference poetry (airbytehq#35414)

* 🧹 remove qa_checks.py (airbytehq#35434)

* connectors-qa: fix connector type attribute access (airbytehq#35435)

* java-connectors: add thread name as part of the log message (airbytehq#35199)

* doc: remove Node requirements on config based getting started tutorial (airbytehq#35436)

* airbyte-ci: disable telemetry with env var (airbytehq#35438)

* airbyte-ci: disable a flaky test (airbytehq#35418)

* ci: check for required reviewers on destinations (airbytehq#35428)

* destination-kvdb QA checks (airbytehq#35424)

Co-authored-by: Augustin <augustin@airbyte.io>

* Add destination-kvdb to OSS registry (airbytehq#35444)

* Normalization logs: remove json parse warnings (airbytehq#34978)

* Support archived connectors in Docs (airbytehq#35374)

* remove destination-kvdb one more time (airbytehq#35382)

* [Source-Postgres] : Add config to throw an error on invalid CDC position (airbytehq#35304)

* java-cdk:remove unused class (airbytehq#35408)

* Source S3: add filter by start date (airbytehq#35392)

* Revert "Add destination-kvdb to OSS registry" (airbytehq#35453)

* airbyte-ci: do no run QA checks on publish - only MetadataValidation (airbytehq#35437)

Co-authored-by: Ella Rohm-Ensing <erohmensing@gmail.com>

* restore kvdb to state from airbytehq#35424 (airbytehq#35454)

* 🚨🚨 Source Facebook Marketing: Add statuses filters (airbytehq#32449)

Co-authored-by: Anatolii Yatsuk <tolikyatsuk@gmail.com>

* add proper logging to junit runs (airbytehq#35394)

Basically, Junit is not logging any thing about its progress outside of the console. This is aimed at fixing that by outputing progress logs along with the standard logs. So there's going to be a line before each step of a test run, and a line after with the elapsed time. Also, exception are now part of the logs instead of being only part of the junit report.
In the process of doing that, I decided to clean up and simplify the log4j2.xml file.
I also noted a few issues with ANSI coloring, so there's a fix for that.
Finally, I'm removing empty lines from container logs (MSSQL is full of them).

The junit printing is done through an intereceptor. That interceptor uses introspection. I wanted to use a factory method, but java's ServiceLoader only allows classes that extends the service interface,  hence the need to override every method in the interceptor class, and to plop a proxy on top of that.

* Re-ignore documentation structure check for the time being (airbytehq#35458)

* [Source-mysql] : Add config to throw an error on invalid CDC position (airbytehq#35338)

* [Source-Mongodb] : Add config to throw an error on invalid CDC position (airbytehq#35375)

* pin to older version (airbytehq#35469)

* Update on-kubernetes-via-helm.md - Add GCS Logging steps (airbytehq#35455)

Co-authored-by: Sajarin <sajarindider@gmail.com>

* Airbyte CDK: add filter to RemoveFields (airbytehq#35326)

Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>

* 🤖 Bump minor version of Python CDK

* 🐛 Source Facebook Marketing: Fix error during transforming state (airbytehq#35467)

* .github: remove connector checklist (airbytehq#35484)

* connectors_qa: bump to 1.0.3 (airbytehq#35475)

* .github: tighter filtering for gradle workflow (airbytehq#35492)

* Airbyte docs: Fixed JSON schema rendering issues for dark mode (airbytehq#35489)

Co-authored-by: bindipankhudi <bindi@airbyte.com>

* Source Quickbooks: fix spec (airbytehq#35457)

* 🐛 Change null cursor value query to not use IIF sql function (airbytehq#35405)

* Source Google Ads: rollback patch 500 Internal Server Error (airbytehq#35493)

* Fix syntax error in `tools/bin/manage.sh`, used to publish airbyte cdk (airbytehq#35466)

* [DB sources] : Reduce CDC state compression limit to 1MB (airbytehq#35511)

* 🤖 Bump patch version of Python CDK

* Add ignore_stream_slicer_parameters_on_paginated_requests flag (airbytehq#35462)

* 🤖 Bump minor version of Python CDK

* Mangle unhandled MongoCommandException to prevent creating grouping o… (airbytehq#35526)

* .github: fix java cdk publish workflow (airbytehq#35533)

* [Source-mysql] : Adopt 0.21.4 and reduce cdc state compression threshold to 1MB (airbytehq#35525)

* 🏥 Source Notion: update stream schema (airbytehq#35409)

* airbyte-ci: make QA check work on strict-encrypt connectors (airbytehq#35536)

* Update docs to show archived information if connector is not in registries (airbytehq#35468)

* 🐛 Source Facebook Marketing: Add missing config migration (airbytehq#35539)

* docs: update ALB configuration docs for exposing API (airbytehq#35520)

* chore: remove upgrading-airbyte.md (airbytehq#35545)

* 📚 Add documentation for Entra ID (airbytehq#34569)

* Bump Airbyte version from 0.50.50 to 0.50.51

* gradle.yml: use a smaller runner (airbytehq#35547)

* airbyte-ci: augment the report for java connectors (airbytehq#35317)

Today we're missing the logs (both JVM and container logs) in java connector reports.
This is creating a link to test artifacts. In the CI, the link will point to a zip file, while on a local run, it will point to a directory.

In addition, we recently added the junit XML inlined with the test standard output and error, but that didn't really work as well as we'd hoped: The reports were slow to load, they were not ordered by time, the corresponding logs were lacking. There's still a possibility they'll be useful, so rather than removing them altogether, they will be bundled in the log zip (or directory).

I'm also adding a button to copy the standard output or the standard error from a step into the clipboard.
Finally, I'm reducing the max vertical size of an expanded step, so it doesn't go over 70%, which seems much cleaner to me.

Here's an example of the result (from the child PR): https://storage.cloud.google.com/airbyte-ci-reports-multi/airbyte-ci/connectors/test/pull_request/stephane_02-09-add_background_thread_to_track_mssql_container_status/1708056420/d4683bfb7f90675c6b9e7c6d4bbad3f98c7a7550/source-mssql/3.7.0/output.html

* Source SalesForce: Add Stream Slice Step option to specification (airbytehq#35421)

Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>

* Destination Clickhouse - 1.0, remove normalization (airbytehq#34637)

Co-authored-by: Aaron ("AJ") Steers <aj@airbyte.io>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: Obioma Anomnachi <onanomnachi@gmail.com>
Co-authored-by: Anatolii Yatsuk <35109939+tolik0@users.noreply.github.com>
Co-authored-by: Maxime Carbonneau-Leclerc <3360483+maxi297@users.noreply.github.com>
Co-authored-by: maxi297 <maxi297@users.noreply.github.com>
Co-authored-by: Ryan Waskewich <156025126+rwask@users.noreply.github.com>
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
Co-authored-by: Marius Posta <marius@airbyte.io>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: SatishChGit <satishchinthanippu@gmail.com>
Co-authored-by: evantahler <evan@airbyte.io>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: Anton Karpets <anton.karpets@globallogic.com>
Co-authored-by: Christo Grabowski <108154848+ChristoGrab@users.noreply.github.com>
Co-authored-by: Akash Kulkarni <akash@airbyte.io>
Co-authored-by: Akash Kulkarni <113392464+akashkulk@users.noreply.github.com>
Co-authored-by: Gireesh Sreepathi <gisripa@gmail.com>
Co-authored-by: Artem Inzhyyants <36314070+artem1205@users.noreply.github.com>

* Airbyte CDK: add interpolation for request options (airbytehq#35485)

Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>

* 🤖 Bump minor version of Python CDK

* Handle seeing uncompressed sendgrid contact data (airbytehq#35343)

* gradle.yml: use XXL runners but only if gradle related files are changed (airbytehq#35548)

* ✨ [greenhouse] [iterable] [linkedin-ads] [paypal-transactions] [pinterest] Bump cdk versions for to use continue on stream per-error reporting (airbytehq#35465)

* Airbyte CDK: add CustomRecordFilter (airbytehq#35283)

Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>

* 🤖 Bump minor version of Python CDK

* Do not add connector header to source and destination index pages (airbytehq#35553)

* gradle.yml: fix path filters (airbytehq#35554)

* Source Monday: fix gql query to support inline fragment value for the Items stream (airbytehq#35506)

* gradle.yml: checkout the repo when not PR trigger (airbytehq#35558)

* airbyte-cdk [python]: re-enable tests in CI (airbytehq#35560)

Co-authored-by: Marius Posta <marius@airbyte.io>

* ✨ [source-mssql] skip sql server agent check if EngineEdition == 8 (airbytehq#35368)

* push new source-mssql version (airbytehq#35564)

* Destinations CDK: Refactor T+D to gather required world state upfront (airbytehq#35342)

Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com>

* .github: fix python_cdk_tests.yml (airbytehq#35567)

* Bump Airbyte version from 0.50.51 to 0.50.52

* add entry into JAVA_OPTS to always select log4j2.xml as our logger configuration (airbytehq#35569)

* destination-s3: bump patch version following airbytehq#35569 (airbytehq#35576)

Co-authored-by: Stephane Geneix <stephane@airbyte.io>

* destination-snowflake: bump patch version following airbytehq#35569 (airbytehq#35575)

Co-authored-by: Stephane Geneix <stephane@airbyte.io>

* destination-bigquery: bump patch version following airbytehq#35569 (airbytehq#35574)

Co-authored-by: Stephane Geneix <stephane@airbyte.io>

* source-mysql: bump patch version following airbytehq#35569 (airbytehq#35573)

Co-authored-by: Stephane Geneix <stephane@airbyte.io>

* source-postgres: bump patch version following airbytehq#35569 (airbytehq#35572)

Co-authored-by: Stephane Geneix <stephane@airbyte.io>

* source-mongodb-v2: bump patch version following airbytehq#35569 (airbytehq#35571)

Co-authored-by: Stephane Geneix <stephane@airbyte.io>

* airbyte-ci-test.yml: only run if modified internal poetry packages (airbytehq#35551)

* airbyte-ci-test.yml: checkout repo for path filters when not on PR (airbytehq#35577)

* connectors-ci: early exit when no connector changes (airbytehq#35578)

* Microsoft Entra ID for Self-Managed Enterprise (airbytehq#35585)

* Improve documentation on check command (airbytehq#35542)

Co-authored-by: Ella Rohm-Ensing <erohmensing@gmail.com>

* 🐛 Source S3: fix exception when setting CSV stream delimiter to `\t`. (airbytehq#35246)

Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* 🐛 Source BigQuery: fix error with RECORD REPEATED fields  (airbytehq#35503)

Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>

* re-release source mssql with logger fixes (airbytehq#35596)

* Source File: change header=0 to header=null in docs (airbytehq#35595)

CI tests failed because the version was not incremented, despite only a single line being altered in the documentation. This change is minor and can be safely merged.

* Changed tag to low code (airbytehq#35594)

CI tests failed because the version was not incremented. This change is minor and can be safely merged.

* Bump Airbyte version from 0.50.52 to 0.50.53

* Destination Postgres: CDK T+D initial state gathering (airbytehq#35385)

Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com>

* Destination Snowflake: CDK T+D initial state refactor (airbytehq#35456)

Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com>

* Destination Redshift: CDK T+D initial state refactor (airbytehq#35354)

Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com>

* delete metadata checks workflow (airbytehq#35580)

* Source Recurly: Enable in registries with updated CDK (airbytehq#34622)

* reduce interrupt and shutdown delays to 1 minutes and 2 minutes when stopping a connector (initially set at 60minutes and 70minutes) (airbytehq#35527)

Fixes airbytehq#32348 
discussed here : https://airbytehq-team.slack.com/archives/C02U2SSHP9S/p1708552465201999

* Docs: Add depecration notices to sunsetting connectors (airbytehq#35446)

* Cleaned up PyAibyte docs (PR # 35603) (airbytehq#35603)

Co-authored-by: bindipankhudi <bindi@airbyte.com>

* Source S3: run incremental syncs with concurrency (airbytehq#34895)

* old commits added

* add file location in output stream

* file docker file

* docker file version change

* pgp docker file

* fix

* Bump gnupg version and pgp decryption changes

* fix bug

* fix: discover dtype issued and test cases added

* added files

---------

Signed-off-by: Artem Inzhyyants <artem.inzhyyants@gmail.com>
Signed-off-by: Gireesh Sreepathi <gisripa@gmail.com>
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: Baz <oleksandr.bazarnov@globallogic.com>
Co-authored-by: Artem Inzhyyants <36314070+artem1205@users.noreply.github.com>
Co-authored-by: Subodh Kant Chaturvedi <subodh1810@gmail.com>
Co-authored-by: Xiaohan Song <xiaohan@airbyte.io>
Co-authored-by: Aaron ("AJ") Steers <aj@airbyte.io>
Co-authored-by: Tim Roes <tim@airbyte.io>
Co-authored-by: benmoriceau <benmoriceau@users.noreply.github.com>
Co-authored-by: Gireesh Sreepathi <gisripa@gmail.com>
Co-authored-by: Marius Posta <marius@airbyte.io>
Co-authored-by: Evan Tahler <evan@airbyte.io>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Co-authored-by: Anton Karpets <anton.karpets@globallogic.com>
Co-authored-by: Patrick Nilan <nilan.patrick@gmail.com>
Co-authored-by: Akash Kulkarni <113392464+akashkulk@users.noreply.github.com>
Co-authored-by: Tyler B <104733644+tybernstein@users.noreply.github.com>
Co-authored-by: bgroff <bgroff@users.noreply.github.com>
Co-authored-by: mjgatz <86885812+mjgatz@users.noreply.github.com>
Co-authored-by: mgreene <michael.greene@gravie.com>
Co-authored-by: Serhii Lazebnyi <serhii.lazebnyi@globallogic.com>
Co-authored-by: Serhii Lazebnyi <53845333+lazebnyi@users.noreply.github.com>
Co-authored-by: Rodi Reich Zilberman <867491+rodireich@users.noreply.github.com>
Co-authored-by: Daryna Ishchenko <80129833+darynaishchenko@users.noreply.github.com>
Co-authored-by: Stephane Geneix <147216312+stephane-airbyte@users.noreply.github.com>
Co-authored-by: Joe Reuter <joe@airbyte.io>
Co-authored-by: Marcos Marx <marcosmarxm@users.noreply.github.com>
Co-authored-by: Maxime Carbonneau-Leclerc <3360483+maxi297@users.noreply.github.com>
Co-authored-by: Akash Kulkarni <akash@airbyte.io>
Co-authored-by: Roman Yermilov [GL] <86300758+roman-yermilov-gl@users.noreply.github.com>
Co-authored-by: Alexandre Girard <alexandre@airbyte.io>
Co-authored-by: girarda <girarda@users.noreply.github.com>
Co-authored-by: Brian Lai <51336873+brianjlai@users.noreply.github.com>
Co-authored-by: brianjlai <brianjlai@users.noreply.github.com>
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
Co-authored-by: midavadim <midavadim@yahoo.com>
Co-authored-by: Julien COUTAND <julien.coutand@gmail.com>
Co-authored-by: Christo Grabowski <108154848+ChristoGrab@users.noreply.github.com>
Co-authored-by: maxi297 <maxime@airbyte.io>
Co-authored-by: Bindi Pankhudi <bindi@airbyte.io>
Co-authored-by: Bindi Pankhudi <bindi@airbyte.com>
Co-authored-by: Ben Drucker <bvdrucker@gmail.com>
Co-authored-by: TornadoContre <37258495+TornadoContre@users.noreply.github.com>
Co-authored-by: Natik Gadzhi <natik@respawn.io>
Co-authored-by: Thomas Dippel <dipth@users.noreply.github.com>
Co-authored-by: marcosmarxm <marcosmarxm@gmail.com>
Co-authored-by: Alex Birdsall <ambirdsall@gmail.com>
Co-authored-by: ambirdsall <ambirdsall@users.noreply.github.com>
Co-authored-by: Jose Gerardo Pineda <jose.pineda@airbyte.io>
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
Co-authored-by: Anatolii Yatsuk <35109939+tolik0@users.noreply.github.com>
Co-authored-by: Pedro S. Lopez <pedroslopez@me.com>
Co-authored-by: Ella Rohm-Ensing <erohmensing@gmail.com>
Co-authored-by: Siarhei Ivanou <sinusu@gmail.com>
Co-authored-by: Anatolii Yatsuk <tolikyatsuk@gmail.com>
Co-authored-by: Ryan Waskewich <156025126+rwask@users.noreply.github.com>
Co-authored-by: Sajarin <sajarindider@gmail.com>
Co-authored-by: artem1205 <artem1205@users.noreply.github.com>
Co-authored-by: perangel <perangel@gmail.com>
Co-authored-by: Joe Bell <joseph.bell@airbyte.io>
Co-authored-by: Obioma Anomnachi <onanomnachi@gmail.com>
Co-authored-by: maxi297 <maxi297@users.noreply.github.com>
Co-authored-by: SatishChGit <satishchinthanippu@gmail.com>
Co-authored-by: Brian Leonard <brian@bleonard.com>
Co-authored-by: David Wallace <dwallace0723@gmail.com>
Co-authored-by: pmossman <pmossman@users.noreply.github.com>
Co-authored-by: Stephane Geneix <stephane@airbyte.io>
Co-authored-by: Alexandre Cuoci <Hesperide@users.noreply.github.com>
Co-authored-by: Danny Tiesling <tiesling@gmail.com>
Co-authored-by: Marco Fontana <MaxwellJK@users.noreply.github.com>
Co-authored-by: rishabh-cldcvr <rishabh@cldcvr.com>
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
FVidalCarneiro pushed a commit to AgiData/airbyte that referenced this pull request Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants