Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

airbyte-ci: run incremental acceptance tests step for community connectors #39363

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented Jun 10, 2024

What

When a connector under test is a community connector we want to fail acceptance tests only if the current branch is adding new test failures compared to CAT results in the released image.

How

  • Introduce a IncrementalAcceptanceTests step which runs after AcceptanceTests
  • It runs AcceptanceTests on the released image of the connector.
  • It compares the test reports from the released version vs. the current branch version and fails if new failing tests are detected.

Some words about the "cost" of running acceptance tests twice

Acceptance tests will run on the connector container of the branch and on a container build from the latest released image.
You can consider acceptance tests is run twice and this is costly, but:

  • Subsequent runs of acceptance tests on the released image will be cached by Dagger, it should lead to 1 run of CAT on released image per PRs.
  • It's a good way to make sure we're comparing apple to apples: instead of pulling a previously generated test report on a potentially different version of CAT we use the exact same test suite on both versions.
  • This IncrementalAcceptanceTests only runs for community connectors, it is skipped for certified ones, so it's not adding an overhead for Airbyters maintaining certified connecors.
  • It's a good way to run "just in time" tests instead of scheduling costly nightly / weekly builds on all community connectors on master.

Copy link

vercel bot commented Jun 10, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Jun 11, 2024 2:05pm

@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from 1db035d to bc78187 Compare June 10, 2024 13:05
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_run_incremental_acceptance_tests_step branch from a0d75dc to 51837c8 Compare June 10, 2024 13:05
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from bc78187 to 6a006a5 Compare June 10, 2024 13:21
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_run_incremental_acceptance_tests_step branch from 51837c8 to 3ab9e17 Compare June 10, 2024 13:21
@alafanechere alafanechere marked this pull request as ready for review June 10, 2024 13:22
@alafanechere alafanechere requested a review from a team as a code owner June 10, 2024 13:22
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from 6a006a5 to 51799c8 Compare June 10, 2024 13:52
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_run_incremental_acceptance_tests_step branch from 3ab9e17 to 3a6dd9c Compare June 10, 2024 13:52
@alafanechere alafanechere mentioned this pull request Jun 10, 2024
2 tasks
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_run_incremental_acceptance_tests_step branch from 3a6dd9c to 7b1c6be Compare June 10, 2024 14:57
@erohmensing
Copy link
Contributor

Subsequent runs of acceptance tests on the released image will be cached by Dagger, it should lead to 1 run of CAT on released image per PRs.

@alafanechere what about the caching implementation makes it per PR? is this something might could actually be cached if multiple devs are developing against the same connector? What about a stack of PRs affecting the same connector? Does it get screwed up with force pushes at all?

Just trying to understand the inner workings of it. If it's just "somewhere in the cache we have results for this version of this connector", that would be awesome. But I know you also mentioned it's running the same version of CAT as on the branch (also awesome, but obviously a different cache implementation). Is it CAT version + released connector version?

async def get_result_log_on_master(self) -> Artifact:
"""Runs acceptance test on the released image of the connector and returns the report log.
The released image version is fetched from the master metadata file of the connector.
We're not using the online connector registry here as some connectors might not be released to OSS nor Airbyte Cloud.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a now problem: we've run into this before. This tells me we need a registry of all of our connectors (including archived, unreleased etc) as well as the ones we use to power our product. This seems important for docs too, as I think for that we currently join the OSS and cloud ones into a master list

artifacts=[report_log_artifact],
consider_in_overall_status=consider_result_in_overall_status,
)


class IncrementalAcceptanceTests(Step):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming off this is great! I like that it conveys that "The acceptance tests failed" (Ignored - Failed) and "the incremental acceptance tests succeeded", and the latter is what we're making decisions based on.

Comment on lines 396 to 408
if self.context.connector.metadata.get("supportLevel") == "certified":
return StepResult(
step=self, status=StepStatus.SKIPPED, stdout="The connector is certified, skipping incremental acceptance tests."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like logic that depends on the logic here but is encoded in different places. Can we find a shared mechanism for that logic (maybe we check if the acceptance tests are a soft fail before running the incremental acceptance tests) or at least try to move them next to one another so we know they should be considered in tandem?


current_failing_nodes = await self.get_failed_pytest_node_ids(current_acceptance_tests_report_log)
if not current_failing_nodes:
return StepResult(step=self, status=StepStatus.SUCCESS, stdout="No failing acceptance tests were detected.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this maybe be SKIPPED? We didn't actually run the tests/comparison, we've skipped it because we deemed it unnecessary since the previous step succeeded

f"""
No new failing acceptance tests were detected.
Acceptance tests are still failing with {len(current_failing_nodes)} failing tests.
This connector is not certified so we are not enforcing a clean slate, but please checkout the original acceptance tests failures and assess how critical they are.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same note about coding the logic of "this connector is not certified" in many different places

Comment on lines 423 to 439
stdout=f"{len(new_failing_nodes)} new failing acceptance tests detected:\n-"
+ "\n-".join(current_failing_nodes)
+ "\nPlease fix the new failing tests before merging this PR.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would add the same note above about the nodes which are still failing (but don't need to be fixed)

@octavia-squidington-iv octavia-squidington-iv requested a review from a team June 10, 2024 20:43
@alafanechere
Copy link
Contributor Author

@erohmensing

If it's just "somewhere in the cache we have results for this version of this connector"

Yes it's basically it. Sorry for the "per PR confusion". What is cached is the CAT inputs we provide here. If the released version is the same we will get the same value for connector_container_id.
We implemented a daily cache buster strategy for CAT so the results will not be cached forever, we did it to make nightly builds catch API changes.

@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from 51799c8 to 46db2da Compare June 11, 2024 10:49
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_run_incremental_acceptance_tests_step branch 2 times, most recently from 369c6cf to e31fa6a Compare June 11, 2024 11:09
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from 46db2da to ff30368 Compare June 11, 2024 12:06
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_run_incremental_acceptance_tests_step branch from e31fa6a to bb03937 Compare June 11, 2024 12:06
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors branch from ff30368 to 8d9929b Compare June 11, 2024 12:37
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_run_incremental_acceptance_tests_step branch from bb03937 to 4641a89 Compare June 11, 2024 12:39
Copy link
Contributor Author

alafanechere commented Jun 11, 2024

Merge activity

  • Jun 11, 9:00 AM EDT: @alafanechere started a stack merge that includes this pull request via Graphite.
  • Jun 11, 10:05 AM EDT: Graphite rebased this pull request as part of a merge.
  • Jun 11, 10:27 AM EDT: @alafanechere merged this pull request with Graphite.

@alafanechere alafanechere changed the base branch from augustin/06-10-airbyte-ci_ignore_CAT_results_when_running_on_community_connectors to graphite-base/39363 June 11, 2024 13:40
alafanechere added a commit that referenced this pull request Jun 11, 2024
…39361)

## What
We decided that it was fine to make CI pass on community connectors if their acceptance tests results are not made worse than what they are on `master`.

**Detection of "not made worse" is implemented in #39363


To do so we have to change the CI logic slightly to allow CAT to fail on community connector while still emitting a 🟢 CI status.

It'll allow keeping the original CAT results for debugging but not fail CI if they are not passing.

## How
1. Introduce a `consider_result_in_overall_status` flag at the `StepResult` level.
It allows `Step` to emit `StepResult` which won't be considered in the overall computation of success / failure for a pipeline.

2. Update the HTML report to show when a step was ignored from overall results.
3. Change the definition of `success` at the `Report` level to integrate this new `consider_result_in_overall_status` flag.
@alafanechere alafanechere changed the base branch from graphite-base/39363 to master June 11, 2024 14:02
@alafanechere alafanechere force-pushed the augustin/06-10-airbyte-ci_run_incremental_acceptance_tests_step branch from 4641a89 to ad9072b Compare June 11, 2024 14:04
@alafanechere alafanechere merged commit 8fd3a3c into master Jun 11, 2024
29 checks passed
@alafanechere alafanechere deleted the augustin/06-10-airbyte-ci_run_incremental_acceptance_tests_step branch June 11, 2024 14:27
Copy link

sentry-io bot commented Jun 11, 2024

Suspect Issues

This pull request was deployed and Sentry observed the following issues:

  • ‼️ QueryError: file size 150516934 exceeds limit 134217728 dagger.client._core in execute View Issue
  • ‼️ TypeError: 'NoneType' object is not subscriptable pipelines.airbyte_ci.connectors.test.steps.comm... View Issue
  • ‼️ KeyError: 'data' pipelines.airbyte_ci.connectors.test.steps.comm... View Issue

Did you find this useful? React with a 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants