-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
airbyte-ci: run incremental acceptance tests step for community connectors #39363
airbyte-ci: run incremental acceptance tests step for community connectors #39363
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
This stack of pull requests is managed by Graphite. Learn more about stacking. Join @alafanechere and the rest of your teammates on |
1db035d
to
bc78187
Compare
a0d75dc
to
51837c8
Compare
bc78187
to
6a006a5
Compare
51837c8
to
3ab9e17
Compare
6a006a5
to
51799c8
Compare
3ab9e17
to
3a6dd9c
Compare
3a6dd9c
to
7b1c6be
Compare
@alafanechere what about the caching implementation makes it per PR? is this something might could actually be cached if multiple devs are developing against the same connector? What about a stack of PRs affecting the same connector? Does it get screwed up with force pushes at all? Just trying to understand the inner workings of it. If it's just "somewhere in the cache we have results for this version of this connector", that would be awesome. But I know you also mentioned it's running the same version of CAT as on the branch (also awesome, but obviously a different cache implementation). Is it CAT version + released connector version? |
async def get_result_log_on_master(self) -> Artifact: | ||
"""Runs acceptance test on the released image of the connector and returns the report log. | ||
The released image version is fetched from the master metadata file of the connector. | ||
We're not using the online connector registry here as some connectors might not be released to OSS nor Airbyte Cloud. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a now problem: we've run into this before. This tells me we need a registry of all of our connectors (including archived, unreleased etc) as well as the ones we use to power our product. This seems important for docs too, as I think for that we currently join the OSS and cloud ones into a master list
artifacts=[report_log_artifact], | ||
consider_in_overall_status=consider_result_in_overall_status, | ||
) | ||
|
||
|
||
class IncrementalAcceptanceTests(Step): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The naming off this is great! I like that it conveys that "The acceptance tests failed" (Ignored - Failed
) and "the incremental acceptance tests succeeded", and the latter is what we're making decisions based on.
if self.context.connector.metadata.get("supportLevel") == "certified": | ||
return StepResult( | ||
step=self, status=StepStatus.SKIPPED, stdout="The connector is certified, skipping incremental acceptance tests." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like logic that depends on the logic here but is encoded in different places. Can we find a shared mechanism for that logic (maybe we check if the acceptance tests are a soft fail before running the incremental acceptance tests) or at least try to move them next to one another so we know they should be considered in tandem?
|
||
current_failing_nodes = await self.get_failed_pytest_node_ids(current_acceptance_tests_report_log) | ||
if not current_failing_nodes: | ||
return StepResult(step=self, status=StepStatus.SUCCESS, stdout="No failing acceptance tests were detected.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this maybe be SKIPPED
? We didn't actually run the tests/comparison, we've skipped it because we deemed it unnecessary since the previous step succeeded
f""" | ||
No new failing acceptance tests were detected. | ||
Acceptance tests are still failing with {len(current_failing_nodes)} failing tests. | ||
This connector is not certified so we are not enforcing a clean slate, but please checkout the original acceptance tests failures and assess how critical they are. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same note about coding the logic of "this connector is not certified" in many different places
stdout=f"{len(new_failing_nodes)} new failing acceptance tests detected:\n-" | ||
+ "\n-".join(current_failing_nodes) | ||
+ "\nPlease fix the new failing tests before merging this PR.", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would add the same note above about the nodes which are still failing (but don't need to be fixed)
Yes it's basically it. Sorry for the "per PR confusion". What is cached is the CAT inputs we provide here. If the released version is the same we will get the same value for |
51799c8
to
46db2da
Compare
369c6cf
to
e31fa6a
Compare
46db2da
to
ff30368
Compare
e31fa6a
to
bb03937
Compare
ff30368
to
8d9929b
Compare
bb03937
to
4641a89
Compare
Merge activity
|
…39361) ## What We decided that it was fine to make CI pass on community connectors if their acceptance tests results are not made worse than what they are on `master`. **Detection of "not made worse" is implemented in #39363 To do so we have to change the CI logic slightly to allow CAT to fail on community connector while still emitting a 🟢 CI status. It'll allow keeping the original CAT results for debugging but not fail CI if they are not passing. ## How 1. Introduce a `consider_result_in_overall_status` flag at the `StepResult` level. It allows `Step` to emit `StepResult` which won't be considered in the overall computation of success / failure for a pipeline. 2. Update the HTML report to show when a step was ignored from overall results. 3. Change the definition of `success` at the `Report` level to integrate this new `consider_result_in_overall_status` flag.
4641a89
to
ad9072b
Compare
Suspect IssuesThis pull request was deployed and Sentry observed the following issues:
Did you find this useful? React with a 👍 or 👎 |
What
When a connector under test is a community connector we want to fail acceptance tests only if the current branch is adding new test failures compared to CAT results in the released image.
How
IncrementalAcceptanceTests
step which runs afterAcceptanceTests
AcceptanceTests
on the released image of the connector.Some words about the "cost" of running acceptance tests twice
Acceptance tests will run on the connector container of the branch and on a container build from the latest released image.
You can consider acceptance tests is run twice and this is costly, but:
IncrementalAcceptanceTests
only runs for community connectors, it is skipped for certified ones, so it's not adding an overhead for Airbyters maintaining certified connecors.master
.