Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airbyte-ci: Add skippable connector test steps #32188

Merged
merged 39 commits into from
Dec 15, 2023
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
f1732af
Remove fail fast
bnchrch Oct 20, 2023
866bc56
Remove fast test only
bnchrch Oct 20, 2023
5a1c1cc
Add new pipeline demo file
bnchrch Oct 20, 2023
27267a6
Remove code format checks
bnchrch Oct 20, 2023
3c0b584
Wrapped step
bnchrch Oct 20, 2023
bab78ab
Merge remote-tracking branch 'origin/master' into bnchrch/airbyte-ci/…
bnchrch Nov 1, 2023
2de8d43
Add get test step skeleton
bnchrch Nov 2, 2023
f90f002
Get runnable working
bnchrch Nov 3, 2023
dd4d275
Add skip support
bnchrch Nov 3, 2023
ab950d3
Add get_test_steps to java connector
bnchrch Nov 3, 2023
258b66d
Remove old code
bnchrch Nov 3, 2023
3a518ca
Remove outdatted methods
bnchrch Nov 3, 2023
be10290
Add tests
bnchrch Nov 3, 2023
3505170
Rename to run_step
bnchrch Nov 3, 2023
e59dc3c
Add paralell tests
bnchrch Nov 6, 2023
3e54bd4
Add concurrency tests
bnchrch Nov 6, 2023
ef6f26a
Remove commented out code
bnchrch Nov 6, 2023
5a61d50
Fix secrets
bnchrch Nov 6, 2023
bba7057
Get async args working
bnchrch Nov 6, 2023
0abd1eb
Add new type for results
bnchrch Nov 6, 2023
45a7e23
Add new tests
bnchrch Nov 6, 2023
fbcf237
Add step tree logging
bnchrch Nov 6, 2023
1821d91
Apply suggestions from code review
bnchrch Nov 7, 2023
5361936
Merge remote-tracking branch 'origin/master' into bnchrch/airbyte-ci/…
bnchrch Dec 8, 2023
7b61277
Code review comments
bnchrch Dec 9, 2023
1a6e3d5
Add Awaitable type
bnchrch Dec 9, 2023
7f293f3
Automated Commit - Formatting Changes
bnchrch Dec 9, 2023
31d98c9
Ensure --help works
bnchrch Dec 11, 2023
8a5bbb9
Add tests for skip steps
bnchrch Dec 11, 2023
5ac61e3
Parameterize invalid args
bnchrch Dec 11, 2023
f6ed2a5
Add concurrency support
bnchrch Dec 11, 2023
8f57adb
Fix mocker tests
bnchrch Dec 11, 2023
fa233f9
Merge remote-tracking branch 'origin/master' into bnchrch/airbyte-ci/…
bnchrch Dec 14, 2023
cddd975
Address PR comments
bnchrch Dec 14, 2023
4ffa61d
Fix secrets
bnchrch Dec 14, 2023
13349dd
improve Choice and skip step validation for better UX
alafanechere Dec 15, 2023
b99e39c
fix tests
alafanechere Dec 15, 2023
ffae7e2
Merge branch 'master' into bnchrch/airbyte-ci/options-for-steps
alafanechere Dec 15, 2023
485d0c4
Fix test
bnchrch Dec 15, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions airbyte-ci/connectors/pipelines/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,8 +52,8 @@ If you face any installation problems feel free to reach out the Airbyte Connect
### Setting up connector secrets access

If you plan to use Airbyte CI to run CAT (Connector Acceptance Tests), we recommend setting up GSM
access so that Airbyte CI can pull remote secrets from GSM. For setup instructions, see the
CI Credentials package (which Airbyte CI uses under the hood) README's
access so that Airbyte CI can pull remote secrets from GSM. For setup instructions, see the
CI Credentials package (which Airbyte CI uses under the hood) README's
[Get GSM Access](https://github.com/airbytehq/airbyte/blob/master/airbyte-ci/connectors/ci_credentials/README.md#get-gsm-access)
instructions.

Expand Down Expand Up @@ -155,7 +155,7 @@ Available commands:
| `--enable-dependency-scanning / --disable-dependency-scanning` | False | ` --disable-dependency-scanning` | | When enabled the dependency scanning will be performed to detect the connectors to select according to a dependency change. |
| `--docker-hub-username` | | | DOCKER_HUB_USERNAME | Your username to connect to DockerHub. Required for the publish subcommand. |
| `--docker-hub-password` | | | DOCKER_HUB_PASSWORD | Your password to connect to DockerHub. Required for the publish subcommand. |


### <a id="connectors-list-command"></a>`connectors list` command
Retrieve the list of connectors satisfying the provided filters.
Expand Down Expand Up @@ -236,8 +236,8 @@ flowchart TD

| Option | Multiple | Default value | Description |
| ------------------- | -------- | ------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `--skip-step/-x` | True | | Skip steps by id e.g. `-x unit -x acceptance` |
| `--fail-fast` | False | False | Abort after any tests fail, rather than continuing to run additional tests. Use this setting to confirm a known bug is fixed (or not), or when you only require a pass/fail result. |
| `--fast-tests-only` | True | False | Run unit tests only, skipping integration tests or any tests explicitly tagged as slow. Use this for more frequent checks, when it is not feasible to run the entire test suite. |
| `--code-tests-only` | True | False | Skip any tests not directly related to code updates. For instance, metadata checks, version bump checks, changelog verification, etc. Use this setting to help focus on code quality during development. |
| `--concurrent-cat` | False | False | Make CAT tests run concurrently using pytest-xdist. Be careful about source or destination API rate limits. |

Expand Down Expand Up @@ -408,6 +408,7 @@ This command runs the Python tests for a airbyte-ci poetry package.
## Changelog
| Version | PR | Description |
| ------- | ---------------------------------------------------------- | --------------------------------------------------------------------------------------------------------- |
| 2.6.0 | [#31974](https://github.com/airbytehq/airbyte/pull/?????) | Add -x option to connector test to allow for skipping steps |
| 2.5.3 | [#31974](https://github.com/airbytehq/airbyte/pull/31974) | Fix latest CDK install and pip cache mount on connector install. |
| 2.5.2 | [#31871](https://github.com/airbytehq/airbyte/pull/31871) | Deactivate PR comments, add HTML report links to the PR status when its ready. |
| 2.5.1 | [#31774](https://github.com/airbytehq/airbyte/pull/31774) | Add a docker configuration check on `airbyte-ci` startup. |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,11 @@

from datetime import datetime
from types import TracebackType
from typing import Optional
from typing import List, Optional

import yaml
import functools

from anyio import Path
from asyncer import asyncify
from dagger import Directory, Secret
Expand All @@ -18,6 +20,7 @@
from pipelines.helpers.connectors.modifed import ConnectorWithModifiedFiles
from pipelines.helpers.github import update_commit_status_check
from pipelines.helpers.slack import send_message_to_webhook
from pipelines.helpers.run_steps import RunStepOptions
from pipelines.helpers.utils import METADATA_FILE_NAME
from pipelines.models.contexts import PipelineContext

Expand Down Expand Up @@ -49,8 +52,6 @@ def __init__(
reporting_slack_channel: Optional[str] = None,
pull_request: PullRequest = None,
should_save_report: bool = True,
fail_fast: bool = False,
fast_tests_only: bool = False,
code_tests_only: bool = False,
use_local_cdk: bool = False,
use_host_gradle_dist_tar: bool = False,
Expand All @@ -60,6 +61,7 @@ def __init__(
s3_build_cache_access_key_id: Optional[str] = None,
s3_build_cache_secret_key: Optional[str] = None,
concurrent_cat: Optional[bool] = False,
run_step_options: RunStepOptions = RunStepOptions(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to expose this parameter at the PipelineContext level for reusability. Wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do! Im just passing it through to the pipeline context here

):
"""Initialize a connector context.

Expand All @@ -78,8 +80,6 @@ def __init__(
slack_webhook (Optional[str], optional): The slack webhook to send messages to. Defaults to None.
reporting_slack_channel (Optional[str], optional): The slack channel to send messages to. Defaults to None.
pull_request (PullRequest, optional): The pull request object if the pipeline was triggered by a pull request. Defaults to None.
fail_fast (bool, optional): Whether to fail fast. Defaults to False.
fast_tests_only (bool, optional): Whether to run only fast tests. Defaults to False.
code_tests_only (bool, optional): Whether to ignore non-code tests like QA and metadata checks. Defaults to False.
use_host_gradle_dist_tar (bool, optional): Used when developing java connectors with gradle. Defaults to False.
open_report_in_browser (bool, optional): Open HTML report in browser window. Defaults to True.
Expand All @@ -99,8 +99,6 @@ def __init__(
self._updated_secrets_dir = None
self.cdk_version = None
self.should_save_report = should_save_report
self.fail_fast = fail_fast
self.fast_tests_only = fast_tests_only
self.code_tests_only = code_tests_only
self.use_local_cdk = use_local_cdk
self.use_host_gradle_dist_tar = use_host_gradle_dist_tar
Expand All @@ -110,6 +108,7 @@ def __init__(
self.s3_build_cache_access_key_id = s3_build_cache_access_key_id
self.s3_build_cache_secret_key = s3_build_cache_secret_key
self.concurrent_cat = concurrent_cat
self._connector_secrets = None

super().__init__(
pipeline_name=pipeline_name,
Expand All @@ -128,6 +127,7 @@ def __init__(
ci_git_user=ci_git_user,
ci_github_access_token=ci_github_access_token,
open_report_in_browser=open_report_in_browser,
run_step_options=run_step_options,
)

@property
Expand Down Expand Up @@ -206,6 +206,11 @@ def docker_hub_password_secret(self) -> Optional[Secret]:
return None
return self.dagger_client.set_secret("docker_hub_password", self.docker_hub_password)

async def connector_secrets(self):
if self._connector_secrets is None:
self._connector_secrets = await secrets.get_connector_secrets(self)
return self._connector_secrets

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to your changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is! We were mounting connector secrets in between steps. So I hoisted that logic into the context so when you request connector_secrets it mounts and gets them if it hasnt already

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename the function to get_connector_secrets then? Or would it make sense to make secrets.get_connector_secrets a method of the context class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im torn on this

On one hand it belongs in context given how coupled it is to a ConnectorContext

On the other hand connector secrets is a foundational concept that deserves a separate file to hold onto the logic.

I think we gain more from it being separate.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming it to get_connector_secrets as it:

  • Downloads the connector secrets
  • Assign the secrets to self._connector_secrets.

And we rename self._connector_secrets to self.connector_secrets to make it public.

I'd be confused to have a method named connector_secrets acting as a getter on self._connector_secrets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Though, lets leave the internal field with the underscore. We want devs to always reference the getter and feel bad otherwise

async def get_connector_dir(self, exclude=None, include=None) -> Directory:
"""Get the connector under test source code directory.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from pipelines.cli.dagger_pipeline_command import DaggerPipelineCommand
from pipelines.consts import ContextState
from pipelines.helpers.github import update_global_commit_status_check_for_tests
from pipelines.helpers.run_steps import RunStepOptions
from pipelines.helpers.utils import fail_if_missing_docker_hub_creds


Expand All @@ -30,27 +31,27 @@
type=bool,
is_flag=True,
)
@click.option(
"--fast-tests-only",
help="When enabled, slow tests are skipped.",
default=False,
type=bool,
is_flag=True,
)
@click.option(
"--concurrent-cat",
help="When enabled, the CAT tests will run concurrently. Be careful about rate limits",
default=False,
type=bool,
is_flag=True,
)
@click.option(
'--skip-step',
'-x',
multiple=True,
type=str,
bnchrch marked this conversation as resolved.
Show resolved Hide resolved
help="Skip a step by name. Can be used multiple times to skip multiple steps.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How big of a lift would it be to have an --only option to select a specific test suite?

Copy link
Contributor Author

@bnchrch bnchrch Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not much at all! I can do that in another PR

Issue here: https://github.com/airbytehq/airbyte/issues/33309

@click.pass_context
async def test(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add a discover subcommand or --dry-run flag to help users discover the available step names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I make this change you suggested
#32188 (comment)

Then --help should allow them to discover the step names

ctx: click.Context,
code_tests_only: bool,
fail_fast: bool,
fast_tests_only: bool,
concurrent_cat: bool,
skip_step: str,
bnchrch marked this conversation as resolved.
Show resolved Hide resolved
) -> bool:
"""Runs a test pipeline for the selected connectors.

Expand All @@ -70,6 +71,11 @@ async def test(
update_global_commit_status_check_for_tests(ctx.obj, "success")
return True

run_step_options = RunStepOptions(
fail_fast=fail_fast,
skip_steps=skip_step,
)

connectors_tests_contexts = [
ConnectorContext(
pipeline_name=f"Testing connector {connector.technical_name}",
Expand All @@ -86,18 +92,18 @@ async def test(
ci_context=ctx.obj.get("ci_context"),
pull_request=ctx.obj.get("pull_request"),
ci_gcs_credentials=ctx.obj["ci_gcs_credentials"],
fail_fast=fail_fast,
fast_tests_only=fast_tests_only,
code_tests_only=code_tests_only,
use_local_cdk=ctx.obj.get("use_local_cdk"),
s3_build_cache_access_key_id=ctx.obj.get("s3_build_cache_access_key_id"),
s3_build_cache_secret_key=ctx.obj.get("s3_build_cache_secret_key"),
docker_hub_username=ctx.obj.get("docker_hub_username"),
docker_hub_password=ctx.obj.get("docker_hub_password"),
concurrent_cat=concurrent_cat,
run_step_options=run_step_options,
)
for connector in ctx.obj["selected_connectors_with_modified_files"]
]

try:
await run_connectors_pipelines(
[connector_context for connector_context in connectors_tests_contexts],
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,125 +3,67 @@
#
"""This module groups factory like functions to dispatch tests steps according to the connector under test language."""

import itertools
from typing import List

import anyio
import asyncer
from connector_ops.utils import METADATA_FILE_NAME, ConnectorLanguage
from pipelines.airbyte_ci.connectors.context import ConnectorContext
from pipelines.airbyte_ci.connectors.reports import ConnectorReport
from pipelines.airbyte_ci.connectors.test.steps import java_connectors, python_connectors
from pipelines.airbyte_ci.connectors.test.steps.common import QaChecks, VersionFollowsSemverCheck, VersionIncrementCheck
from pipelines.airbyte_ci.metadata.pipeline import MetadataValidation
from pipelines.models.steps import StepResult
from pipelines.helpers.run_steps import StepToRun, run_steps

LANGUAGE_MAPPING = {
"run_all_tests": {
ConnectorLanguage.PYTHON: python_connectors.run_all_tests,
ConnectorLanguage.LOW_CODE: python_connectors.run_all_tests,
ConnectorLanguage.JAVA: java_connectors.run_all_tests,
},
"run_code_format_checks": {
ConnectorLanguage.PYTHON: python_connectors.run_code_format_checks,
ConnectorLanguage.LOW_CODE: python_connectors.run_code_format_checks,
# ConnectorLanguage.JAVA: java_connectors.run_code_format_checks
"get_test_steps": {
ConnectorLanguage.PYTHON: python_connectors.get_test_steps,
ConnectorLanguage.LOW_CODE: python_connectors.get_test_steps,
ConnectorLanguage.JAVA: java_connectors.get_test_steps,
},
}


async def run_metadata_validation(context: ConnectorContext) -> List[StepResult]:
"""Run the metadata validation on a connector.
Args:
context (ConnectorContext): The current connector context.

Returns:
List[StepResult]: The results of the metadata validation steps.
"""
return [await MetadataValidation(context).run()]


async def run_version_checks(context: ConnectorContext) -> List[StepResult]:
"""Run the version checks on a connector.

Args:
context (ConnectorContext): The current connector context.

Returns:
List[StepResult]: The results of the version checks steps.
"""
return [await VersionFollowsSemverCheck(context).run(), await VersionIncrementCheck(context).run()]


async def run_qa_checks(context: ConnectorContext) -> List[StepResult]:
"""Run the QA checks on a connector.

Args:
context (ConnectorContext): The current connector context.

Returns:
List[StepResult]: The results of the QA checks steps.
"""
return [await QaChecks(context).run()]


async def run_code_format_checks(context: ConnectorContext) -> List[StepResult]:
"""Run the code format checks according to the connector language.

Args:
context (ConnectorContext): The current connector context.

Returns:
List[StepResult]: The results of the code format checks steps.
"""
if _run_code_format_checks := LANGUAGE_MAPPING["run_code_format_checks"].get(context.connector.language):
return await _run_code_format_checks(context)
else:
context.logger.warning(f"No code format checks defined for connector language {context.connector.language}!")
return []


async def run_all_tests(context: ConnectorContext) -> List[StepResult]:
"""Run all the tests steps according to the connector language.
def get_test_steps(context: ConnectorContext) -> List[StepToRun]:
"""Get all the tests steps according to the connector language.

Args:
context (ConnectorContext): The current connector context.

Returns:
List[StepResult]: The results of the tests steps.
List[StepResult]: The list of tests steps.
"""
if _run_all_tests := LANGUAGE_MAPPING["run_all_tests"].get(context.connector.language):
return await _run_all_tests(context)
if _get_test_steps := LANGUAGE_MAPPING["get_test_steps"].get(context.connector.language):
return _get_test_steps(context)
else:
context.logger.warning(f"No tests defined for connector language {context.connector.language}!")
return []


async def run_connector_test_pipeline(context: ConnectorContext, semaphore: anyio.Semaphore) -> ConnectorReport:
"""Run a test pipeline for a single connector.
async def run_connector_test_pipeline(context: ConnectorContext, semaphore: anyio.Semaphore):
"""
Compute the steps to run for a connector test pipeline.
"""

A visual DAG can be found on the README.md file of the pipelines modules.
steps_to_run = get_test_steps(context)

Args:
context (ConnectorContext): The initialized connector context.
if not context.code_tests_only:
alafanechere marked this conversation as resolved.
Show resolved Hide resolved
steps_to_run += [
[
StepToRun(id="metadata_validation", step=MetadataValidation(context)),
StepToRun(id="version_follow_check", step=VersionFollowsSemverCheck(context)),
StepToRun(id="version_inc_check", step=VersionIncrementCheck(context)),
StepToRun(id="qa_checks", step=QaChecks(context)),
]
]

Returns:
ConnectorReport: The test reports holding tests results.
"""
async with semaphore:
async with context:
async with asyncer.create_task_group() as task_group:
tasks = [
task_group.soonify(run_all_tests)(context),
task_group.soonify(run_code_format_checks)(context),
]
if not context.code_tests_only:
tasks += [
task_group.soonify(run_metadata_validation)(context),
task_group.soonify(run_version_checks)(context),
task_group.soonify(run_qa_checks)(context),
]
results = list(itertools.chain(*(task.value for task in tasks)))
result_dict = await run_steps(
runnables=steps_to_run,
options=context.run_step_options,
)
Comment on lines +62 to +65
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💎


results = list(result_dict.values())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm wondering if casting to list is mandatory here, does directly passing result_dict.values() to the ConnectorReport work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call out, I think your right, let me check

context.report = ConnectorReport(context, steps_results=results, name="TEST RESULTS")

return context.report
Loading
Loading