Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Airbyte-ci: Add skippable connector test steps #32188

Merged
merged 39 commits into from
Dec 15, 2023

Conversation

bnchrch
Copy link
Contributor

@bnchrch bnchrch commented Nov 3, 2023

Problem

As a connector developer I would like to be able to skip individual steps in the test pipeline

Proposed Solution

airbyte-ci connectors --name=source-s3 test -x unit -x acceptance

Implementation

Defining the Steps to be run as a list

def get_test_steps(context: ConnectorContext) -> List[StepToRun]:
    """
    Get all the tests steps for a Python connector.
    """
    return [
        [StepToRun(id="build", step=BuildConnectorImages(context, LOCAL_BUILD_PLATFORM))],
        [StepToRun(
            id="unit",
            step=UnitTests(context),
            args=lambda results: {"connector_under_test": results["build"].output_artifact[LOCAL_BUILD_PLATFORM]},
            depends_on=["build"],
        )],
        [
            StepToRun(
                id="integration",
                step=IntegrationTests(context),
                args=lambda results: {"connector_under_test": results["build"].output_artifact[LOCAL_BUILD_PLATFORM]},
                depends_on=["build"],
            ),
            StepToRun(
                id="acceptance",
                step=AcceptanceTests(context, context.concurrent_cat),
                args=lambda results: {"connector_under_test_container": results["build"].output_artifact[LOCAL_BUILD_PLATFORM]},
                depends_on=["build"],
            ),
            StepToRun(id="check_base_image", step=CheckBaseImageIsUsed(context), depends_on=["build"]),
        ],
    ]

Other minor changes

  1. connector_secrets is now explicitly part of the ConnectorContext

Why a list?

This will give us a type and structure we can manupilate prior to running, enabling features like

  1. Skipping steps
  2. Passing extra args
  3. Eagerly failing tests on failing dependencies
  4. Generating a test report prior to the tests completing

Why a new class?

The new class StepToRun was added for two reasons

  1. Its difficult to add this functionality to all steps without a much larger rewrite
  2. The need for dynamic args when calling run still would require an addtional class / datamodel
  3. This lets us keep all coordination logic inside the run_steps.py file

Copy link

vercel bot commented Nov 3, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Dec 15, 2023 8:16pm

@bnchrch bnchrch force-pushed the bnchrch/airbyte-ci/options-for-steps branch from 2cb24a3 to 258b66d Compare November 3, 2023 22:50
@bnchrch bnchrch force-pushed the bnchrch/airbyte-ci/options-for-steps branch from aa29862 to 3505170 Compare November 3, 2023 23:32
@bnchrch bnchrch marked this pull request as ready for review November 6, 2023 22:17
@bnchrch bnchrch requested a review from a team November 6, 2023 22:17
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I LOVE the direction where this is going.

Comment on lines 41 to 47
@click.option(
'--skip-step',
'-x',
multiple=True,
type=str,
help="Skip a step by name. Can be used multiple times to skip multiple steps.",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How big of a lift would it be to have an --only option to select a specific test suite?

Copy link
Contributor Author

@bnchrch bnchrch Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not much at all! I can do that in another PR

Issue here: https://github.com/airbytehq/airbyte/issues/33309

Comment on lines 209 to 213
async def connector_secrets(self):
if self._connector_secrets is None:
self._connector_secrets = await secrets.get_connector_secrets(self)
return self._connector_secrets

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to your changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is! We were mounting connector secrets in between steps. So I hoisted that logic into the context so when you request connector_secrets it mounts and gets them if it hasnt already

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rename the function to get_connector_secrets then? Or would it make sense to make secrets.get_connector_secrets a method of the context class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im torn on this

On one hand it belongs in context given how coupled it is to a ConnectorContext

On the other hand connector secrets is a foundational concept that deserves a separate file to hold onto the logic.

I think we gain more from it being separate.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming it to get_connector_secrets as it:

  • Downloads the connector secrets
  • Assign the secrets to self._connector_secrets.

And we rename self._connector_secrets to self.connector_secrets to make it public.

I'd be confused to have a method named connector_secrets acting as a getter on self._connector_secrets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Though, lets leave the internal field with the underscore. We want devs to always reference the getter and feel bad otherwise

Comment on lines 175 to 176
args=lambda results: {"connector_under_test": results["build"].output_artifact[LOCAL_BUILD_PLATFORM]},
depends_on=["build"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken the dependency is always implicitly defined by the fact that a Step args are from a dependency output_artifact. Could we infer the dependency tree dynamically from the args then?
I know explicit is better than implicit 😄 , so feel free to reject this suggestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats fair! but there sometimes output that may not be an artifact, thus theres no implicit relationship either.

e.g. we have a step called PushToDockerHub and a step called VerifyOnDockerHub

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add tests that check that argument passing between steps works?

id: str
step: Step
args: ARGS_TYPE = field(default_factory=dict)
depends_on: List[str] = field(default_factory=list)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would find it more correct and robust if the depends_on type was a list of StepToRun instances

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not full against that but why is that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A child could know its parents if depends_on was a StepToRun

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it limits id typos as we'd pass around a variable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with the need of having --help list the name of available steps to skip We've added the pattern where steps id's are defined as constants and then reused.

That should fix the typo problem you pointed out here.

Which I think is a good approach since we dont want to pass around fully instantiated StepToRun objects as that would lead us into a couple of painful areas

  1. Passing the BuildStepInstance everywhere
  2. Preventing us from having a polymorphic build step. e.g. BuildStepInstance or BuildWithNormalizationStepInstance

WDYT?



@dataclass(frozen=True)
class StepToRun:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just opening the discussion here.
Could we rename StepToRun to Step
And Step to Operation?

A Operation.__init__ would take a Step instance and would access parameters from Step.args.

Step instances would be "singleton"
Operation instance would not be singleton: an operation can be re-used in multiple step.
And I think Operation might be worth being functions as they'll be stateless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im all for finding a way to cut down the awkwardness (and layers) of these abstraction.

Hearing works like "singleton", "would be" and "I think" is telling me that this might be better tackled on its own so we can just wrestle with restructuring the taxonomy and not restructuring the taxonomy AND adding new functionality.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

@bnchrch
Copy link
Contributor Author

bnchrch commented Dec 11, 2023

@alafanechere This is ready for review again!

Since the last review

  1. Updated to latest master
  2. Added support for --help listing skippable steps for -x
  3. Made StepToRun ids less vulnerable to typos
  4. Added checks for args being one of callable, awaitable or dict
  5. Updated tests to have descriptions
  6. Added test cases for skip steps, fast failures, concurrency, invalid args and more
  7. Added configurable concurrency support

I also left a few discussions open around depends_on and get_connector_secrets

What im worried about is that I may have clobbered a change we've made in the last month. I havent found any regression but would love your eagle eyes on that!

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a bug in the input validation:
airbyte-ci connectors --name=source-faker test -x=acceptance:

┃ Error: Invalid value for '--skip-step' / '-x': '=acceptance' is not one of <CONNECTOR_TEST_STEP_ID.ACCEPTANCE: 'acceptance'>, <CONNECTOR_TEST_STEP_ID.BUILD_NORMALIZATION: 'build_normalization'>, <CONNECTOR_TEST_STEP_ID.BUILD_TAR: 'build_tar'>, <CONNECTOR_TEST_STEP_ID.BUILD: 'build'>, <CONNECTOR_TEST_S
┃ TEP_ID.CHECK_BASE_IMAGE: 'check_base_image'>, <CONNECTOR_TEST_STEP_ID.INTEGRATION: 'integration'>, <CONNECTOR_TEST_STEP_ID.METADATA_VALIDATION: 'metadata_validation'>, <CONNECTOR_TEST_STEP_ID.QA_CHECKS: 'qa_checks'>, <CONNECTOR_TEST_STEP_ID.UNIT: 'unit'>, <CONNECTOR_TEST_STEP_ID.VERSION_FOLLOW_CHECK:
┃ 'version_follow_check'>, <CONNECTOR_TEST_STEP_ID.VERSION_INC_CHECK: 'version_inc_check'>, <CONNECTOR_TEST_STEP_ID.TEST_ORCHESTRATOR: 'test_orchestrator'>, <CONNECTOR_TEST_STEP_ID.DEPLOY_ORCHESTRATOR: 'deploy_orchestrator'>.

Copy link
Contributor Author

bnchrch commented Dec 14, 2023

I think thats just a quirk of click.choice

We have the same issue with the -a flag in build

image.png

Copy link
Contributor Author

bnchrch commented Dec 15, 2023

@alafanechere Ready for your review again :)

@alafanechere
Copy link
Contributor

@bnchrch I figured that = can be used with long options -- but with short one a space is expected... Might be a convention I was not aware of.
I changed the Choice a bit to have a cleaner output in case the passed -x value is invalid

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it 🎉
I think I fixed the test suite + conflicts.
I manually tested connectors test , everything is working as expected so far!

@bnchrch bnchrch enabled auto-merge (squash) December 15, 2023 20:16
@bnchrch bnchrch merged commit e444d75 into master Dec 15, 2023
21 checks passed
@bnchrch bnchrch deleted the bnchrch/airbyte-ci/options-for-steps branch December 15, 2023 20:28
timroes pushed a commit that referenced this pull request Dec 19, 2023
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
Co-authored-by: Augustin <augustin@airbyte.io>
Co-authored-by: bnchrch <bnchrch@users.noreply.github.com>
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants