Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the -k ability to the airbyte-ci test command #24061

Closed
bnchrch opened this issue Mar 14, 2023 · 6 comments · Fixed by #34050
Closed

Add the -k ability to the airbyte-ci test command #24061

bnchrch opened this issue Mar 14, 2023 · 6 comments · Fixed by #34050
Assignees

Comments

@bnchrch
Copy link
Contributor

bnchrch commented Mar 14, 2023

Problem

Test commands are slow, particularly when you want to show just one new/updated tests is changing

Solution

Update the airbyte-ci connector test command to expose the -k command from Pytest

Implementation Recommendation

Once #32188 is merged we will have the StepToRun class which has

  1. string id's per step
  2. dynamic args passed to a step that can change at run time

This should support a pattern like so

airbyte-ci connector --name=source-faker test --arg-build-k test_specific

Where

  1. click allows for extra arguments

  2. our test command parses the extra click arguments to a dictionary

extra_args = {
    "build": ["-k",  "test_specific"]
}
  1. Our run_steps function passes these extra args to the step at run time
  2. Our steps are equipped to use extra_args as an optional value of run
    async def _run(self, command=None, extra_args=None) -> StepResult:
        if additional_args:
            command += extra_args

        container_to_run = await self.init_container()
        return await self.get_step_result(container_to_run.with_exec(command))
@bnchrch
Copy link
Contributor Author

bnchrch commented Mar 14, 2023

Grooming

  • pass the flag to where ever we call intergration_tests.acceptance

@bnchrch bnchrch changed the title Add the -k ability to the /test command Add the -k ability to the airbyte-ci test command Aug 25, 2023
@bnchrch bnchrch self-assigned this Oct 11, 2023
@bnchrch bnchrch removed their assignment Dec 13, 2023
@alafanechere alafanechere self-assigned this Jan 3, 2024
@alafanechere
Copy link
Contributor

@bnchrch I think the implementation recommendation you declared above has the following limitations:

  • When we run connectors test it can run on a batch of connectors using different languages (e.g airbyte-ci connectors --name=source-postgres --name=source-faker test`. The extra args parsing and validation is harder
  • Extra args being dynamic we can't document them in with --help, impeding feature discoverability of airbyte-ci

I tried to think about it thoroughly today and come with the following suggestion:

Ideal interface

airbyte-ci connector source-faker test <test-suite-name> -<test-suite-specific-option> <test-suite-specific-option-value>
Eg:
airbyte-ci connector source-faker test acceptance -k <specific-cat-tests-to-run>

Ideal implementation

  • We can declare test suite specific commands (unit / cat / integration)
  • The test suite available options / params are declared at the Step level and back propagated as click options (we declare at a single place what options are available for a step)
  • We have a concept of test suites: a group of test. E.G:
    • Unit tests (just runs unit tests)
    • Integration tests (connector build + integration test)
    • Code test (Unit + Integration test suite merged)
    • Metadata validation
    • QA checks
    • Version bump check
    • Static analysis (qa check, metadata valdiation, qa checks, version bump check)
    • acceptance
    • full test suite (what we current have when running tests)
  • Test suite test items can be skipped with the existing --skip-step option.

Problem

  • connectors test command runs test for multiple connectors, of different languages. It's not possible to have a unified interface between Java and python connectors test suites: pytest options and gradle options are different
  • Instantiation of a STEP TREE (array of step to run) depends on the creation instantiation of a ConnectorContext, which is built at command invoke time. We need to be able to declare the STEP TREE before running the command if we want to expose the extra args

Suggest approach

Expose a connector commands (instead of connectors) to test a single connector to expose connector/language specific extra args. We keep the connectors one for batch testing of modified connectors in CI.

The connector name becomes an argument, test is command group, unit-tests is a command test suite :

airbyte-ci connector source-faker test unit-tests -k <pytest selection>
airbyte-ci connector source-postgres test integration-test -x <skip specific gradle tests

Benefits:

  • It's easier to get a unified static interface for extra args
  • We can document test suites and extra args in the CLI with --help
  • 1 repo 1 connector 1 command is easier to reason about when we'll want to run CI on any repo.

@bnchrch
Copy link
Contributor Author

bnchrch commented Jan 8, 2024

Ok so I have some issues with this approach.

It might be useful to come up a level to talk more about problems to avoid, rather than implementation

1. We don't want to repeat ourself or others
So my worry here is the tension between

  1. Reusable Steps
  2. Additional Arguments
  3. Declarative CLI Interface

This tension can lead to some issues with our workflow that we should avoid

  1. Having to redefine/redocument every option in a third party tool (e.g. pytest, gradle, black, ruff, etc...)
  2. Having to define/document an argument N times as the step is used in N pipelines.

And I think these issues are more important to avoid than

  1. "Extra args being dynamic we can't document them in with --help, impeding feature discoverability of airbyte-ci"

Which means we should have a solution where

  1. The pipeline has to do no more than use a step, for the step to be passed additional arguments
  2. In otherwords the CLI side of the implementation should be generic for all pipelines, and the implementation of how additional args are used is up to the individual step.

1. Your connectors abstraction is good! Lets not move away from it
This is a footnote but I believe strongly that

  1. Reducing the number of interfaces to your logic has an outsized effect on bug reduction and complexity.
  2. The most elegant abstractions acknowledge that everything is a list. There is no single item but instead an array of that item, and that array can be one.

We should resist having the same pipeline implemented twice for connectors or connector

@alafanechere
Copy link
Contributor

Thank you for the writeup 🙏
I'm inline with the fact that connectorS is 👍:skin-tone-2: but I can't figure a clean way to pass the extra args:
let's say we test both a java and python connectors, how would we expose an option which would apply just for python unit tests ?

If we keep connectors what's your opinion on the "test suite subcommands": e.g:
airbyte-ci connectors --name=source-faker --name=source-postgres test unit-test -k
I don't think it clashes with:

Which means we should have a solution where
The pipeline has to do no more than use a step, for the step to be passed additional arguments
In otherwords the CLI side of the implementation should be generic for all pipelines, and the implementation of how additional args are used is up to the individual step.

Please share if you have any CLI example in mind 🙏 I think I'm confused by the build term in your original example.

@bnchrch
Copy link
Contributor Author

bnchrch commented Jan 9, 2024

how would we expose an option which would apply just for python unit tests ?

I was thinking we use a format like {step-id:arg value}

We would have to enable extra args on our click commands for connectors and the interface would look like

airbyte-ci connectors --name=source-faker --name=source-postgres test --unit-test-k some_test_name

or

airbyte-ci connectors --name=source-faker --name=source-postgres test --arg-unit-test-k some_test_name

and have the run_steps function take an new parameter called additional_args that looks something like

{
  "unit-test": {
    "-k": "some_test_name"
  }
}

Which it will use to pass extra args to the specified step.

e.g. the unit-test step's run function will receive the parameter additional_args with this value

{
    "-k": "some_test_name"
}

If we keep connectors what's your opinion on the "test suite subcommands": e.g: airbyte-ci connectors --name=source-faker --name=source-postgres test unit-test -k

I think that would be fine? It might not be optimal since it applies to both but the dev is still free to run two tests in a row.

and its useful if you want to only run one cat test for N connectors

@alafanechere
Copy link
Contributor

@bnchrch I implemented your suggested approach in this PR stack (ready for review).

I just came up with a slightly different interface in terms of additional options:
airbyte-ci connectors --name=source-pokeapi test --acceptance.-k=test_read

Do you have a suggestion on how we could alleviate the following limitation:

As some step id are mapped to different Step class we can't guarantee that an extra parameter will be correctly handled during step execution. Example: I run the test command on a python and a java connectors with --unit.cov-fail-under=100 option. The cov-fail-under option only exists in the python connector land (it's a pytest option) , so the gradle test for the java connector will likely fail as the command with this specific option is invalid.

I was thinking that we could log a warning if the test command is executed with extra parameters on a set of connectors with mixed languages. Wdyt?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants