live-tests: debug mode and initial regression tests framework #35624

clnoll · 2024-02-26T06:44:23Z

Initial implementation of "live tests" - tests that are intended to be run against actual APIs instead of static data. This currently implements a debug command which forms the foundation for regression tests, by providing an interface for running an Airbyte command against multiple versions of a connector and storing output to disk.

To run a regression-style tests with this current version of live-tests, we run, e.g.

live-tests debug read \
--connector-image=airbyte/source-<SOURCE>:dev \
--connector-image=airbyte/source-<SOURCE>:latest \
--output-directory=</path/to/output> \
--config-path=</path/to/config.json> \
--catalog-path=</path/to/configured_catalog.json>

Results will be stored in the output-directory and can be compared by

diff -u </path/to/output>/read/latest/airbyte_messages/<source>_records.jsonl </path/to/output>/read/dev/airbyte_messages/<source>_records.jsonl

Additional details are in the README.

vercel · 2024-02-26T06:44:29Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
airbyte-docs	⬜️ Ignored (Inspect)	Visit Preview		Mar 6, 2024 7:42am

alafanechere · 2024-02-26T15:54:46Z

airbyte-ci/connectors/regression-testing/regression_testing/main.py

+logger = logging.getLogger(__name__)
+
+
+async def _main(


I'd love to separate connector image retrieval (test bootstrapping) from actual command execution.
I suggest creating a ConnnectorUnderTest dataclass which could have the following attribute:

@dataclass class ConnectorUnderTest: technical_name: str, version: semver.Version # or str container: dagger.Container

We could add other attribute for reporting if needed, but the key here is the container attribute, whose retrieval happens before instantiation.

We can then call dispatch on a ConnectorUnderTest instance and keep the metadata attribute at hand during the whole execution flow.

For connector container retrieval we can implement multiple upstream functions producing ConnectorUnderTest according to the execution context:

You want to run regression test on two already released version: we create the connector container with dagger_client.container().from_(image_address) in a specific function

You want to run regression test on a target which is not released: we build it re-using the airbyte-ci build step, in a specific function again.

👍 I've added the ConnectorUnderTest dataclass and added a TODO for handling building of images based on context.

airbyte-ci/connectors/regression-testing/regression_testing/backends/base_backend.py

airbyte-ci/connectors/regression-testing/regression_testing/__init__.py

airbyte-ci/connectors/regression-testing/regression_testing/main.py

alafanechere · 2024-02-28T00:18:42Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+        raise NotImplementedError(f"{command} is not recognized. Must be one of {', '.join(COMMANDS)}")
+
+
+@click.command()


can we declare the command(s) in a separate cli.py module?

alafanechere · 2024-02-28T00:20:16Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+    default=None,
+    type=str,
+)
+def main(


I think we should use asynclick instead of click so that we can declare async def main and just await on _main

Example of the same pattern here on the QA checks side.

alafanechere · 2024-02-28T00:21:34Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+):
+    runner = ConnectorRunner(container, backend, f"{output_directory}/{command}")
+
+    if command == "check":


can we introduce a Command(Enum) ?

Definitely. Done.

alafanechere · 2024-02-28T00:22:54Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+        await runner.call_discover(config)
+
+    elif command == "read":
+        runner = ConnectorRunner(container, backend, f"{output_directory}/read")


these command specific runner can be removed right, the runner defined on line 92 can be reused.

Yep! Removed.

alafanechere · 2024-02-28T00:29:09Z

airbyte-ci/connectors/regression-testing/src/regression_testing/connector_runner.py

+        catalog: dict = None,
+        state: Union[dict, list] = None,
+        enable_caching=True,
+    ) -> List[AirbyteMessage]:


It's not returning a list of AirbyteMessage if I'm not mistaken 😄

What do you think about making _run return a tuple with the executed command and the awaited container.

Then this tuple can be passed to a "backend" which would produce the artifact we would need downstream for debugging or testing?

This would help making this class self contained: only running airbyte commands on connector containers

alafanechere · 2024-02-28T00:38:03Z

airbyte-ci/connectors/regression-testing/src/regression_testing/connector_runner.py

+        entrypoint = await container.entrypoint()
+        airbyte_command = entrypoint + airbyte_command
+        container = container.with_exec(
+            ["sh", "-c", " ".join(airbyte_command) + f" > {self.IN_CONTAINER_OUTPUT_PATH} 2>&1 | tee -a {self.IN_CONTAINER_OUTPUT_PATH}"],


this assumes sh and tee are available in the container.
I think we should rather await on a with exec which is not altering the original entrypoint and get stdout + stderr.
And as I suggested above, I think this can happen in a different class, maybe at the backend class level.
E.G:
_run "just" returns:

return command, await container.with_exec(airbyte_command)

Then in a different code path:

command, executed_container = connector_run.call_read(***) stdout, stderr = await executed_container.stdout(), await executed_container.stderr() # These stdout stderr can be written to files, db are whatever the "backend" implements

alafanechere · 2024-02-28T00:40:44Z

airbyte-ci/connectors/regression-testing/src/regression_testing/connector_runner.py

+        raw_output = await AnyioPath(filepath).read_text()
+        await self._backend.write(self._raw_output_iter(raw_output))
+
+    def _raw_output_iter(self, raw_output):


I think the runner should not perform any deserialization. I'd suggest deserialization to happen post dispatch when we actually need deserialized stuffed in tests or diff.
My point is that error detection should happen at a more downstream level where actual good or bad connector behavior will be evaluated.

I like the flow of:

run commands on containers and gather raw outputs (stdout, stderr)

build objects from these output with backends.

Use a backend useful for test, another one for debugging, etc.

alafanechere · 2024-02-28T00:42:32Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+
+logger = logging.getLogger(__name__)
+
+COMMANDS = ["check", "discover", "read", "read-with-state", "spec"]


Would love a Commands enum 😄

alafanechere · 2024-02-28T00:44:44Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+    default=None,
+    type=str,
+)
+def main(


Can we call this command run and create it under a regression-test command group?

alafanechere · 2024-02-28T00:45:57Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+
+        # TODO: maybe use proxy to cache the response from the first round and use the cache for the second round
+        #   (this may only make sense for syncs with an input state)
+        if command == "all":


can we instead change the main signature to take a commands list and iterate on commands? It will help you avoid the if/else

alafanechere · 2024-02-28T00:59:59Z

airbyte-ci/connectors/regression-testing/src/regression_testing/comparators/diff_comparator.py

+                            dagger_client.host().directory(self._test_output_directory)))
+        os.makedirs(self._in_container_results_directory, exist_ok=True)
+
+    async def _diff(self, container: Container, control_connector: ConnectorUnderTest, target_connector: ConnectorUnderTest) -> Container:


If we say we are comparing raw outputs. Can we diff stderr and stdout of the control and the target?
let's say you can access the executed container with control_connector.executed_container.
Then you can mount the stdout/stderr as files to you container with diff:

self._container .with_new_file(f"{control_directory}/stdout.txt", await control_connector.executed_container.stdout()) .with_new_file(f"{target_directory}/stdout.txt", await target_connector.executed_container.stdout()) .with_new_file(f"{control_directory}/stderr.txt", await control_connector.executed_container.stderr()) .with_new_file(f"{target_directory}/stderr.txt", await target_connector.executed_container.stderr())

It's interesting to realize that with this approach the connector output is never transiting on the host filesystem, which can be an interesting security point.

As discussed I think it would make sense to put this behind a debug flag. Let me know when your changes to the connector runner are stable and I'll add it.

girarda · 2024-02-29T20:51:32Z

airbyte-ci/connectors/live-tests/pyproject.toml

+    { include = "live_tests", from = "src" },
+]
+
+[tool.poetry.dependencies]


I had to add click as a depenency
click = "^8.1.3"

after an update, I now needed to add
asyncclick ="^8.1.3.2"

clnoll

@alafanechere I've addressed all the comments except the ones touching the ConnectorRunner, which you're taking, and the --debug flag that we discussed for determining whether to store the raw output. I'll add it in when your ConnectorRunner changes are stable. In the mean time I'll add some more unit tests.

clnoll · 2024-02-29T17:39:24Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+        raise NotImplementedError(f"{command} is not recognized. Must be one of {', '.join(COMMANDS)}")
+
+
+@click.command()


clnoll · 2024-02-29T17:39:45Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+        await runner.call_discover(config)
+
+    elif command == "read":
+        runner = ConnectorRunner(container, backend, f"{output_directory}/read")


Yep! Removed.

clnoll · 2024-02-29T17:39:54Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+):
+    runner = ConnectorRunner(container, backend, f"{output_directory}/{command}")
+
+    if command == "check":


Definitely. Done.

clnoll · 2024-02-29T17:48:46Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+    default=None,
+    type=str,
+)
+def main(


clnoll · 2024-02-29T17:53:06Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+
+logger = logging.getLogger(__name__)
+
+COMMANDS = ["check", "discover", "read", "read-with-state", "spec"]


clnoll · 2024-03-01T00:06:49Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+
+        # TODO: maybe use proxy to cache the response from the first round and use the cache for the second round
+        #   (this may only make sense for syncs with an input state)
+        if command == "all":


clnoll · 2024-03-01T00:12:50Z

airbyte-ci/connectors/regression-testing/src/regression_testing/main.py

+    default=None,
+    type=str,
+)
+def main(


clnoll · 2024-03-01T00:18:43Z

airbyte-ci/connectors/regression-testing/src/regression_testing/comparators/diff_comparator.py

+                            dagger_client.host().directory(self._test_output_directory)))
+        os.makedirs(self._in_container_results_directory, exist_ok=True)
+
+    async def _diff(self, container: Container, control_connector: ConnectorUnderTest, target_connector: ConnectorUnderTest) -> Container:


As discussed I think it would make sense to put this behind a debug flag. Let me know when your changes to the connector runner are stable and I'll add it.

alafanechere · 2024-03-04T13:47:32Z

live-tests: add regression tests suite #35837
live-tests: implement debug mode #35786
live-tests: debug mode and initial regression tests framework #35624 👈
master

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @clnoll and the rest of your teammates on Graphite

…nnector versions

Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>

Co-authored-by: alafanechere <augustin.lafanechere@gmail.com> Co-authored-by: Augustin <augustin@airbyte.io>

clnoll requested a review from a team as a code owner February 26, 2024 06:44

clnoll marked this pull request as draft February 26, 2024 06:44

clnoll mentioned this pull request Feb 26, 2024

Regression testing version comparison #35534

Closed

clnoll force-pushed the regression-tests-package branch 3 times, most recently from 0e93700 to 9d1462e Compare February 26, 2024 07:38

alafanechere reviewed Feb 26, 2024

View reviewed changes

airbyte-ci/connectors/regression-testing/regression_testing/backends/base_backend.py Outdated Show resolved Hide resolved

alafanechere reviewed Feb 26, 2024

View reviewed changes

airbyte-ci/connectors/regression-testing/regression_testing/__init__.py Outdated Show resolved Hide resolved

alafanechere reviewed Feb 26, 2024

View reviewed changes

airbyte-ci/connectors/regression-testing/regression_testing/main.py Outdated Show resolved Hide resolved

clnoll force-pushed the regression-tests-package branch 4 times, most recently from 91d5ee8 to 1a5fd4d Compare February 27, 2024 19:18

alafanechere reviewed Feb 28, 2024

View reviewed changes

clnoll force-pushed the regression-tests-package branch from 1a5fd4d to b70b66a Compare February 29, 2024 16:57

girarda reviewed Feb 29, 2024

View reviewed changes

clnoll commented Mar 1, 2024

View reviewed changes

alafanechere force-pushed the regression-tests-package branch from dc189c3 to 648bfc4 Compare March 4, 2024 13:47

alafanechere mentioned this pull request Mar 4, 2024

live-tests: implement debug mode #35786

Merged

alafanechere force-pushed the regression-tests-package branch 3 times, most recently from 54eb1ce to 3c1e213 Compare March 5, 2024 21:10

alafanechere mentioned this pull request Mar 5, 2024

live-tests: add regression tests suite #35837

Merged

clnoll marked this pull request as ready for review March 6, 2024 00:24

clnoll changed the title ~~WIP: regression tests separate package~~ live-tests: debug mode and initial regression tests framework Mar 6, 2024

WIP: regression tests separate package

8e1ddc1

clnoll and others added 19 commits March 6, 2024 08:27

Handle other commands

232c109

Organize files for all commands

44879c4

Run all commands

77173ac

run all commands

b722295

cleanup

86a9520

DiffComparator that outputs unified diff between output of the two co…

1d3b7cb

…nnector versions

WIP tests

fd343f3

test_main

a5b2a28

file_backend tests

35e08df

regression-testing -> live-tests rename

6ca00c8

Simplify regression_tests to only regression test-specific code

fbbf734

Reorg into cli.py + run_tests.py

0385587

COMMANDS -> Command(Enum)

1ea43f9

Don't change poetry.lock for airbyte-ci/connectors/pipelines

8575823

__all__ for backends & comparators

4c47d43

List of commands instead of single command

a7e4a7c

Use regression_tests command group

f1bc311

remove extra __init__ files

60204aa

live-tests: implement debug mode (#35786)

8c0062e

Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>

alafanechere force-pushed the regression-tests-package branch from 4f54bc5 to f74d3cc Compare March 6, 2024 07:28

octavia-squidington-iii added area/connectors Connector related issues connectors/source/pokeapi labels Mar 6, 2024

alafanechere force-pushed the regression-tests-package branch from f74d3cc to 0a4f755 Compare March 6, 2024 07:31

octavia-squidington-iii removed the area/connectors Connector related issues label Mar 6, 2024

Clean up dependencies

d9d08c7

alafanechere force-pushed the regression-tests-package branch from 0a4f755 to d9d08c7 Compare March 6, 2024 07:41

alafanechere approved these changes Mar 6, 2024

View reviewed changes

clnoll merged commit 1571dbd into master Mar 6, 2024
35 checks passed

clnoll deleted the regression-tests-package branch March 6, 2024 13:18

xiaohansong pushed a commit that referenced this pull request Mar 7, 2024

live-tests: debug mode and initial regression tests framework (#35624)

5477231

Co-authored-by: alafanechere <augustin.lafanechere@gmail.com> Co-authored-by: Augustin <augustin@airbyte.io>

		raise NotImplementedError(f"{command} is not recognized. Must be one of {', '.join(COMMANDS)}")


		@click.command()


		logger = logging.getLogger(__name__)

		COMMANDS = ["check", "discover", "read", "read-with-state", "spec"]

live-tests: debug mode and initial regression tests framework #35624

live-tests: debug mode and initial regression tests framework #35624

Conversation

clnoll commented Feb 26, 2024 • edited Loading

vercel bot commented Feb 26, 2024 • edited Loading

alafanechere Feb 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

clnoll left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alafanechere commented Mar 4, 2024 • edited Loading

clnoll commented Feb 26, 2024 •

edited

Loading

vercel bot commented Feb 26, 2024 •

edited

Loading

alafanechere Feb 26, 2024 •

edited

Loading

alafanechere commented Mar 4, 2024 •

edited

Loading