-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
live-tests: debug mode and initial regression tests framework #35624
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
0e93700
to
9d1462e
Compare
logger = logging.getLogger(__name__) | ||
|
||
|
||
async def _main( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd love to separate connector image retrieval (test bootstrapping) from actual command execution.
I suggest creating a ConnnectorUnderTest
dataclass which could have the following attribute:
@dataclass
class ConnectorUnderTest:
technical_name: str,
version: semver.Version # or str
container: dagger.Container
We could add other attribute for reporting if needed, but the key here is the container
attribute, whose retrieval happens before instantiation.
We can then call dispatch
on a ConnectorUnderTest
instance and keep the metadata attribute at hand during the whole execution flow.
For connector container retrieval we can implement multiple upstream functions producing ConnectorUnderTest
according to the execution context:
- You want to run regression test on two already released version: we create the connector container with
dagger_client.container().from_(image_address)
in a specific function - You want to run regression test on a target which is not released: we build it re-using the
airbyte-ci
build step, in a specific function again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I've added the ConnectorUnderTest
dataclass and added a TODO for handling building of images based on context.
airbyte-ci/connectors/regression-testing/regression_testing/backends/base_backend.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/regression-testing/regression_testing/__init__.py
Outdated
Show resolved
Hide resolved
airbyte-ci/connectors/regression-testing/regression_testing/main.py
Outdated
Show resolved
Hide resolved
91d5ee8
to
1a5fd4d
Compare
raise NotImplementedError(f"{command} is not recognized. Must be one of {', '.join(COMMANDS)}") | ||
|
||
|
||
@click.command() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we declare the command(s) in a separate cli.py
module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
default=None, | ||
type=str, | ||
) | ||
def main( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use asynclick
instead of click
so that we can declare async def main
and just await on _main
Example of the same pattern here on the QA checks side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Done.
): | ||
runner = ConnectorRunner(container, backend, f"{output_directory}/{command}") | ||
|
||
if command == "check": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we introduce a Command(Enum)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely. Done.
await runner.call_discover(config) | ||
|
||
elif command == "read": | ||
runner = ConnectorRunner(container, backend, f"{output_directory}/read") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these command specific runner can be removed right, the runner
defined on line 92 can be reused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Removed.
catalog: dict = None, | ||
state: Union[dict, list] = None, | ||
enable_caching=True, | ||
) -> List[AirbyteMessage]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not returning a list of AirbyteMessage if I'm not mistaken 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about making _run
return a tuple with the executed command and the awaited container.
Then this tuple can be passed to a "backend" which would produce the artifact we would need downstream for debugging or testing?
This would help making this class self contained: only running airbyte commands on connector containers
entrypoint = await container.entrypoint() | ||
airbyte_command = entrypoint + airbyte_command | ||
container = container.with_exec( | ||
["sh", "-c", " ".join(airbyte_command) + f" > {self.IN_CONTAINER_OUTPUT_PATH} 2>&1 | tee -a {self.IN_CONTAINER_OUTPUT_PATH}"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assumes sh
and tee
are available in the container.
I think we should rather await on a with exec which is not altering the original entrypoint and get stdout + stderr.
And as I suggested above, I think this can happen in a different class, maybe at the backend class level.
E.G:
_run
"just" returns:
return command, await container.with_exec(airbyte_command)
Then in a different code path:
command, executed_container = connector_run.call_read(***)
stdout, stderr = await executed_container.stdout(), await executed_container.stderr()
# These stdout stderr can be written to files, db are whatever the "backend" implements
raw_output = await AnyioPath(filepath).read_text() | ||
await self._backend.write(self._raw_output_iter(raw_output)) | ||
|
||
def _raw_output_iter(self, raw_output): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the runner should not perform any deserialization. I'd suggest deserialization to happen post dispatch
when we actually need deserialized stuffed in tests or diff.
My point is that error detection should happen at a more downstream level where actual good or bad connector behavior will be evaluated.
I like the flow of:
- run commands on containers and gather raw outputs (stdout, stderr)
- build objects from these output with backends.
- Use a backend useful for test, another one for debugging, etc.
|
||
logger = logging.getLogger(__name__) | ||
|
||
COMMANDS = ["check", "discover", "read", "read-with-state", "spec"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would love a Commands enum 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
default=None, | ||
type=str, | ||
) | ||
def main( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call this command run
and create it under a regression-test
command group?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 done.
|
||
# TODO: maybe use proxy to cache the response from the first round and use the cache for the second round | ||
# (this may only make sense for syncs with an input state) | ||
if command == "all": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we instead change the main signature to take a commands
list and iterate on commands? It will help you avoid the if/else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Done.
dagger_client.host().directory(self._test_output_directory))) | ||
os.makedirs(self._in_container_results_directory, exist_ok=True) | ||
|
||
async def _diff(self, container: Container, control_connector: ConnectorUnderTest, target_connector: ConnectorUnderTest) -> Container: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we say we are comparing raw outputs. Can we diff stderr
and stdout
of the control and the target?
let's say you can access the executed container with control_connector.executed_container
.
Then you can mount the stdout/stderr
as files to you container with diff
:
self._container
.with_new_file(f"{control_directory}/stdout.txt", await control_connector.executed_container.stdout())
.with_new_file(f"{target_directory}/stdout.txt", await target_connector.executed_container.stdout())
.with_new_file(f"{control_directory}/stderr.txt", await control_connector.executed_container.stderr())
.with_new_file(f"{target_directory}/stderr.txt", await target_connector.executed_container.stderr())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting to realize that with this approach the connector output is never transiting on the host filesystem, which can be an interesting security point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed I think it would make sense to put this behind a debug
flag. Let me know when your changes to the connector runner are stable and I'll add it.
1a5fd4d
to
b70b66a
Compare
{ include = "live_tests", from = "src" }, | ||
] | ||
|
||
[tool.poetry.dependencies] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to add click as a depenency
click = "^8.1.3"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after an update, I now needed to add
asyncclick ="^8.1.3.2"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alafanechere I've addressed all the comments except the ones touching the ConnectorRunner
, which you're taking, and the --debug
flag that we discussed for determining whether to store the raw output. I'll add it in when your ConnectorRunner
changes are stable. In the mean time I'll add some more unit tests.
raise NotImplementedError(f"{command} is not recognized. Must be one of {', '.join(COMMANDS)}") | ||
|
||
|
||
@click.command() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
await runner.call_discover(config) | ||
|
||
elif command == "read": | ||
runner = ConnectorRunner(container, backend, f"{output_directory}/read") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Removed.
): | ||
runner = ConnectorRunner(container, backend, f"{output_directory}/{command}") | ||
|
||
if command == "check": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely. Done.
default=None, | ||
type=str, | ||
) | ||
def main( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Done.
|
||
logger = logging.getLogger(__name__) | ||
|
||
COMMANDS = ["check", "discover", "read", "read-with-state", "spec"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
# TODO: maybe use proxy to cache the response from the first round and use the cache for the second round | ||
# (this may only make sense for syncs with an input state) | ||
if command == "all": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep! Done.
default=None, | ||
type=str, | ||
) | ||
def main( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 done.
dagger_client.host().directory(self._test_output_directory))) | ||
os.makedirs(self._in_container_results_directory, exist_ok=True) | ||
|
||
async def _diff(self, container: Container, control_connector: ConnectorUnderTest, target_connector: ConnectorUnderTest) -> Container: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed I think it would make sense to put this behind a debug
flag. Let me know when your changes to the connector runner are stable and I'll add it.
dc189c3
to
648bfc4
Compare
This stack of pull requests is managed by Graphite. Learn more about stacking. |
54eb1ce
to
3c1e213
Compare
Co-authored-by: Catherine Noll <clnoll@users.noreply.github.com>
4f54bc5
to
f74d3cc
Compare
f74d3cc
to
0a4f755
Compare
0a4f755
to
d9d08c7
Compare
Co-authored-by: alafanechere <augustin.lafanechere@gmail.com> Co-authored-by: Augustin <augustin@airbyte.io>
Initial implementation of "live tests" - tests that are intended to be run against actual APIs instead of static data. This currently implements a
debug
command which forms the foundation for regression tests, by providing an interface for running an Airbyte command against multiple versions of a connector and storing output to disk.To run a regression-style tests with this current version of live-tests, we run, e.g.
Results will be stored in the
output-directory
and can be compared byAdditional details are in the README.