Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

airbyte-lib: Add testing to connectors #34044

Merged
merged 21 commits into from
Feb 6, 2024

Conversation

flash1293
Copy link
Contributor

@flash1293 flash1293 commented Jan 9, 2024

Building on the airbyte-lib-validate-source command to validate whether a python connector works with airbyte-lib, this PR introduces a new optional test step to run this command with the current connector under test.

To test:

  • Install dev airbyte-ci tools via make tools.airbyte-ci-dev.install
  • Run tests for connector where they will work: airbyte-ci connectors --name=source-apify-dataset test

Copy link

vercel bot commented Jan 9, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Feb 5, 2024 4:09pm

@octavia-squidington-iii octavia-squidington-iii added area/connectors Connector related issues area/documentation Improvements or additions to documentation connectors/source/pokeapi labels Jan 31, 2024
@flash1293 flash1293 marked this pull request as ready for review January 31, 2024 16:43
@flash1293 flash1293 requested a review from a team January 31, 2024 16:43
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks good to me and I like the fact that the validation happens in the airbyte-lib package.
I'm a bit more concern by the footprint of this new check on the connector tests in CI.
Could we make the following part of the validation optional to not overlap the test with CAT parts?

    * Answers according to the Airbyte protocol when called with spec, check, discover and read.

@flash1293
Copy link
Contributor Author

flash1293 commented Feb 2, 2024

@alafanechere Maybe this is worded badly - what it means to say is that airbyte-lib can use the connector as expected, which includes getting the spec to statically validate the config the user has provided, running check to fully validate it, and running discover and read to actually fetch data and give it to the user.

The check is not doing any protocol-specific checks, it's just trying to use airbyte-lib with the connector:

source = ab.get_connector(

When you say footprint, are you talking about test runtimes? Because this is relatively lightweight in this sense, it calls spec, check, discover and read only once and stops after the very first record.

The reason why I would like to actually call read in this test is that some connectors do things only if a read is actually triggered (e.g. lazily loading some modules or static files) and I want to validate that this works correctly. If airbyte-lib managed to get a record from a single stream out of the connector, there is a very good chance the other ones will work too

@flash1293
Copy link
Contributor Author

For example for pokeapi the validation took only 13s (and other connectors shouldn't be much slower as it will stop as soon as it has its record):

Screenshot 2024-02-02 at 17 09 30

@alafanechere
Copy link
Contributor

alafanechere commented Feb 5, 2024

The reason why I would like to actually call read in this test is that some connectors do things only if a read is actually triggered (e.g. lazily loading some modules or static files) and I want to validate that this works correctly. If airbyte-lib managed to get a record from a single stream out of the connector, there is a very good chance the other ones will work too

This makes sense to me, and I'm 🆗 with it if it's not significantly slowing down the full test suite. One concern remains: calling read or discover on a connector with short lived access token means these operation can lead to an update of the access token that has to be reflected on the GSM side. Is your validation code performing this update?

@flash1293
Copy link
Contributor Author

@alafanechere good point, I changed this PR to only implement the basic check of "install works and spec can be fetched" for now. I will create an issue to track the more complete setup

@flash1293
Copy link
Contributor Author

The failing test looks unrelated to the changes to me (and also passes locally) - maybe it's flaky?

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change.
Would you mind manually testing it on a couple of connectors by dispatching this workflow?

Joe Reuter added 3 commits February 5, 2024 17:06
@flash1293
Copy link
Contributor Author

@flash1293
Copy link
Contributor Author

Ran tests for a bunch of connectors (see above) and the airbyte-lib validation passed for all of them, merging and crafting a message to connector devs so they are aware

@flash1293 flash1293 merged commit 0b8496c into master Feb 6, 2024
22 checks passed
@flash1293 flash1293 deleted the flash1293/airbyte-lib-testing-proper branch February 6, 2024 09:25
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 21, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
jatinyadav-cc pushed a commit to ollionorg/datapipes-airbyte that referenced this pull request Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants