Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression tests: run in GitHub Actions #37659

Merged
merged 48 commits into from
Apr 30, 2024
Merged

Conversation

clnoll
Copy link
Contributor

@clnoll clnoll commented Apr 29, 2024

What

Adds the ability to run regression tests manually in GitHub Actions.

How

  1. Adds a new workflow for regression tests that takes the connector and connection ID as input.
  2. Updates the regression tests to remove auto_select_connection as an input option. Instead, we auto-select if the connection ID is "auto". This removes one degree of freedom which was complicating the connection retrieval logic.

Note -
Using the connection-retriever required some setup that isn't entirely obvious in this PR. This includes

  • Accessing the airbyte-platform-internal repo during the build time of the regression test container. Locally we can use ssh to give authorized users the ability to download it, but in CI we need to use https and therefore had to modify live tests' pyproject.toml and configure the pyproject.toml with the appropriate credentials at build time.
  • Authenticating with gcloud involved creating a new service account and key, GCP_INTEGRATION_TESTER_CREDENTIALS, stored in github secrets.
  • This service account had to be given several permissions (in prod-ab-cloud-proj and ab-analytics).
  • To connect to postgres, we needed to authenticate by running Google's Cloud SQL Auth Proxy. This allows us to authenticate without manually managing IP whitelists.

Depends on https://github.com/airbytehq/airbyte-platform-internal/pull/12286.

@clnoll clnoll requested a review from a team as a code owner April 29, 2024 15:00
Copy link

vercel bot commented Apr 29, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Apr 30, 2024 4:33am

Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have anything blocking to raise.
I'd be curious to have more details about the need of a Google Cloud SQL proxy.
I'd also like to understand why you removed the --auto-select-connection flag. (does it require a README.md update?)

And as usual: please bump packages version and update changelogs :D

.github/workflows/regression_tests.yml Show resolved Hide resolved
id: fetch_last_commit_id_wd
run: echo "commit_id=$(git rev-parse origin/${{ steps.extract_branch.outputs.branch }})" >> $GITHUB_OUTPUT

- name: Run Regression Tests [WORKFLOW DISPATCH]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might get a very cheap "run on multiple connections" feature if you'd use a matrix strategy here:

  • Make the connection_id inputs a comma separated list
  • Parse this list into an array
  • Run this job for each connection id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I'm going to add this as a TODO in a comment since it requires a little bit of extra testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make sure this refacto works locally too? 🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I did, and am going to do a bit more manual testing before merge. Might be time to start thinking about adding more unit tests 😄 .

@@ -118,7 +116,8 @@ def pytest_configure(config: Config) -> None:
dagger_log_path.touch()
config.stash[stash_keys.DAGGER_LOG_PATH] = dagger_log_path
config.stash[stash_keys.PR_URL] = get_option_or_fail(config, "--pr-url")
config.stash[stash_keys.AUTO_SELECT_CONNECTION] = config.getoption("--auto-select-connection")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you getting rid of this explicit flag and decided the go implicit with connection_id == 'auto'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we should only have one of auto-select or connection ID, we don't really need both flags and it creates more surface area for bugs or unexpected behavior. For example, when I passed in a connection ID via gha I couldn't find a way to turn off auto-select since pytest doesn't parse falsey values. I could have changed auto-select to default to false, but then we'd still end up in a situation where users could provide inputs that don't make sense when combined. So I decided to remove the degree of freedom.


We need to explicitly kill the proxy in order to allow the GitHub Action to exit.

An alternative that we can consider is to run the proxy as a separate service.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running it with Dagger would prevent you from managing the service lifecycle, it would shutdown if it's not used

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it the perfect use case for dagger services :D google distributes a docker image for it:
https://cloud.google.com/sql/docs/postgres/sql-proxy#cloud-sql-auth-proxy-docker-image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I've added that link. Since the existing code is working I'm going to leave it as-is for now but will circle back to try this out if time permits.

@clnoll clnoll merged commit 7874e32 into master Apr 30, 2024
35 checks passed
@clnoll clnoll deleted the catherine/live-tests-gha branch April 30, 2024 04:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants