Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

publish spec as part of connector publish script #5994

Merged
merged 10 commits into from
Sep 13, 2021

Conversation

cgardens
Copy link
Contributor

@cgardens cgardens commented Sep 11, 2021

Closes #5266

What

  • This PR is the second half of the project that was stared in this PR: Add scheduler client that pulls from bucket cache #5605
  • Our goal is to publish to a cache the spec for a connector when we release a new version of it. This cache can then be used by the app to pull specs without having to run a docker container. The previous PR handled setting up reading from the cache (and falling back on running the spec job if there was a cache miss). This PR handles writing to the cache when we publish a connector.
    It helps to add screenshots if it affects the frontend.

How

  • In the manage.sh script add logic to push to our GCS bucket cache.
  • Note: I have sanity checked that this actually works locally.

Recommended reading order

  1. tools/integrations/manage.sh

Pre-merge Checklist

  • resolve open question below

Open Question

  • I'm not entirely sure of the most ergonomic way to handle the service account key that is needed to make this script work. See in-line comment below.

@github-actions github-actions bot added the area/platform issues related to the platform label Sep 11, 2021
@@ -90,15 +90,19 @@ public BucketSpecCacheSchedulerClient(final SynchronousSchedulerClient client, f

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all changes in this file are just adding debug log statements.

# publish spec to cache. do so, by running get spec locally and then pushing it to gcs.
local tmp_spec_file; tmp_spec_file=$(mktemp)
docker run --rm "$versioned_image" spec | jq .spec > "$tmp_spec_file"
gcloud auth activate-service-account --key-file /Users/charles/Downloads/prod-ab-cloud-proj-bdb658ebbe5a.json
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this line is obviously wrong (since it mentions the FS on my local machine). As far as I can tell this script gets called from 2 places: 1. from people's local machines 2. from GH actions. I think my inclination is to add another argument for cmd_publish where the user has to publish the path to the service account key. @sherifnada does this feel like a reasonable way of going about it? It's a credential so it has to add some developer friction, but I'm trying to keep it as light as possible.

I guess it should be designed so that if someone doesn't pass the cred but their gsutil is already authed into a user that can access the bucket then it should still work as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems reasonable to me to limit publishing new connector versions to only Github

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk. well it's possible in both places now :D

# publish spec to cache. do so, by running get spec locally and then pushing it to gcs.
local tmp_spec_file; tmp_spec_file=$(mktemp)
docker run --rm "$versioned_image" spec | jq .spec > "$tmp_spec_file"
gcloud auth activate-service-account --key-file /Users/charles/Downloads/prod-ab-cloud-proj-bdb658ebbe5a.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should inject this key in CI and pull from an env variable
There generally shouldn't be a strong need to publish from a local machine as you can already skip tests from CI is needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk. made it so it was possible to so locally if needed. happens by default in the publish command on GH


# publish spec to cache. do so, by running get spec locally and then pushing it to gcs.
local tmp_spec_file; tmp_spec_file=$(mktemp)
docker run --rm "$versioned_image" spec | jq .spec > "$tmp_spec_file"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherifnada this will break if spec returns anything but a single airbyte message of type spec. do you have any sense of how bad of an assumption that is. i guess we can protect against this by getting a little fancier with jq.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend filtering to the right spec message. It shouldn't be that much harder to filter to only spec messages right? (and there should definitely only be one of those)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool. added new jq logic. it handles:

  1. filtering non valid json
  2. picking the first spec if multiple are returned
  3. failing if no specs are returned

Copy link
Contributor

@davinchia davinchia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good.

1 small comment.

@cgardens
Copy link
Contributor Author

cgardens commented Sep 13, 2021

/publish connector=connectors/source-pokeapi

🕑 connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231427196
❌ connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231427196

@jrhizor jrhizor temporarily deployed to more-secrets September 13, 2021 22:20 Inactive
@cgardens
Copy link
Contributor Author

cgardens commented Sep 13, 2021

/publish connector=connectors/source-pokeapi

🕑 connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231438100
❌ connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231438100

@jrhizor jrhizor temporarily deployed to more-secrets September 13, 2021 22:25 Inactive
@cgardens
Copy link
Contributor Author

cgardens commented Sep 13, 2021

/publish connector=connectors/source-pokeapi

🕑 connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231552190
❌ connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231552190

@jrhizor jrhizor temporarily deployed to more-secrets September 13, 2021 23:14 Inactive
@cgardens
Copy link
Contributor Author

cgardens commented Sep 13, 2021

/publish connector=connectors/source-pokeapi

🕑 connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231567662
❌ connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231567662

@jrhizor jrhizor temporarily deployed to more-secrets September 13, 2021 23:21 Inactive
@cgardens
Copy link
Contributor Author

cgardens commented Sep 13, 2021

/publish connector=connectors/source-pokeapi

🕑 connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231618936
✅ connectors/source-pokeapi https://github.com/airbytehq/airbyte/actions/runs/1231618936

@jrhizor jrhizor temporarily deployed to more-secrets September 13, 2021 23:43 Inactive
@cgardens cgardens merged commit 74c9986 into master Sep 13, 2021
@cgardens cgardens deleted the cgardens/publish_specs_to_cache branch September 13, 2021 23:49
@@ -69,6 +89,34 @@ cmd_publish() {
echo "Publishing new version ($versioned_image)"
docker push "$versioned_image"
docker push "$latest_image"

if [[ "true" == "${publish_spec_to_cache}" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be extra conditions here to exclude normalization docker images when publishing? see #6052 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/platform issues related to the platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Airbyte Cloud: "Cache" connector containers
5 participants