Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata: set cache control headers #26091

Merged
merged 4 commits into from
May 16, 2023
Merged

Conversation

bnchrch
Copy link
Contributor

@bnchrch bnchrch commented May 15, 2023

What

Ensure we are not caching the metadata files, spec or registry at the CDN level.

While we had caching disabled, google caches public buckets by default.

More context here: https://airbytehq-team.slack.com/archives/C02U46QK5C3/p1683940962000049?thread_ts=1683925529.873469&cid=C02U46QK5C3

Notes for reviewers

Google does not have a bucket level setting for this 馃槩

So we will want to run this command after deploying

gsutil setmeta -r -h "Cache-Control: no-cache" gs://io-airbyte-cloud-spec-cache
gsutil setmeta -r -h "Cache-Control: no-cache" gs://prod-airbyte-cloud-connector-metadata-service

@bnchrch bnchrch requested a review from alafanechere May 15, 2023 20:43
@bnchrch bnchrch changed the title Bnchrch/set cache control headers Metadata: set cache control headers May 16, 2023
Copy link
Contributor

@alafanechere alafanechere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, small change for greater peace of mind 馃憤

Comment on lines +49 to +51
blob_to_save.cache_control = "no-cache"

blob_to_save.upload_from_filename(file_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shall we have a try/catch that make the function return false on error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want the loud error in this case.

But we could change this to

return blob_to_save.exists()

However that doesnt help much in an overwrite scenario

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what uploads the metadata file right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what upload the registry right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct!

@alafanechere
Copy link
Contributor

So we will want to run this command after deploying
gsutil setmeta -r -h "Cache-Control: no-cache" gs://io-airbyte-cloud-spec-cache
gsutil setmeta -r -h "Cache-Control: no-cache" gs://prod-airbyte-cloud-connector-metadata-service

Can this be terraformed or re-executed after each metadata / registry upload?

@bnchrch
Copy link
Contributor Author

bnchrch commented May 16, 2023

@alafanechere thanks for the review.

So we will want to run this command after deploying
gsutil setmeta -r -h "Cache-Control: no-cache" gs://io-airbyte-cloud-spec-cache
gsutil setmeta -r -h "Cache-Control: no-cache" gs://prod-airbyte-cloud-connector-metadata-service

Can this be terraformed or re-executed after each metadata / registry upload?

Im talking with ops after the hackdays about this. Looking at some google help forms our best option is to add a serverless function to gcloud that reruns this on every upload.

if thats the option I would say lets avoid it and start ensuring the metadata_service library is the entry point to these buckets.

@bnchrch bnchrch merged commit 36b3358 into master May 16, 2023
28 checks passed
@bnchrch bnchrch deleted the bnchrch/set-cache-control-headers branch May 16, 2023 16:10
marcosmarxm pushed a commit to natalia-miinto/airbyte that referenced this pull request Jun 8, 2023
* set cache control header to no-cache for metadata upload

* set cache control header to no-cache for spec cache upload

* set cache control header to no-cache for registry upload

* Run formatter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants