Skip to content

feat: Support GCS remotecache #5910

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

anurag-harness
Copy link

@anurag-harness anurag-harness commented Apr 11, 2025

In this PR I am adding support to use gcs as a remote cache. You can authenticate with your gcs bucket using a base64 encoded value of your gcp json key or using OIDC

We have been using this in Harness CI since a few months.

Here is an execution with base64 encoded gcp json key

Screenshot 2025-04-15 at 3 12 45 PM
Screenshot 2025-04-15 at 3 12 55 PM

Here is an execution using OIDC auth

Screenshot 2025-04-15 at 3 11 42 PM
Screenshot 2025-04-15 at 3 11 50 PM

@github-actions github-actions bot added area/dependencies Pull requests that update a dependency file area/buildkitd area/remotecache labels Apr 11, 2025
@anurag-harness anurag-harness changed the title feat: Support exporting and importing cache from GCS feat: Support GCS remotecache Apr 11, 2025
Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
Copy link
Member

@crazy-max crazy-max left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

We would need integration tests similar to #3398 and #3477.

Also vendoring is quite huge, what is the binary size diff?

Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
@anurag-harness
Copy link
Author

Thanks

We would need integration tests similar to #3398 and #3477.

Also vendoring is quite huge, what is the binary size diff?

Thank you @crazy-max for reviewing the PR!
Will work on the integration tests and update the PR. Will also post the binary size diff

@anurag-harness
Copy link
Author

anurag-harness commented Apr 14, 2025

#3477

@crazy-max Do you think this is the best way to simulate a gcs server? https://github.com/fsouza/fake-gcs-server
Wasnt able to find anything official like azurite.
fake-gcs-server doesnt have auth support, need to find something else probaby

Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
@anurag-harness
Copy link
Author

@crazy-max
Binary size before: 61 MB
Binary size after: 65.4 MB

Screenshot 2025-04-15 at 2 28 10 PM Screenshot 2025-04-15 at 2 30 37 PM

@anurag-harness
Copy link
Author

@crazy-max
I have added integration tests using fake-gcs-server, can you please review? Thank you!

Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
@anurag-harness
Copy link
Author

Updated the README and tried to fix linting issues

Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
@anurag-harness
Copy link
Author

Ran docker buildx bake validate locally to make sure there are no more linting or doc errors

@anurag-harness
Copy link
Author

@crazy-max Can I please get a review on this again when possible?

Copy link
Member

@crazy-max crazy-max left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the changes I have multiple concerns:

Binary size before: 61 MB
Binary size after: 65.4 MB

We are currently being very intentional about keeping the binary size of our application lean, and this change increases it from ~61MB to ~65MB.

The additional size primarily comes from introducing a large set of dependencies that are only used for this single backend, and I'm not comfortable with the tradeoff at this time. Maintaining a small binary is important to us for distribution.

Looking at the changes it seems to be very similar to the s3 cache backend. I would prefer if we could make gcs and s3 interoperable. Looking at #3749, it seems possible.

A couple of additional questions before we can consider moving forward:

  • Would you be open to maintaining this backend long-term? Since it's adding a new surface area to the project, we'd want to make sure there's someone committed to keeping it working as GCS APIs evolve.
  • Could you share a bit more about your use case and the context in which you're using GCS? That will help us better understand how generalizable this backend is and whether it makes sense as part of the core project. Atm it doesn't seem there is any user asking for this backend specifically.
  • Also, what's your level of experience with GCS and the Go SDK? Just trying to get a sense of how deeply you've worked with this ecosystem, especially if issues or edge cases come up later.

Depending on the answers, we might consider alternatives like making this backend optional via build tags or external integration.

@crazy-max crazy-max self-assigned this Apr 17, 2025
@anurag-harness
Copy link
Author

anurag-harness commented Apr 17, 2025

Looking at the changes I have multiple concerns:

Binary size before: 61 MB
Binary size after: 65.4 MB

We are currently being very intentional about keeping the binary size of our application lean, and this change increases it from ~61MB to ~65MB.

The additional size primarily comes from introducing a large set of dependencies that are only used for this single backend, and I'm not comfortable with the tradeoff at this time. Maintaining a small binary is important to us for distribution.

Looking at the changes it seems to be very similar to the s3 cache backend. I would prefer if we could make gcs and s3 interoperable. Looking at #3749, it seems possible.

A couple of additional questions before we can consider moving forward:

  • Would you be open to maintaining this backend long-term? Since it's adding a new surface area to the project, we'd want to make sure there's someone committed to keeping it working as GCS APIs evolve.
  • Could you share a bit more about your use case and the context in which you're using GCS? That will help us better understand how generalizable this backend is and whether it makes sense as part of the core project. Atm it doesn't seem there is any user asking for this backend specifically.
  • Also, what's your level of experience with GCS and the Go SDK? Just trying to get a sense of how deeply you've worked with this ecosystem, especially if issues or edge cases come up later.

Depending on the answers, we might consider alternatives like making this backend optional via build tags or external integration.

Yes we were initially using HMAC AK and SK for GCS buckets (as mentioned in #3749) and going through the S3 flow but that wasnt enough for us.

Most of our users are not comfortable using AK and SK due to security concerns and hence we had to add support for the gcs sdk which would allow inheriting auth from the env and support for OIDC.

Answers to your questions

  1. If we are able to open source our changes to buildkit then we can eventually shift to using the open sourced version rather than our internal fork with these changes. Since we have users using this flow we would do our best to maintain this long term.
  2. As I mentioned above we have users who use GCS and are not comfortable with AK and SK based auth. This is the reason we had to add a new backend to support all auth types so they could take advantage of remote caching
  3. I'm not an expert with the GCS Go SDK. I have just used it to support different auth flows and in multiple projects to pull and push from buckets.

Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
Signed-off-by: Anurag Madnawat <anurag.madnawat@harness.io>
@thompson-shaun thompson-shaun modified the milestone: v0.23.0 May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants