Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add repository provider #244

Conversation

realtimetodie
Copy link

Adds a new provider to get information about the used repository.

Required for a replacement of bazebuild/rules_k8s.

See https://github.com/realtimetodie/rules_k8s, a (possible) candidate for bazel-contrib/rules_k8s 🎉

Related

@thesayyn
Copy link
Collaborator

is there a way to do this without using providers? I don't see us adding providers as we don't depend on them anywhere. It also expands the surface we must worry about when making changes.

@thesayyn
Copy link
Collaborator

One way I see this could be placed is a common target that stamps out the repository information which in turn can be passed to oci_push and some other rule that depends on this information.

@realtimetodie
Copy link
Author

realtimetodie commented May 14, 2023

Everything that enables access to the repository information is fine 👍

@thesayyn
Copy link
Collaborator

Everything that enables access to the repository information is fine 👍

you can do this today by simply doing;

stamping_rule(
  name = "repository",
  repository = "..."
)

oci_push(
  repository = ":repository"
)

some_other_rule(
  repository = ":repository"
)

@realtimetodie
Copy link
Author

I understand, but a Kubernetes manifest can contain multiple images that can point to different registries.

Example

load("@rules_k8s//k8s:defs.bzl", "kubectl_apply")
load("@rules_oci//oci:defs.bzl", "oci_image", "oci_push")

oci_image(
    name = "image",
    architecture = "amd64",
    entrypoint = ["/example"],
    os = "linux",
)

oci_push(
    name = "push_image1",
    image = ":image",
    remote_tags = ["v1"],
    repository = "index.docker.io:9899/janedoe/awesomeapp",
)

kubectl_apply(
    name = "pod",
    images = [":push_image"],
    manifest = "pod.yaml",
)

@thesayyn
Copy link
Collaborator

I see. oci_push prints the reference of the pushed image to stdout. you can get the information that way too.

@realtimetodie
Copy link
Author

This will not work because the oci_push rule is never run. The Bazel run command is used to build and run a single target. Unlike previously, the new rules_k8s does not run multiple actions sequentially. It only executes the kubectl tool for communicating with a Kubernetes cluster's control plane.

@thesayyn
Copy link
Collaborator

thesayyn commented May 15, 2023

This will not work because the oci_push rule is never run. The Bazel run command is used to build and run a single target. Unlike previously, the new rules_k8s does not run multiple actions sequentially. It only executes the kubectl tool for communicating with a Kubernetes cluster's control plane.

then why do you need oci_push targets at all? if they are never run. How do you make sure that the image is pushed upstream and safe to pull?

also, what would be the purpose of this provider if all you need is a repository string which can be shared between targets?

@realtimetodie
Copy link
Author

then why do you need oci_push targets at all? if they are never run. How do you make sure that the image is pushed upstream and safe to pull?

also, what would be the purpose of this provider if all you need is a repository string which can be shared between targets?

I understand and this is a valid point. The weird thing about bazebuild/rules_k8s was, that it pushed images independently from rules_docker. It was a seperate implementation. This is bad design in my opinion.

https://github.com/bazelbuild/rules_k8s/blob/fee80eb69e1921c076167ebebcf5eea3d2e9c707/k8s/go/pkg/resolver/resolver.go#L286

The new rules_k8s should not maintain a separate implementation of the rules_oci push rule. What if we make the implementation of the rules_oci push rule public and convert it so that it uses ctx.action.run instead of returning DefaultInfo? Let me know what you think (dummy code)

load("@rules_oci//oci:defs.bzl", "push_image_impl")

def _kubectl_apply_impl(ctx):
    ...
    for image in images:
        files = push_image_impl(ctx)
        runfiles = runfiles.merge(files)
    ...

@realtimetodie
Copy link
Author

However, I want to point out that making the rules_oci push rule implementation public will not solve the problem of the missing repository information. The repository information is required to construct the registry URL in the manifest. This is how it worked before and would like to keep this API as it is very convenient

Source manifest

image: janedoe/awesomeapp

Resolved manifest

image: index.docker.io:9899/janedoe/awesomeapp

@thesayyn
Copy link
Collaborator

The new rules_k8s should not maintain a separate implementation of the rules_oci push rule. What if we make the implementation of the rules_oci push rule public and convert it so that it uses ctx.action.run instead of returning DefaultInfo

there's no difference between getting files from DefaultInfo of oci_push versus calling the implementation function, which would require you to declare the attributes in your rule.

I still don't get why you need to run oci_push yourself as an action as part of the _kubectl_apply rule? what you need is a full reference to the image repo@digest which is already provided when you run oci_push which _kubectl_apply rule can do there.

@thesayyn
Copy link
Collaborator

image: index.docker.io:9899/janedoe/awesomeapp

don't you need at least :tag or @digest here?

@realtimetodie
Copy link
Author

don't you need at least :tag or @digest here?

I thought about using crane's digest command, but this would require information about the image's repository too

https://github.com/google/go-containerregistry/blob/main/cmd/crane/doc/crane_digest.md

What do you think? I'm not sure how to run the rules_oci push rule from within the kubectl_apply rule

@thesayyn
Copy link
Collaborator

as far as I can tell what you need is;

# object.yaml
- kind: somek8s/kind
   template:
      image: image/repository@to_be_replaced
kubectl_apply(
   template = "object.yaml",
   substitutions = {
      ":label_to_oci_push": "image/repository@to_be_replaced"
   }
)

at analysis time

  • construct runfiles by including every oci_push target in the substitutions attribute

at runtime

  • run every oci_push target and collect the digest printed to stdout but keep a map of which digest belongs to which substitution
  • apply the substitution to the object.yaml file
  • kubectl apply the resulting yaml file.

@thesayyn
Copy link
Collaborator

I'm not sure how to run the rules_oci push rule

here's one of our tests which runs oci_push target in another bash script. https://github.com/bazel-contrib/rules_oci/blob/main/examples/push/test.bash

@thesayyn
Copy link
Collaborator

@realtimetodie does that answer your question?

@thesayyn thesayyn added the wontfix This will not be worked on label May 16, 2023
@realtimetodie
Copy link
Author

@realtimetodie does that answer your question?

The substitution idea is ugly and I don't like it. What I want to achieve is a lightweight re-implementation of bazelbuild/rules_k8s with the same convenient API, but without the fat gopher tools. Are there any stronger arguments against the provider other than that it expands the surface? Providers are a common way to mutually exchange data between Bazel rulesets.

Alternatively, we could also expand the provider and make it more universal. My suggested implementation only covers the requirements of the new rules_k8s. To make it clear, the ultimate goal is to make rules_oci and the new rules_k8s work with Tilt again

https://docs.tilt.dev/integrating_bazel_with_tilt.html

@realtimetodie
Copy link
Author

@alexeagle

@thesayyn
Copy link
Collaborator

Are there any stronger arguments against the provider other than that it expands the surface? Providers are a common way to mutually exchange data between Bazel rulesets.

Not expanding the API surface is a good enough argument for not introducing a new provider as a "convenience". It's also worth mentioning that the repository attribute is string now but it will become an attr.label once we introduce stamping for it too. If I add this provider now, then I will have to worry about too many things once we do it.

you can already do this and you do not need a new provider to do it. I am not trying to give you a hard time here but I don't think I'll agree to adding this.

@thesayyn
Copy link
Collaborator

thesayyn commented May 18, 2023

Also, if you just want the repository information excluding ":tag" "@digest", you can simply parse the reference printed to stdout and replace that information in the object.yaml with the one with digest.

@thesayyn
Copy link
Collaborator

going to close as not planned and there are feasible ways to do this already.

@thesayyn thesayyn closed this May 18, 2023
@realtimetodie
Copy link
Author

@thesayyn I still don't understand why this was rejected. There is no logic behind the reasoning. Now I have to maintain multiple forks for clients with a patched version of rules_oci. It's so tiresome. Sad.

@alexeagle
Copy link
Collaborator

The logic was pretty clear from what I can tell: we want to have a bare minimum in this repo, so if something is possible already, the preference is not to add anything.

There is a case to be made for convenience features over time, especially as we get more mature so there is less chance of churn. I'll reopen this so we can keep discussing.

@alexeagle alexeagle added need: discussion Needs a proper discussion around the problem. and removed wontfix This will not be worked on labels Jun 4, 2023
@alexeagle alexeagle reopened this Jun 4, 2023
@alexeagle
Copy link
Collaborator

That said, it's better to start with an issue than a PR, to focus discussion on the problem rather than one proposed solution, and come to agreement that the problem even exists.

@thesayyn
Copy link
Collaborator

thesayyn commented Jun 4, 2023

I'll explain why this is not needed and not the right thing to do again;

1- repository information is provided when you run the oci_push as part of another rule; eg run oci_push wrapper and capture the stdout which only contains full reference to the image.

2- repository attribute is a string for now but soon we'll make it stampable too, when that happens it'll become a label and we'll have to worry about how to make that file available and wait for downstream dependents to change their implementation or simply forced to make a breaking change warn users about that. hence my concern about expanding the public API. this is one of the goals of rules_oci here; https://github.com/bazel-contrib/rules_oci#bazel-rules-for-oci-containers

3- you mentioned that you don't want to run oci_push targets, but how do you propose making sure that the repository string that you are depending on is already pushed upstream? original rules_k8s has done it by having a go binary that pushes the image. However, you don't need to use the same approach here; it can just be a simple bash script invoking oci_push target and doing some replacements over the k8s manifest.

all that said; i am willing to contribute to your effort directly, to demonstrate how this can be done without relying on complex go pushers, simply with a bash script using yq.

@thesayyn thesayyn marked this pull request as draft June 22, 2023 17:20
@alexeagle
Copy link
Collaborator

@realtimetodie I'm going to close this as I don't think we'll merge as is - but I don't mean to discourage you at all! We'd love to have a path for rules_k8s users that doesn't rely on rules_docker and happy to have your contributions here.
Let's start with an issue thread about the design problem next time.

@opicaud
Copy link

opicaud commented Aug 4, 2023

Hey all,
Sorry @alexeagle to reopen the thread, I did not want to create an issue specific for my comment..

I am migrating rules_helm to use rules_oci. So far, i have successfully used the oci_push run target as mentioned by @thesayyn in helm_install rule. So far so good :)
But for helm_package.. it's about "building" a archive containing manifests with images descriptors not yet pushed

So how it is possible to make a bazel build //my-awesome-helm-chart that will contains the potentially not yet pushed images descriptors ? Images that might be pushed later via oci_push (or via the rule helm_install as explained before)

Thanks

Olivier

@aw185176
Copy link
Contributor

aw185176 commented Aug 8, 2023

going to close as not planned and there are feasible ways to do this already.

One missing piece is that it should be possible to render that reference without actually performing the push, which implementations on top of oci_push might want to leverage in order to avoid pushing more than necessary, or depending on specific contexts.

@thesayyn
Copy link
Collaborator

thesayyn commented Aug 8, 2023

reference without actually performing the push

Nothing will be sent over the wire if the image is already pushed. That said, you still need to make sure the image is available on the remote in order to produce a correct manifest.

@opicaud
Copy link

opicaud commented Aug 10, 2023

That said, you still need to make sure the image is available on the remote in order to produce a correct manifest.

No I don't need : let me take an example

I can deploy a new manifest with a not yet available image. It will ends up with an imagePullBackOff and that's OK : because during this time the old pod can still running, in order to not stop the business behind. Then when the new image is available, the new pod will start smoothly, and we can stop the old one.

So, it's perfectly fine to build a manifest, without having having the image pushed yet.

@thesayyn
Copy link
Collaborator

Could you please explain how ‘imagePullBackOff’ is okay?

@opicaud
Copy link

opicaud commented Aug 10, 2023

Because what does matter is that at least one service is running at any time. In my exemple, I can deploy v1.1 manifest with an image not yet available, as soon as the v1.0 is still running, it does not impact my customer, so it's okay to deal with this little delay.

@thesayyn
Copy link
Collaborator

thesayyn commented Aug 10, 2023

Now you see, we are talking about an edge case rather than a common case. If you don't want the images to be pushed in order to get a reproducible reference with a digest then you can always pass the repository information down to the other rule directly instead of making oci_image relay/reexport this information.

A few ways to do this since you are not concerned about the existence of the image;

1- use a starlark variable

REPOSITORY="myrepo"
oci_push(repository = REPOSITORY)
whatever_rule(repository=REPOSITORY)

2- use a file and pass the label to both rules.

write_file(name = "repository", content  = "myrepo")
oci_push(repository = ":repository")
whatever_rule(repository=":repository")

3- add a repository-aware push rule to the downstream ruleset and replace oci_push with that.

# exports ImageRepositoryInfo alongside DefaultInfo
 oci_noop_push()

I'll reiterate what I said earlier and some more.

1- our approach with rules_oci is to keep it stupid simple and I can't do that if I am adding "convenience" features all over the place. my goal is to avoid that.
2- I believe I have given enough examples that cover both common and edge cases but I'll I hear people insisting that we add another way to get a piece of information that is already accessible in multiple ways.
3- "rules_oci" is a barebones ruleset meaning you can always extend on top of it. you can always a wrapper rule in whatever ruleset to add this provider.

As far as rules_oci is concerned, this is an implementation detail of a downstream ruleset, and should not be carried responsibility for rules_oci.

NOTE: my main priority is to not add something new here. I'll lock this conversation for good once i hear opposition arguments once more.

@opicaud
Copy link

opicaud commented Aug 11, 2023

I don't want to convince you to accept this pull request, i would not accept it like you do.. because as you explained, it's a behaviour we can implement by extending the rule, and I have migrated helm_install from the rules_helm project thanks to your advice already.

My question was not about if it is possible or not, but more when in the bazel process we can achieve this : basically to get the digest during the build phase in order to keep the same behaviour as it is currently in rules_helm with the helm_chart rule.

I see is that the oci_image rule is building a index.json containing the digest to grab, so i will invest in this direction.

Thanks for your support, you challenged and that helped me.

@thesayyn
Copy link
Collaborator

Great. I am on Bazel Slack if you'd like to chat about it more. For now, I am going to lock this discussion to prevent further conversation here. Please file an issue if you'd still like to discuss further.

@thesayyn thesayyn removed the need: discussion Needs a proper discussion around the problem. label Aug 11, 2023
@bazel-contrib bazel-contrib locked as resolved and limited conversation to collaborators Aug 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants