-
Notifications
You must be signed in to change notification settings - Fork 899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(functions): support environment if present #4632
Conversation
not sure whether we want to also apply patches at |
The other option we could have considered was injecting into the observed state a fake EnvironmnentConfig, but I think such implementation would have the following issues with that:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @phisco! I agree this is probably good enough for now.
internal/controller/apiextensions/composite/composition_functions.go
Outdated
Show resolved
Hide resolved
@@ -90,6 +90,9 @@ message State { | |||
|
|||
// The state of any composed resources. | |||
map<string, Resource> resources = 2; | |||
|
|||
// The environment passed to the Function pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some kind of THIS IS AN ALPHA FIELD
warning like we do on our CRDs? Something that communicates that this field is more at-risk of changing than the rest of the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As to how it might change - if we did make environment
an array or map of resources I could imagine reflecting that here. So something more like:
map<string, google.protobuf.Struct> environment = 3;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. I will update it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the alpha warning as suggested 👍
About the type of the environment field, wouldn't a google.protobuf.Struct
already accomodate such a change? currently the environment is fully unstructured:
// Environment defines unstructured data.
type Environment struct {
unstructured.Unstructured
}
But if we decided to have two separate sections, "data" and "resources" as I'm suggesting, we would just have to populate the Environment in a different way, but we shouldn't need to change the code here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we wouldn't have to, in that a google.protobuf.Struct
can pretty much model anything. It does however seem to me that if we know the environment will always have a certain shape (i.e. data and array of resources) it would be a better experience for folks building functions for us to reflect that shape in our 'schema' as much as possible.
47c8534
to
36c2d80
Compare
@phisco, elsewhere observed and desired work roughly like:
Would the "desired environment" work the same way? i.e. Functions just set the keys they want to set? What would we do with the desired environment at the end of the pipeline run? Discard it? |
Yes, for consistency I would expect it to be exactly the same as the observed/desired states. But then a downstream function would have to "merge" the observed and desired environment to access any info. The only other option would be to have it at the same level as the states in both the request and response, with its own semantic, more similar to a context, rather than the observed/desired state. Currently the state is discarded at the end of the reconciliation loop for P&T, so yes, I'd say for the moment we should stick to that. |
I'm maybe leaning toward this option, but only very slightly. Mostly because it's a bit odd that it's not really desired environment state, but kinda something else. 🤔 |
Yeah, I agree it feels odd. I will update the PR tomorrow with a dedicated structure for both request and response, using a map as you suggested. I'd arbitrarily already put it under a "data" top level key, just to accommodate future changes such as the ones we are discussing ("resources"), wdyt? |
36c2d80
to
3ddd6a4
Compare
@negz, pushed some changes, let me know wdyt 🙏 |
if req.Environment != nil { | ||
d, err := AsStruct(req.Environment) | ||
if err != nil { | ||
return CompositionResult{}, errors.Wrap(err, errEnvAsStruct) | ||
} | ||
env = &v1beta1.Environment{ | ||
Data: d, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@negz how would you feel if we did not pass the computed environment and therefore effectively ignored the whole spec.environment
stanza for Function based compositions, but we still added the in-memory environment here and we worked to implement the current EnvironmentConfig
retrieval and merging logic as a function, similarly to what the patch-and-transform function is doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel pretty genuinely ambivalent about that, haha.
I do really like the idea of moving the "bootstrapping" of the environment into a Function. This gives us more room to experiment there, and allows folks to opt out of the complexity if they don't need it.
What I'm not so sure about is the general pattern of Functions reading stuff from the API server and passing them to the Function pipeline. (I presume a Function would select and read EnvConfigs). Is that something we should allow? Or do we want Crossplane to be the sole broker of everything read from and written to the API server. I'm not concretely sure of what risk Functions reading/writing to the API server poses, more of an intuition at this point.
If we did allow this we'd probably need to update the spec: https://github.com/crossplane/crossplane/blob/5797021/contributing/specifications/functions.md#runtime-environment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, adding side effects could make everything more complex, e.g. how do we handle that from xrender? However I feel it's going to be hard to prevent, people can always add a service account and use that to do whatever they want, so I guess we should just embrace it and make it a thing we take into account. For example, we could add an annotation to let xrender know a function should be skipped or add conditionals as we were discussing in some other thread, so that one can render compositions skipping the retrieval steps and passing stuff into the environment instead of letting it be retrieved from the outside world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, adding side effects could make everything more complex, e.g. how do we handle that from xrender? However I feel it's going to be hard to prevent, people can always add a service account and use that to do whatever they want, so I guess we should just embrace it and make it a thing we take into account.
I am pretty skeptical to the idea of natively supporting direct access to Kube API from a function. This is mostly because, this would open a door invalidating most of the opinionated framework that we're trying to provide with RunFunctionRequest
/RunFunctionResponse
. If a function could access to the Kube API natively, why should it bother getting resources from Observations or specifying the desired state instead of just getting/patching the resources directly. I am imagining that there will be a tendency to open that door with the first limitation in the framework that somebody hits and never get back to track even we resolve that limitation afterwards. There is also a whole new complexity that would be introduced due to the RBAC. How and who will define the RBAC for such a service account, if it is users, then this will still require manual work and passing a service account wouldn't solve the problem. I don't want to think about Crossplane also taking care of RBAC for such a case 😅
However, considering the Kube API is just yet another API and we are planning to enable passing credentials to functions, power users could still communicate with the Kube API if/once they really have to.
This feels quite similar to what one can do with provider-kubernetes and provider-helm. They can be used to create resources (other than the providers own CRDs and a limited set of other types) on the same Kubernetes cluster if a provider config provided accordingly. However, we didn't enable this as a native functionality for all Crossplane Providers, and working fine so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me invalidate a bit what I wrote above 😄
There is also a whole new complexity that would be introduced due to the RBAC. How and who will define the RBAC for such a service account, if it is users, then this will still require manual work and passing a service account wouldn't solve the problem. I don't want to think about Crossplane also taking care of RBAC for such a case 😅
I just remembered there was a functionality in the Provider Meta definition that no provider uses currently (to the best of my knowledge), which is PermissionRequests. I am not sure if it got broken at some point but if we reflect that functionality to Function meta package, an environment-merger function should be able to give a list RBAC policy rules in its package definition to access EnvironmentConfig resources. And this doesn't feel that bad as it will be something defined at the package level and have a very limited scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, having it already implemented makes it feel less bad, isn't it? 😂
Not really 🙂
For me, the key here is RBAC and how/where it is defined.
Initially, I was thinking it as if it would enable/promote doing anything on the Kube API and functions evolving to k8s operators.
However, a permission model where only an explicit/limited set of resources defined by function author is accessible makes it more controlled as you are stating and not harmful to our opinionated framework
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm feeling aligned with what @turkenh wrote in #4632 (comment).
If we embrace letting Functions write to and from the API server we break the nascent contract we have where Crossplane and controllers are responsible for all API server interactions. Per #4723 I think this contract is an invaluable property of Functions.
As a concrete example, a Function that read EnvConfigs from the API server would not work with xrender
. Or rather, you'd need to somehow plumb it up to an API server to allow it to get access to the types it needs.
I don't think we have to submit to inevitability and try to support these Fucntions just because it's hard to stop folks from writing a Function that talks to the API server if they want to. For example probably just saying "your Function isn't compliant with the spec and thus may not work in various contexts" might be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand this correctly it means that while Functions can be used to modify the current environment/context, and could populate the environment from non-K8S API sources, they should not populate it from K8S API resources and that functionality needs to stay where it is (for now) in the Composition, and the discussion in #4583 should stay out of the realm of Functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a concrete example, a Function that read EnvConfigs from the API server would not work with xrender. Or rather, you'd need to somehow plumb it up to an API server to allow it to get access to the types it needs.
Same issue would apply to object retrieval defined at composition level, even more with what we are discussing in #4583.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phisco I don't think it would be quite the same. xrender
would need to be aware that a Composition could reference EnvConfigs and have a flag like --env-configs
or something to pass in the EnvConfigs, similar to the --observed-resources
flag.
That all said, maybe there's a middle ground here where a Function could ask Crossplane to provide the pipeline with "extra" observed resources. I wrote up a sketch of how that might work in #4739.
if req.Environment != nil { | ||
d, err := AsStruct(req.Environment) | ||
if err != nil { | ||
return CompositionResult{}, errors.Wrap(err, errEnvAsStruct) | ||
} | ||
env = &v1beta1.Environment{ | ||
Data: d, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel pretty genuinely ambivalent about that, haha.
I do really like the idea of moving the "bootstrapping" of the environment into a Function. This gives us more room to experiment there, and allows folks to opt out of the complexity if they don't need it.
What I'm not so sure about is the general pattern of Functions reading stuff from the API server and passing them to the Function pipeline. (I presume a Function would select and read EnvConfigs). Is that something we should allow? Or do we want Crossplane to be the sole broker of everything read from and written to the API server. I'm not concretely sure of what risk Functions reading/writing to the API server poses, more of an intuition at this point.
If we did allow this we'd probably need to update the spec: https://github.com/crossplane/crossplane/blob/5797021/contributing/specifications/functions.md#runtime-environment
// The environment to be passed down the Function pipeline. | ||
message Environment { | ||
// The data section of the environment. | ||
google.protobuf.Struct data = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will Functions use keys in this data
struct to pass information to each other?
My understanding from #4731 (comment) is "no". If that's correct, I think it would be useful to bring that use case into scope for this PR. Would we add a new field to the Environment
message for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could imagine foregoing the Environment
message and just having something like:
message RunFunctionRequest {
// Fields omitted for brevity
google.protobuf.Struct environment = 5;
}
You could then put the merged EnvironmentConfigs
under a well-known field within the environment
struct like apiextensions.crossplane.io/config
and let Functions use other fields to pass information without worrying about collisions with what was loaded from the EnvironmentConfigs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will Functions use keys in this data struct to pass information to each other?
What do you mean? "data" is what we get today merging EnvironmentConfig and patching to and from the environment, if we were to add other fields to the environment these would be on the same level as the "data".
environment.data
is exactly what you were putting at environment[apiextensions.crossplane.io/environment]
. I would let people use FromEnvironment patches adding the "data" prefix to fields, so that if we ever added something in the future it could still be reachable by FromEnvironmentPatches in the patch and transform function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would let people use FromEnvironment patches adding the "data" prefix to fields
We're on the same page here pattern wise, i.e. nesting what we would call "the environment" today under a key in... the environment. I think we need a more descriptive, specific names than "environment" and "data" though. If my EnvironmentConfigs and defaults become the "data" key in the environment, what else would there be? Would the other "keys" be something other than data?
Perhaps if we really want to stick with "environment" as the top-level name environment.data
could become environment.config
? or environment[apiextensions.crossplane.io/config]
?
If we ever added something in the future it could still be reachable by FromEnvironmentPatches in the patch and transform function.
I feel that v1.14 we should have a story for Functions to write to the "environment" for v1.14. If we merge this PR as-is it seems like their only option will be to write to environment.data
, which means they could potentially collide with the object that Crossplane loads into that struct from EnvConfigs.
I lean toward the top level of the "environment" being a struct or map because then a Function can "claim" an arbitrary field/key within that top-level struct/map without needing an update to the protobuf schema to add more fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't feel great about having a well-known key we use "officially" to inject the current environment into an unstructured object.
I'd be more comfortable saying that the environment (or context) is just an unstructured object and letting functions define their convention, e.g. an EnvironmentConfig aggregating function defines a key it will put the aggregated result in some key, which will be the same key expected by the patch and transform function, ignoring the whole spec.environment section in Pipeline mode. This was what I was hinting at here: #4632 (comment).
My point is, if stuff is official it should be explicitly represented in the schema, otherwise let's fully go the other way and let functions define their own conventions, just providing them a way to pass data through the pipeline outside the desired state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to find a more descriptive name than "data", though.
If we do go with a map / arbitrary keys, I think it makes sense to reserve and use certain key namespaces for Crossplane (core, not Functions) to write to. Similar to how kubernetes.io/
and k8s.io/
annotation key prefixes are reserved for use by Kubernetes components, and GITHUB_
env vars are reserved for use by GitHub in GitHub Workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't feel great about having a well-known key we use "officially" to inject the current environment into an unstructured object.
Sorry, GitHub didn't show me this comment until I refreshed the page.
I think I get your reservations - explicit schemas are usually best. That said, your analogy of the environment being like environment variables, and looking at how other systems use environment variables is making me feel better about the pattern.
If we say that Functions can effectively communicate by reading from and writing to shared "environment variables", i.e. top-level keys in the environment, is it so bad if Crossplane "bootstraps" some of that environment, like GitHub workflows does with their env vars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't feel super strongly against that and I can see you point, so I would be fine with annotations-style reserved keyspaces. I'll revert the whole environment to be just a struct tomorrow. Now the only point is whether we want to wire the current EnvironmentConfig retrieval from configurations or just accept that people are going to access the API server (or any other external system) from functions and just deal with it, which is the topic of the other thread above, so let's continue there if you agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done: 2b28d9a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
called it context
as you were doing in #4731, as it didn't make much sense to put the current environment in the environment but under a key .../environment
...
Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>
…known fields Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>
I think this can be merged as is, because if we ever decided to embrace functions accessing the API server as we are discussing in the thread above, we will just have to remove the environment fetching wiring in here and move the logic to a dedicated function without any major disruption for users. |
@phisco Thanks, I agree we can merge this. I do like your idea of letting |
This is the Go SDK end of crossplane/crossplane#4632 Signed-off-by: Nic Cope <nicc@rk0n.org>
Description of your changes
Fixes #4600
Even with the whole conversation going on about environment and EnvironmentConfigs, I think this simple change will allow us to accomodate whatever we decide to do with that
Environment being added to both the observed and desired state allows to first pass whatever environment a Composition built before calling the function pipeline, but also using it as a variable space where functions can write stuff to be used by following functions.
I have:
make reviewable
to ensure this PR is ready for review.Addedbackport release-x.y
labels to auto-backport this PR, if necessary.Opened a PR updating the docs, if necessary.