feat(functions): support environment if present #4632

phisco · 2023-09-15T12:17:44Z

Description of your changes

Even with the whole conversation going on about environment and EnvironmentConfigs, I think this simple change will allow us to accomodate whatever we decide to do with that

Environment being added to both the observed and desired state allows to first pass whatever environment a Composition built before calling the function pipeline, but also using it as a variable space where functions can write stuff to be used by following functions.

I have:

Read and followed Crossplane's contribution process.
Added or updated unit and E2E tests for my change.
Run make reviewable to ensure this PR is ready for review.
~~Added backport release-x.y labels to auto-backport this PR, if necessary.~~
~~Opened a PR updating the docs, if necessary.~~

phisco · 2023-09-15T12:23:07Z

not sure whether we want to also apply patches at spec.environment.patches here, I guess that would be a P&T leakage into function based compositions, the same thing can be done by a function using the provided environment and composite resource.

phisco · 2023-09-15T12:45:13Z

The other option we could have considered was injecting into the observed state a fake EnvironmnentConfig, but I think such implementation would have the following issues with that:

we would bind functions to EnvironmnentConfigs, while I think there is value into the separation between EnvironmentConfig and the in-memory environment.
we would miss on the in my opinion useful "variable-like" capabilities I mentioned above as we don't allow functions to write to the observed state and we discard any change across different function calls.

negz

Thanks @phisco! I agree this is probably good enough for now.

internal/controller/apiextensions/composite/composition_functions.go

negz · 2023-09-15T19:11:11Z

apis/apiextensions/fn/proto/v1beta1/run_function.proto

@@ -90,6 +90,9 @@ message State {

  // The state of any composed resources.
  map<string, Resource> resources = 2;
+
+  // The environment passed to the Function pipeline.


Should we add some kind of THIS IS AN ALPHA FIELD warning like we do on our CRDs? Something that communicates that this field is more at-risk of changing than the rest of the API.

As to how it might change - if we did make environment an array or map of resources I could imagine reflecting that here. So something more like:

map<string, google.protobuf.Struct> environment = 3;

Yes, you are right. I will update it 👍

Added the alpha warning as suggested 👍

About the type of the environment field, wouldn't a google.protobuf.Struct already accomodate such a change? currently the environment is fully unstructured:

// Environment defines unstructured data. type Environment struct { unstructured.Unstructured }

But if we decided to have two separate sections, "data" and "resources" as I'm suggesting, we would just have to populate the Environment in a different way, but we shouldn't need to change the code here.

I agree we wouldn't have to, in that a google.protobuf.Struct can pretty much model anything. It does however seem to me that if we know the environment will always have a certain shape (i.e. data and array of resources) it would be a better experience for folks building functions for us to reflect that shape in our 'schema' as much as possible.

negz · 2023-09-18T18:53:42Z

Environment being added to both the observed and desired state allows to first pass whatever environment a Composition built before calling the function pipeline, but also using it as a variable space where functions can write stuff to be used by following functions.

@phisco, elsewhere observed and desired work roughly like:

Observed is read-only and identical for every Function in the pipeline. Crossplane populates it as fully as possible.
Desired is read-write. It starts out completely empty, and the pipeline produces it as essentially an 'overlay' on observed state to be applied.

Would the "desired environment" work the same way? i.e. Functions just set the keys they want to set? What would we do with the desired environment at the end of the pipeline run? Discard it?

phisco · 2023-09-18T19:08:51Z

Would the "desired environment" work the same way? i.e. Functions just set the keys they want to set? What would we do with the desired environment at the end of the pipeline run? Discard it?

Yes, for consistency I would expect it to be exactly the same as the observed/desired states. But then a downstream function would have to "merge" the observed and desired environment to access any info. The only other option would be to have it at the same level as the states in both the request and response, with its own semantic, more similar to a context, rather than the observed/desired state.

Currently the state is discarded at the end of the reconciliation loop for P&T, so yes, I'd say for the moment we should stick to that.

negz · 2023-09-18T19:36:52Z

The only other option would be to have it at the same level as the states in both the request and response, with its own semantic, more similar to a context, rather than the observed/desired state.

I'm maybe leaning toward this option, but only very slightly. Mostly because it's a bit odd that it's not really desired environment state, but kinda something else. 🤔

phisco · 2023-09-18T19:44:39Z

Yeah, I agree it feels odd. I will update the PR tomorrow with a dedicated structure for both request and response, using a map as you suggested. I'd arbitrarily already put it under a "data" top level key, just to accommodate future changes such as the ones we are discussing ("resources"), wdyt?

phisco · 2023-09-18T20:35:17Z

@negz, pushed some changes, let me know wdyt 🙏

phisco · 2023-10-04T16:44:28Z

internal/controller/apiextensions/composite/composition_functions.go

+	if req.Environment != nil {
+		d, err := AsStruct(req.Environment)
+		if err != nil {
+			return CompositionResult{}, errors.Wrap(err, errEnvAsStruct)
+		}
+		env = &v1beta1.Environment{
+			Data: d,
+		}
+	}


@negz how would you feel if we did not pass the computed environment and therefore effectively ignored the whole spec.environment stanza for Function based compositions, but we still added the in-memory environment here and we worked to implement the current EnvironmentConfig retrieval and merging logic as a function, similarly to what the patch-and-transform function is doing?

I feel pretty genuinely ambivalent about that, haha.

I do really like the idea of moving the "bootstrapping" of the environment into a Function. This gives us more room to experiment there, and allows folks to opt out of the complexity if they don't need it.

What I'm not so sure about is the general pattern of Functions reading stuff from the API server and passing them to the Function pipeline. (I presume a Function would select and read EnvConfigs). Is that something we should allow? Or do we want Crossplane to be the sole broker of everything read from and written to the API server. I'm not concretely sure of what risk Functions reading/writing to the API server poses, more of an intuition at this point.

If we did allow this we'd probably need to update the spec: https://github.com/crossplane/crossplane/blob/5797021/contributing/specifications/functions.md#runtime-environment

Yes, adding side effects could make everything more complex, e.g. how do we handle that from xrender? However I feel it's going to be hard to prevent, people can always add a service account and use that to do whatever they want, so I guess we should just embrace it and make it a thing we take into account. For example, we could add an annotation to let xrender know a function should be skipped or add conditionals as we were discussing in some other thread, so that one can render compositions skipping the retrieval steps and passing stuff into the environment instead of letting it be retrieved from the outside world.

Yes, adding side effects could make everything more complex, e.g. how do we handle that from xrender? However I feel it's going to be hard to prevent, people can always add a service account and use that to do whatever they want, so I guess we should just embrace it and make it a thing we take into account.

I am pretty skeptical to the idea of natively supporting direct access to Kube API from a function. This is mostly because, this would open a door invalidating most of the opinionated framework that we're trying to provide with RunFunctionRequest/RunFunctionResponse. If a function could access to the Kube API natively, why should it bother getting resources from Observations or specifying the desired state instead of just getting/patching the resources directly. I am imagining that there will be a tendency to open that door with the first limitation in the framework that somebody hits and never get back to track even we resolve that limitation afterwards. There is also a whole new complexity that would be introduced due to the RBAC. How and who will define the RBAC for such a service account, if it is users, then this will still require manual work and passing a service account wouldn't solve the problem. I don't want to think about Crossplane also taking care of RBAC for such a case 😅

However, considering the Kube API is just yet another API and we are planning to enable passing credentials to functions, power users could still communicate with the Kube API if/once they really have to.

This feels quite similar to what one can do with provider-kubernetes and provider-helm. They can be used to create resources (other than the providers own CRDs and a limited set of other types) on the same Kubernetes cluster if a provider config provided accordingly. However, we didn't enable this as a native functionality for all Crossplane Providers, and working fine so far.

Let me invalidate a bit what I wrote above 😄

There is also a whole new complexity that would be introduced due to the RBAC. How and who will define the RBAC for such a service account, if it is users, then this will still require manual work and passing a service account wouldn't solve the problem. I don't want to think about Crossplane also taking care of RBAC for such a case 😅

I just remembered there was a functionality in the Provider Meta definition that no provider uses currently (to the best of my knowledge), which is PermissionRequests. I am not sure if it got broken at some point but if we reflect that functionality to Function meta package, an environment-merger function should be able to give a list RBAC policy rules in its package definition to access EnvironmentConfig resources. And this doesn't feel that bad as it will be something defined at the package level and have a very limited scope.

Yes, having it already implemented makes it feel less bad, isn't it? 😂

Not really 🙂
For me, the key here is RBAC and how/where it is defined.
Initially, I was thinking it as if it would enable/promote doing anything on the Kube API and functions evolving to k8s operators.
However, a permission model where only an explicit/limited set of resources defined by function author is accessible makes it more controlled as you are stating and not harmful to our opinionated framework

I'm feeling aligned with what @turkenh wrote in #4632 (comment).

If we embrace letting Functions write to and from the API server we break the nascent contract we have where Crossplane and controllers are responsible for all API server interactions. Per #4723 I think this contract is an invaluable property of Functions.

As a concrete example, a Function that read EnvConfigs from the API server would not work with xrender. Or rather, you'd need to somehow plumb it up to an API server to allow it to get access to the types it needs.

I don't think we have to submit to inevitability and try to support these Fucntions just because it's hard to stop folks from writing a Function that talks to the API server if they want to. For example probably just saying "your Function isn't compliant with the spec and thus may not work in various contexts" might be enough.

If I understand this correctly it means that while Functions can be used to modify the current environment/context, and could populate the environment from non-K8S API sources, they should not populate it from K8S API resources and that functionality needs to stay where it is (for now) in the Composition, and the discussion in #4583 should stay out of the realm of Functions.

@negz

As a concrete example, a Function that read EnvConfigs from the API server would not work with xrender. Or rather, you'd need to somehow plumb it up to an API server to allow it to get access to the types it needs.

Same issue would apply to object retrieval defined at composition level, even more with what we are discussing in #4583.

@phisco I don't think it would be quite the same. xrender would need to be aware that a Composition could reference EnvConfigs and have a flag like --env-configs or something to pass in the EnvConfigs, similar to the --observed-resources flag.

That all said, maybe there's a middle ground here where a Function could ask Crossplane to provide the pipeline with "extra" observed resources. I wrote up a sketch of how that might work in #4739.

apis/apiextensions/fn/proto/v1beta1/run_function.proto

negz · 2023-10-05T06:48:01Z

internal/controller/apiextensions/composite/composition_functions.go

+	if req.Environment != nil {
+		d, err := AsStruct(req.Environment)
+		if err != nil {
+			return CompositionResult{}, errors.Wrap(err, errEnvAsStruct)
+		}
+		env = &v1beta1.Environment{
+			Data: d,
+		}
+	}


I feel pretty genuinely ambivalent about that, haha.

I do really like the idea of moving the "bootstrapping" of the environment into a Function. This gives us more room to experiment there, and allows folks to opt out of the complexity if they don't need it.

What I'm not so sure about is the general pattern of Functions reading stuff from the API server and passing them to the Function pipeline. (I presume a Function would select and read EnvConfigs). Is that something we should allow? Or do we want Crossplane to be the sole broker of everything read from and written to the API server. I'm not concretely sure of what risk Functions reading/writing to the API server poses, more of an intuition at this point.

If we did allow this we'd probably need to update the spec: https://github.com/crossplane/crossplane/blob/5797021/contributing/specifications/functions.md#runtime-environment

negz · 2023-10-05T06:55:00Z

apis/apiextensions/fn/proto/v1beta1/run_function.proto

+// The environment to be passed down the Function pipeline.
+message Environment {
+  // The data section of the environment.
+  google.protobuf.Struct data = 1;


Will Functions use keys in this data struct to pass information to each other?

My understanding from #4731 (comment) is "no". If that's correct, I think it would be useful to bring that use case into scope for this PR. Would we add a new field to the Environment message for that?

I could imagine foregoing the Environment message and just having something like:

message RunFunctionRequest { // Fields omitted for brevity google.protobuf.Struct environment = 5; }

You could then put the merged EnvironmentConfigs under a well-known field within the environment struct like apiextensions.crossplane.io/config and let Functions use other fields to pass information without worrying about collisions with what was loaded from the EnvironmentConfigs.

Will Functions use keys in this data struct to pass information to each other?
What do you mean? "data" is what we get today merging EnvironmentConfig and patching to and from the environment, if we were to add other fields to the environment these would be on the same level as the "data".

environment.data is exactly what you were putting at environment[apiextensions.crossplane.io/environment]. I would let people use FromEnvironment patches adding the "data" prefix to fields, so that if we ever added something in the future it could still be reachable by FromEnvironmentPatches in the patch and transform function.

I would let people use FromEnvironment patches adding the "data" prefix to fields

We're on the same page here pattern wise, i.e. nesting what we would call "the environment" today under a key in... the environment. I think we need a more descriptive, specific names than "environment" and "data" though. If my EnvironmentConfigs and defaults become the "data" key in the environment, what else would there be? Would the other "keys" be something other than data?

Perhaps if we really want to stick with "environment" as the top-level name environment.data could become environment.config? or environment[apiextensions.crossplane.io/config]?

If we ever added something in the future it could still be reachable by FromEnvironmentPatches in the patch and transform function.

I feel that v1.14 we should have a story for Functions to write to the "environment" for v1.14. If we merge this PR as-is it seems like their only option will be to write to environment.data, which means they could potentially collide with the object that Crossplane loads into that struct from EnvConfigs.

I lean toward the top level of the "environment" being a struct or map because then a Function can "claim" an arbitrary field/key within that top-level struct/map without needing an update to the protobuf schema to add more fields.

I still don't feel great about having a well-known key we use "officially" to inject the current environment into an unstructured object.

I'd be more comfortable saying that the environment (or context) is just an unstructured object and letting functions define their convention, e.g. an EnvironmentConfig aggregating function defines a key it will put the aggregated result in some key, which will be the same key expected by the patch and transform function, ignoring the whole spec.environment section in Pipeline mode. This was what I was hinting at here: #4632 (comment).

My point is, if stuff is official it should be explicitly represented in the schema, otherwise let's fully go the other way and let functions define their own conventions, just providing them a way to pass data through the pipeline outside the desired state.

I think we need to find a more descriptive name than "data", though.

If we do go with a map / arbitrary keys, I think it makes sense to reserve and use certain key namespaces for Crossplane (core, not Functions) to write to. Similar to how kubernetes.io/ and k8s.io/ annotation key prefixes are reserved for use by Kubernetes components, and GITHUB_ env vars are reserved for use by GitHub in GitHub Workflows.

I still don't feel great about having a well-known key we use "officially" to inject the current environment into an unstructured object.

Sorry, GitHub didn't show me this comment until I refreshed the page.

I think I get your reservations - explicit schemas are usually best. That said, your analogy of the environment being like environment variables, and looking at how other systems use environment variables is making me feel better about the pattern.

If we say that Functions can effectively communicate by reading from and writing to shared "environment variables", i.e. top-level keys in the environment, is it so bad if Crossplane "bootstraps" some of that environment, like GitHub workflows does with their env vars?

Yeah, I don't feel super strongly against that and I can see you point, so I would be fine with annotations-style reserved keyspaces. I'll revert the whole environment to be just a struct tomorrow. Now the only point is whether we want to wire the current EnvironmentConfig retrieval from configurations or just accept that people are going to access the API server (or any other external system) from functions and just deal with it, which is the topic of the other thread above, so let's continue there if you agree.

done: 2b28d9a

called it context as you were doing in #4731, as it didn't make much sense to put the current environment in the environment but under a key .../environment...

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

…known fields Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

phisco · 2023-10-06T13:27:30Z

I think this can be merged as is, because if we ever decided to embrace functions accessing the API server as we are discussing in the thread above, we will just have to remove the environment fetching wiring in here and move the logic to a dedicated function without any major disruption for users.

negz · 2023-10-06T20:15:09Z

@phisco Thanks, I agree we can merge this.

I do like your idea of letting function-patch-and-transform patch to and from (what we'd now call) the whole context, rather than just the environment key. I could see that being solvable in function-patch-and-transform by introducing a new PatchFromContext set of patches and leaving the PatchFromEnvironment ones for backward compatibility.

This is the Go SDK end of crossplane/crossplane#4632 Signed-off-by: Nic Cope <nicc@rk0n.org>

phisco requested a review from negz September 15, 2023 12:17

phisco requested review from turkenh and a team as code owners September 15, 2023 12:17

phisco requested a review from MisterMX September 15, 2023 12:17

phisco marked this pull request as draft September 15, 2023 12:20

negz reviewed Sep 15, 2023

View reviewed changes

phisco force-pushed the functional-env branch from 47c8534 to 36c2d80 Compare September 16, 2023 07:26

phisco force-pushed the functional-env branch from 36c2d80 to 3ddd6a4 Compare September 18, 2023 20:29

phisco mentioned this pull request Sep 21, 2023

Add Data from Arbitrary K8s Object to Composition Environment #4583

Closed

turkenh mentioned this pull request Sep 22, 2023

Support for Querying/Filtering for Import and Observe #4141

Open

phisco marked this pull request as ready for review September 25, 2023 07:55

negz self-assigned this Oct 3, 2023

phisco added functions environment labels Oct 4, 2023

phisco commented Oct 4, 2023

View reviewed changes

phisco mentioned this pull request Oct 5, 2023

Support Composition environment via context #4731

Closed

5 tasks

negz reviewed Oct 5, 2023

View reviewed changes

feat(functions): support environment if present

f9d980a

Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

phisco force-pushed the functional-env branch from 3ddd6a4 to f9d980a Compare October 5, 2023 07:59

This was referenced Oct 5, 2023

Make the Composition environment "dumber" 😉 #4738

Closed

Allow a Function to request "extra" observed resources #4739

Closed

review: unstructured environment with keyspaced annotation-like well-…

2b28d9a

…known fields Signed-off-by: Philippe Scorsolini <p.scorsolini@gmail.com>

negz approved these changes Oct 6, 2023

View reviewed changes

negz merged commit 4120759 into crossplane:master Oct 6, 2023
14 of 16 checks passed

negz added a commit to negz/function-sdk-go that referenced this pull request Oct 13, 2023

Support injecting Function environment via context

0eb69eb

This is the Go SDK end of crossplane/crossplane#4632 Signed-off-by: Nic Cope <nicc@rk0n.org>

This was referenced Oct 13, 2023

Support injecting Function environment via context crossplane/function-sdk-go#39

Merged

Patch to and from an Environment loaded from Function context crossplane-contrib/function-patch-and-transform#17

Merged

feat(functions): support environment if present #4632

feat(functions): support environment if present #4632

Conversation

phisco commented Sep 15, 2023 • edited

Description of your changes

phisco commented Sep 15, 2023

phisco commented Sep 15, 2023

negz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

negz commented Sep 18, 2023

phisco commented Sep 18, 2023

negz commented Sep 18, 2023

phisco commented Sep 18, 2023

phisco commented Sep 18, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

turkenh Oct 5, 2023 • edited

Choose a reason for hiding this comment

turkenh Oct 5, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

negz Oct 5, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

negz Oct 5, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

phisco commented Oct 6, 2023

negz commented Oct 6, 2023

phisco commented Sep 15, 2023 •

edited

turkenh Oct 5, 2023 •

edited

turkenh Oct 5, 2023 •

edited

negz Oct 5, 2023 •

edited

negz Oct 5, 2023 •

edited