KEP 33 - Structured parameters #1666

ANeumann82 · 2020-08-31T10:24:26Z

Signed-off-by: Andreas Neumann aneumann@mesosphere.com

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

keps/0033-structured-parameters.md

…escription Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Updated defined types in Spec Added small description to wizard and listing section Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

…cussions Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

…igration Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

kensipe

1st pass

keps/0033-structured-parameters.md

kensipe · 2020-09-04T21:50:43Z

keps/0033-structured-parameters.md

+
+### Non-Goals
+
+Extend the current format to support new features.


Is it a goal or a non-goal to provide recommended groupings? I ask because there is UX value in having standard naming for things like "data node" and "jvm" etc. If not a goal here... perhaps a mention that there is value in a future kep based on this feature for grouping and standardizing the user experience.

Is it a goal or a non-goal to provide recommended groupings?

This isn't a goal at this point. However, this should be possible via extra "ui:xxx": ... properties on parameters.

kensipe · 2020-09-04T21:52:42Z

keps/0033-structured-parameters.md

+"$schema": "http://json-schema.org/draft-07/schema#"
+apiVersion: kudo.dev/v1beta2
+properties:
+  Zookeeper:


Is this intended to be Camel case? or Sentence case? can it be 2 words combined or not and how is that managed? What about the display name... is it dependent on the casing for the property? and does the casing impact the setting of the value from the value file or param passing by CLI?

Is this intended to be Camel case? or Sentence case?

We don't have an established policy at this point so it's up for debate. But yes, it is case-sensitive (in the CLI and in templates)

kensipe · 2020-09-04T22:00:08Z

keps/0033-structured-parameters.md

+        type: "string"
+        description: "host and port information for Zookeeper connection. e.g. zk:2181,zk2:2181,zk3:2181"
+        default: "zookeeper-instance-zookeeper-0.zookeeper-instance-hs:2181"
+    required: ["URI"]


the semantics of this required seems different to me... the value of the required before was to ensure that a value for that property was provided... and was mainly useful when a default was NOT provided. When a default was provided it was used.

Defining required in a schema feels different in that if a default value is available, the user does not need to provide it... but if you have a value file what you wanted to use the schema to validate it... it would fail because the required element is missing.

That's a good point. I guess, we'll have to enrich the values with the defaults from the schema and then validate it 🤔

Yeah - we have to merge the data before validating. There are discussions about this, and it seems to be consensus that validation will (usually) ignore default values.

keps/0033-structured-parameters.md

kensipe · 2020-09-04T22:05:16Z

keps/0033-structured-parameters.md

+### Non-Goals
+
+Extend the current format to support new features.
+


additionally will the grouping / hierarchy be limited? and is that a goal or non-goal? I believe dcos params were limited to 1 level deep... so it was possible to set a grouping or category of information. This again also has an impact on the UI team and the UX... it is really hard to support infinite nesting of properties... but it is reasonable to expect 1 or 2 levels. One can expect that the UI team would create a simple tab for each category and list all the properties in that category... challenging to imagine what infinite nesting looks like.

Limiting the hierarchy is not a goal at this point. This is probably better solved "by convention" anyway. Also, dcos C* has e.g. four levels.

alenkacz · 2020-09-07T09:16:35Z

keps/0033-structured-parameters.md

+
+### CRDs
+
+Normally, per Kubernetes conventions, CRDs would evolve over released versions without a breaking change from one version to the next, allowing clients to gradually convert to new CRD versions. This requires for a CRD to be convertible from the old version to new _and vice versa_ for all clients (new and old) to be able to read/write both old and new CRD versions. However, such a conversion in both directions will not be possible with this change, as JSON-schema supports a lot of configurations which will not be mappable from a flat list. 


I wonder if this is right, I have to do some reading. I wonder if we should not have really new version of API/different API when we're doing such a breaking change. Is one way conversion a norm?

I think this is fine, this is what the docs say for beta features: The schema and/or semantics of objects may change in incompatible ways in a subsequent beta or stable release. When this happens, we will provide instructions for migrating to the next version. This may require deleting, editing, and re-creating API objects. The editing process may require some thought. This may require downtime for applications that rely on the feature.

But at the same time I found this: In Kubernetes, all versions must be safely round-tripable through each other. This means that if we convert from version 1 to version 2, and then back to version 1, we must not lose information.

After that btw. they propose that you need to change v1 and enhance it in a way so that it can contain all the information from v2. Can you imagine doing that?

Enhancing the v1beta version so it can contain all the data from v1beta2 was my first idea, but that would mean we'd have to restrict the usage of json-schema to a quite limited subset. So, realistically, I don't see that as a viable option.

The all versions must be safely round-tripable is true, but only for all served versions. The current idea is to deprecate v1beta immediatly and only serve v1beta2. In the code, we still need to have v1beta1, but only for migration purposes.

This ensures that only conversions from v1beta to v1beta2 are done, never the other way around.

Alright, I cannot really evaluate how good that option is and if it poses some problems, I would have to do a lot of reading. All the sources I found always talk about round-tripability as main feature for all fields https://groups.google.com/g/kubernetes-sig-api-machinery/c/07JKqCzQKsc?pli=1 🤷

do you have some examples/reading about someone who also did it this way? or basically any conversion webhooks in the wild?

For examples: The only real project I know that uses multiple CRD versions is cert-manager, and I looked at their code for some of the things. They do register a conversion webhook, but fully rely on the default schema conversion without any custom code. I haven't looked into their CRDs to see how closely they follow the full-roundtrip rule. I assume close, as they might serve multiple/all versions.
I don't know any project that does "one-way-conversion" or migration though.

And yeah. The solution described here is not optimal and breaks the Kubernetes API compatibility rules, but it shouldn't impose any technical problems.

The alternatives would be one of these:

Make it a fully breaking change - Ignore existing installations and just change the CRD. I'm very opposed to this solution.

Do it following the K8s API rules - This would involve multiple new CRD versions over multiple KUDO releases and not really change much - The current released clients still won't work with the final CRDs, but there is a time period in between where we support both old and new model where clients can switch. I argue that we don't have many clients yet, and are not that integrated in existing structures, so we can get away with skipping this migration period.

alenkacz · 2020-09-07T09:19:53Z

keps/0033-structured-parameters.md

+
+The server-side of the proposal includes:
+- Updating `Operator` and `Instance` CRDs to use JSON-schema as the parameter format  
+- Providing a conversion-webhook to convert in-cluster running operators to the new format 


just a reminder that this will bump minimal version for kudo to 1.16 while we've been keeping it 1.15 (is konvoy 1.16 out?). I don't think it changes everything just noting :)

Well, it's feature gated and on by default for 1.15. But yes, we should probably mention it somewhere in the KEP. (https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definition-versioning/#webhook-conversion)

Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

alenkacz · 2020-09-07T12:35:11Z

keps/0033-structured-parameters.md

+1. Deploy the new CRD version where the new (`v1beta2`) version will be the _only_ served and stored version:
+    - `v1beta1` will be still in the list, but with `storage=false, served=false`
+    - `v1beta2` will be marked as `storage=true, served=true`
+2. Deploy the new manager with the conversion webhook


I would personally deploy manager before the crd? so there's already the conversion implementation in place when the crd comes.

Hmmm. I'm not sure. Starting the manager first would mean the manager would try to start reconciliations, or maybe couldn't even start because it tries to register a watch on Instance with v1beta2 which is unknown to the cluster if the CRDs are not deployed first

Maybe this is the reason that projects separate the manager and the conversion webhook? That's another option. I would prefer to keep it in the manager, but maybe we have to make it a separate deployment?

hmm I see, maybe. but that would make our upgrade process even more complicated. Can we even right now express the steps like this easily?

Yeah, I think the steps here can be implemented pretty easy. I've updated the section a little bit.

I would personally deploy manager before the crd?

Like Andreas mentioned, you need the new CRD version before the manager can register itself to watch them, right? And the resources has to be converted too at this moment. So the whole update process will be something like:

Kill the manager (or maybe step 3 can do it for the user?)

kubectl krew upgrade kudoctl to upgrade the kudoctl

kudo init --update to install new CRDs, run the new manager+webhook, and trigger the migration

kudo init --upgrade kills the manager already, so there's no need to manually shut it down. The whole proces described here is internal to the kudo init --upgrade

keps/0033-structured-parameters.md

alenkacz

This is well researched and very detailed KEP, thank you for that 👏

I have done my first pass, I have couple of comments, I think my only big concern is this #1666 (comment)

Other than that, it looks like a lot of work and it's a breaking change, so the only thing I question is whether it's worth it and some easier but much less nicer solution would not be good-enough.

alenkacz · 2020-09-07T12:44:31Z

keps/0033-structured-parameters.md

+}
+```
+
+A better solution than `extensionapi.JSON` would be to have a typed structure, but CRDs currently cannot describe recursive data structures, as they [don't support $ref or $recursiveRef](https://github.com/kubernetes/kubernetes/issues/54579). The only way to define an arbitrary JSON structure in a Kubernetes API is with `apiextension.JSON`. The resulting CRD for the operator version will therefore look like this:


I see, this is fine on the CRD level. What would be nice is to have a method on OV to get you that strongly typed version

Agreed. We will have to figure out how this strongly typed structure will look like - Maybe we can use something existing from a json-scheme go library, maybe we have to wrap that structure to make it easily usable.

Well, I found this json-schema validator has this type Schema struct. But dragging this into the OperatorVersion ...

Nice find 👍 We can't use that in the OV itself, because it won't get rendered in the CRD because of the missing capabilities for recursion. But we could maybe use it internally

keps/0033-structured-parameters.md

alenkacz · 2020-09-07T13:02:35Z

keps/0033-structured-parameters.md

+      foo: "bar"
+```
+
+// TODO: it seems that the conversion webhook doesn't have access to the cluster resources (at least by default). How do we unwrap the values (e.g. `MAP_PARAM`) without access to the `OperatorVersion` JSON-schema?


I don't think this is an issue, you just inject client there as we do with the other webhook. I can't think of why that would not work

It's a little bit annoying, because (at least in my current PoC) the conversion is mostly autogenerated and injecting the client may be a bit ugly. But nothing that can't be solved.

ah okay, I did not see that code but yeah, should be possible :)

The code is here, if you want to take a look: https://github.com/kudobuilder/kudo/blob/an/add_v1beta2_and_conversion/pkg/apis/kudo/v1beta1/conversion.go

It's a proof-of-concept branch and won't be merged, it's more about exploratory development ;) The conversion there isn't the real conversion either, of course.

Yeah I think that's similar to what we have done with admission. So for admission, there's this Validator interface that's nicely wired in through manager and everything. But that does not have client and supports only simple use cases. I think for complex use cases we will have to reimplement some of the stuff by ourselves basically hooking in here https://github.com/kubernetes-sigs/controller-runtime/blob/master/pkg/webhook/conversion/conversion.go#L92

I think that's very much in line with what we have here. https://github.com/kudobuilder/kudo/blob/main/pkg/webhook/instance_admission.go#L45

I would prefer not to reimplement that, but to use the code from the controller-runtime. The problem is, both patterns (conversion-gen that uses the schema to convert, and controller-runtime which uses Hub and Converter interfaces) do not allow for additional metadata to be plugged into the conversion process... Meh.

In the end it's an implementation details, but it would be nice if we could not reinvent the wheel for once ;)

but that's actually the recommended process. I talked with controller-runtime people about this for the admission part. They provide these simplified interfaces with no metadata for the basic use. For anything more complex, you're on your own and yes, you have to handle the request/response object by yourself.

Ah, good to know 👍

I like the simplicity of having a basically side-effect free conversion function. So maybe we can keep it simple and just try to unmarshal the values into either type and take one that was successful. Brute force all things! :D

Yeah, we'll have to try it - Would be nice if it works without external data

keps/0033-structured-parameters.md

alenkacz · 2020-09-07T13:08:21Z

keps/0033-structured-parameters.md

+### Trigger
+
+With the list of parameters, each parameter had a trigger attribute which specified a plan to trigger when the parameter was changed. With the nested structure, this gets more complicated as one "parameter branch" might have multiple different triggers defined which is clearly not something we want. Here is the new rule set:
+ - Every level of the structure *may* specify a trigger.


why not only leaves to have a trigger? And no inheritance

That would be an option, but at the same time it would

lead to a lot of repeated trigger attributes

if we assume the default "When it's not set, it's deploy" rule, this could be easy to forget for params of other plans. I think it's easier to only specify the trigger on the /Backup object, and all parameters below that trigger the backup plan.

Only triggers on leaves is a valid strategy here, I think - Maybe inheritance is a premature optimization, i'm not fully sure. I still think i'd be nice to have it, from an operator developers perspective.

yeah I am just afraid of some ambiguity in rich trees of params. But that said I don't even know how much is trigger used in the wild, if people usually leave it to default or ... ?

That depends - It's mostly used on custom plans, i'm not sure how widely they are used. But I think custom plans is one of the USPs of KUDO, the ability to trigger a backup plan, or to execute a custom action is something that's not easily possibly with other solutions, and I think we should embrace it

Yeah I actually don't feel strongly either way about this, it's definitely not a blocker for me

why not only leaves to have a trigger? And no inheritance

I tend to agree. While @ANeumann82 is right and it will "lead to a lot of repeated trigger attributes" this is no different from what we have today. So, on the other hand, we save the operator developer some lines by defining it at the higher level, on the other, there is an increased mental tax to pay when figuring out the trigger inheritance.

I think I'd rather save brain cycles than YAML lines. And we could always relax the rule and introduce inheritance later.

keps/0033-structured-parameters.md

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

ANeumann82 · 2020-09-08T05:54:35Z

Other than that, it looks like a lot of work and it's a breaking change, so the only thing I question is whether it's worth it and some easier but much less nicer solution would not be good-enough.

That's a valid concern, and I'm not sure how to resolve that. One reason for doing it this way would be that a lot of the work we will have to do anyway at some point. For example additional CRD versions, conversion webhook, etc.

If we build this now it's a lot easier to utilize for smaller changes later on.

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

alenkacz · 2020-09-08T11:27:04Z

@ANeumann82 I think I would like to see at least one alternative. One of the "uglier" solution to be on a side of this and then we can compare "ugliness" vs the amount of work. WDYT?

alenkacz

I don't have any huge concerns with this approach other than "should we do it" and "is it worth it". So let's merge and figure out those two questions when we move this to implementable?

I think we leave KEPs in provisional open for too long and not merge

# Conflicts: # keps/README.md

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Started KEP 33 for structured parameters

f662ef3

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

ANeumann82 requested review from alenkacz, gerred, kensipe, nfnt and zen-dog as code owners August 31, 2020 10:24

ANeumann82 marked this pull request as draft August 31, 2020 10:24

nfnt reviewed Aug 31, 2020

View reviewed changes

keps/0033-structured-parameters.md Outdated Show resolved Hide resolved

nfnt reviewed Aug 31, 2020

View reviewed changes

keps/0033-structured-parameters.md Outdated Show resolved Hide resolved

ANeumann82 added the kep/33 label Aug 31, 2020

zen-dog and others added 10 commits August 31, 2020 21:18

First round of polish for the overview, proposal and package format d…

e30e5a5

…escription Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

Polishing the CRDs and updating the Instance

fe8e8e9

Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

Added comments from review

7b855a1

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Added section about untyped parameter for OV

730508c

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Updated triggers section

d59d626

Updated defined types in Spec Added small description to wizard and listing section Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Updated triggering parameters and Instance updates based on slack dis…

51c0159

…cussions Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

Small enhancements

3c8c07f

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Reworked triggers, added example for flatListName

254518d

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Updated and enhanced section about FlatListName and operatorVersion m…

38c4c2c

…igration Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Reworked and added examples

6993336

Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

kensipe reviewed Sep 4, 2020

View reviewed changes

alenkacz reviewed Sep 7, 2020

View reviewed changes

Minor wording improvement

b015980

Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>

alenkacz reviewed Sep 7, 2020

View reviewed changes

keps/0033-structured-parameters.md Outdated Show resolved Hide resolved

alenkacz reviewed Sep 7, 2020

View reviewed changes

ANeumann82 marked this pull request as ready for review September 7, 2020 14:04

Updated KEP based on reviews

ab66293

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

ANeumann82 added 2 commits September 8, 2020 13:17

Updated upgrade process details

c67e716

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

Add detail about add/remove of array entries

213ea2b

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

alenkacz approved these changes Sep 22, 2020

View reviewed changes

ANeumann82 added 2 commits September 25, 2020 13:34

Merge branch 'main' into an/kep-33-structured-parameters

c5bab1c

# Conflicts: # keps/README.md

Update KEP overview

6f49200

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

ANeumann82 merged commit 8d66df8 into main Sep 25, 2020

ANeumann82 deleted the an/kep-33-structured-parameters branch September 25, 2020 12:30


		### Non-Goals

		Extend the current format to support new features.


		### CRDs

		Normally, per Kubernetes conventions, CRDs would evolve over released versions without a breaking change from one version to the next, allowing clients to gradually convert to new CRD versions. This requires for a CRD to be convertible from the old version to new _and vice versa_ for all clients (new and old) to be able to read/write both old and new CRD versions. However, such a conversion in both directions will not be possible with this change, as JSON-schema supports a lot of configurations which will not be mappable from a flat list.

KEP 33 - Structured parameters #1666

KEP 33 - Structured parameters #1666

Conversation

ANeumann82 commented Aug 31, 2020

kensipe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alenkacz Sep 7, 2020 • edited Loading

Choose a reason for hiding this comment

alenkacz Sep 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ANeumann82 Sep 7, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alenkacz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zen-dog Sep 9, 2020 • edited Loading

Choose a reason for hiding this comment

ANeumann82 commented Sep 8, 2020

alenkacz commented Sep 8, 2020

alenkacz left a comment

Choose a reason for hiding this comment

alenkacz Sep 7, 2020 •

edited

Loading

alenkacz Sep 7, 2020 •

edited

Loading

ANeumann82 Sep 7, 2020 •

edited

Loading

zen-dog Sep 9, 2020 •

edited

Loading