Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP 2211: Omit managedFields server-side #2212

Conversation

knight42
Copy link
Member

@knight42 knight42 commented Dec 22, 2020

This KEP proposes to omit the managedFields of objects on the server-side if users send a request with a special content type modifier.

Motivation: kubernetes/kubernetes#90066

Rendered

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 22, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: knight42
To complete the pull request process, please assign fedebongio after the PR has been reviewed.
You can assign the PR to them by writing /assign @fedebongio in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Dec 22, 2020
FIXME: we need to set the `Accept` header on the client-side and perform client-side stripping if the API server still
returns objects with managedFields. I am not sure we should do this in `client-go` and `cli-runtime` or only in `kubectl get`.

/cc @apelisse
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @apelisse

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about controller-runtime, or is that taken care of by the client-go ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My proposed change is providing a helper function in client-go which is able to set the Accept header in rest.Config, so that users could tell the API server they want to omit the managed fields.

Then you could use the modified rest.Config everywhere that needs it and the managed fields should be dropped if the API server recognize the new Accept header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still really not convinced we need a mechanism to set a single Accept header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @apelisse .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't say need but the header we propose for fallback etc. might not be too known or familiar as an option here that I wouldn't mind having a simpler option that does it for you.
Like PreferNoManagedFields() = header with q= fallback and RequestNoManagedFields() = no fallback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kwiesmueller yes that's what I mean by saying "providing a helper function", but it seems there is more we could do on client-side if we need to consider the case that API server cannot recognize this new header and we still want to omit the managed fields.

}
```

### Client-side
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@julianvmodesto We'll try to get this KEP merged as provisional quickly, so that you can fill in that section if you're ok with it :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to implement this using Accept: application/yaml+unmanaged,application/yaml;q=0.9, or simply: Accept: application/yaml+unmanaged,application/yaml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does unmanaged mean here ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does unmanaged mean here ?

Omit the managed fields.

@knight42
Copy link
Member Author


ManagedFields is used to keep track of who has changed a field of an object, and Server-side Apply relies on such information.
While the managedFields is beneficial to resolving conflicts on updates, it is a bit verbose to many users because this field
could be quite large and generally users are not interested in this information.
Copy link
Member

@apelisse apelisse Dec 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users and controllers, also impact bandwidth usage, size of internal caches

While the managedFields is beneficial to resolving conflicts on updates, it is a bit verbose to many users because this field
could be quite large and generally users are not interested in this information.

This KEP is proposing stripping the managedFields on the server-side when users request objects to make the response less verbose.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only when user "requests objects", since pretty much all operations return the final object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully we are talking about omitting them from the response and not stripping them from the stored object.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not omitting it from stored object as well ? If a cluster operator has made a conscious decision, it would save a lot of storage for him.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully we are talking about omitting them from the response and not stripping them from the stored object.

Yes, the original object saved in etcd is left intact.

Why not omitting it from stored object as well

The Server-side Apply feature needs the managedFields.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a cluster operator has made a conscious decision, it would save a lot of storage for him.

It's unclear if the trade-offs they're making would be all that conscious.

I'm happy to discuss if there is a specific use-case that you have in mind, but the "Accept" header is not where we would want to strip managedFields from storage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not omitting it from stored object as well ?

  1. it loses valuable context about the object that other aspects of the system need
  2. this would not be the right mechanism for that at all, we'd need to make a setting otherwise it would come back
  3. this is primarily about read operations, although it will also work on write operations

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Daniel. Managed field is crucial to automatic config mgmt and a depency for system needs. This proposal should be a minimal UI fix IMO.

A off-topic idea is we may want to bring up some real-world cases how this tech could be properly used by the platform team in server-side apply doc. The main use case on our side is k-o-k configuration mgmt scenario, but it should be more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good thanks. I agree a different setting is needed. I think until this becomes widespread in usage, the default should be ignored in returns and not stored as well. This is ahead of its time , but definitely a really nice concept.

[90066]: https://github.com/kubernetes/kubernetes/issues/90066

At the same time, it is not ideal to only drop the managedFields on the client-side, because that would require every client
(such as kubectl or CLI tools written by users using client-go) to do the same thing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, along with extra bandwidth

* Add three content types to denote the client expects the API server to strip the managedFields.
* `application/json+unmanaged`
* `application/yaml+unmanaged`
* `application/vnd.kubernetes.protobuf+unmanaged`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we don't introduce it just yet, we might have other versions for the encoding of managed fields in the future. Currently, we only have v1. Can we describe how the header would look like if I wanted to request a specific format version of managed fields in the future?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"unmanaged" isn't the right word. "omitting.managed.fields", "sans-managed" perhaps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer something that was a positive specification rather than negative. So, like we could add "managed.v1" (the default) and "managed.omit" (the new option), which leaves room for v2 etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to compose with the tabular output content type.

Also I think clients would specify these in the "Accept" header. Ah, it's mentioned below. But better to be specific.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we describe how the header would look like if I wanted to request a specific format version of managed fields in the future

I guess this is out of the scope of this KEP, but we could still leave room for it.

@lavalamp IIUC are you suggesting changing the Content-Type to application/{json,yaml,profobuf}.managed.omit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apelisse
Copy link
Member

I'm happy to merge this as provisional for now.
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 24, 2020
* `application/json+unmanaged`
* `application/yaml+unmanaged`
* `application/vnd.kubernetes.protobuf+unmanaged`
* Add a new serializer, say `UnmanagedSerializer`, which is able to strip the managedFields when encoding objects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serializer would ideally just take the version (either v1 or omit) rather than needing a different one for each version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, it will be wrapper around existing serializers I think.


ContentTypeUnmanagedJSON string = "application/json+unmanaged"
ContentTypeUnmanagedYAML string = "application/yaml+unmanaged"
ContentTypeUnmanagedProtobuf string = "application/vnd.kubernetes.protobuf+unmanaged"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should encourage a combinatorial explosion. We should represent this with a {json/yaml/proto, managed-fields:v1/omit} pair.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lavalamp Could you elaborate more on the representation please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Daniel (modulo the fact that for backward compatibility we need to default to v1 if unset)

This BTW will also reflect the implementation.

* Add three content types to denote the client expects the API server to strip the managedFields.
* `application/json+unmanaged`
* `application/yaml+unmanaged`
* `application/vnd.kubernetes.protobuf+unmanaged`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the name of the header ?

* `application/json+unmanaged`
* `application/yaml+unmanaged`
* `application/vnd.kubernetes.protobuf+unmanaged`
* Add a new serializer, say `UnmanagedSerializer`, which is able to strip the managedFields when encoding objects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, it will be wrapper around existing serializers I think.

FIXME: we need to set the `Accept` header on the client-side and perform client-side stripping if the API server still
returns objects with managedFields. I am not sure we should do this in `client-go` and `cli-runtime` or only in `kubectl get`.

/cc @apelisse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @apelisse .


ContentTypeUnmanagedJSON string = "application/json+unmanaged"
ContentTypeUnmanagedYAML string = "application/yaml+unmanaged"
ContentTypeUnmanagedProtobuf string = "application/vnd.kubernetes.protobuf+unmanaged"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Daniel (modulo the fact that for backward compatibility we need to default to v1 if unset)

This BTW will also reflect the implementation.

owning-sig: sig-api-machinery
participating-sigs: [sig-cli]
reviewers: ['@apelisse', '@kwiesmueller']
approvers: ['@wojtek-t']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need sig-apimachinery approver too (it seems @lavalamp is already looking into it)

alpha: v1.21
beta: v1.22
stable: v1.23
# TODO: do we need a feature gate?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need

@knight42 knight42 force-pushed the feat/strip-managed-fields-server-side branch from 373f5a3 to 9137444 Compare January 8, 2021 05:52
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 8, 2021
@knight42 knight42 force-pushed the feat/strip-managed-fields-server-side branch from 9137444 to cd4c958 Compare January 8, 2021 06:00

## Design Details

### Server-side
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not clear to me why the server-side piece is required... that seems significantly more complex and doesn't remove the need for client-side logic to pass alternate content-type headers. Is the bandwidth savings significant? Why would we not just drop the field in the client?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it's ~30% of the object size if you're in proto mode.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you mention it though, why don't we just implement in kubectl and see if controller operators complain? So far all the complaints have just been from users that don't want to see it AFAIK.

Copy link
Member

@liggitt liggitt Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we just implement in kubectl and see if controller operators complain?

I would definitely start there (or somewhere client-side... client-go or cli-runtime or kubectl). If there's good evidence we need to push it further up the stack, doing it with a ;omit=metadata.managedFields modifier or something in a later iteration could be reasonable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have complaints about the network impact:
kubernetes/kubernetes#90066 (comment)

Hi, @apelisse, Due to addition managedFields metadata, payload of the every object got increased by 40%. For example, for simple nginx pod, before <1.18, response was ~6KB and >=1.18, response is 10KB. Ours is monitoring application, we use k8s apis to get the response of all objects (chunked fashion) and we dont use this field. The additional payload significantly costing the cpu and memory. is there any way to express the k8s API to exclude in the response ? If not, having functionality like this will significantly helpful for the clients.

kubernetes/kubernetes#90066 (comment)

As I've mentioned before and as @ganga1980 mentioned as well, the main managed fields problem is not a cosmetic one of hiding them in kubectl, it's a performance one, they increase the CPU, memory and bandwidth utilisation, solving this at the client side is not enough.

it's unclear from these comments what CPU/memory is being affected here.

Copy link

@krmayankk krmayankk Jan 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on doing on server side, we have lot of querying and storing the k8s resources, and a 40% increase with no specific near time use will help with bandwidth and processing power of our clients which are making this query continuously. What version was this managedFields introduced in responses from the server side, we might have to plan for this ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're worried about storage, just clear client-side. You only need to clear serverside if there's a problem with network bandwidth.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client-side already has some changes, see kubernetes/kubernetes#91946 which dropped those from edit, but I keep on hearing this over and over that folks want to see those stripped also when doing get.


### Server-side changes

* Add three content types:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, we already have content type modifiers defined for metadata lists. Why are we not using that mechanism?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I was trying to ask for 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about these 🤷‍♂️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton - you mean thins like "as=Table", "as=PartialObjectMetadata" etc?
Shouldn't those be generic types?

[If you meant those - these doesn't benefit from this optimization yet: https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1152-less-object-serializations

  • and i think that we don't want to drop this.]

[added later]
It seems that "as" is one potential modifier, another one is e.g. "pretty", which is supported by the optimization above. So I guess we can implement the new modifier to take advantage of the optimization above.

1. Add a content type suffix to `k8s.io/apimachinery/pkg/runtime/types.go`, which can be used to construct the new content type
along with the existing ones:
```go
const ContentTypeSuffixOmitManagedFields = "+managed.omit"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to leverage the existing negotiation modifier mechanism, which already handles partial objects. How it's modelled is open to debate, but this feels way too specific to managed fields and has no future proofing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


### Server-side changes

* Add three content types:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton - you mean thins like "as=Table", "as=PartialObjectMetadata" etc?
Shouldn't those be generic types?

[If you meant those - these doesn't benefit from this optimization yet: https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1152-less-object-serializations

  • and i think that we don't want to drop this.]

[added later]
It seems that "as" is one potential modifier, another one is e.g. "pretty", which is supported by the optimization above. So I guess we can implement the new modifier to take advantage of the optimization above.

* Add three content types:
* `application/json+managed.omit`
* `application/yaml+managed.omit`
* `application/vnd.kubernetes.protobuf+managed.omit`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


#### Alternative 1: New dedicated flag, for example `--show-managed-fields`

The value of flag `--show-managed-fields` could be `omit`, `default` and a version name(in case users want to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this flag? kubectl only?

participating-sigs: [sig-cli]
reviewers: ['@apelisse', '@kwiesmueller']
approvers: ['@wojtek-t', '@lavalamp']
prr-approvers: ['@wojtek-t']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to target alpha in 1.21, you need to fill PRR and get approval before feature freeze.


### Client-side changes

Implement a flag in `kubectl` to let users choose to omit the managedFields.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are parallel efforts for keeping kubectl preferences, compare #2111

1. We could add the original content type to the `Accept` header as a fallback, such as
`Accept: application/json+managed.omit, application/json;q=0.9`, therefore the API server could return JSON-formatted
objects rather than reject the request with a 406(Not Acceptable) error if it is too old to recognize the new content type.
2. Then make `kubectl get` to drop managedFields if user wants to omit managedFields, but the API server doesn't realize that.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the functionality doesn't exist on the server then we do nothing. I don't want to repeat the mechanism the sever already have. It'll just be a no-op in those cases.


## Design Details

### Server-side
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

client-side already has some changes, see kubernetes/kubernetes#91946 which dropped those from edit, but I keep on hearing this over and over that folks want to see those stripped also when doing get.


#### Alternative 1: New dedicated flag, for example `--show-managed-fields`

The value of flag `--show-managed-fields` could be `omit`, `default` and a version name(in case users want to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for sig-cli pov, it allows much more flexibility for the future.


#### Alternative 1: New dedicated flag, for example `--show-managed-fields`

The value of flag `--show-managed-fields` could be `omit`, `default` and a version name(in case users want to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm even inclined to have it default to omit.

Cons:
* The new flag depends on the `-o` flag, it only works with `-o (json|yaml)`. In other cases, it is a no-op.

#### Alternative 2: New output format, for example `-o yaml+managed.omit`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this approach, I'd prefer seeing the other flag with omit being the default.

@lavalamp
Copy link
Member

lavalamp commented Feb 5, 2021

For some reason github won't let me reply directly to some of @soltysh's comments above, so:

I have a new thought, which is to combine this feature with a filtering feature, so basically there'd be one flag. The paradigm is that this flag consumes the managed fields, and uses them to filter the object:

  • --managed-fields-filter=.* take off managed fields, and show all fields.
  • --managed-fields-filter=<fieldmanager> take off managed fields, and show only fields managed in the current version by the given field manager*
  • --managed-fields-filter=<fieldmanager regexp> take off managed fields, and show all fields managed in the current version by any of the field managers matching the given regexp
  • --managed-fields-filter=^$ take off managed fields, and show fields managed by no one.
  • --managed-fields-filter= existing behavior-- leave managed fields alone, don't do any filtering

Could be a weird idea, could be useful, what do you think?

@kwiesmueller
Copy link
Member

@soltysh my problem with the omit flag is, that it doesn't seem consistent.
Using -o is the direct translation of what happens on the server. We chose a different serializer that removes managedFields and as far as I understand -o is there for exactly that, specifying the serialization the user wants.
This has the additional benefit of allowing us to request different versions of managedFields in the future (-o yaml+managed.v2).

Aside from that technicality, the omit flag doesn't seem like fully deterministic UX to me.
If it is --show-managed-fields then this flag does not really guarantee to do anything in some cases (like when there are no managedFields to show).
What would happen in combination with -o wide or just no -o. Sure, we would not show managedFields as usual, but the user technically is asking for them and we provide them with the option. On the other hand limiting the use of this flag to certain scenarios would make using the CLI harder.

The -o flag would directly translate to the Accept header (imo) so if it can not be provided, it would act just as if you request a format that does not exist.

If the flag would be --omit-managed-fields this would be a little more consistent, but still add a flag for something we already have and can represent while adding a very long flag.

I know this is discussing on a very granular level and making the topic probably more complicated than necessary. There was a vote on Slack and the flag won so I'll go with it. But as you just commented here, I want to raise my concerns while I still can.

@soltysh
Copy link
Contributor

soltysh commented Feb 8, 2021

* `--managed-fields-filter=.*` take off managed fields, and show all fields.

I'd love to have default be to filter out managed fields. Rarely folks invoking kubectl care about those managed fields.

* `--managed-fields-filter=^$` take off managed fields, and show fields managed by no one.

Is it possible that a field is not managed by anyone?

* `--managed-fields-filter=` existing behavior-- leave managed fields alone, don't do any filtering

Could be a weird idea, could be useful, what do you think?

@lavalamp I assume this flag would directly translate to params being passed to the server or are you referring to client-side filtering? I still would be a big proponent of having this mechanism on the server. The proposal itself seems reasonable although I'm worried about too much complexity when specifying those, but that's still on the table.

@soltysh
Copy link
Contributor

soltysh commented Feb 8, 2021

@kwiesmueller although -o can translate to what the server response looks like it doesn't necessarily have to. Since its inception --output was used to steer the shape of the output. For example, the default output and -o wide are actually printing data from the same tabled output the server provides. I'd prefer to keep that distinction still intact and left kubectl decide only about -o value without coupling it to what we ask the server to print. Yet still, this is still under discussions so I don't want to clearly say yes or no 😉

@apelisse
Copy link
Member

apelisse commented Feb 8, 2021

For some reason github won't let me reply directly to some of @soltysh's comments above, so:

I have a new thought, which is to combine this feature with a filtering feature, so basically there'd be one flag. The paradigm is that this flag consumes the managed fields, and uses them to filter the object:

  • --managed-fields-filter=.* take off managed fields, and show all fields.
  • --managed-fields-filter=<fieldmanager> take off managed fields, and show only fields managed in the current version by the given field manager*
  • --managed-fields-filter=<fieldmanager regexp> take off managed fields, and show all fields managed in the current version by any of the field managers matching the given regexp
  • --managed-fields-filter=^$ take off managed fields, and show fields managed by no one.
  • --managed-fields-filter= existing behavior-- leave managed fields alone, don't do any filtering

Could be a weird idea, could be useful, what do you think?

Do we want to have this mechanism available to control the output of every command, including apply, patch, edit, etc? I think what you're describing here makes sense, but there are enough options and controls and it's specific enough that I think it'd be better in its own command rather than as a horizontal flag that applies to all command outputs.

@lavalamp
Copy link
Member

lavalamp commented Feb 8, 2021

I assume this flag would directly translate to params being passed to the server or are you referring to client-side filtering? I still would be a big proponent of having this mechanism on the server.

@soltysh I think it'd be best to implement on the client. I don't really see much benefit in having the server do this for you (it's fairly trivial with the structured-merge-diff library). Maybe if it begins to be a blocker to api clients in other languages.

@lavalamp
Copy link
Member

lavalamp commented Feb 8, 2021

Do we want to have this mechanism available to control the output of every command, including apply, patch, edit, etc? I think what you're describing here makes sense, but there are enough options and controls and it's specific enough that I think it'd be better in its own command rather than as a horizontal flag that applies to all command outputs.

I was only thinking about this for get commands personally. For get commands, my first thought is that it's not good to make users pipe the output through a separate kubectl managed-fields apply ____ command to do the filtering to omit them. For other commands it seems a bit more reasonable? I really don't know.

@soltysh
Copy link
Contributor

soltysh commented Feb 9, 2021

Yeah, we have a get in-flight client-side, see kubernetes/kubernetes#96878

@knight42 knight42 force-pushed the feat/strip-managed-fields-server-side branch from bb249fb to 061d0fd Compare February 18, 2021 17:38
@knight42
Copy link
Member Author

As for the server-side changes, it looks like we all agree it is better to leverage the existing negotiation modifier mechanism, so I have updated the design detail in kep, and here is a PoC: kubernetes/kubernetes@master...knight42:feat/server-omit-managed-fields. Reviewers please take another look.

@wojtek-t
Copy link
Member

@knight42 - are we going to proceed with this? Or can we close it?

@wojtek-t wojtek-t self-assigned this May 10, 2021
@knight42
Copy link
Member Author

@wojtek-t I would love to proceed, but no one has commented on the updated KEP 😢 , so I am not sure whether it needs further improvement.

@apelisse
Copy link
Member

I think we don't have enough evidence that we need this at that time. We can close it and see if people have stronger use-cases that require this, or we can let open. 🤷

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 8, 2021
@kwiesmueller
Copy link
Member

With SSA in GA now, I still think there can be use-cases where omitting the managedFields server-side could be beneficial.
We've already heard from people complaining about the increased network load and object cache size due to managedFields. While SSA should be useable by any user on the cluster, there might be some selected applications that don't need them and should be able to do without them.

Sadly we didn't get enough energy here to get to a positive conclusion.
I'm gonna set this to
/lifecycle frozen
until we decide to abandon this idea fully or the need grows more.
We should see more feedback with the growing adoption of v1.22 to influence that decision.

@k8s-ci-robot
Copy link
Contributor

@kwiesmueller: The lifecycle/frozen label cannot be applied to Pull Requests.

In response to this:

With SSA in GA now, I still think there can be use-cases where omitting the managedFields server-side could be beneficial.
We've already heard from people complaining about the increased network load and object cache size due to managedFields. While SSA should be useable by any user on the cluster, there might be some selected applications that don't need them and should be able to do without them.

Sadly we didn't get enough energy here to get to a positive conclusion.
I'm gonna set this to
/lifecycle frozen
until we decide to abandon this idea fully or the need grows more.
We should see more feedback with the growing adoption of v1.22 to influence that decision.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

@kwiesmueller: The lifecycle/frozen label cannot be applied to Pull Requests.

In response to this:

/lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kwiesmueller
Copy link
Member

Nevermind, PRs can't be frozen.
Then for the sake of having a proper status, I'll close it and we keep an eye on it in wg-api-expression.
If you have any input regarding this feature please let us know.
/close

@k8s-ci-robot
Copy link
Contributor

@kwiesmueller: Closed this PR.

In response to this:

Nevermind, PRs can't be frozen.
Then for the sake of having a proper status, I'll close it and we keep an eye on it in wg-api-expression.
If you have any input regarding this feature please let us know.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@xiaobaitusi
Copy link

Hi k8s experts, I just want to add another data point related to the size issue of managedFields. Basically for operators, e.g. KCC watch a great amount of objects, say over 20K objects, even with metadata-only watch enabled, we have still see a good portion of memory footprint reduction (over 30%) if watches can trim managedFields and the last-applied-configuration annotation (it includes the full spec even in metadata only watch).

Looking at the previous discussion about solving and mitigating the issue on the client side. The ask to reduce memory footprint for watches can potentially belong to controller-runtime library. Can anyone comment on the current thinking about this kep? If it's more appropriate to purpose the light watch in controller-runtime layer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.