Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

managedFields options are very verbose, making kubectl usage a bit #90066

Closed
howardjohn opened this issue Apr 10, 2020 · 105 comments · Fixed by #91946
Closed

managedFields options are very verbose, making kubectl usage a bit #90066

howardjohn opened this issue Apr 10, 2020 · 105 comments · Fixed by #91946
Assignees
Labels
sig/cli Categorizes an issue or PR as relevant to SIG CLI. wg/api-expression Categorizes an issue or PR as relevant to WG API Expression.

Comments

@howardjohn
Copy link
Contributor

Recently (1.18?) a new field was added to configs, managedFields. This is incredibly verbose, which makes kubectl commands like -oyaml and edit painful.

For example, on a random deployment I have looked at, the entire kubectl get deployment -oyaml is 640 lines. Managed fields is over half of that, with 358 lines.

I am not sure if its feasible to somehow mitigate this without breaking server side apply or other features, but it would be useful if possible. One possible idea would be to just hide it from kubectl commands, but I am not sure the impact of that. The goal of this issue is mostly to bring awareness that this causes some user pain and see if we can improve that a bit.

Version:

Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-25T14:58:59Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.0", GitCommit:"9e991415386e4cf155a24b1da15becaa390438d8", GitTreeState:"clean", BuildDate:"2020-03-25T20:56:08Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Apr 10, 2020
@howardjohn
Copy link
Contributor Author

/sig cli

@k8s-ci-robot k8s-ci-robot added sig/cli Categorizes an issue or PR as relevant to SIG CLI. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 10, 2020
@apelisse
Copy link
Member

apelisse commented May 5, 2020

Hi @howardjohn

I understand that this is a voluminous field. The team has discussed that issue multiple times, but we've decided that the information provided is useful for the users.

First, feel free to read this blog-post which will explain what it does and why it's here: https://kubernetes.io/blog/2020/04/01/kubernetes-1.18-feature-server-side-apply-beta-2/

Also, I think I could have emphasized in the blog how the fields can be used to "blame" (à la git-blame) the changes made to an object. For example, you can use it to see what and when a controller adds or changes a specific field of an object. It's also a great way to learn the intricate ways of Kubernetes and understand its asynchronous behavior (the service account controller adds the service account token volume to a pod, the "neg-status" label is set asynchronously by the neg controller, etc).

I hope that helps, and thank you for the feedback!

@tamalsaha
Copy link
Member

I would like to propose that command -o yaml | json commands by default hide the managedFields from output and show it if user passes a flag like --show-managed-fields.

@liggitt
Copy link
Member

liggitt commented May 9, 2020

While I agree the output is verbose, kubectl tries very hard to be as transparent as possible when outputting what the server returns in json/yaml format. Introspecting/manipulating particular fields in the response could be done if absolutely required, but is less than ideal.

@mcristina422
Copy link
Contributor

I find that everyday workflows like kubectl edit and -o yaml | json are extremely impacted by the addition of managedFields. I think every developer will have a similar shocking reaction to this field. While the blog post is a great resource, more communication is necessary for introducing this field and its usefulness.

At first I thought that hiding it by default in -o yaml | json would be nice but I agree that introspecting/manipulating particular fields should not be done by kubectl

@HaveFun83
Copy link

same here please opt-in managed-fields as extra kubectl flag

@papdaniel
Copy link

I think too it could be optional to show managed fields

@towolf
Copy link
Contributor

towolf commented May 29, 2020

I think it should be possible to suppress it, so I can "export" resources as YAML and use them for other purposes. So, opt-out.

Didn't we have something like --export in the past?

@itaysk
Copy link

itaysk commented May 29, 2020

I feel that this is a good place to recommend using a plugin to clean up resources: kubectl neat https://github.com/itaysk/kubectl-neat

@itaysk
Copy link

itaysk commented May 29, 2020

@towolf there was a --export field which got deprecated. that has lead me to create a replacement in the form of a kubectl plugin (kubectl-neat). Since the original use case for export is not longer required this plugin goal has evolved from "exporting a resource for recreation" to "make resources more readable for end users". for reference: kubernetes-sigs/krew-index#191 (comment)

@towolf
Copy link
Contributor

towolf commented May 29, 2020

@itaysk I found your plugin and I really like it. However, to have --export back in kubectl itself to obtain "neat" yaml, that would be much more compelling. So many applications come to mind.

@tringuyen-yw
Copy link

kubectl-neat plugin seems to solve this issue quite elegantly. Would it be possible to integrate kubectl-neat's features set into kubectl get ?

@apelisse
Copy link
Member

apelisse commented Jun 1, 2020

I don't believe that kubectl should decide which fields people think are important or not. kubectl is implemented to display the object as it is received from the server. What we could do is print the managedFields as a more compact, one-line json format? I suspect we could also add another format option but since this would never be the default, I don't know how useful it would be.

@towolf
Copy link
Contributor

towolf commented Jun 1, 2020

What we could do is print the managedFields as a more compact, one-line json format?

No, that's worse. And again unwieldy like the thing this replaced.

@tringuyen-yw
Copy link

I don't believe that kubectl should decide which fields people think are important or not

True, but giving a way for the user to filter out the info he/she doesn't need would improve significantly the user experience.

@lavalamp
Copy link
Member

lavalamp commented Jun 1, 2020

I hear you all, do you all agree that this isn't the only field that could benefit from at least a less prominent location if not outright hiding?

The primary risk in hiding fields is in accidentally causing some workflow that eventually updates an object to then delete the hidden fields. That shouldn't be an issue for this particular field (depending on how much we obscure), but it would definitely be an issue for some other fields.

How much of the problem is in -o yaml vs edit? edit could probably obscure fields completely safely (i.e., merged the hidden fields back with the thing with the user's input).

@lavalamp
Copy link
Member

lavalamp commented Jun 1, 2020

I'm also a little reluctant to filter fields in -o yaml, as that's the easiest way to know the exact state of an object right now. If we add filtering, I think we need a -o rawyaml that doesn't filter. OTOH if we're going to end up with an additional format, why not leave -o yaml the way it is and add a short / filtered yaml output?

@itaysk
Copy link

itaysk commented Jun 1, 2020

maybe hide it only in describe which is manipulating output anyway

@lavalamp
Copy link
Member

lavalamp commented Jun 1, 2020

Describe should definitely not show this. (except for maybe the update timestamp.) Does it right now? If so that's a bug.

@itaysk
Copy link

itaysk commented Jun 1, 2020

sorry I didn't check..

@tringuyen-yw
Copy link

tringuyen-yw commented Jun 1, 2020

Describe should definitely not show this. (except for maybe the update timestamp.) Does it right now? If so that's a bug.

Kubernetes 1.18: kubectl describe pod/mypod does NOT display the managedFields. kubectl get pod/mypod -o yaml does show managedFields.

@lavalamp
Copy link
Member

lavalamp commented Jun 1, 2020

Great, that's what I expected. We should probably make it show the most recent update timestamp, I don't think we added that yet.

@turnes
Copy link

turnes commented Jun 5, 2020

I'm quite new to k8s. You can disable managedFields. managedField is a featuregate and it's in beta. I encourage you do not disable, because the documentation says:

Note: Please do try Beta features and give feedback on them! After they exit beta, it may not be practical for us to make more changes.

I've done a little test disabling it in a minikube env e it worked as expected. I did it just for learning purpose.

mini-kube start --feature-gates=ServerSideApply=false
kubectl get pods/kube-proxy-sb794 -n kube-system -o yaml

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2020-06-05T13:40:31Z"
  generateName: kube-proxy-
  labels:
    controller-revision-hash: 58b47d6fd4
    k8s-app: kube-proxy
    pod-template-generation: "1"
  name: kube-proxy-sb794
  namespace: kube-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: kube-proxy
    uid: 2561d935-a809-4aab-850b-540b9336ccff
  resourceVersion: "447"
  selfLink: /api/v1/namespaces/kube-system/pods/kube-proxy-sb794
  uid: 1f8afe9e-9988-44f1-8b05-a7217ddf9d16
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - la2
  containers:
  - command:
    - /usr/local/bin/kube-proxy
    - --config=/var/lib/kube-proxy/config.conf
    - --hostname-override=$(NODE_NAME)
    env:
    - name: NODE_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    image: k8s.gcr.io/kube-proxy:v1.18.3
    imagePullPolicy: IfNotPresent
    name: kube-proxy
    resources: {}
    securityContext:
      privileged: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/lib/kube-proxy
      name: kube-proxy
    - mountPath: /run/xtables.lock
      name: xtables-lock
    - mountPath: /lib/modules
      name: lib-modules
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-proxy-token-xhkdj
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostNetwork: true
  nodeName: la2
  nodeSelector:
    kubernetes.io/os: linux
  priority: 2000001000
  priorityClassName: system-node-critical
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: kube-proxy
  serviceAccountName: kube-proxy
  terminationGracePeriodSeconds: 30
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  - operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/disk-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/pid-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/unschedulable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/network-unavailable
    operator: Exists
  volumes:
  - configMap:
      defaultMode: 420
      name: kube-proxy
    name: kube-proxy
  - hostPath:
      path: /run/xtables.lock
      type: FileOrCreate
    name: xtables-lock
  - hostPath:
      path: /lib/modules
      type: ""
    name: lib-modules
  - name: kube-proxy-token-xhkdj
    secret:
      defaultMode: 420
      secretName: kube-proxy-token-xhkdj
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2020-06-05T13:40:31Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2020-06-05T13:40:34Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2020-06-05T13:40:34Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2020-06-05T13:40:31Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://4eb4e3ff978bba60055ad54cfea95d1f0cb88e1b0fc208705ed189827a56fe9c
    image: k8s.gcr.io/kube-proxy:v1.18.3
    imageID: docker-pullable://k8s.gcr.io/kube-proxy@sha256:6a093c22e305039b7bd6c3f8eab8f202ad8238066ed210857b25524443aa8aff
    lastState: {}
    name: kube-proxy
    ready: true
    restartCount: 0
    started: true
    state:
      running:
        startedAt: "2020-06-05T13:40:33Z"
  hostIP: 192.168.99.141
  phase: Running
  podIP: 192.168.99.141
  podIPs:
  - ip: 192.168.99.141
  qosClass: BestEffort
  startTime: "2020-06-05T13:40:31Z"```

@dougsland
Copy link
Member

dougsland commented Mar 24, 2021

@mcginne sorry the delay.

@dougsland am I right in thinking that will only affect Kubectl's display of the managed fields?

Correct.

We have a load test to verify
apiserver performance, which uses client-go to make lots of get pods requests and I see a ~20% regression between v1.17 and
v1.18 which I believe is down to the extra overhead of these fields....

Do you have an issue opened with more data?

@mcginne
Copy link

mcginne commented Mar 24, 2021

I haven't raised a separate issue yet as there was some discussion on performance impact between @lavalamp and @wojtek-t here #90066 (comment), and it sounded like this overhead was necessary for correctness. I can raise a separate issue about the performance impact if folks think there is some scope for improvement..

kevo1ution pushed a commit to twosigma/waiter that referenced this issue Mar 30, 2021
The managedFields metadata is not useful to us, and it's very verbose.
kubernetes/kubernetes#90066 (comment)
@max-rocket-internet
Copy link

I've installed kubectl version v1.20.5 but the annoying managedFields output is still present. When will the fix be in a stable release?

@makkes
Copy link

makkes commented Apr 6, 2021

@max-rocket-internet it is targeted for 1.21, see #96878 for details.

@krisnova
Copy link
Contributor

krisnova commented Apr 6, 2021

what is the delta between this and #96878? there is a comment in the PR that says this issue still should remain open after the merge?

what else are we missing?

@apelisse
Copy link
Member

apelisse commented Apr 6, 2021

managed fields are still downloaded, they are stripped by the client. We could possibly omit them server-side.

@alexec
Copy link

alexec commented May 4, 2021

@howardjohn
Copy link
Contributor Author

Just another data point. We were seeing our mutating webhooks take up to 15s, which is crazy - should be milliseconds typically. The root cause was managedFields. Simply removing the managedFields from our patching logic dropped the slow requests to ~500ms (the bad case was a config near etcd size limits, so much slower than normal). istio/istio#34799

@luisdavim
Copy link

Just another data point. We were seeing our mutating webhooks take up to 15s, which is crazy - should be milliseconds typically. The root cause was managedFields. Simply removing the managedFields from our patching logic dropped the slow requests to ~500ms (the bad case was a config near etcd size limits, so much slower than normal). istio/istio#34799

yeah, this is the sort of things I'm really concerned about, not the output of kubectl...

@alvaroaleman
Copy link
Member

But is "My patch computation is very slow because it looks at the entire object instead of just at subset that is actually relevant" really something to reasonable blame the existence of managed fields for?

@howardjohn
Copy link
Contributor Author

Sort of? I mean it worked fine up until managedFields were added, then it (subtly) started behaving really slowly due without changes on our side. I am not here to say "managedFields are bad" or anything, just that they negatively impacted us in this niche edge case scenario, and may negatively impact others if they have similar logic. If they do, I hope they see this comment and the fix (https://github.com/istio/istio/pull/34805/files#diff-e57c2ab4e6af0f24e7285fad8ec9fe8b688fad1350b6d6c405f75a067d65e0dcR723) and can spend less time figuring out what is going on.

@alexec
Copy link

alexec commented Aug 20, 2021

really something to reasonable blame the existence of managed fields for?

It is correct to blame managed fields. Any change that increases payload sizes by at least 2x and commonly 3x will directly produce performance degradation. I assume this was clear to the engineers who designed and implemented this, and that they felt the benefits outweighed the costs.

Myself, I can't really see any tangible benefit. The downsides, such poor developer and user experience, performance impact, have been reported by dozens of users and are clear.

This is pretty frustrating.

@thiagolsfortunato
Copy link

thiagolsfortunato commented Sep 10, 2021

Until waiting for Kubernetes upgrade to version 1.21+, I'm using yq to delete managed fields from the output.

$ kubectl get deploy api-foo -o yaml | yq e 'del(.metadata.managedFields)' -

@ReggieCarey
Copy link

I recognize this is closed but here are my two cents anyway:

managedFields is an implementation detail. It has nothing to do with my assertion of some resource. I feel like this functionality could have been implemented at least partially if git had been incorporated into the backend of k8s.

As it stands, the resource that gets applied/created is manipulated in inconsistent ways. Some status data shows up in the status block, other status data occurs inside the metadata section, and yet other parts are overwritten/added/removed via webhooks/etc.

I see the value in the information. I'm not at all convinced that this is the smartest way to store or convey that information.

managedFields as currently implemented makes using k8s more difficult.

This is meaningless implementation babble and it chews up bandwidth. There has never been a case where I've found this data useful. I'm sure it might be to some but for me, no:

          f:conditions:
            .: {}
            k:{"type":"ChangeDetected"}:
              .: {}
              f:lastTransitionTime: {}
              f:message: {}
              f:observedGeneration: {}
              f:reason: {}
              f:status: {}
              f:type: {}
            k:{"type":"Valid"}:
              .: {}
              f:lastTransitionTime: {}
              f:message: {}
              f:observedGeneration: {}
              f:reason: {}
              f:status: {}
              f:type: {}
          f:lastUpdated: {}
          f:schema:
            .: {}
            f:name: {}
            f:namespace: {}
          f:settings: {}

There reason many solutions die is because of this kind of well intentioned bloat creeps in (look at xml as an example). Where the average user has to contend with what is essentially baffling esoterica, the elite power few, get to know what field in a resource was changed.

A lot of what I read about Apply has to do with limitations in the HTTP protocol (GET, PUT, HEAD, POST, DELETE) used to convey state changes from client to server.

This current approach is broken.

@apelisse
Copy link
Member

This current approach is broken.

Thanks for the feedback. What do you think specifically is broken?

managedFields as currently implemented makes using k8s more difficult.

It sounds like it's been annoying to you, when has this been the case specifically?

As it stands, the resource that gets applied/created is manipulated in inconsistent ways. Some status data shows up in the status block, other status data occurs inside the metadata section, and yet other parts are overwritten/added/removed via webhooks/etc.

Feel free to open a bug with the steps to reproduce, I'd love to take a look.

@lavalamp
Copy link
Member

You might like to read about how it's intended to be used: https://kubernetes.io/blog/2022/10/20/advanced-server-side-apply/

@ReggieCarey
Copy link

ReggieCarey commented Oct 25, 2022 via email

@lavalamp
Copy link
Member

There is no other place that is universal to all kubernetes objects. We also can't move this since SSA has been GA for quite some time.

client-side apply maintained a "lastAppliedAnnotation" which is also in metadata and wasn't very pretty either, btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/cli Categorizes an issue or PR as relevant to SIG CLI. wg/api-expression Categorizes an issue or PR as relevant to WG API Expression.
Projects
None yet
Development

Successfully merging a pull request may close this issue.