Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

structured-merge-diff supports only 'set' type for associative list #115330

Closed
byako opened this issue Jan 26, 2023 · 5 comments · Fixed by #115354
Closed

structured-merge-diff supports only 'set' type for associative list #115330

byako opened this issue Jan 26, 2023 · 5 comments · Fixed by #115354
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@byako
Copy link
Contributor

byako commented Jan 26, 2023

What happened?

kube-apiserver has has this kind of entries in log

E0126 00:39:45.253562       1 fieldmanager.go:210] "[SHOULD NOT HAPPEN] failed to update managedFields" err="failed to convert new object (default/test-inline-claim; /v1, Kind=Pod) to smd typed: .spec.containers[name=\"with-resource\"].resources.claims: element 0: associative list without keys has an element that's a map type" VersionKind="/, Kind=" namespace="default" name="test-inline-claim"
E0126 00:39:46.700727       1 fieldmanager.go:210] "[SHOULD NOT HAPPEN] failed to update managedFields" err="failed to convert new object (default/test-inline-claim-resource; resource.k8s.io/v1alpha1, Kind=ResourceClaim) to smd typed: .status.reservedFor: element 0: associative list without keys has an element that's a map type" VersionKind="/, Kind=" namespace="default" name="test-inline-claim-resource"

What did you expect to happen?

I expected those lines not to appear in kube-apiserver logs, and structured-merge-diff to support map types, like the comment says at https://github.com/kubernetes/kubernetes/blob/master/vendor/sigs.k8s.io/structured-merge-diff/v4/typed/helpers.go#L233

How can we reproduce it (as minimally and precisely as possible)?

I ran into it during Dynamic Resource Allocation usage. It requires a either a CRI-O v1.23, or containerd 1.7+ (has to be built from git at the moment); the cluster has to be initialized "DynamicResourceAllocation" feature-gate and alpha apis enabled, for instance with this config for kubeadm :

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
kubernetesVersion: v1.26.0
imageRepository: registry.local
apiServer:
  extraArgs:
    feature-gates: "DynamicResourceAllocation=true"
    runtime-config: "api/alpha=true"
controllerManager:
  extraArgs:
    feature-gates: "DynamicResourceAllocation=true"
scheduler:
  extraArgs:
    "feature-gates": "DynamicResourceAllocation=true"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
featureGates:
  DynamicResourceAllocation: true
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
  criSocket: "unix:///var/run/crio/crio.sock"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
featureGates:
  DynamicResourceAllocation: true

Deploy pod with inline resource-claim for instance the following (the issue will appear immediately even though the resource-claim is malformed. Proper resource claim would require resource-driver to be deployed, but it's not necessary for issue to be manifested):

apiVersion: resource.k8s.io/v1alpha1
kind: ResourceClaimTemplate
metadata:
  name: test-inline-claim-template
  namespace: default
spec:
  metadata:
    labels:
      app: inline-resource
  spec:
---
apiVersion: v1
kind: Pod
metadata:
  name: test-inline-claim
spec:
  restartPolicy: Never
  containers:
  - name: with-resource
    image: registry.k8s.io/e2e-test-images/busybox:1.29-2
    command: ["sh", "-c", "ls -la /dev/dri/ && sleep 30"]
    resources:
      claims:
      - name: resource
  resourceClaims:
  - name: resource
    source:
      resourceClaimTemplateName: test-inline-claim-template

Anything else we need to know?

The code where the issue happens appears to be at https://github.com/kubernetes/kubernetes/blob/master/vendor/sigs.k8s.io/structured-merge-diff/v4/typed/helpers.go#L233

Kubernetes version

$ kubectl version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short.  Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.0", GitCommit:"b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d", GitTreeState:"clean", BuildDate:"2022-12-08T19:58:30Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.0", GitCommit:"b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d", GitTreeState:"clean", BuildDate:"2022-12-08T19:51:45Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

None, local cluster, baremetal / virtual - does not matter.

OS version

# On Linux:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.1 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.1 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

$ uname -a
# paste output here
# unrelated

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Unrelated

Container runtime (CRI) and version (if applicable)

Unrelated

Related plugins (CNI, CSI, ...) and versions (if applicable)

Dynamic Resource Allocation has to be enabled, - Resource Claim is the only API object I know of that triggers this issue and / or has maps in associative list.

@byako byako added the kind/bug Categorizes issue or PR as related to a bug. label Jan 26, 2023
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 26, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pohly
Copy link
Contributor

pohly commented Jan 26, 2023

I can reproduce it with the following branch:
https://github.com/pohly/kubernetes/tree/dra-integration-tests

It occurs in an integration test:

$ hack/install-etcd.sh
...
$ export PATH="/nvme/gopath/src/k8s.io/kubernetes/third_party/etcd:${PATH}"
$ 
...
E0126 18:10:38.665239   16370 fieldmanager.go:210] "[SHOULD NOT HAPPEN] failed to update managedFields" err="failed to convert new object (test/claim-84; resource.k8s.io/v1alpha1, Kind=ResourceClaim) to smd typed: .status.reservedFor: element 0: associative list without keys has an element that's a map type" VersionKind="/, Kind=" namespace="test" name="claim-84"

status.reservedFor is defined here:

// +listType=set
// +optional
ReservedFor []ResourceClaimConsumerReference `json:"reservedFor,omitempty" protobuf:"bytes,3,opt,name=reservedFor"`

It contains:

// ResourceClaimConsumerReference contains enough information to let you
// locate the consumer of a ResourceClaim. The user must be a resource in the same
// namespace as the ResourceClaim.
type ResourceClaimConsumerReference struct {
// APIGroup is the group for the resource being referenced. It is
// empty for the core API. This matches the group in the APIVersion
// that is used when creating the resources.
// +optional
APIGroup string `json:"apiGroup,omitempty" protobuf:"bytes,1,opt,name=apiGroup"`
// Resource is the type of resource being referenced, for example "pods".
Resource string `json:"resource" protobuf:"bytes,3,name=resource"`
// Name is the name of resource being referenced.
Name string `json:"name" protobuf:"bytes,4,name=name"`
// UID identifies exactly one incarnation of the resource.
UID types.UID `json:"uid" protobuf:"bytes,5,name=uid"`
}

I suppose the "map type" is the ResourceClaimConsumerReference struct, not a literal Go map?

One possible fix would be to turn ReservedFor into a map with UID as key. That works because it has to be unique among all entries. Let me try that...

/assign

@pohly
Copy link
Contributor

pohly commented Jan 26, 2023

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 26, 2023
@pohly
Copy link
Contributor

pohly commented Jan 26, 2023

This seems to work:

	// +listType=map
	// +listMapKey=uid
	// +optional
	ReservedFor []ResourceClaimConsumerReference `json:"reservedFor,omitempty" protobuf:"bytes,3,opt,name=reservedFor"`

@liggitt
Copy link
Member

liggitt commented Jan 28, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
Development

Successfully merging a pull request may close this issue.

4 participants