-
Couldn't load subscription status.
- Fork 71
Description
Hello! We are developing a custom operator and utilizing fluxCD.
We have the v1alpha1 custom resource which is deployed by fluxCD.
When we upgraded the custom resource operator from v1alpha1 to v1alpha2, flux notified us that dryrun failed with the following error message.
dry-run failed, error: failed to prune fields: failed add back owned items: failed to convert pruned object at version <foo.com>/v1alpha1: conversion webhook for <foo.com>/v1alpha2, Kind=<resource> returned invalid metadata: invalid metadata of type <nil> in input object
Running the following dry-run command actually sometimes(about once in 5 times...?) fails with the same error message.
$ kubectl apply --server-side --dry-run=server -f <v1alpha1-resource.yaml> --field-manager kustomize-controller
Error from server: failed to prune fields: failed add back owned items: failed to convert pruned object at version <foo.com>/v1alpha1: conversion webhook for <foo.com>/v1alpha2, Kind=<resource> returned invalid metadata: invalid metadata of type <nil> in input objectBut performing the actual conversion (the following command) never fails
performing the actual conversion also sometimes fails.
$ kubectl apply --server-side -f <v1alpha1-resource.yaml> --field-manager kustomize-controller
<foo.com>/<resource> serverside-appliedThe flakiness might be a key to solving this.
Our conversion code is similar to https://github.com/IBM/operator-sample-go/blob/b79e66026a5cc5b4994222f2ef7aa962de9f7766/operator-application/api/v1alpha1/application_conversion.go#L37
We checked the conversion log. Just one dryrun command called ConvertTo function for 3 times and ConvertFrom function for 3 times. For the last one time for each ConvertTo and ConvertFrom, we noticed that the request has lacking the information of metadata and spec when it fails.
The error log is like "metadata":{"creationTimestamp":null},"spec":{}
(The normal log is like "metadata":{"name":"<foo>","namespace":"<foo>","uid":"09b69792-56d5-4217-b23c-4d418d3f904b","resourceVersion":"1707796","generation":3,"creationTimestamp":"2022-09-16T07:28:54Z","labels":{"kustomize.toolkit.fluxcd.io/name":"<foo>","kustomize.toolkit.fluxcd.io/namespace":"flux-system"}},"spec":{"attribute1":[{...)
We could confirm that this weird thing happens when the managedField has two components(kustomization-controller and our-operator) as follows:
apiVersion: <foo.com>/v1alpha2
kind: <MyResource>
metadata:
creationTimestamp: "2022-09-15T04:52:03Z"
generation: 1
labels:
kustomize.toolkit.fluxcd.io/name: operator-sample
kustomize.toolkit.fluxcd.io/namespace: flux-system
managedFields:
- apiVersion: <foo.com>/v1alpha1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
f:kustomize.toolkit.fluxcd.io/name: {}
f:kustomize.toolkit.fluxcd.io/namespace: {}
f:spec:
f:attribute1: {}
f:attribute2: {}
manager: kustomize-controller
operation: Apply
time: "2022-09-15T04:52:03Z"
- apiVersion: <foo.com>/v1alpha2
fieldsType: FieldsV1
fieldsV1:
f:status:
f:attribute1: {}
f:attribute2: {}
manager: <our-operator>
operation: Update
time: "2022-09-15T04:52:04Z"
name: v1alpha1-flux
namespace: flux
resourceVersion: "483157"
uid: 696bed77-a12b-45d0-b240-8d685cf790e0
spec:
...
status:
...
I asked this question in the flux repo but I could not get the reason why.
fluxcd/flux2#3105
I got stuck in this for more than one week and any ideas are really appreciated.
Thanks!