-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"operator-sdk generate kustomize manifests -q" hangs for api groups with reference #4990
Comments
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
/remove-lifecycle stale |
/remove-lifecycle rotten |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Bug Report
In the operator I have two api groups. First api group referenced to the object from second api group. When I'm trying to build olm bundle ''make bundle" command hanges. I took a look: "make bundle" command uses inside sub-command:
And "make bundle" hangs on this step.
What did you do?
I will try to explain this bug using sample(https://github.com/AndrienkoAleksandr/memcached-operator-ast-bug). I used operator-sdk 1.8.0(but initially I found this bug on the 1.7.x). I generated project sample with two api groups v1 and v1alpha1:
I applied storage version marker:
AndrienkoAleksandr/memcached-operator-ast-bug@808d81d#diff-5e36a4d8934ca67673d0971f36a1448ee88520428fe8c95154f2958dcbdda632R43
I executed command to update autogenerated files, CRs, CRDs from root of the project:
I created initial OLM bundle:
Bundle was generate successfully. But...
After that in the api group v1 I applied field '"Status" to the structure "MemcachedStatus". And this field has a type name "MemcachedStatus" too, but from api group v1alpha:
AndrienkoAleksandr/memcached-operator-ast-bug@1547b67#diff-5e36a4d8934ca67673d0971f36a1448ee88520428fe8c95154f2958dcbdda632R38
Then I wanted to update OLM bundle:
And this command hangs Infinitly. Looks like operator-sdk has got dead recursion. On the Fedora 31 os operator-sdk consumes all RAM memory and laptop hangs! On the Mac I see hanged only operator-sdk process.
What did you expect to see?
Bundle should be generated. On the operator-sdk v0.17.2 the same scenario is valid and working.
What did you see instead? Under which circumstances?
During execution "make bundle" sub command "operator-sdk generate kustomize manifests -q" hangs
Environment
Operator type:
language go
Kubernetes cluster type:
minikube. But it doesn't matter for this issue.
$ operator-sdk version operator-sdk version: "v1.8.0", commit: "d3bd87c6900f70b7df618340e1d63329c7cd651e", kubernetes version: "1.20.2", go version: "go1.16.4", GOOS: "darwin", GOARCH: "amd64"
$ go version
(if language is Go)go version go1.16.5 darwin/amd64
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.1", GitCommit:"5e58841cce77d4bc13713ad2b91fa0d961e69192", GitTreeState:"clean", BuildDate:"2021-05-12T14:11:29Z", GoVersion:"go1.16.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.7", GitCommit:"132a687512d7fb058d0f5890f07d4121b3f0a2e2", GitTreeState:"clean", BuildDate:"2021-05-12T12:32:49Z", GoVersion:"go1.15.12", Compiler:"gc", Platform:"linux/amd64"}
Possible Solution
Yes, workaround exists, but I'm not sure that it will be sutable for all cases....
If I rename structure in the api group v1alpha from MemcachedStatus to MemcachedStatusAlpha, then issue is gone... Bundle updated and crd v1 status description contains expected MemcachedStatusAlpha field definitions.
Additional context
I debugged a bit operator-sdk and I see that described issue it's a bug of the ast.go parser. I want to share a bit details.
ast.go https://github.com/operator-framework/operator-sdk/blob/v1.8.0/internal/generate/clusterserviceversion/bases/definitions/ast.go#L96 analizing api group files to find field with markers.
During inspection https://github.com/operator-framework/operator-sdk/blob/v1.8.0/internal/generate/clusterserviceversion/bases/definitions/ast.go#L110 ast.go in the some iteration inspect node "MemcachedStatus" structure in the api group v1. Let's call this node v1.MemcachedStatus
Then ast.go inspect child nodes(of the v1.MemcachedStatus) one by one, and one of them that's "MemcachedStatus" structure but from v1alpha1 api package. Let's call it v1alpha1.MemcachedStatus. And yes parser correctly found this node for original package v1alpha. But then ast parser recursivly inspect child elements of this node. And for one of the child (with type *ast.Indent) ast.go create indent object with 'rootPkg' api package:
https://github.com/operator-framework/operator-sdk/blob/v1.8.0/internal/generate/clusterserviceversion/bases/definitions/ast.go#L130
'rootPkg' is hardcoded and and always points to the package v1. So for this child node that's wrong package. Valid package should be v1alpha. And unfortunately in the v1 package really present structure with the same name "MemcachedStatus": v1.MemcachedStatus (see step 2).
and parser recursivly inspect:
v1.MemcachedStatus -> v1alpha1.MemcachedStatus -> v1.MemcachedStatus -> v1alpha1.MemcachedStatus ....
in the cycle: https://github.com/operator-framework/operator-sdk/blob/v1.8.0/internal/generate/clusterserviceversion/bases/definitions/ast.go#L107 which never ends.
P.S. It would be nice to cover ast.go by tests, otherwise trying to fix this issue could break a lot of operators....
The text was updated successfully, but these errors were encountered: