Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

❄️ Flaky Test Collection ❄️ #753

Open
joe-kimmel-vmw opened this issue Jun 17, 2022 · 14 comments
Open

❄️ Flaky Test Collection ❄️ #753

joe-kimmel-vmw opened this issue Jun 17, 2022 · 14 comments
Labels
in-progress Work has begun by a community member or a maintainer; this issue may be included in a future release

Comments

@joe-kimmel-vmw
Copy link
Contributor

joe-kimmel-vmw commented Jun 17, 2022

Flaky Test Collection: Name and shame flaky or flakey tests in this issue, provide any ongoing hints/remediation, remove,

  1. Test_PackageInstalled_FromPackageInstall_DeletionFailureBlocks (recently got new diagnostic prints added to help debug why it flakes)
  2. wait: no child processes [1] [2] [3] [4]
@joe-kimmel-vmw joe-kimmel-vmw added the carvel-triage This issue has not yet been reviewed for validity label Jun 17, 2022
@benmoss
Copy link
Contributor

benmoss commented Jun 17, 2022

Test_PackageInstallAndRepo_CanAuthenticateToPrivateRepository_UsingPlaceholderSecret was updated in #738, may be worth checking if that was the cause of the flakiness

@joe-kimmel-vmw joe-kimmel-vmw added in-progress Work has begun by a community member or a maintainer; this issue may be included in a future release and removed carvel-triage This issue has not yet been reviewed for validity labels Jun 17, 2022
@benmoss
Copy link
Contributor

benmoss commented Jun 17, 2022

I'm investigating Test_PackageInstallAndRepo_CanAuthenticateToPrivateRepository_UsingPlaceholderSecret

@cppforlife fixed in #758

@cppforlife
Copy link
Contributor

"Fetching resources: wait: no child processes" error from:


--- FAIL: Test_PackageInstallStatus_DisplaysUsefulErrorMessage_ForDeploymentFailure (11.16s)
    packageinstall_test.go:268: 
        Expected useful error message to contain deploy error
        Got:
        Fetching resources: wait: no child processes
# from text examples
kapp deploy -y -a nginx-helm-git -f examples/nginx-helm-git.yml

@praveenrewar
Copy link
Member

Fetching resources: wait: no child processes

# from examples
kapp deploy -y -a redis-helm -f examples/redis-helm.yml

@cppforlife
Copy link
Contributor

(https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7234570328?check_suite_focus=true)

--- FAIL: Test_PackageInstalled_FromPackageInstall_Successfully (6.51s)
    packageinstall_test.go:172: 
        	Error Trace:	packageinstall_test.go:172
        	Error:      	Not equal: 
        	            	expected: v1alpha1.AppStatus{ManagedAppName:"", Fetch:(*v1alpha1.AppStatusFetch)(0xc000331ea0), Template:(*v1alpha1.AppStatusTemplate)(0xc000344f80), Deploy:(*v1alpha1.AppStatusDeploy)(0xc000331e30), Inspect:(*v1alpha1.AppStatusInspect)(0xc0001384b0), ConsecutiveReconcileSuccesses:1, ConsecutiveReconcileFailures:0, GenericStatus:v1alpha1.GenericStatus{ObservedGeneration:1, Conditions:[]v1alpha1.Condition{v1alpha1.Condition{Type:"ReconcileSucceeded", Status:"True", Reason:"", Message:""}}, FriendlyDescription:"Reconcile succeeded", UsefulErrorMessage:""}}
        	            	actual  : v1alpha1.AppStatus{ManagedAppName:"", Fetch:(*v1alpha1.AppStatusFetch)(0xc000331dc0), Template:(*v1alpha1.AppStatusTemplate)(0xc000344900), Deploy:(*v1alpha1.AppStatusDeploy)(0xc000331d50), Inspect:(*v1alpha1.AppStatusInspect)(0xc000138410), ConsecutiveReconcileSuccesses:2, ConsecutiveReconcileFailures:0, GenericStatus:v1alpha1.GenericStatus{ObservedGeneration:1, Conditions:[]v1alpha1.Condition{v1alpha1.Condition{Type:"ReconcileSucceeded", Status:"True", Reason:"", Message:""}}, FriendlyDescription:"Reconcile succeeded", UsefulErrorMessage:""}}
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -68,3 +68,3 @@
        	            	  }),
        	            	- ConsecutiveReconcileSuccesses: (int) 1,
        	            	+ ConsecutiveReconcileSuccesses: (int) 2,
        	            	  ConsecutiveReconcileFailures: (int) 0,
        	Test:       	Test_PackageInstalled_FromPackageInstall_Successfully

@cppforlife
Copy link
Contributor

cppforlife commented Jul 11, 2022

Test_PackageInstall_UsesExistingAppWithSameName was fixed by #783

@benmoss
Copy link
Contributor

benmoss commented Jul 14, 2022

TestPackageRepository https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7339526938?check_suite_focus=true#step:5:4533

The kctrl logic in https://github.com/benmoss/carvel-kapp-controller/blob/1815114580ddbe149abf2f0cf5309c3163bae9d1/cli/pkg/kctrl/cmd/app/app_tailer.go#L202 seems sane, I'm not sure how this failure is happening.

Running 'kctrl package repository update -r test-package-repository --url index.docker.io/k8slt/kc-e2e-test-repo:latest -n kctrl-test --yes'...
    package_repository_test.go:118: 
        	Error Trace:	package_repository_test.go:118
        	            				e2e.go:14
        	            				package_repository_test.go:110
        	Error:      	"Target cluster 'https://192.168.49.2:8443' (nodes: minikube)

Waiting for package repository to be updated

12:19:15PM: Waiting for package repository reconciliation for 'test-package-repository'
12:19:23PM: Waiting for generation 2 to be observed 
12:19:23PM: Fetch started 
12:19:23PM: Template succeeded 
12:19:23PM: Deploy started (1s ago)
12:19:24PM: Deploying 
	    | Target cluster 'https://10.96.0.1:443'
	    | 12:19:24PM: info: Resources: Scoping listings to single namespace: kctrl-test
	    | Changes
	    | Namespace   Name                            Kind             Age  Op      Op st.  Wait to  Rs  Ri
	    | kctrl-test  pkg.test.carvel.dev             PackageMetadata  -    create  ???     -        -   -
	    | ^           pkg.test.carvel.dev.1.0.0       Package          -    create  ???     -        -   -
	    | ^           pkg.test.carvel.dev.2.0.0       Package          -    create  ???     -        -   -
	    | ^           pkg.test.carvel.dev.3.0.0-rc.1  Package          -    create  ???     -        -   -
	    | Op:      4 create, 0 delete, 0 update, 0 noop, 0 exists
	    | Wait to: 0 reconcile, 0 delete, 4 noop
	    | 12:19:24PM: ---- applying 4 changes [0/4 done] ----
	    | 12:19:24PM: create package/pkg.test.carvel.dev.1.0.0 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: create package/pkg.test.carvel.dev.3.0.0-rc.1 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: create package/pkg.test.carvel.dev.2.0.0 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: create packagemetadata/pkg.test.carvel.dev (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ---- waiting on 4 changes [0/4 done] ----
	    | 12:19:24PM: ok: noop packagemetadata/pkg.test.carvel.dev (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ok: noop package/pkg.test.carvel.dev.1.0.0 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ok: noop package/pkg.test.carvel.dev.3.0.0-rc.1 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ok: noop package/pkg.test.carvel.dev.2.0.0 (data.packaging.carvel.dev/v1alpha1) namespace: kctrl-test
	    | 12:19:24PM: ---- applying complete [4/4 done] ----
	    | 12:19:24PM: ---- waiting complete [4/4 done] ----
	    | Succeeded
12:19:24PM: Deploy succeeded 

Succeeded
" does not contain "Fetch succeeded"
        	Test:       	TestPackageRepository

        	Test:       	TestPackageRepository

@joe-kimmel-vmw
Copy link
Contributor Author

--- FAIL: TestDependencyDownload (6.40s)
    --- FAIL: TestDependencyDownload/with_regular_files (1.01s)
        dependencies_test.go:73: bad status code retrieving url: https://github.com/benmoss/test-resources/releases/download/v1.0.0/test-v1.0.0-darwin-arm64: 503 Service Unavailable
    --- FAIL: TestDependencyDownload/with_tgz_files (5.39s)
        dependencies_test.go:73: bad status code retrieving url: https://github.com/benmoss/test-resources/releases/download/v1.0.0/test-v1.0.0-darwin-arm64.tgz: 503 Service Unavailable
2022/07/22 12:22:40 Updating test to 1.0.1
--- FAIL: TestDependencyUpdate (5.80s)
    dependencies_test.go:119: bad status code retrieving url: https://github.com/benmoss/test-resources/releases/download/v1.0.1/test-v1.0.1-darwin-arm64: 503 Service Unavailable

@joe-kimmel-vmw
Copy link
Contributor Author

joe-kimmel-vmw commented Aug 8, 2022

--- FAIL: Test_AppReconcileOccurs_WhenSecretUpdated (2.54s)
[4323](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4324)
    kapp.go:92: Failed to successfully execute 'kapp deploy -f - -a configmap-with-secret -n kappctrl-test --yes': Execution error: stdout: 'Target cluster 'https://192.168.49.2:8443/' (nodes: minikube)
[4324](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4325)
        
[4325](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4326)
        Changes
[4326](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4327)
        
[4327](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4328)
        Namespace      Name                          Kind            Age  Op      Op st.  Wait to    Rs  Ri  
[4328](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4329)
        kappctrl-test  configmap-with-secret         App             -    create  -       reconcile  -   -  
[4329](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4330)
        ^              kappctrl-e2e-ns-role          Role            -    create  -       reconcile  -   -  
[4330](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4331)
        ^              kappctrl-e2e-ns-role-binding  RoleBinding     -    create  -       reconcile  -   -  
[4331](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4332)
        ^              kappctrl-e2e-ns-sa            ServiceAccount  -    create  -       reconcile  -   -  
[4332](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4333)
        ^              simple-app-values             Secret          -    create  -       reconcile  -   -  
[4333](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4334)
        
[4334](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4335)
        Op:      5 create, 0 delete, 0 update, 0 noop, 0 exists
[4335](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4336)
        Wait to: 5 reconcile, 0 delete, 0 noop
[4336](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4337)
        
[4337](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4338)
        8:34:15PM: ---- applying 3 changes [0/5 done] ----
[4338](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4339)
        8:34:15PM: create role/kappctrl-e2e-ns-role (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4339](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4340)
        8:34:15PM: create secret/simple-app-values (v1) namespace: kappctrl-test
[4340](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4341)
        8:34:15PM: create serviceaccount/kappctrl-e2e-ns-sa (v1) namespace: kappctrl-test



8:34:15PM: ---- waiting on 3 changes [0/5 done] ----
[4342](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4343)
        8:34:15PM: ok: reconcile secret/simple-app-values (v1) namespace: kappctrl-test
[4343](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4344)
        8:34:15PM: ok: reconcile serviceaccount/kappctrl-e2e-ns-sa (v1) namespace: kappctrl-test
[4344](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4345)
        8:34:15PM: ok: reconcile role/kappctrl-e2e-ns-role (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4345](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4346)
        8:34:15PM: ---- applying 1 changes [3/5 done] ----
[4346](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4347)
        8:34:15PM: create rolebinding/kappctrl-e2e-ns-role-binding (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4347](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4348)
        8:34:15PM: ---- waiting on 1 changes [3/5 done] ----
[4348](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4349)
        8:34:15PM: ok: reconcile rolebinding/kappctrl-e2e-ns-role-binding (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4349](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4350)
        8:34:15PM: ---- applying 1 changes [4/5 done] ----
[4350](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4351)
        8:34:15PM: create app/configmap-with-secret (kappctrl.k14s.io/v1alpha1) namespace: kappctrl-test
[4351](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4352)
        8:34:15PM: ---- waiting on 1 changes [4/5 done] ----
[4352](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4353)
        8:34:15PM: ongoing: reconcile app/configmap-with-secret (kappctrl.k14s.io/v1alpha1) namespace: kappctrl-test
[4353](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4354)
        8:34:15PM:  ^ Waiting for generation 1 to be observed
[4354](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4355)
        8:34:16PM: fail: reconcile app/configmap-with-secret (kappctrl.k14s.io/v1alpha1) namespace: kappctrl-test
[4355](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4356)
        8:34:16PM:  ^ Reconcile failed:  (message: Templating dir: waitid: no child processes)
[4356](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4357)
        
[4357](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4358)
        ' stderr: 'kapp: Error: waiting on reconcile app/configmap-with-secret (kappctrl.k14s.io/v1alpha1) namespace: kappctrl-test:
[4358](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4359)
          Finished unsuccessfully (Reconcile failed:  (message: Templating dir: waitid: no child processes))
[4359](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4360)
        ' error: 'exit status 1'
[4360](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4361)
Running 'kapp delete -a configmap-with-configmap -n kappctrl-test --yes'...
[4361](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7733810947?check_suite_focus=true#step:5:4362)
==> deploy












===========


--- FAIL: Test_PackageInstalled_FromPackageInstall_DeletionFailureBlocks (2.84s)
[4659](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4660)
    kapp.go:92: Failed to successfully execute 'kapp deploy -a instl-pkg-failure-block-test -f - -n kappctrl-test --yes': Execution error: stdout: 'Target cluster 'https://192.168.49.2:8443/' (nodes: minikube)
[4660](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4661)
        
[4661](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4662)
        Changes
[4662](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4663)
        
[4663](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4664)
        Namespace      Name                          Kind               Age  Op      Op st.  Wait to    Rs  Ri  
[4664](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4665)
        kappctrl-test  basic.test.carvel.dev         PackageRepository  -    create  -       reconcile  -   -  
[4665](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4666)
        ^              instl-pkg-failure-block-test  PackageInstall     -    create  -       reconcile  -   -  
[4666](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4667)
        ^              kappctrl-e2e-ns-role          Role               -    create  -       reconcile  -   -  
[4667](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4668)
        ^              kappctrl-e2e-ns-role-binding  RoleBinding        -    create  -       reconcile  -   -  
[4668](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4669)
        ^              kappctrl-e2e-ns-sa            ServiceAccount     -    create  -       reconcile  -   -  
[4669](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4670)
        
[4670](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4671)
        Op:      5 create, 0 delete, 0 update, 0 noop, 0 exists
[4671](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4672)
        Wait to: 5 reconcile, 0 delete, 0 noop
[4672](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4673)
        
[4673](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4674)
        9:10:17PM: ---- applying 2 changes [0/5 done] ----
[4674](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4675)
        9:10:17PM: create role/kappctrl-e2e-ns-role (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4675](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4676)
        9:10:17PM: create serviceaccount/kappctrl-e2e-ns-sa (v1) namespace: kappctrl-test
[4676](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4677)
        9:10:17PM: ---- waiting on 2 changes [0/5 done] ----
[4677](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4678)
        9:10:17PM: ok: reconcile serviceaccount/kappctrl-e2e-ns-sa (v1) namespace: kappctrl-test
[4678](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4679)
        9:10:17PM: ok: reconcile role/kappctrl-e2e-ns-role (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4679](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4680)
        9:10:17PM: ---- applying 1 changes [2/5 done] ----
[4680](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4681)
        9:10:17PM: create rolebinding/kappctrl-e2e-ns-role-binding (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4681](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4682)
        9:10:17PM: ---- waiting on 1 changes [2/5 done] ----
[4682](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4683)
        9:10:17PM: ok: reconcile rolebinding/kappctrl-e2e-ns-role-binding (rbac.authorization.k8s.io/v1) namespace: kappctrl-test
[4683](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4684)
        9:10:17PM: ---- applying 1 changes [3/5 done] ----
[4684](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4685)
        9:10:17PM: create packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
[4685](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4686)
        9:10:17PM: ---- waiting on 1 changes [3/5 done] ----
[4686](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4687)
        9:10:17PM: ongoing: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
[4687](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4688)
        9:10:17PM:  ^ Waiting for generation 1 to be observed
[4688](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4689)
        9:10:18PM: ongoing: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
[4689](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4690)
        9:10:18PM:  ^ Reconciling
[4690](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4691)
        9:10:19PM: fail: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
[4691](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4692)
        9:10:19PM:  ^ Reconcile failed:  (message: Templating dir: waitid: no child processes)
[4692](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4693)
        
[4693](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4694)
        ' stderr: 'kapp: Error: waiting on reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test:
[4694](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4695)
          Finished unsuccessfully (Reconcile failed:  (message: Templating dir: waitid: no child processes))
[4695](https://github.com/vmware-tanzu/carvel-kapp-controller/runs/7734208825?check_suite_focus=true#step:5:4696)
        ' error: 'exit status 1'

@cppforlife
Copy link
Contributor

==> deploy
Running 'kapp deploy -f - -a test-repo-status-success -n kappctrl-test --yes --wait-timeout 3m'...
==> check against expected successful status
Running 'kapp inspect -a test-repo-status-success --raw --tty=false --filter-kind=PackageRepository -n kappctrl-test --yes'...
==> force a second reconcile and see if it all still works
Running 'kapp deploy -f - -a test-repo-status-success -n kappctrl-test --yes --wait-timeout 3m'...
Running 'kapp delete -a test-repo-status-success -n kappctrl-test --yes'...
==> deploy pkg repository
Running 'kapp deploy -a repo-packages-available -f - -n kappctrl-test --yes --wait-timeout 3m'...
Running 'kapp delete -a repo-packages-available -n kappctrl-test --yes'...
--- FAIL: Test_PackageRepoBundle_PackagesAvailable (1.56s)
    kapp.go:95: Failed to successfully execute 'kapp deploy -a repo-packages-available -f - -n kappctrl-test --yes --wait-timeout 3m': Execution error: stdout: 'Target cluster 'https://192.168.49.2:8443/' (nodes: minikube)
        
        Changes
        
        Namespace      Name                   Kind               Age  Op      Op st.  Wait to    Rs  Ri  
        kappctrl-test  basic.test.carvel.dev  PackageRepository  -    create  -       reconcile  -   -  
        
        Op:      1 create, 0 delete, 0 update, 0 noop, 0 exists
        Wait to: 1 reconcile, 0 delete, 0 noop
        
        12:03:01AM: ---- applying 1 changes [0/1 done] ----
        12:03:01AM: create packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
        12:03:01AM: ---- waiting on 1 changes [0/1 done] ----
        12:03:01AM: ongoing: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
        12:03:01AM:  ^ Waiting for generation 1 to be observed
        12:03:02AM: fail: reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test
        12:03:02AM:  ^ Reconcile failed:  (message: Fetching: secrets "basic.test.carvel.dev-fetch-0" not found)
        
        ' stderr: 'kapp: Error: waiting on reconcile packagerepository/basic.test.carvel.dev (packaging.carvel.dev/v1alpha1) namespace: kappctrl-test:
          Finished unsuccessfully (Reconcile failed:  (message: Fetching: secrets "basic.test.carvel.dev-fetch-0" not found))
        ' error: 'exit status 1'

@100mik
Copy link
Contributor

100mik commented Sep 20, 2022

Ran into a failure for Test_PackageInstalled_FromPackageInstall_DeletionFailureBlocks here where it seems like the deletion never fails.

@neil-hickey
Copy link
Contributor

Investigation of wait: no child processes flake

When does it happen?

This seems to be completely random

Why does it happen?

I attempted to perform a root cause analysis, but to not much success. I tried:

  • re-running the e2e tests over 100 times against minikube and kind locally on a mac OSX.
  • Forked the repo, running the e2e tests via the github action runner which uses kind on ubuntu-latest.
  • Attempts to reproduce were unsuccessful.

What next?

  • We suspect, though cannot confirm, that this may be a race condition in the way we start commands and our zombie reaping process that runs constantly. When we kick off a cmd.Run() during our templating phase, this reapzombies call can happen right between the start() and the wait() thus causing this error?

Quote from a similar issue as referenced below:

I've run into this intermittently. The code section in question is in utils/run.go ExecuteAndWait. If you check out the golang source code for cmd.Run you'll see a race condition. The process is started and then we wait for it. But if the process completes and exits before the wait happens (because, say, the go runtime decides to do a GC pause right then or the goroutine yields for the syscall), then we'll get an error there.

References:

@100mik
Copy link
Contributor

100mik commented Oct 18, 2022

Another wait: no child processes on a failing Test_SecretsAndConfigMapsWithCustomPathsCanReconcile here

cc: @neil-hickey in case it helps

@100mik
Copy link
Contributor

100mik commented Nov 24, 2022

Seeing a lot of flake in the case where we expect consecutive successes to be 1 but it is 2. (Example on Test_PackageRepoStatus_Success)

Maybe increasing sync period will help ensure that another reconciliation does not happen before we check for the case?
Edit: Again over here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in-progress Work has begun by a community member or a maintainer; this issue may be included in a future release
Projects
Status: Unprioritized
Development

No branches or pull requests

6 participants