Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot uninstall iochaos when sidecar chaosfs running failed #126

Closed
mahjonp opened this issue Jan 12, 2020 · 4 comments
Closed

Cannot uninstall iochaos when sidecar chaosfs running failed #126

mahjonp opened this issue Jan 12, 2020 · 4 comments
Assignees
Labels
type/bug Something isn't working

Comments

@mahjonp
Copy link
Contributor

mahjonp commented Jan 12, 2020

Bug Report

What version of Kubernetes are you using?

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.3", GitCommit:"b3cbbae08ec52a7fc73d334838e18d17e8512749", GitTreeState:"clean", BuildDate:"2019-11-13T11:23:11Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:32:14Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

What did you do?

Follow the doc to test iochaos #121 on TiKV.

I found the sidecar chaosfs was in the Waiting state because of CrashLoopBackOff, so I want to uninstall iochaos.

What did you expect to see?

uninstall iochaos successfully.

What did you see instead?

found the command is hanging:

$ kubectl delete -f examples/io-mixed-example.yaml
iochaos.pingcap.com "io-delay-example" deleted
...

here are part of controller logs:

2020-01-12T02:42:05.380Z	INFO	controllers.IoChaos	Recover I/O chaos action, network is not ok, retrying...	{"iochaos": "chaos-testing/io-delay-example", "reconciler": "chaosfs", "namespace": "default", "name": "demo-tikv-0"}
2020-01-12T02:42:07.380Z	INFO	controllers.IoChaos	Recover I/O chaos action, network is not ok, retrying...	{"iochaos": "chaos-testing/io-delay-example", "reconciler": "chaosfs", "namespace": "default", "name": "demo-tikv-0"}
2020-01-12T02:42:09.379Z	INFO	controllers.IoChaos	Recover I/O chaos action, network is not ok, retrying...	{"iochaos": "chaos-testing/io-delay-example", "reconciler": "chaosfs", "namespace": "default", "name": "demo-tikv-0"}
2020-01-12T02:42:11.380Z	INFO	controllers.IoChaos	Recover I/O chaos action, network is not ok, retrying...	{"iochaos": "chaos-testing/io-delay-example", "reconciler": "chaosfs", "namespace": "default", "name": "demo-tikv-0"}
2020-01-12T02:42:13.380Z	INFO	controllers.IoChaos	Recover I/O chaos action, network is not ok, retrying...	{"iochaos": "chaos-testing/io-delay-example", "reconciler": "chaosfs", "namespace": "default", "name": "demo-tikv-0"}
2020-01-12T02:42:15.379Z	ERROR	controllers.IoChaos	failed to recover I/O chaos action	{"iochaos": "chaos-testing/io-delay-example", "reconciler": "chaosfs", "namespace": "default", "name": "demo-tikv-0", "error": "timed out waiting for the condition"}
github.com/go-logr/zapr.(*zapLogger).Error
	/Users/manjunpeng/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
github.com/pingcap/chaos-mesh/controllers/iochaos/fs.(*Reconciler).recoverPod
	/Users/manjunpeng/Workspaces/chaos-mesh/controllers/iochaos/fs/types.go:184
github.com/pingcap/chaos-mesh/controllers/iochaos/fs.(*Reconciler).cleanFinalizersAndRecover
	/Users/manjunpeng/Workspaces/chaos-mesh/controllers/iochaos/fs/types.go:152
github.com/pingcap/chaos-mesh/controllers/iochaos/fs.(*Reconciler).Recover
	/Users/manjunpeng/Workspaces/chaos-mesh/controllers/iochaos/fs/types.go:112
github.com/pingcap/chaos-mesh/controllers/twophase.(*Reconciler).Reconcile
	/Users/manjunpeng/Workspaces/chaos-mesh/controllers/twophase/types.go:87
github.com/pingcap/chaos-mesh/controllers/iochaos.(*Reconciler).Reconcile
	/Users/manjunpeng/Workspaces/chaos-mesh/controllers/iochaos/types.go:46
github.com/pingcap/chaos-mesh/controllers.(*IoChaosReconciler).Reconcile
	/Users/manjunpeng/Workspaces/chaos-mesh/controllers/iochaos_controller.go:43
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/Users/manjunpeng/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:256
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/Users/manjunpeng/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/Users/manjunpeng/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/Users/manjunpeng/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191121015412-41065c7a8c2a/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/Users/manjunpeng/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191121015412-41065c7a8c2a/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/Users/manjunpeng/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191121015412-41065c7a8c2a/pkg/util/wait/wait.go:88
2020-01-12T02:42:15.379Z	ERROR	controllers.IoChaos	failed to recover chaos	{"iochaos": "chaos-testing/io-delay-example", "reconciler": "chaosfs", "error": "timed out waiting for the condition"}
github.com/go-logr/zapr.(*zapLogger).Error
	/Users/manjunpeng/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
github.com/pingcap/chaos-mesh/controllers/twophase.(*Reconciler).Reconcile
	/Users/manjunpeng/Workspaces/chaos-mesh/controllers/twophase/types.go:89
github.com/pingcap/chaos-mesh/controllers/iochaos.(*Reconciler).Reconcile
	/Users/manjunpeng/Workspaces/chaos-mesh/controllers/iochaos/types.go:46
github.com/pingcap/chaos-mesh/controllers.(*IoChaosReconciler).Reconcile
	/Users/manjunpeng/Workspaces/chaos-mesh/controllers/iochaos_controller.go:43
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/Users/manjunpeng/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:256
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/Users/manjunpeng/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
	/Users/manjunpeng/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
	/Users/manjunpeng/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191121015412-41065c7a8c2a/pkg/util/wait/wait.go:152
k8s.io/apimachinery/pkg/util/wait.JitterUntil
	/Users/manjunpeng/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191121015412-41065c7a8c2a/pkg/util/wait/wait.go:153
k8s.io/apimachinery/pkg/util/wait.Until
	/Users/manjunpeng/go/pkg/mod/k8s.io/apimachinery@v0.0.0-20191121015412-41065c7a8c2a/pkg/util/wait/wait.go:88
@mahjonp mahjonp added the type/bug Something isn't working label Jan 12, 2020
@mahjonp
Copy link
Contributor Author

mahjonp commented Jan 12, 2020

I logged the chaosfs sidecar container, found this ERROR, PTAL @ethercflow :

$ kubectl logs -f demo-tikv-0 -c chaosfs 
Chaosfs Version: version.Info{GitVersion:"v0.0.0-master+$Format:%h$", GitCommit:"89189f934e4ff90f6a6ab2bfb80a1abd6f65a050", GitTreeState:"dirty", BuildDate:"2020-01-12T02:12:00Z", GoVersion:"go1.13.3", Compiler:"gc", Platform:"linux/amd64"}
2020-01-12T02:46:03.851Z	INFO	chaos-daemon	Init hookfs
2020-01-12T02:46:03.851Z	ERROR	chaos-daemon	failed to create pid file	{"error": "pid file found, ensure docker is not running or delete /tmp/fuse/pid"}
github.com/go-logr/zapr.(*zapLogger).Error
	/Users/manjunpeng/go/pkg/mod/github.com/go-logr/zapr@v0.1.0/zapr.go:128
main.main
	/Users/manjunpeng/Workspaces/chaos-mesh/cmd/chaosfs/main.go:105
runtime.main
	/usr/local/Cellar/go/1.13.3/libexec/src/runtime/proc.go:203

@cwen0
Copy link
Member

cwen0 commented Jan 12, 2020

@mahjonp Can you give me the log of the first crash of chaosfs?

kubectl logs -f demo-tikv-0 -c chaosfs  -p 

@mahjonp
Copy link
Contributor Author

mahjonp commented Jan 12, 2020

@mahjonp Can you give me the log of the first crash of chaosfs?

kubectl logs -f demo-tikv-0 -c chaosfs  -p 

Sorry, the cluster had been destroyed.

@zhouqiang-cl zhouqiang-cl added this to TODO in chaos-operator Jan 15, 2020
@cwen0
Copy link
Member

cwen0 commented Feb 5, 2020

this issue is caused by the data directory of the application, detail refers faqs, the delete command is hanging, because the chaos action is failed and we we recorded the target pod to finalizers of this chaos object. we can remove the finalizers from the chaos object when to delete this chaos object.

I will close this issue.

@cwen0 cwen0 closed this as completed Feb 5, 2020
@zhouqiang-cl zhouqiang-cl moved this from TODO to Done in chaos-operator Feb 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
No open projects
Development

No branches or pull requests

2 participants