Multiple Deployers in 1.24.0 Causes Logging shutdown issue #5840

dhodun · 2021-05-13T20:56:40Z

Expected behavior

Ctl + C at the end of a Skaffold Dev loop shuts down appropriately

Actual behavior

Ctl + C causes an immediate failure and stack trace of Skaffold

Information

Skaffold version: 1.24
Operating system: MacOs Big Sur 11.2.3
Installed via: curl

Steps to reproduce the behavior

skaffold dev when using more than one deployer (we're using helm and kustomize)
After everything stabilizes, CTL + C

May be related to

#5809

Logs

^Cpanic: close of closed channel

goroutine 1 [running]:
github.com/GoogleContainerTools/skaffold/pkg/skaffold/kubernetes/logger.(*LogAggregator).StopLogger(0xc000a401e0)
/skaffold/pkg/skaffold/kubernetes/logger/log.go:145 +0x34
github.com/GoogleContainerTools/skaffold/pkg/skaffold/deploy.DeployerMux.StopLogger(0xc000840c00, 0x7, 0x8)
/skaffold/pkg/skaffold/deploy/deploy_mux.go:102 +0x5a
github.com/GoogleContainerTools/skaffold/pkg/skaffold/runner/v1.(*SkaffoldRunner).Dev(0xc00088a900, 0x62767e0, 0xc00022e140, 0x621fe60, 0xc00051e810, 0xc000618180, 0x7, 0x8, 0x0, 0x0)
/skaffold/pkg/skaffold/runner/v1/dev.go:313 +0x182f
github.com/GoogleContainerTools/skaffold/cmd/skaffold/app/cmd.runDev.func5(0x62a3800, 0xc00088a900, 0xc0006a0d00, 0x9, 0x10, 0x2, 0x2)
/skaffold/cmd/skaffold/app/cmd/dev.go:68 +0x276
github.com/GoogleContainerTools/skaffold/cmd/skaffold/app/cmd.withRunner(0x62767e0, 0xc00022e140, 0x621fe60, 0xc00051e810, 0xc000d5bbc0, 0x0, 0xc)
/skaffold/cmd/skaffold/app/cmd/runner.go:55 +0x114
github.com/GoogleContainerTools/skaffold/cmd/skaffold/app/cmd.runDev(0x62767e0, 0xc00022e140, 0x621fe60, 0xc00051e810, 0x0, 0x0)
/skaffold/cmd/skaffold/app/cmd/dev.go:63 +0x178
github.com/GoogleContainerTools/skaffold/cmd/skaffold/app/cmd.(*builder).NoArgs.func1(0xc0003fc500, 0xc0009cba40, 0x0, 0x3, 0x0, 0x0)
/skaffold/cmd/skaffold/app/cmd/commands.go:129 +0x98
github.com/spf13/cobra.(*Command).execute(0xc0003fc500, 0xc0009cba10, 0x3, 0x3, 0xc0003fc500, 0xc0009cba10)
/go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:852 +0x453
github.com/spf13/cobra.(*Command).ExecuteC(0xc0003fc000, 0x10, 0x0, 0x0)
/go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:960 +0x349
github.com/spf13/cobra.(*Command).Execute(...)
/go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:897
github.com/spf13/cobra.(*Command).ExecuteContext(...)
/go/pkg/mod/github.com/spf13/cobra@v1.1.3/command.go:890
github.com/GoogleContainerTools/skaffold/cmd/skaffold/app.Run(0x621f1a0, 0xc00012e008, 0x621f1a0, 0xc00012e010, 0x0, 0x0)
/skaffold/cmd/skaffold/app/skaffold.go:47 +0x13d
main.main()
/skaffold/cmd/skaffold/skaffold.go:37 +0x5c

The text was updated successfully, but these errors were encountered:

dhodun · 2021-05-13T22:06:47Z

Strangely enough, I didn't see this issue when running skaffold debug from within Cloud Code, using 1.24.

nkubala · 2021-05-13T22:10:15Z

this is happening because in #5809, we changed the control of the logger to be transferred to the individual deployers, meaning each one is responsible for stopping its own logger. however, when we have multiple deployers sharing the same logger, that Stop() call results in an already closed channel having close() called it again, which panics.

the fix here is to implement the follow up work from #5809 (comment), which will prevent multiple calls to Stop() or any other methods that shouldn't be called multiple times.

gsquared94 · 2021-05-17T20:03:01Z

This can be reproduced in examples/multi-config-microservices.
Also the port-forwarder fails intermittently, I think due to the same issue.

...
Deployments stabilized in 1.221 second
Watching for changes...
port forwarding deployment-leeroy-web-default-8080 got terminated: output: error: error upgrading connection: unable to upgrade connection: pod not found ("leeroy-web-7c5768c696-lf2vd_default")

Port forwarding deployment/leeroy-app in namespace default, remote port http -> 127.0.0.1:9001
Port forwarding service/leeroy-app in namespace default, remote port 50051 -> 127.0.0.1:50053

The recent changes in #5809 and others moved the logAggregator and portForwarder inside the Deployer. But in deployer_mux.go each registered deployer calls the Start and Stop methods multiple times. These invocations resolve to a single instance of logAggregator and portForwarder and those aren't implemented to be tolerant to multiple invocations.

I feel it's best to revert these series of changes to a prior stable state and rethink the implementation. Also discussed with @briandealwis.

MarlonGamez added kind/bug Something isn't working priority/p1 High impact feature/bug. labels May 13, 2021

nkubala self-assigned this May 13, 2021

nkubala mentioned this issue May 13, 2021

Use LogProvider to instantiate Loggers #5845

Closed

gsquared94 assigned gsquared94 and unassigned nkubala May 17, 2021

gsquared94 mentioned this issue May 17, 2021

Fix 5840 #5858

Merged

tejal29 closed this as completed in #5858 May 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple Deployers in 1.24.0 Causes Logging shutdown issue #5840

Multiple Deployers in 1.24.0 Causes Logging shutdown issue #5840

dhodun commented May 13, 2021

dhodun commented May 13, 2021

nkubala commented May 13, 2021

gsquared94 commented May 17, 2021 •

edited

Multiple Deployers in 1.24.0 Causes Logging shutdown issue #5840

Multiple Deployers in 1.24.0 Causes Logging shutdown issue #5840

Comments

dhodun commented May 13, 2021

Expected behavior

Actual behavior

Information

Steps to reproduce the behavior

May be related to

Logs

dhodun commented May 13, 2021

nkubala commented May 13, 2021

gsquared94 commented May 17, 2021 • edited

gsquared94 commented May 17, 2021 •

edited