Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: Exit does not respect deferred functions making any cleanup code useless #38261

Open
Sebi2020 opened this issue Apr 5, 2020 · 16 comments
Open

Comments

@Sebi2020
Copy link

@Sebi2020 Sebi2020 commented Apr 5, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14.1 windows/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
set GO111MODULE=auto
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\anonymous\AppData\Local\go-build
set GOENV=C:\Users\anonymous\AppData\Roaming\go\env
set GOEXE=.exe
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=C:\Users\anonymous\go
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=c:\go
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=c:\go\pkg\tool\windows_amd64
set GCCGO=gccgo
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=C:\Users\anonymous\Desktop\an\go.mod
set CGO_CFLAGS=-g -O2
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-g -O2
set CGO_FFLAGS=-g -O2
set CGO_LDFLAGS=-g -O2
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -fmessage-length=0 -fdebug-prefix-map=C:\Users\anon~1\AppData\Local\Temp\go-build078526234=/tmp/go-build -gno-record-gcc-switches

What did you do?

I called os.Exit(1) on SIGINT
https://play.golang.org/p/_Py0WToAf8v
(Local testing is better, playground does not like sleeps and have no way to cause a SIGINT).

What did you expect to see?

Golang runs all deferred cleanup routines. This would allow every go routine to register it's own cleanup function if you think separation of concerns is a thing.

What did you see instead?

No Cleanup function is called at all, but that is important if you're writing to files in other goroutines. Panic have a similiar issue.

See also: #4101

@Sebi2020 Sebi2020 changed the title os.Exit does not respect deferred functions making any cleanup code senseless os.Exit does not respect deferred functions making any cleanup code useless Apr 5, 2020
@ALTree
Copy link
Member

@ALTree ALTree commented Apr 5, 2020

This is working as intended (and it's well documented, both in the spec section about defer and in the os.Exit documentation); so I don't believe this is a bug.

I called os.Exit(1) on SIGINT

This would allow every go routine to register it's own cleanup function

In close_handler, call runtime.Goexit() instead of os.Exit(), the former does run deferred functions.

Would that satisfy your use case?

@ALTree ALTree changed the title os.Exit does not respect deferred functions making any cleanup code useless os.: Exit does not respect deferred functions making any cleanup code useless Apr 5, 2020
@ALTree ALTree changed the title os.: Exit does not respect deferred functions making any cleanup code useless os: Exit does not respect deferred functions making any cleanup code useless Apr 5, 2020
@Sebi2020
Copy link
Author

@Sebi2020 Sebi2020 commented Apr 5, 2020

This is working as intended (and it's well documented, both in the spec section about defer and in the os.Exit documentation); so I don't believe this is a bug.

I called os.Exit(1) on SIGINT
This would allow every go routine to register it's own cleanup function

In close_handler, call runtime.Goexit() instead of os.Exit(), the former does run deferred functions.

Would that satisfy your use case?

No, because it does not close the entire process just the calling goroutine. Therefore the main goroutine (or other routines which have been spawned) keep running. Goexit only returns from the calling goroutine. Cleanup code in other goroutines do not get executed and things are much more easier (also in terms of separation of concerns) if Golang, which contains the defer and goroutine stuff, would provide reliable Exit functions which take care of all this stuff.

@ccbrown
Copy link

@ccbrown ccbrown commented Apr 5, 2020

os.Exit is the wrong tool for the job here. In most cases you should spawn your goroutines with cancelation and join mechanisms built with channels or the context package.

Since os.Exit is already well defined, if you still want this behavior, you should probably propose adding a new function to the runtime package instead. I can’t think of any good use-cases off the top of my head for such a function though.

@Sebi2020
Copy link
Author

@Sebi2020 Sebi2020 commented Apr 5, 2020

@ccbrown I already explained the use case. And I think contexts are not applicable because you can't interrupt sleeping goroutines. What about blocking file operations for example.

If you look at my example, you will see that Cancelation won't work here as the main routine is not able to cancel the goroutine (the process exit is triggered by the goroutine itself).
I'm not sure how you get separation of concerns with contexts, because my idea is, that everyn goroutine should be responsible for the cleanup it must do (btw this makes code a lot more understandable und reusable, because you haven't any hidden dependencies with other code pieces that have to do cleanup for goroutines in the case of an early exit). And as far as I can remember the defer call was advertised for cleanup tasks. Which, as we can see is not a reliable way if the application panics or if the process is killed. See my example on playground.

And I think it's a common task to do some cleanup if the user interrupts the process with SIGINT before exiting. The other way arround I'm question myself why we have os.Exit if it's not applicable to reliable close a process. It does not respect defers. Some of go's core features.

@ccbrown
Copy link

@ccbrown ccbrown commented Apr 5, 2020

The way you would do this with contexts looks like this:

package main

import (
	"context"
	"os"
	"os/signal"
	"time"

	"golang.org/x/sync/errgroup"
)

func GoroutineThatSleeps(ctx context.Context) error {
	select {
	case <-ctx.Done():
		return ctx.Err()
	case <-time.After(time.Second):
		return nil
	}
}

func main() {
	ctx, cancel := context.WithCancel(context.Background())

	go func() {
		ch := make(chan os.Signal, 1)
		signal.Notify(ch, os.Interrupt)
		<-ch
		cancel()
	}()

	var g errgroup.Group

	// launch some goroutines that we can cancel
	for i := 0; i < 3; i++ {
		g.Go(func() error {
			return GoroutineThatSleeps(ctx)
		})
	}

	if err := g.Wait(); err != nil {
		println(err.Error())
		os.Exit(1)
	}
}

https://play.golang.org/p/YAarsVHJSGp

This works fine for goroutines that sleep and goroutines that use blocking IO. Many third-party APIs have direct support for using contexts in this way.

I said I couldn't think of any good use-cases for a function that attempts to immediately end all goroutines gracefully. There are better, more elegant ways to gracefully cancel goroutines.

@ccbrown
Copy link

@ccbrown ccbrown commented Apr 5, 2020

Also want to add: If your concern is with canceling work done by a third-party library that does something like time.Sleep, that library obviously was not written with cancelation in mind, and even with your proposed behavior, you would not be able to ensure that it shuts down gracefully.

@Sebi2020
Copy link
Author

@Sebi2020 Sebi2020 commented Apr 5, 2020

Also want to add: If your concern is with canceling work done by a third-party library that does something like time.Sleep, that library obviously was not written with cancelation in mind, and even with your proposed behavior, you would not be able to ensure that it shuts down gracefully.

Maybe golang does not work like other languages like C if they're calling blocking I/O, but as far as I know blocking I/O is another example for a sleeping thread which cannot be interrupted until a timeout or completion of the operation.

At least my proposal whould allow developers to recover if for example some goroutine panics. Currently only the panicking goroutine is able to do something in case of an error. If some other routine hasn't completed it's operation this could cause data corruption, just for example.

I said I couldn't think of any good use-cases for a function that attempts to immediately end all goroutines gracefully. There are better, more elegant ways to gracefully cancel goroutines.

I just wanted to show that without support by os.Exit() or another API function it forces you to use many more lines of code if you compare this to just an simple exit call.
Maybe I misunderstand your code, but as I read the code this looks like you're only able to cancel goroutines which aren't already sleeping (like they would do if some blocking I/O operation is in progress).

@ccbrown
Copy link

@ccbrown ccbrown commented Apr 5, 2020

Maybe I misunderstand your code, but as I read the code this looks like you're only able to cancel goroutines which aren't already sleeping (like they would do if some blocking I/O operation is in progress).

Yeah this is just a misunderstanding. You can cancel goroutines that are already sleeping or doing blocking I/O. My example function sleeps for one second. If you interrupt the process half a second into the goroutine's slumber, it'll be canceled and return immediately. Try making the sleep time one minute and interrupt it a few seconds in and this will be easier to see: You won't have to wait the full minute.

@Sebi2020
Copy link
Author

@Sebi2020 Sebi2020 commented Apr 5, 2020

My example function sleeps for one second. If you interrupt the process half a second into the goroutine's slumber, it'll be canceled and return immediately. Try making the sleep time one minute and interrupt it a few seconds in and this will be easier to see: You won't have to wait the full minute.

I know this is not the right place to ask questions, but in the example the magic is, that the select statement causes the goroutine to wait for the channels, but how would that work in context of blocking i/o operations? They don't return channels on which a select could choose between the cancelation channel and time.After. For example, a file read/write.

@ccbrown
Copy link

@ccbrown ccbrown commented Apr 5, 2020

Unfortunately os.File predates the context package, so this isn't as easy as it should be, but the easiest thing to do is just close the file when the context is cancelled:

func ReadAllWithContext(ctx context.Context, path string) (bufOut []byte, errOut error) {
	f, err := os.Open(path)
	if err != nil {
		return nil, err
	}
	defer f.Close()

	// Spawn a goroutine to close the file if the context is canceled.
	done := make(chan struct{})
	go func() {
		select {
		case <-ctx.Done():
			f.Close()
		case <-done:
		}
	}()
	defer close(done)

	// If the context was canceled before we could read the file, change the error to ctx.Err().
	defer func() {
		if ctxErr := ctx.Err(); errOut != nil && ctxErr != nil {
			errOut = ctxErr
		}
	}()

	return ioutil.ReadAll(f)
}

Close closes the File, rendering it unusable for I/O. On files that support SetDeadline, any pending I/O operations will be canceled and return immediately with an error

https://golang.org/pkg/os/#File.Close

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Apr 5, 2020

Ordinary Go code doesn't have blocking I/O operations in the sense that I think you mean. When a goroutine calls Read or Write on a network connection, the runtime puts that goroutine to sleep and uses epoll, or the equivalent on other operating systems, to decide when to wake the goroutine up again. A different goroutine can always wake the goroutine up by changing the deadline of the operation or closing the network connection.

But I'm not sure what that has to do with the behavior of os.Exit. The current behavior of os.Exit, to exit the program without running any deferred functions, is well established and is not going to change at this point.

In general there is no reliable way to do anything as a program exits, because there are many different ways that a program can exit. For example, on GNU/Linux, the kernel can kill it if it starts using too much memory. Therefore, if there is any cleanup that must be done on program exit, the only reliable way to do it is to invoke the program from a different process that just does the cleanup. The Go standard library has chosen to not provide a 90% solution in this space, because it is a space where correct functioning often requires a 100% solution.

@networkimprov
Copy link

@networkimprov networkimprov commented Apr 6, 2020

@Sebi2020 if you can get data corruption due to unexpected exit, that will happen if the host crashes or powers off. You should be able to recover on restart, and not expect graceful exit.

If you're using a transactional data store, this isn't hard to do. If using the filesystem, it's much more involved (speaking from experience).

@Sebi2020
Copy link
Author

@Sebi2020 Sebi2020 commented Apr 6, 2020

@networkimprov I think that are two different problems yielding to the same result. But with the difference that a power loss or crash off the host is out of your control (from the viewpoint of the application developer), if you have not taken any steps to mitigate these cases. But how an application exits is within your control (mostly, not always).

As I said I just think that a language construct like defer should be reliable regarding their execution is guaranteed. If it isn't you have to keep track of which function calls get executed on exit. As long as deferred function calls are only executed as long the normal operation continues the only reliable way is to exit in the main. That is little bit like only being allowed to use a single return at the end of a function instead of using multiple exits. But I can understand what @ianlancetaylor said about the runtime library.

@ccbrown
Copy link

@ccbrown ccbrown commented Apr 6, 2020

It sounds like you just want to fire and forget goroutines, but still be able to stop them gracefully.

That’s really not something that’s likely to ever be supported though, and this behavior isn’t just specific to Go. In other languages that support RAII patterns like Rust or C++, destructors don’t execute if you exit. You’re just expected to not leak goroutines and only exit after your cleanup is done (usually from main).

@Sebi2020
Copy link
Author

@Sebi2020 Sebi2020 commented Apr 14, 2020

@ccbrown most of the other languages have other methods ensuring that cleanup code can run on a call to exit. This doesn't necessary have to happen inside a destructor (or a defer call) but it would make things much less annoying. An example in C++ is atexit. I don't know a similiar golang mechanism. And saying, no language did this that way is not a really good reason for not doing it. Golang does many things other languages do not do. And as far as I can remember a simliar policy applies for stuff people want golang to have.

@ccbrown
Copy link

@ccbrown ccbrown commented Apr 14, 2020

I think you misunderstand my point a bit. Go's defer statements are much more similar in purpose to C++ destructors than atexit. So my point was just that what Go currently does isn't unusual and should not be surprising.

I wasn't necessarily suggesting that Go shouldn't have an atexit-like mechanism. But if you want that sort of thing, you should probably modify your issue or create a new one to propose it since os.Exit isn't going to change for compatibility reasons.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.