Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dead lock detected? #33

Closed
sillykelvin opened this issue Aug 24, 2021 · 5 comments
Closed

Dead lock detected? #33

sillykelvin opened this issue Aug 24, 2021 · 5 comments

Comments

@sillykelvin
Copy link
Contributor

sillykelvin commented Aug 24, 2021

Thanks for the great package! We use it with zerolog, and it works well, however we met a bug recently, large amount of goroutines hang at writing log, all with the following stack trace:

(dlv) bt
 0  0x0000000000439405 in runtime.gopark
    at go/src/runtime/proc.go:337
 1  0x000000000044a385 in runtime.goparkunlock
    at go/src/runtime/proc.go:342
 2  0x000000000044a385 in runtime.semacquire1
    at go/src/runtime/sema.go:144
 3  0x0000000000468967 in sync.runtime_SemacquireMutex
    at go/src/runtime/sema.go:71
 4  0x00000000004721e5 in sync.(*Mutex).lockSlow
    at go/src/sync/mutex.go:138
 5  0x0000000000b4ee0b in sync.(*Mutex).Lock
    at go/src/sync/mutex.go:81
 6  0x0000000000b4ee0b in github.com/arthurkiller/rollingwriter.(*LockedWriter).Write
    at go-mod-cache/github.com/arthurkiller/rollingwriter@v1.1.2/writer.go:254
 7  0x0000000000695703 in github.com/rs/zerolog.levelWriterAdapter.WriteLevel
    at go-mod-cache/github.com/rs/zerolog@v1.21.0/writer.go:20
 8  0x0000000000695703 in github.com/rs/zerolog.(*levelWriterAdapter).WriteLevel
    at <autogenerated>:1
 9  0x0000000000689625 in github.com/rs/zerolog.(*Event).write
    at go-mod-cache/github.com/rs/zerolog@v1.21.0/event.go:76
10  0x0000000000689b1b in github.com/rs/zerolog.(*Event).msg
    at go-mod-cache/github.com/rs/zerolog@v1.21.0/event.go:140
11  0x0000000000b39f70 in github.com/rs/zerolog.(*Event).Msg
    at go-mod-cache/github.com/rs/zerolog@v1.21.0/event.go:106
...

I'm not familiar with the code, however, after a quick search, I found the following suspicious function:

func (w *LockedWriter) Write(b []byte) (n int, err error) {
	w.Lock()
	select {
	case filename := <-w.fire:
		if err := w.Reopen(filename); err != nil {
			return 0, err                      // <-- here
		}
	default:
	}
	n, err = w.file.Write(b)
	w.Unlock()
	return
}

I don't know if there is a chance that the function returns at the comment line, leaves the lock acquired without release. Is this the root cause of the bug above? Any help is appreciated, thanks.

@arthurkiller
Copy link
Owner

arthurkiller commented Aug 24, 2021 via email

@sillykelvin
Copy link
Contributor Author

I have reproduced the problem:

  1. init the locked writer, make it rotate every minute
  2. do something that may cause os.Rename fail, e.g. delete the currently active log file
  3. wait about a minute and check the process

It is indeed because the code snippet I posted above does not release the lock when w.Reopen(filename) fails, the problem is gone with a defer w.Unlock() statement put after w.Lock().

@arthurkiller
Copy link
Owner

good catch, thx alot

@sillykelvin
Copy link
Contributor Author

Shall we bump a new hotfix version v1.1.3, with this fix included?

@arthurkiller
Copy link
Owner

Shall we bump a new hotfix version v1.1.3, with this fix included?

v1.1.2.1 has released

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants