Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slog/log: Concurrent writes to TextHandler and JSONHandler lose log lines. #61321

Closed
WillAbides opened this issue Jul 12, 2023 · 11 comments
Closed
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@WillAbides
Copy link
Contributor

What version of Go are you using (go version)?

$ go version
go version go1.21rc2 darwin/amd64

Does this issue reproduce with the latest release?

It reproduces on go1.21rc2, and on go1.20 if I use "golang.org/x/exp/slog"
instead of "log/slog"

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE='on'
GOARCH='amd64'
GOBIN=''
GOCACHE='/Users/wroden/Library/Caches/go-build'
GOENV='/Users/wroden/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/wroden/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/wroden/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/wroden/go/go1.21rc2'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/Users/wroden/go/go1.21rc2/pkg/tool/darwin_amd64'
GOVCS=''
GOVERSION='go1.21rc2'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/bm/vj4qgj_n737bj6l4b2wkf08h0000gn/T/go-build3015726571=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

I logged a lot of messages from concurrent goroutines to a slog.Logger and children
created by logger.With(...). See the play link for more details.

play: https://go.dev/play/p/Uhzxkd5e-a7?v=gotip

What did you expect to see?

I expected to see all the logged messages in the logger's output.

What did you see instead?

Most of the logged messages, but not all of them.

The only way I could get it to reliably log all messages was to use one mutex for all the loggers. Separate mutexes for each child logger did not help.

What else?

To check whether this was a problem with the handler or the logger, I also tried
calling handler.Handle() directly. That showed the problem is in the handler
and happens on both of the handlers provided by the slog package.

In experimenting with dummy handlers, the only way I found to avoid this is
adding a mutex to the handler. I'm not sure if that's the best solution, but it
does seem to work.

@WillAbides WillAbides changed the title affected/package: slog/log: Concurrent writes to TextHandler and JSONHandler lose log lines. Jul 12, 2023
@AndrewHarrisSPU
Copy link

AndrewHarrisSPU commented Jul 12, 2023

The test is racing. The counts get a little better with wg.Add(count) rather than invoking wg.Add(1) but there's more to it - I am a bit rushed and can't dig into why the test is racing (I think it's racing buffer writes?) but it seems like it would be worth fixing that to see if the problem persists.

@bcmills
Copy link
Contributor

bcmills commented Jul 12, 2023

(attn @jba)

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jul 12, 2023
@bcmills bcmills added this to the Go1.21 milestone Jul 12, 2023
@WillAbides
Copy link
Contributor Author

@AndrewHarrisSPU I agree that there is a race writing to buf, but I think it's the handler's responsibility to guard against that race.

The doc says:

// Any of the Handler's methods may be called concurrently with itself
// or with other methods. It is the responsibility of the Handler to
// manage this concurrency.

commonHandler already has a mutex guarding against concurrent writes to the output. It just doesn't extend that to child handlers.

@jba
Copy link
Contributor

jba commented Jul 12, 2023

Thanks everyone. 20 seconds after reading "it just doesn't extend that to child handlers" I realized there was a serious bug, which I hadn't noticed in many months of development.

To elaborate on what @WillAbides said, the handler implementation has two fields, a sync.Mutex and an io.Writer, with the first protecting the second. Every call with Handler.WithGroup and Handler.WithAttrs creates a clone of that handler with additional information. The clone has a new mutex, but the same io.Writer.

One fix is to change the sync.Mutex field to a *sync.Mutex.
Another is to write a lockingWriter type that holds both the mutex and the io.Writer, and pass a pointer to that around.
I think they're basically the same thing. Does anyone see a difference? Does anyone have a preference?

@WillAbides
Copy link
Contributor Author

@jba I've been fiddling with it here and got it working by changing mu to a sync.Locker and updating NewTextHandler, NewJSONHandler and newDefaultHandler to set it with mu: &sync.Mutex{}. Then update commonHandler.clone() to set mu and defaultHandler.Handle to use the lock. I think that's all the production changes necessary for this.

I would be happy to submit a change.

@jba
Copy link
Contributor

jba commented Jul 12, 2023

Yeah, that's my option 1.

Would love to take your change! Thanks!
But just write *sync.Mutex, not sync.Locker. (Unless you see a problem with that?)
And please turn your play link into a test, if you don't mind.

@WillAbides
Copy link
Contributor Author

Will do. I'll have something pushed shortly.

@jba
Copy link
Contributor

jba commented Jul 12, 2023

This is still going to be a problem for people who make two top-level loggers that use the same io.Writer. I don't think there's a simple fix for that. (The log package has the same problem, if I'm not mistaken.)

We ultimately want something like zapcore.Lock, that does my option 2: wraps an io.Writer in a lock. Maybe I'll propose that once the dust settles.

/cc @bcmills

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/509196 mentions this issue: log/slog: fix issue with concurrent writes

@AndrewHarrisSPU
Copy link

Seems good to have caught this :)

This is still going to be a problem for people who make two top-level loggers that use the same io.Writer. I don't think there's a simple fix for that. (The log package has the same problem, if I'm not mistaken.)

We ultimately want something like zapcore.Lock, that does my option 2: wraps an io.Writer in a lock. Maybe I'll propose that once the dust settles.

/cc @bcmills

I wonder about an additional WriterAlreadySynchronizes boolean field of slog.HandlerOptions, which could indicate the writer is trusted when set true and subsequently not invoke the mutex in the common handler. This might be a way to permit something like zerolog's lock-free/atomic diodes - I don't know if many people actually need this, but it was interesting.

@bcmills
Copy link
Contributor

bcmills commented Jul 13, 2023

The documentation for (*TextHandler).Handle and (*JSONHandler).Handle requires that each Record is written as a single call to Write. So for multiple top-level loggers that use the same io.Writer, it should suffice to wrap the Write method to acquire and release a mutex.

bradfitz pushed a commit to tailscale/go that referenced this issue Aug 2, 2023
This causes instances commonHandler created by withAttrs or withGroup to
share a mutex with their parent preventing concurrent writes to their
shared writer.

Fixes golang#61321

Change-Id: Ieec225e88ad51c84b41bad6c409fac48c90320b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/509196
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
This causes instances commonHandler created by withAttrs or withGroup to
share a mutex with their parent preventing concurrent writes to their
shared writer.

Fixes golang#61321

Change-Id: Ieec225e88ad51c84b41bad6c409fac48c90320b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/509196
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
eric pushed a commit to fancybits/go that referenced this issue Sep 7, 2023
This causes instances commonHandler created by withAttrs or withGroup to
share a mutex with their parent preventing concurrent writes to their
shared writer.

Fixes golang#61321

Change-Id: Ieec225e88ad51c84b41bad6c409fac48c90320b2
Reviewed-on: https://go-review.googlesource.com/c/go/+/509196
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Jonathan Amsterdam <jba@google.com>
Reviewed-by: Jonathan Amsterdam <jba@google.com>
@golang golang locked and limited conversation to collaborators Jul 13, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants