fix potential log record corruption #30

inconshreveable · 2014-11-03T22:11:11Z

So I discovered a pretty horrible bug in log15 the other day that could cause corruption of log records when a logger is used concurrently (or by using the Channel/Buffered handlers).

The bug is in the creation of log records:

Ctx:  append(l.ctx, normalize(ctx)...),

The thing about the append() function is that it feels like an operation that works on an immutable argument because it returns you a new slice. And that's true, really. The slice you pass to append is not mutated.

However, the backing array of the slice is mutated. This can cause log record context corruption in a race like this:

goroutine1:
    r := &Record{
        // ...
        Ctx: append(l.ctx, "foo", "bar")
    }

goroutine2:
    r := &Record {
        // ...
        Ctx: append(l.ctx, "baz", "quux")
    }
 goroutine1:
    // r could have wrong context values "baz"/"quux" instead of "foo"/"bar"
    l.h.Handle(r)

I say "could have wrong context values" because whether or not the record gets corrupted actually depends on the behavior of the Go runtime. If when we initially allocated the slice for l.ctx the runtime chose to allocate the slice with not enough extra capacity for our new context values, it will allocate a new backing array and copy all of the values to it. In this case, the bug is not triggered.

Amusingly, it seems like for small context lengths (<=32 on Darwin), the runtime never chooses to allocate extra capacity and so this bug both never triggers and is undetectable by the race detector. This is why the test has the magical value of 34, which was the first size I could find that would trigger the bug.

@ChrisHines please review

ChrisHines · 2014-11-04T02:09:31Z

log15_test.go

+	const goroutines = 8
+	var res [goroutines]int
+	l.SetHandler(SyncHandler(FuncHandler(func(r *Record) error {
+		res[r.Ctx[ctxLen+1].(int)]++


It took me a while to figure out what this line does, and more generally how this test works. Perhaps some comments explaining the strategy would be helpful.

ChrisHines · 2014-11-04T02:23:55Z

Yes, this is a subtle bug, very good catch.

The fix LGTM.

The new test takes some work to figure out. See if you think my inline comments would help improve readability. It is subjective of course, so my comments are merely suggestions.

inconshreveable · 2014-11-05T20:04:07Z

fixes look okay chris?

ChrisHines · 2014-11-05T21:11:50Z

Yes, very helpful. LGTM.

fix potential log record corruption

315f5fa

ChrisHines reviewed Nov 4, 2014
View reviewed changes

comment up the test per code review suggestions

60f30f8

inconshreveable merged commit 60f30f8 into master Nov 6, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix potential log record corruption #30

fix potential log record corruption #30

inconshreveable commented Nov 3, 2014

ChrisHines Nov 4, 2014

ChrisHines commented Nov 4, 2014

inconshreveable commented Nov 5, 2014

ChrisHines commented Nov 5, 2014

fix potential log record corruption #30

fix potential log record corruption #30

Conversation

inconshreveable commented Nov 3, 2014

ChrisHines Nov 4, 2014

Choose a reason for hiding this comment

ChrisHines commented Nov 4, 2014

inconshreveable commented Nov 5, 2014

ChrisHines commented Nov 5, 2014