fix potential log record corruption #30
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
So I discovered a pretty horrible bug in log15 the other day that could cause corruption of log records when a logger is used concurrently (or by using the Channel/Buffered handlers).
The bug is in the creation of log records:
The thing about the
append()
function is that it feels like an operation that works on an immutable argument because it returns you a new slice. And that's true, really. The slice you pass to append is not mutated.However, the backing array of the slice is mutated. This can cause log record context corruption in a race like this:
I say "could have wrong context values" because whether or not the record gets corrupted actually depends on the behavior of the Go runtime. If when we initially allocated the slice for l.ctx the runtime chose to allocate the slice with not enough extra capacity for our new context values, it will allocate a new backing array and copy all of the values to it. In this case, the bug is not triggered.
Amusingly, it seems like for small context lengths (<=32 on Darwin), the runtime never chooses to allocate extra capacity and so this bug both never triggers and is undetectable by the race detector. This is why the test has the magical value of 34, which was the first size I could find that would trigger the bug.
@ChrisHines please review