-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve logging of long log lines #22982
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,6 @@ | ||
package logger | ||
|
||
import ( | ||
"bufio" | ||
"bytes" | ||
"io" | ||
"sync" | ||
|
@@ -10,8 +9,13 @@ import ( | |
"github.com/Sirupsen/logrus" | ||
) | ||
|
||
const ( | ||
bufSize = 16 * 1024 | ||
readSize = 2 * 1024 | ||
) | ||
|
||
// Copier can copy logs from specified sources to Logger and attach Timestamp. | ||
// Writes are concurrent, so you need implement some sync in your logger | ||
// Writes are concurrent, so you need implement some sync in your logger. | ||
type Copier struct { | ||
// srcs is map of name -> reader pairs, for example "stdout", "stderr" | ||
srcs map[string]io.Reader | ||
|
@@ -39,30 +43,76 @@ func (c *Copier) Run() { | |
|
||
func (c *Copier) copySrc(name string, src io.Reader) { | ||
defer c.copyJobs.Done() | ||
reader := bufio.NewReader(src) | ||
buf := make([]byte, bufSize) | ||
n := 0 | ||
eof := false | ||
msg := &Message{Source: name} | ||
|
||
for { | ||
select { | ||
case <-c.closed: | ||
return | ||
default: | ||
line, err := reader.ReadBytes('\n') | ||
line = bytes.TrimSuffix(line, []byte{'\n'}) | ||
|
||
// ReadBytes can return full or partial output even when it failed. | ||
// e.g. it can return a full entry and EOF. | ||
if err == nil || len(line) > 0 { | ||
if logErr := c.dst.Log(&Message{Line: line, Source: name, Timestamp: time.Now().UTC()}); logErr != nil { | ||
logrus.Errorf("Failed to log msg %q for logger %s: %s", line, c.dst.Name(), logErr) | ||
} | ||
// Work out how much more data we are okay with reading this time. | ||
upto := n + readSize | ||
if upto > cap(buf) { | ||
upto = cap(buf) | ||
} | ||
|
||
if err != nil { | ||
if err != io.EOF { | ||
logrus.Errorf("Error scanning log stream: %s", err) | ||
// Try to read that data. | ||
if upto > n { | ||
read, err := src.Read(buf[n:upto]) | ||
if err != nil { | ||
if err != io.EOF { | ||
logrus.Errorf("Error scanning log stream: %s", err) | ||
return | ||
} | ||
eof = true | ||
} | ||
n += read | ||
} | ||
// If we have no data to log, and there's no more coming, we're done. | ||
if n == 0 && eof { | ||
return | ||
} | ||
// Break up the data that we've buffered up into lines, and log each in turn. | ||
p := 0 | ||
for q := bytes.Index(buf[p:n], []byte{'\n'}); q >= 0; q = bytes.Index(buf[p:n], []byte{'\n'}) { | ||
msg.Line = buf[p : p+q] | ||
msg.Timestamp = time.Now().UTC() | ||
msg.Partial = false | ||
select { | ||
case <-c.closed: | ||
return | ||
default: | ||
if logErr := c.dst.Log(msg); logErr != nil { | ||
logrus.Errorf("Failed to log msg %q for logger %s: %s", msg.Line, c.dst.Name(), logErr) | ||
} | ||
} | ||
p += q + 1 | ||
} | ||
// If there's no more coming, or the buffer is full but | ||
// has no newlines, log whatever we haven't logged yet, | ||
// noting that it's a partial log line. | ||
if eof || (p == 0 && n == len(buf)) { | ||
if p < n { | ||
msg.Line = buf[p:n] | ||
msg.Timestamp = time.Now().UTC() | ||
msg.Partial = true | ||
if logErr := c.dst.Log(msg); logErr != nil { | ||
logrus.Errorf("Failed to log msg %q for logger %s: %s", msg.Line, c.dst.Name(), logErr) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Suggestion for avoiding massive error lines const maxErrors := 256
errors := 0
// ...
if errors < maxErrors {
logrus.Errorf(theError)
errors += 1
if errors == maxErrors {
logrus.Warnf("reached the limit")
}
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure I follow your meaning here. Are you looking at having dockerd stop logging "Failed to log msg..." errors once some number of them have been encountered? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I understand, then. I think that belongs in a different PR. |
||
} | ||
p = 0 | ||
n = 0 | ||
} | ||
if eof { | ||
return | ||
} | ||
} | ||
// Move any unlogged data to the front of the buffer in preparation for another read. | ||
if p > 0 { | ||
copy(buf[0:], buf[p:n]) | ||
n -= p | ||
} | ||
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little not comfortable with passing part of a slice to
Log
. I'm not sure about current loggers if they can stash messages and send them later, but that would lead to messages corruption.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could allocate msg.Line and copy the slice into it every time, at the cost of a hit to speed and memory usage. Some of the log driver internals are complicated enough that I can't say for sure that we don't need to worry about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, dug into it.
I'm tempted to make awslogs copy the data, and let the rest use slices, because otherwise we're copying the log data twice for most log drivers (once in Copier, and again in the log driver).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with this, let's document this also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added notes around the definition of Message and the new CopyMessage function, and a very short note in the awslog driver where it calls CopyMessage(). @LK4D4 is this what you were thinking, or are there other places this should be called out?