New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix for log lines without or delayed newlines results in a file truncated detection and reading the file again #164
Conversation
I found a solution for the issue elastic#149
Just for reference, as described in #149 we discovered ReadLine() actually treats EOF as an end-of-line delimiter (I checked the pkg source) - which is the root cause of the issue. |
Hi tzahari, |
I've been testing without the removal of +1 and it does work OK, no issue with the reproduction steps you provided before. So if you can put that +1 back in that will make this fix perfect :-) |
Hi Jason, |
Hi tzahari, I agree. I think the best way seems to be, readline() returns another value - the number of characters stripped - and that is added to offset. Jason. |
Hi tzahari, Found another possible issue - if the newline does not appear for more than 10 seconds, we lose all data up until the line because it's kept in buffer and we lose it when we return nil, io.EOF. I think it need to do one of:
Jason |
Fixed: file offset still calculated wrong. File was detected as truncated again :-( Fixed: loosing the beginning of a partial line Fixed: A CR is also removed from the EOL Known issues: If the last line was logged with CRLF as EOF then the file offset is not correct calculated (missing one byte for the CR) in the Registry
Hi Jason, |
Cool That issue with the registrar... I've fixed that in my PR by moving the offset change to BEFORE the registrar event is sent. So the registrar actually has the calculated offset and doesn't need to do anything. I'll drop a comment on one bit I'm not sure why was moved. I've also got the tests working here so I can apply your PR to my PR, fix conflicts, and see if the test passes (currently fails of course!) Regards Jason |
} | ||
|
||
if segment != nil && len(segment) > 0 { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 145. Just needs blank line removing here and at end of block - forget what I said about moved - got it now, needed bufferSize earlier ^^ Coolios
Hi tzahari, I got the tests working again and created a test, but it fails. So looks like the buffer is gonna need to be held by the caller, Harvest() and it passes it back in on each subsequent call. If readline() returns a line it will need to clear the buffer (since it won't go out of scope anymore). Jason |
So I'm thinking if we do this in Harvest() // TODO(sissel): Make the buffer size tunable at start-time
reader := bufio.NewReaderSize(h.file, 16<<10) // 16kb buffer by default
buffer := new(bytes.Buffer)
var read_timeout = 10 * time.Second
last_read_time := time.Now()
for {
text, bytesread, err := h.readline(reader, buffer, read_timeout) Then the later bit. Don't worry about the +1 in the registrar - my PR ultimately fixes it but its minor - with CRLF it just means we resume and read a blank line - no bad thing really offset += int64(bytesread) Then we have for readline() func (h *Harvester) readline(reader *bufio.Reader, buffer *bytes.Buffer, eof_timeout time.Duration) (*string, int, error) {
var is_partial bool = true
var newline_length int = 1
start_time := time.Now()
for {
segment, err := reader.ReadBytes('\n')
if segment != nil && len(segment) > 0 {
if segment[len(segment)-1] == '\n' {
// Found a complete line
is_partial = false
// Check if also a CR present
if len(segment) > 1 && segment[len(segment)-2] == '\r' {
newline_length++
}
}
// TODO(sissel): if buffer exceeds a certain length, maybe report an error condition? chop it?
buffer.Write(segment)
}
if err != nil {
if err == io.EOF && is_partial {
time.Sleep(1 * time.Second) // TODO(sissel): Implement backoff
// Give up waiting for data after a certain amount of time.
// If we time out, return the error (eof)
if time.Since(start_time) > eof_timeout {
return nil, 0, err
}
continue
} else {
log.Println(err)
return nil, 0, err // TODO(sissel): don't do this?
}
}
// If we got a full line, return the whole line without the EOL chars (CRLF or LF)
if !is_partial {
// Get the str length with the EOL chars (LF or CRLF)
bufferSize := buffer.Len()
str := new(string)
*str = buffer.String()[:bufferSize - newline_length]
// Reset the buffer for the next line
buffer.Reset()
return str, bufferSize, nil
}
} /* forever read chunks */
return nil, 0, nil
} Using the above, the test I've made passes! |
FYI - just updated my comment with the fixed code - I simplified the CRLF detection and fixed a few formatting bits |
Hi Jason, |
No problem! I fixed it all though in another PR - and also added test cases for this issue and the couple others I fixed. That one doesn't fail because of the above reasons, it just fails because the tests fail (which is good!) Once the test PR is merged, this one too, and my one, it should start succeeding. Although Jordan might kill me when he sees the conflicts lol... but when I did a test merge it didn't look too bad... Have a good weekend too :-) |
When I say merge the PR for him - I mean into one big PR. I'm not a maintainer! lol |
any news of forward movement on this fix? |
Hi glongman, You can find a fully working logstash-forwarder including this and more in my github: |
Apologies for the delay in attending to many of these pull requests. The PRs now appear to be some kind of spaghetti of merges crossing into other PRs. My plan is to spend some time next week untangling what has occurred in my absence. Hopefully will have some of these things merged soon. Thanks again for your patience! |
I can confirm that this fix works. I've been using it for a week, and I haven't had any more floods of repeated log data. |
for { | ||
segment, is_partial, err := reader.ReadLine() | ||
segment, err := reader.ReadBytes('\n') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't fully read this patch, but it looks like this reimplements go's ReadLine() method with respect to partial detection. Is there a bug we should file upstream to the Go developers? Put differently, what caused this to need reimplementing of is_partial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As described in #149 we discovered ReadLine() actually treats EOF as an end-of-line delimiter (I checked the pkg source) - which is the root cause of the issue.
So if you have data at the end of the log, with no new line, ReadLine() returns normally with is_partial set to false and you can never know if there was a new line or not - and LSF assumes there was (the +1 we have for offset). So we end up offset too far ahead of the Size() and detect truncation, wrongly.
It's also worth noting http://golang.org/pkg/bufio/#Reader.ReadLine
ReadLine is a low-level line-reading primitive. Most callers should use ReadBytes('\n') or ReadString('\n') instead or use a Scanner.
Using ReadBytes('\n') we can safely ensure that we don't treat EOF as a line delimiter and properly count the \r\n bytes at the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think Go have documented the behaviour at EOF. Note the second sentence.
The text returned from ReadLine does not include the line end ("\r\n" or "\n"). No indication or error is given if the input ends without a final line end.
So an upstream patch is probably not viable as it's the defined behaviour.
Hi @jordansissel , |
@twood9003 are you sure it's the same issue? This was a very specific issue and you haven't mentioned any of the symptoms you are seeing. If you could raise a new issue it would be better and we can all look at it with fresh eyes in the event it is different. It helps too as this is now closed. |
Changed the file reading from ReadLine() to ReadBytes('\n').
Now the file is not longer recognized as truncated (and start reading the file again from the beginning).
Details here: #149
The problem is that ReadLine() http://golang.org/pkg/bufio/#Reader.ReadLine returns also a incomplete line (without newline) but there is no indicator that the newline is missing.
Therefor I changed the reading to ReadBytes() http://golang.org/pkg/bufio/#Reader.ReadBytes