[receiver/filelog] Flush can send partial input #32170

OverOrion · 2024-04-04T15:50:43Z

Component(s)

pkg/stanza

What happened?

Description

As mentioned in #31512 the filelogreceiver could "loose" some characters. The issue is a bit long now, so this is here to summarize the problem, minimize the reproduction steps and point to the relevant lines of code.

The problem is how the flush logic behaves with a scanner. By default the scanner expects newline terminated lines, but if it can't find it, then it will return the current buffer once flush timeout expires. The problem with this is that there is no communication between the flushing and scanner, so the following is possible:

scanner scanned some bytes, but has not reached EOF yet
forced flush happens, then the scanner yields its' inner buffer

The problem here seems to be with the different lifetimes of FlushState and Scanner as there is a single FlushState instance for a reader which will have different Scanner instances

opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/scanner/scanner.go

Line 23 in d75362a

    
           func New(r io.Reader, maxLogSize int, bufferSize int, startOffset int64, splitFunc bufio.SplitFunc) *Scanner {

This means that these scanner instances will all all share the same fate:

Read n bytes (n == initial buffer size), try to read more but since it can't find the newline terminator it won't return with tokens.

Becase it did not run successfully, then a new one will be constructed:

opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader/reader.go

Lines 62 to 80 in d75362a

    
           s := scanner.New(r, r.maxLogSize, r.initialBufferSize, r.Offset, r.splitFunc) 
        
           // Iterate over the tokenized file, emitting entries as we go 
        
           for { 
        
           	select { 
        
           	case <-ctx.Done(): 
        
           		return 
        
           	default: 
        
           	} 
        
           	ok := s.Scan() 
        
           	if !ok { 
        
           		if err := s.Error(); err != nil { 
        
           			r.logger.Errorw("Failed during scan", zap.Error(err)) 
        
           		} else if r.deleteAtEOF { 
        
           			r.delete() 
        
           		} 
        
           		return 
        
           	}

Once the flush timer expires, then the current Scanner will be force flushed, yielding n bytes only (this could be different, depenending on when the flush timeout reaps it)

The reconstruction is needed because:

once a scanner reached the end of its' input then it won't be usable anymore and
because this is how the Collector gets the new input from a file: a new Scanner with the offset

Steps to Reproduce

Input creation without newline ending

printf "2024-03-19T11:21:00.839338492-05:00 stdout P 2024-03-13 11:51:00,838 [scheduler-2] INFO  dLphJ63kHpPQ78jzoFu" > input.log

Collector

The easiest is to change the default scanner buffer size to something small (50 bytes, as the input is just a little over 100 bytes)

opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/scanner/scanner.go

Line 14 in 87630b2

const DefaultBufferSize = 16 * 1024
Build and run the collector with the given configuration
Check the output log file, the input is chunked into 50 bytes

{"resourceLogs":[{"resource":{},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1712240671818953417","body":{"stringValue":"2024-03-19T11:21:00.839338492-05:00 stdout P 2024-"},"attributes":[{"key":"log.file.name","value":{"stringValue":"input.log"}}],"traceId":"","spanId":""}]}]}]}
{"resourceLogs":[{"resource":{},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1712240677018432716","body":{"stringValue":"03-13 11:51:00,838 [scheduler-2] INFO  dLphJ63kHpP"},"attributes":[{"key":"log.file.name","value":{"stringValue":"input.log"}}],"traceId":"","spanId":""}]}]}]}
{"resourceLogs":[{"resource":{},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1712240682018598612","body":{"stringValue":"Q78jzoFu"},"attributes":[{"key":"log.file.name","value":{"stringValue":"input.log"}}],"traceId":"","spanId":""}]}]}]}

Expected Result

Whole line as is

Actual Result

Chunked line

Possible solutions

A possible workaround would be if flushing only happened when atEOF was true, but it's only for file based sources, so it would not work for TCP for example.
A different "polling" method for Scanners so they would not have to be recreated just to read new input/lines, and the flush timeout would only send the the buffer if the scanning can't advance anymore
The bufio.Scanner might need to be retired in favor of something else (something based on bufio.Reader perhaps?), combined with keeping track of the current partial token. This is something that needs to be given some thought because many things rely on the scanner currently.

What do you think @djaglowski @ChrsMark?

Also huge kudos to @MrAnno for pair debugging this issue with me 🚀

Collector version

e4c5b51

Environment information

Environment

OS: Ubuntu 23.10
Compiler(if manually compiled): go 1.21.6

OpenTelemetry Collector configuration

receivers:
  filelog:
    start_at: beginning
    include:
    - /home/orion/input.log

exporters:
  file/simple:
    path: ./partial_output
  debug:
    verbosity: detailed

service:
  pipelines:
    logs:
      receivers: [filelog]
      exporters: [debug, file/simple]

Log output

{"resourceLogs":[{"resource":{},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1712240671818953417","body":{"stringValue":"2024-03-19T11:21:00.839338492-05:00 stdout P 2024-"},"attributes":[{"key":"log.file.name","value":{"stringValue":"input.log"}}],"traceId":"","spanId":""}]}]}]}
{"resourceLogs":[{"resource":{},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1712240677018432716","body":{"stringValue":"03-13 11:51:00,838 [scheduler-2] INFO  dLphJ63kHpP"},"attributes":[{"key":"log.file.name","value":{"stringValue":"input.log"}}],"traceId":"","spanId":""}]}]}]}
{"resourceLogs":[{"resource":{},"scopeLogs":[{"scope":{},"logRecords":[{"observedTimeUnixNano":"1712240682018598612","body":{"stringValue":"Q78jzoFu"},"attributes":[{"key":"log.file.name","value":{"stringValue":"input.log"}}],"traceId":"","spanId":""}]}]}]}

Additional context

I also added some good ole' print statements to Func()

opentelemetry-collector-contrib/pkg/stanza/flush/flush.go

Line 29 in d75362a

    
           func (s *State) Func(splitFunc bufio.SplitFunc, period time.Duration) bufio.SplitFunc {

and to s.Bytes()

opentelemetry-collector-contrib/pkg/stanza/fileconsumer/internal/reader/reader.go

Line 82 in d75362a

token, err := r.decoder.Decode(s.Bytes())

which helped with debugging.

// First scanner instance
inside Func, data is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024- // First 50 bytes
inside Func, data is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024-03-13 11:51:00,838 [scheduler-2] INFO  dLphJ63kHpP // First 2*50 bytes
inside Func, data is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024-03-13 11:51:00,838 [scheduler-2] INFO  dLphJ63kHpPQ78jzoFu // Leftovers
inside Func, data is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024-03-13 11:51:00,838 [scheduler-2] INFO  dLphJ63kHpPQ78jzoFu //EOF

// Second scanner instance, same fate
inside Func, data is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024-
inside Func, data is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024-03-13 11:51:00,838 [scheduler-2] INFO  dLphJ63kHpP
inside Func, data is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024-03-13 11:51:00,838 [scheduler-2] INFO  dLphJ63kHpPQ78jzoFu
inside Func, data is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024-03-13 11:51:00,838 [scheduler-2] INFO  dLphJ63kHpPQ78jzoFu

// Third scanner instance
inside Func, data is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024-

// Flush timeout, sending current scanner's buffer, calling s.Bytes()
 tokenizing, len(bytes): 50,
tokenizing, bytes is: 2024-03-19T11:21:00.839338492-05:00 stdout P 2024-

The text was updated successfully, but these errors were encountered:

github-actions · 2024-04-04T15:51:56Z

Pinging code owners:

pkg/stanza: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

github-actions · 2024-04-04T16:11:49Z

Pinging code owners for receiver/filelog: @djaglowski. See Adding Labels via Comments if you do not have permissions to add labels yourself.

djaglowski · 2024-04-04T17:10:19Z

Thanks for investigating and writing this up @OverOrion.

To summarize the expected vs problematic behavior:

Expected: If a file ends with an unterminated log and the flush timer expires, we should flush a token containing the content after the offset, up until either the end of file or max log size.

Problem: When a file ends with an unterminated log and the flush timer expires, we are flushing a token containing the content after the offset, but only up until initial buffer size.

My understanding of the incorrect behavior does not rely on multiple sequential scanners. I would describe it as follows:

A scanner is created with an initial buffer size.

It reads into its buffer until full but doesn't find a complete token.
It immediately and automatically enlarges the buffer and tries again.
It continues to enlarge the buffer and try again until one of following happens:

The end of a token is found, in which case the token is returned (correct behavior).
The buffer reaches its max size, in which case the entire buffer is returned (correct behavior).
The EOF is found, in which case no token is returned (correct behavior).
The flush timer expires, in which case the contents of the buffer are returned immediately (incorrect behavior).

The correct behavior would be the same, except the last two items should be related:

3a. The EOF is found before the flush timer has expired. No token is returned.
3b. The EOF is found after the flush timer has expired. The entire buffer is returned.

A possible workaround would be if flushing only happened when atEOF was true

I think this is the solution to enable the correct behavior described above. In fact, I believe this behavior previously existed but the nuance was not tested or documented. Let's make sure to include both this time so we don't regress again in the future.

ChrsMark · 2024-04-05T09:57:49Z

Thank's @OverOrion for digging into this and sharing the details!

The EOF is found, in which case no token is returned (correct behavior).

I wonder if it's correct to wait for the flush timer to expire in order to return the buffer in case the EOF is found instead of doing it immediately when EOF is found 🤔 .

If we return at EOF immediately, a bigger force_flush_period would give more time to the Scanner to increase the buffer and consume the remaining message until EOF. Then the complete message would be returned. I have replicated this at ChrsMark@e12a690 and seems to work with the example tested by @OverOrion and the big ones we used at #31512.

If we want to keep this in order to preserve 3a (from #32170 (comment)) then 3b should be like:

3b: If flush timer has expired but the EOF is not found skip flush and try again?

In that case the flush timer actually has more or less no effect, right?

ChrsMark · 2024-04-05T11:36:16Z

Another option could be to move the We're seeing new data so postpone the next flush block:

opentelemetry-collector-contrib/pkg/stanza/flush/flush.go

Lines 62 to 66 in 3ab6693

    
           // We're seeing new data so postpone the next flush 
        
           if len(data) > s.LastDataLength { 
        
           	s.LastDataChange = time.Now() 
        
           	s.LastDataLength = len(data) 
        
           }

before the timeout check (previous if block).

This if-block only makes sense if the goal is to actually consume the rest of the data but by leaving it in the end will not ensure that on the next call the timeout won't be reached. So it actually has an a non deterministic impact which depends on "timing".

However this would be completely identical with preserving the reading till the EOF.

djaglowski · 2024-04-05T14:05:07Z

I wonder if it's correct to wait for the flush timer to expire in order to return the buffer in case the EOF is found instead of doing it immediately when EOF is found 🤔 .

This is worth a separate issue to discuss if you want. In short though, I think this makes a lot of assumptions about how files are written which I've never been comfortable making. Does every application & OS write complete logs atomically? Otherwise we're just emitting partial logs which would have been complete if we just waited a little longer. Maybe there's a case to be made but I think this could potentially create big problems.

If we return at EOF immediately, a bigger force_flush_period would give more time to the Scanner to increase the buffer and consume the remaining message until EOF.

If we want to keep this in order to preserve 3a (from #32170 (comment)) then 3b should be like:

3b: If flush timer has expired but the EOF is not found skip flush and try again?

In that case the flush timer actually has more or less no effect, right?

The key concept which I maybe didn't articulate well is that flushing is never a necessity prior to EOF. It shouldn't be part of the logic at all. (This is essentially what the bug boils down to, that the flush timer is being applied when it really shouldn't even be considered.) As long as we're not yet at EOF, we should just keep consuming the data rapidly and never look at the flush timer. Once at EOF, it's a relevant consideration.

Typically when flushing is necessary at all, it's not because the timer expires while reading the file. It's because file is sitting idle and the timer expired in between polls. It can work either way, but it's not intended to be a consideration which is relevant during the normal course of consuming the file.

djaglowski · 2024-04-05T14:15:10Z

This if-block only makes sense if the goal is to actually consume the rest of the data but by leaving it in the end will not ensure that on the next call the timeout won't be reached. So it actually has an a non deterministic impact which depends on "timing".

Not sure I understand what you are suggesting but the purpose of that is to reset the timer because the timer is relative to the last time a log was emitted. In other words, when the final (partial) log in a file is reached, this is basically when the timer should start.

ChrsMark · 2024-04-05T22:29:03Z

This is worth a separate issue to discuss if you want. In short though, I think this makes a lot of assumptions about how files are written which I've never been comfortable making. Does every application & OS write complete logs atomically? Otherwise we're just emitting partial logs which would have been complete if we just waited a little longer. Maybe there's a case to be made but I think this could potentially create big problems.

I see yeap. It makes sense to not rely on such an assumption.

This if-block only makes sense if the goal is to actually consume the rest of the data but by leaving it in the end will not ensure that on the next call the timeout won't be reached. So it actually has an a non deterministic impact which depends on "timing".

Not sure I understand what you are suggesting but the purpose of that is to reset the timer because the timer is relative to the last time a log was emitted. In other words, when the final (partial) log in a file is reached, this is basically when the timer should start.

My point was mainly that by inverting the order of the final 2 if-blocks we ensure that first the check for remaining data happens and only if we have no remaining data we proceed to the timeout check. So the flow should be:

...
// We're seeing new data so postpone the next flush
if len(data) > s.LastDataLength {
	s.LastDataChange = time.Now()
	s.LastDataLength = len(data)
}

// Flush timed out
if time.Since(s.LastDataChange) > period {
	s.LastDataChange = time.Now()
	s.LastDataLength = 0
	return len(data), data, nil
}

// Ask for more data
return 0, nil, nil
...

If I don't miss anything this looks equivalent to introducing the EOF requirement/check?
I verified this with some manual tests as well.

Overall I believe we are aligned here. We can keep any remaining discussions at #32100.

djaglowski · 2024-04-16T13:47:42Z

My point was mainly that by inverting the order of the final 2 if-blocks we ensure that first the check for remaining data happens and only if we have no remaining data we proceed to the timeout check.

That makes sense. Thanks for clarifying.

**Description:** Flush could have sent partial input before EOF was reached, this PR fixes it. **Link to tracking Issue:** #31512, #32170 **Testing:** Added unit test `TestFlushPeriodEOF` **Documentation:** Added a note to `force_flush_period` option --------- Signed-off-by: Szilard Parrag <szilard.parrag@axoflow.com> Co-authored-by: Daniel Jaglowski <jaglows3@gmail.com>

ChrsMark · 2024-04-25T07:53:08Z

Since #32100 was merged I guess we can close this?

ChrsMark · 2024-04-26T09:17:31Z

@crobert-1 since we closed #31512 I think we can close this one as well.

**Description:** Flush could have sent partial input before EOF was reached, this PR fixes it. **Link to tracking Issue:** open-telemetry#31512, open-telemetry#32170 **Testing:** Added unit test `TestFlushPeriodEOF` **Documentation:** Added a note to `force_flush_period` option --------- Signed-off-by: Szilard Parrag <szilard.parrag@axoflow.com> Co-authored-by: Daniel Jaglowski <jaglows3@gmail.com>

OverOrion added bug Something isn't working needs triage New item requiring triage labels Apr 4, 2024

github-actions bot added the pkg/stanza label Apr 4, 2024

djaglowski added the receiver/filelog label Apr 4, 2024

djaglowski changed the title ~~stanza flush can send partial input~~ [receiver/filelog] Flush can send partial input Apr 4, 2024

atoulme removed the needs triage New item requiring triage label Apr 5, 2024

OverOrion mentioned this issue Apr 5, 2024

[pkg/stanza/fileconsumer] Fix long line parsing #32100

Merged

github-actions bot mentioned this issue Apr 9, 2024

Weekly Report: 2024-04-02 - 2024-04-09 #32230

Closed

djaglowski closed this as completed Apr 26, 2024

swiatekm mentioned this issue May 27, 2024

chore: prepare release v3.19.4 SumoLogic/sumologic-kubernetes-collection#3739

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[receiver/filelog] Flush can send partial input #32170

[receiver/filelog] Flush can send partial input #32170

OverOrion commented Apr 4, 2024

github-actions bot commented Apr 4, 2024

github-actions bot commented Apr 4, 2024

djaglowski commented Apr 4, 2024 •

edited

Loading

ChrsMark commented Apr 5, 2024 •

edited

Loading

ChrsMark commented Apr 5, 2024 •

edited

Loading

djaglowski commented Apr 5, 2024

djaglowski commented Apr 5, 2024

ChrsMark commented Apr 5, 2024

djaglowski commented Apr 16, 2024

ChrsMark commented Apr 25, 2024

ChrsMark commented Apr 26, 2024

[receiver/filelog] Flush can send partial input #32170

[receiver/filelog] Flush can send partial input #32170

Comments

OverOrion commented Apr 4, 2024

Component(s)

What happened?

Description

Steps to Reproduce

Input creation without newline ending

Collector

Expected Result

Actual Result

Possible solutions

Collector version

Environment information

Environment

OpenTelemetry Collector configuration

Log output

Additional context

github-actions bot commented Apr 4, 2024

github-actions bot commented Apr 4, 2024

djaglowski commented Apr 4, 2024 • edited Loading

ChrsMark commented Apr 5, 2024 • edited Loading

ChrsMark commented Apr 5, 2024 • edited Loading

djaglowski commented Apr 5, 2024

djaglowski commented Apr 5, 2024

ChrsMark commented Apr 5, 2024

djaglowski commented Apr 16, 2024

ChrsMark commented Apr 25, 2024

ChrsMark commented Apr 26, 2024

djaglowski commented Apr 4, 2024 •

edited

Loading

ChrsMark commented Apr 5, 2024 •

edited

Loading

ChrsMark commented Apr 5, 2024 •

edited

Loading