Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bufio.Scanner: token too long #99

Closed
wants to merge 1 commit into from

Conversation

everesio
Copy link

Description

Replace bufio.NewScanner by bufio.NewReaderSize as the scanner returns "token too long" error when the line is longer than 65536 bytes

Motivation and Context

When function receives and logs long input, the "bufio.Scanner: token too long" error is logged and the request processing is stopped.

  • I have raised an issue to propose this change (required)

How Has This Been Tested?

go test was added. function built with custom image everesio/of-watchdog:0.7.7-fix is not broken any more.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I've read the CONTRIBUTION guide
  • I have signed-off my commits with git commit -s
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Copy link
Member

@LucasRoesler LucasRoesler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @everesio thank you for the patch. I noticed that you did not include a link to an issue. Can you expand your PR description with an example of the error being produced by a function? Ideally, with a link or zip of the function code so that we can manually verify this fix?

Finally, I took a look at the docs for Scanner and you could reuse the scanner implementation by setting the scanner.Buffer(buf, 64*1024) and I believe it would achieve the same effect. See https://golang.org/pkg/bufio/#Scanner.Buffer. Also, according to the documentation, scanner already uses a default buffer of 64*1024 see the docs for scanner.Buffer and here https://golang.org/pkg/bufio/#pkg-constants

A reproducible example would really help us merge this, thanks.

@@ -10,15 +10,21 @@ import (
func bindLoggingPipe(name string, pipe io.Reader, output io.Writer) {
log.Printf("Started logging %s from function.", name)

scanner := bufio.NewScanner(pipe)
reader := bufio.NewReaderSize(pipe, 64*1024)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a comment explaining this number 64*1024 the next person to edit this might not know why it should be that value and not something longer or shorter

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created a pipeReaderBufSize constant with bufio.MaxScanTokenSize as value, as I wanted to keep the same max buffer size as Scanner. I hope this is less cryptic.

"time"
)

func TestBindLoggingPipe(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't really explain what it is testing, i think the name should be something like TestBindLoggingPipeHandlesLongLines this is much clearer about the intent and if anyone needs to edit it in the future, it will be very clear what the behavior should be. Right now the name just tells us what method is called, but not what behavior is being covered.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test was reworked

defer close(done)
max := 4
for i := 0; i < max; i++ {
_, err := w.Write([]byte(fmt.Sprintf("%s\n", strings.Repeat("x", ssize))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To property test the behavior, shouldn't ssize be larger that the constant you set here https://github.com/openfaas-incubator/of-watchdog/pull/99/files#diff-42b34d1af16e4b2190c38ad0ec0e0f54R13


func TestBindLoggingPipe(t *testing.T) {

r, w := io.Pipe()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both r and w are Closers. For completeness you should call r.Close() and w.Close() in this test

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added defer Close()


select {
case <-done:
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you never read the log content. when this is done you should not return but instead check that the content was written correctly

	buf := new(bytes.Buffer)
	buf.ReadFrom(r)
	logLines := buf.String()
	if logLines != expectedLogs { 
		t.Fatalf("unexpected logs lines, expected %s, got %s", expectedLogs, logLines)
        }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add the content check.

if err != nil {
t.Fatal(err)
}
time.Sleep(duration / time.Duration(int64(max)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this sleep required?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@alexellis
Copy link
Member

/add label: design/review

@derek derek bot added the design/review label Jun 29, 2020
@alexellis
Copy link
Member

/add label: not approved

@derek derek bot added the not approved label Jun 29, 2020
@alexellis
Copy link
Member

My thoughts are the same as what Lucas said initially, we need to understand the use-case here. Thanks for raising an issue, but it's still lacking some details about the problem being solved.

Alex

@everesio everesio force-pushed the token-too-long branch 4 times, most recently from 627bc3e to d9f6d7a Compare June 29, 2020 15:49
@alexellis
Copy link
Member

Please hold off on further work. This PR is not approved and we have an open discussion on the issue.

'Error scanning stderr: bufio.Scanner: token too long'
when a line is longer than 64kB. Subsequent writes to pipe hang.

Signed-off-by: Michal Budzyn <michalbudzyn@gmail.com>
@everesio
Copy link
Author

I opened the #100 and added the test handler.
Use case: ML functions receives json event longer than 64kb. The event is logged.

@everesio
Copy link
Author

Hi @alexellis,

I am a little confused. How do you want to proceed with the PR or solving the issue in other way ?
The problem is use case independent i.e. log line longer than 64kb breaks the service.

Regards,
Michal

@alexellis
Copy link
Member

I'm trying to understand why there is a log line of 64kb. I'll take a look at your issue to see if it's been updated yet.

@everesio
Copy link
Author

everesio commented Aug 7, 2020

I'm trying to understand why there is a log line of 64kb.

  1. Service logs a stack trace
  2. It is not always possible to control log size coming from 3rd party libs

@alexellis alexellis mentioned this pull request Nov 8, 2021
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants