Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go,os: EAGAIN when writing to stdout/stderr during test #58408

Open
adob opened this issue Feb 8, 2023 · 11 comments
Open

cmd/go,os: EAGAIN when writing to stdout/stderr during test #58408

adob opened this issue Feb 8, 2023 · 11 comments
Labels
GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@adob
Copy link
Contributor

adob commented Feb 8, 2023

What version of Go are you using (go version)?

$ go version

go version go1.20 linux/amd64

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env

CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build674872367=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Calls to fmt.Printf() during test fail with /dev/stdout: resource temporarily unavailable when printing long lines after part of the output has been printed. The output is truncated.

This can be worked around by passing the -bench flag to suppress output redirection during test execution.

@bcmills
Copy link
Member

bcmills commented Feb 8, 2023

I'm surprised that a write to stdout would return EAGAIN instead of just blocking.

What was the exact go test command that you ran? (The exact command is important because it determines how go test sets up the output pipes.)

How reproducible is this failure mode?

(CC @ianlancetaylor)

@bcmills bcmills changed the title go test: resource temporarily unavailable error when writing to stdout/stderr during test cmd/go,testing: EAGAIN when writing to stdout/stderr during test Feb 8, 2023
@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. GoCommand cmd/go labels Feb 8, 2023
@bcmills bcmills added this to the Backlog milestone Feb 8, 2023
@bcmills bcmills added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Feb 8, 2023
@bcmills
Copy link
Member

bcmills commented Feb 8, 2023

One more question: what version of Linux (and what container environment, if any) is the test running under?

@bcmills
Copy link
Member

bcmills commented Feb 8, 2023

It looks like os.Pipe ends up setting O_NONBLOCK on the pipe as part of setting up the runtime poller:
https://cs.opensource.google/go/go/+/master:src/os/file_unix.go;l=175-178;drc=0602e2cdbc4b42e90c453e6ae365ad5f13f61c92

In the child process, os.Stdout is constructed using os.NewFile and that explicitly checks whether the descriptor has O_NONBLOCK set:
https://cs.opensource.google/go/go/+/master:src/os/file_unix.go;l=103-105;drc=0602e2cdbc4b42e90c453e6ae365ad5f13f61c92

(*poll.FD).Write explicitly retries EAGAIN errors for pollable files:
https://cs.opensource.google/go/go/+/master:src/internal/poll/fd_unix.go;l=390-394;drc=3c529c53676f809c7df55aac20e8b756462f0583

So the only way I see for an EAGAIN to occur from /dev/stdout is if the descriptor inherited by the child process somehow ends up being marked non-pollable.


I see a path on which that may be possible:

So my hypothesis is that somehow Init is failing for /dev/stdout, the error is being buried, and os.Stdout is being marked as non-pollable even though it is in non-blocking mode. That would cause an EAGAIN error to fail to retry.

@bcmills bcmills changed the title cmd/go,testing: EAGAIN when writing to stdout/stderr during test cmd/go,os: EAGAIN when writing to stdout/stderr during test Feb 8, 2023
@bcmills bcmills added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels Feb 8, 2023
@adob
Copy link
Contributor Author

adob commented Feb 8, 2023

I've managed to narrow this down somewhat:

// print_test.go
package print_test

import (
	"bufio"
	"bytes"
	"fmt"
	"os"
	"os/exec"
	"testing"
)

func TestPrint(t *testing.T) {

	cmd := exec.Command("npx", "hardhat", "run", "--no-compile", "test.js")

	stdout, err := cmd.StdoutPipe()
	check(err)

	cmd.Stderr = os.Stderr

	outReader := bufio.NewReader(stdout)

	check(cmd.Start())

	_, err = outReader.ReadBytes('\n')
	check(err)

	var buf bytes.Buffer

	for i := 0; i < 100_000; i++ {
		fmt.Fprintf(&buf, "%04d\n", i)
	}

	s := buf.String()
	fmt.Fprintf(os.Stderr, "about to write %d bytes\n", len(s))

	if _, err := fmt.Fprint(os.Stderr, s); err != nil {
		panic(err)
	}
}

func check(err error) {
	if err != nil {
		panic(err)
	}
}
// test.js

async function main() {
    console.log('1');
    await delay(1000);
}

main().catch((err) => {
    console.error(err);
    process.exit(1);
});

function delay(time) {
    return new Promise(resolve => setTimeout(resolve, time));
} 

Run with

go test . -v

Observed output

...
12586
12587
12588
12--- FAIL: TestPrint (2.02s)
panic: write /dev/stderr: resource temporarily unavailable [recovered]
	panic: write /dev/stderr: resource temporarily unavailable

goroutine 6 [running]:
...

OS Version: Linux Mint 21 (Vanessa)

uname -r: 5.15.0-56-generic

No containers/sandboxing.

@seankhliao
Copy link
Member

isn't that a race on the stderr resource?
you have a child process and the main one trying to write to it at the same time

panic: write /dev/stderr: resource temporarily unavailable [recovered]
	panic: write /dev/stderr: resource temporarily unavailable

goroutine 6 [running]:
testing.tRunner.func1.2({0x51c1a0, 0xc000092270})
	/home/arccy/sdk/go1.20/src/testing/testing.go:1526 +0x24e
testing.tRunner.func1()
	/home/arccy/sdk/go1.20/src/testing/testing.go:1529 +0x39f
panic({0x51c1a0, 0xc000092270})
	/home/arccy/sdk/go1.20/src/runtime/panic.go:884 +0x213
go.seankhliao.com/testrepo0099_test.TestPrint(0x0?)
	/home/arccy/tmp/testrepo0099/main_test.go:38 +0x40e
testing.tRunner(0xc000007860, 0x542e20)
	/home/arccy/sdk/go1.20/src/testing/testing.go:1576 +0x10b
created by testing.(*T).Run
	/home/arccy/sdk/go1.20/src/testing/testing.go:1629 +0x3ea
node:events:490
      throw er; // Unhandled 'error' event
      ^

Error: write EPIPE
    at afterWriteDispatched (node:internal/stream_base_commons:160:15)
    at writeGeneric (node:internal/stream_base_commons:151:3)
    at Socket._writeGeneric (node:net:923:11)
    at Socket._write (node:net:935:8)
    at writeOrBuffer (node:internal/streams/writable:392:12)
    at _write (node:internal/streams/writable:333:10)
    at Writable.write (node:internal/streams/writable:337:10)
    at Socket.<anonymous> (/home/arccy/.cache/npm/_npx/755986f37193a6d8/node_modules/run/run.js:80:20)
    at Socket.emit (node:events:512:28)
    at addChunk (node:internal/streams/readable:324:12)
Emitted 'error' event on Socket instance at:
    at emitErrorNT (node:internal/streams/destroy:151:8)
    at emitErrorCloseNT (node:internal/streams/destroy:116:3)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
  errno: -32,
  code: 'EPIPE',
  syscall: 'write'
}

Node.js v19.6.0

@ianlancetaylor
Copy link
Contributor

Thanks for the complete test case. Unfortunately, I can't recreate the problem.

My first guess would be that the npx hardhat run puts standard error into non-blocking mode. That could cause this problem. Does the problem still happen if you remove the line cmd.Stderr = os.Stderr ?

@adob
Copy link
Contributor Author

adob commented Feb 9, 2023

It does seem to be the case that nodejs sometimes puts stdout/err into non-blocking mode. Removing the cmd.Stderr = os.Stderr line resolves the problem.

The purpose of this line is to forward error/log messages from the child and make them show up in the terminal. It's not really an issue if the child process and the main one write to stderr at the same time so I wouldn't consider it a race condition.

@bcmills
Copy link
Member

bcmills commented Feb 9, 2023

@ianlancetaylor, it occurs to me that Go probably causes the dual problem itself in os.NewFile and (*os.File).Fd. If I'm reading correctly, if stderr is already non-blocking at program start, if we call Fd on it we will unset non-blocking mode for all processes that share its open file description. Perhaps we should avoid toggling that bit at all if the File was created via NewFile?

@bcmills
Copy link
Member

bcmills commented Feb 9, 2023

For the particular symptom at hand, would it be feasible to either always initialize runtime polling support in NewFile (regardless of the current O_NONBLOCK setting), or lazily initialize polling support if we see EAGAIN for a read or write on the file (since that indicates that the file is somehow now in non-blocking mode)?

@ianlancetaylor
Copy link
Contributor

I think you're right: we do seem to lose track of the fact that the descriptor passed to NewFile was in non-blocking mode. In that case we should perhaps leave it in non-blocking mode if someone calls Fd. I don't see how that could be the problem here, though.

Detecting EAGAIN and starting to poll on the file might be feasible. This is a rare condition, though. Most programs will fail in some way if they share file descriptors and one of the programs makes the descriptor non-blocking. There's nothing special about Go here, it would be straightforward to write a C program that has exactly the same problem.

@adob Given the behavior of nodejs here, I recommend that you not to try share stderr between the Go program and the nodejs program. It is straightforward to add (untested)

    stderr, err := cmd.StderrPipe()
    check(err)
    go func() { io.Copy(os.Stderr, stderr) }()

@bcmills
Copy link
Member

bcmills commented Feb 9, 2023

A simpler workaround might be:

cmd.Stderr = struct{ *os.File }{os.Stderr}

which I think should cause the exec.Cmd to run that io.Copy goroutine itself.

@bcmills bcmills removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GoCommand cmd/go NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

4 participants