Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: panic on short write in fd_unix.go #22522

Closed
sigmonsays opened this issue Nov 1, 2017 · 11 comments
Closed

os: panic on short write in fd_unix.go #22522

sigmonsays opened this issue Nov 1, 2017 · 11 comments

Comments

@sigmonsays
Copy link

@sigmonsays sigmonsays commented Nov 1, 2017

What version of Go are you using (go version)?

go 1.9

Does this issue reproduce with the latest release?

reproduced in 1.9, properly returns an error "short write" on 1.8

What operating system and processor architecture are you using (go env)?

linux, x86 64

What did you do?

example code


/*
 * test a write against a fuse file system which returns the wrong number of bytes on a write() syscall
 *
 * use fuse file system from here  https://github.com/terencehonles/fusepy/blob/master/examples/loopback.py#L95
 * - change line to return a different number of bytes than written (return 4)
 *
 */

import (
	"fmt"
	"os"
)

func Error(err error) {
	fmt.Fprintf(os.Stderr, "ERROR: %s\n", err)
	os.Exit(1)
}

func main() {

	data_file := "/tmp/whatever/test.txt"
	f, err := os.Create(data_file)
	if err != nil {
		Error(err)
	}

	data := []byte("A\n")

	// this line panics
	n, err := f.Write(data)
	if err != nil {
		Error(err)
	}

	fmt.Printf("write return %d\n", n)

	err = f.Close()
	if err != nil {
		Error(err)
	}

}

/* output
$ test_corrupt_write
panic: runtime error: slice bounds out of range

goroutine 1 [running]:
internal/poll.(*FD).Write(0xc42007c0f0, 0xc42003bf38, 0x6, 0x20, 0x0, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:218 +0x380
os.(*File).write(0xc42000c028, 0xc42003bf38, 0x6, 0x20, 0xc42000e1c0, 0x7f1fc7a1a260, 0xc42003bed8)
        /usr/local/go/src/os/file_unix.go:233 +0x4e
os.(*File).Write(0xc42000c028, 0xc42003bf38, 0x6, 0x20, 0x6, 0x20, 0x404614)
        /usr/local/go/src/os/file.go:140 +0x72
main.main()
        /home/sig/go/playground/src/playground/test_corrupt_write/main.go:32 +0xb4

$ go version
go version go1.9 linux/amd64
*/


What did you expect to see?

a error returned instead of a panic

What did you see instead?

panic

@tv42
Copy link

@tv42 tv42 commented Nov 1, 2017

I'm a little weirded out by the "short write" error too. Surely write(2) is allowed to return a byte count less than requested, and Go should loop until it has written all data, or seen an error.

@gbbr gbbr changed the title panic on short write in fd_unix.go os: panic on short write in fd_unix.go Nov 1, 2017
@davecheney
Copy link
Contributor

@davecheney davecheney commented Nov 1, 2017

This issue looks very similar to #22102

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 1, 2017

@sigmonsays Can you show us the strace output so we can see precisely how the write system call was called and what it returned?

I think this can only happen if write returns a value that is larger than its count argument. I don't see any way it could happen if write returns a value that is too small, as that is of course completely normal.

@sigmonsays
Copy link
Author

@sigmonsays sigmonsays commented Nov 1, 2017

write system call trace snipped below, will attach also

...
[pid  7435] 08:51:53.560419 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK|O_LARGEFILE <unfinished ...>
[pid  7436] 08:51:53.560445 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0 <unfinished ...>
[pid  7435] 08:51:53.560471 <... fcntl resumed> ) = 0
[pid  7436] 08:51:53.560555 <... pselect6 resumed> ) = 0 (Timeout)
[pid  7436] 08:51:53.560592 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0 <unfinished ...>
[pid  7435] 08:51:53.560632 write(3, "Hello\n", 6 <unfinished ...>
[pid  7436] 08:51:53.560703 <... pselect6 resumed> ) = 0 (Timeout)
[pid  7436] 08:51:53.560733 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0) = 0 (Timeout)
[pid  7436] 08:51:53.560859 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0) = 0 (Timeout)
[pid  7436] 08:51:53.560984 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0) = 0 (Timeout)
[pid  7436] 08:51:53.561108 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0) = 0 (Timeout)
[pid  7436] 08:51:53.561233 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0) = 0 (Timeout)
[pid  7435] 08:51:53.561355 <... write resumed> ) = 4
[pid  7436] 08:51:53.561421 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0 <unfinished ...>
[pid  7435] 08:51:53.561448 write(3, "o\n", 2 <unfinished ...>
[pid  7436] 08:51:53.561528 <... pselect6 resumed> ) = 0 (Timeout)
[pid  7436] 08:51:53.561557 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0) = 0 (Timeout)
[pid  7436] 08:51:53.561682 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0) = 0 (Timeout)
[pid  7436] 08:51:53.561819 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0) = 0 (Timeout)
[pid  7435] 08:51:53.561940 <... write resumed> ) = 4
[pid  7436] 08:51:53.562001 pselect6(0, NULL, NULL, NULL, {0, 20000}, 0 <unfinished ...>
[pid  7435] 08:51:53.562032 pselect6(0, NULL, NULL, NULL, {0, 1000000}, 0 <unfinished ...>
[pid  7436] 08:51:53.562110 <... pselect6 resumed> ) = 0 (Timeout)
[pid  7436] 08:51:53.562140 futex(0x52fa58, FUTEX_WAIT, 0, {60, 0} <unfinished ...>
[pid  7435] 08:51:53.563117 <... pselect6 resumed> ) = 0 (Timeout)
[pid  7435] 08:51:53.563146 pselect6(0, NULL, NULL, NULL, {0, 1000000}, 0) = 0 (Timeout)
[pid  7435] 08:51:53.564270 write(2, "panic: ", 7panic: ) = 7

@sigmonsays
Copy link
Author

@sigmonsays sigmonsays commented Nov 1, 2017

@sigmonsays
Copy link
Author

@sigmonsays sigmonsays commented Nov 1, 2017

A clarification on the previous strace.txt file because I changed the example code. The panic occurs with both data larger and smaller. The strace uses data := []byte("Hello\n") and the example uses data = []byte("A\n")

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 1, 2017

The strace shows the problem.

[pid  7435] 08:51:53.561448 write(3, "o\n", 2 <unfinished ...>
[pid  7435] 08:51:53.561940 <... write resumed> ) = 4

The program called write with a count of 2, and write returned 4. That should never happen. That is a bug in your FUSE file system code.

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 1, 2017

Closing as dup of #22102.

@sigmonsays
Copy link
Author

@sigmonsays sigmonsays commented Nov 1, 2017

how is a panic acceptable behavior in this case?

@ianlancetaylor
Copy link
Contributor

@ianlancetaylor ianlancetaylor commented Nov 1, 2017

Is it really worth adding a specific test for Write returning a count too large, and returning an error for that? That would add a test and error handling code for a case that never happens.

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Nov 1, 2017

For what it's worth, POSIX specifically forbids write from returning a value larger than requested:

Upon successful completion, these functions shall return the number of bytes actually written to the file associated with fildes. This number shall never be greater than nbyte.

-- http://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html

@golang golang locked and limited conversation to collaborators Nov 1, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.