Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Dialing IPv6 with timeout returns inconsistent error #44543

Open
lmb opened this issue Feb 23, 2021 · 0 comments
Open

net: Dialing IPv6 with timeout returns inconsistent error #44543

lmb opened this issue Feb 23, 2021 · 0 comments

Comments

@lmb
Copy link
Contributor

@lmb lmb commented Feb 23, 2021

What version of Go are you using (go version)?

$ go version
go version go1.16 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

Linux on amd64, 5.8.0-43-generic #49-Ubuntu

What did you do?

Set up a network namespace with an IPv6 prefix on lo.

unshare -Urn
ip link set dev lo up
ip addr add fd::/64 dev lo # NB: no local route

Run the following test:

func TestRuntimeDial(t *testing.T) {
	dialer := net.Dialer{
		Timeout: 1 * time.Second,
	}

	_, err := dialer.Dial("tcp6", "[fd::1]:1234")
	if !errors.Is(err, unix.ENETUNREACH) {
		t.Fatal("wrong error", err)
	}
}

Changing the timeout to 2 * time.Second makes the test pass. Here is a trace for one second timeout:

[pid 93030] epoll_ctl(3, EPOLL_CTL_ADD, 4, {EPOLLIN, {u32=7312728, u64=7312728}}) = 0
[pid 93034] epoll_pwait(3, === RUN   TestRuntimeDial
[{EPOLLIN, {u32=7312728, u64=7312728}}], 128, 599999, NULL, 7) = 1
[pid 93034] epoll_pwait(3,  <unfinished ...>
[pid 93030] connect(6, {sa_family=AF_INET6, sin6_port=htons(1234), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fd::1", &sin6_addr), sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)
[pid 93030] epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2358718232, u64=140160026488600}}) = 0
[pid 93034] <... epoll_pwait resumed>[], 128, 999, NULL, 7) = 0
[pid 93034] epoll_pwait(3, [], 128, 0, NULL, 0) = 0
[pid 93034] epoll_pwait(3, [], 128, 1, NULL, 7) = 0
[pid 93034] epoll_pwait(3, [], 128, 0, NULL, 0) = 0
[pid 93034] epoll_ctl(3, EPOLL_CTL_DEL, 6, 0xc0000b1874 <unfinished ...>
[pid 93035] epoll_pwait(3,  <unfinished ...>
[pid 93034] <... epoll_ctl resumed>)    = 0
    netns_test.go:82: wrong error dial tcp6 [fd::1]:1234: i/o timeout
--- FAIL: TestRuntimeDial (1.00s)
FAIL
[pid 93035] <... epoll_pwait resumed> <unfinished ...>) = ?
[pid 93036] +++ exited with 1 +++
[pid 93035] +++ exited with 1 +++
[pid 93034] +++ exited with 1 +++
[pid 93033] +++ exited with 1 +++
[pid 93032] +++ exited with 1 +++
[pid 93031] +++ exited with 1 +++
+++ exited with 1 +++

And here one for two seconds:

[pid 93303] epoll_ctl(3, EPOLL_CTL_ADD, 4, {EPOLLIN, {u32=7312728, u64=7312728}}) = 0
[pid 93305] epoll_pwait(3, === RUN   TestRuntimeDial
[{EPOLLIN, {u32=7312728, u64=7312728}}], 128, 599999, NULL, 7) = 1
[pid 93305] epoll_pwait(3,  <unfinished ...>
[pid 93303] connect(6, {sa_family=AF_INET6, sin6_port=htons(1234), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "fd::1", &sin6_addr), sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)
[pid 93303] epoll_ctl(3, EPOLL_CTL_ADD, 6, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2103566104, u64=139773224279832}}) = 0
[pid 93305] <... epoll_pwait resumed>[{EPOLLIN|EPOLLOUT|EPOLLERR|EPOLLHUP|EPOLLRDHUP, {u32=2103566104, u64=139773224279832}}], 128, 1999, NULL, 7) = 1
[pid 93305] getsockopt(6, SOL_SOCKET, SO_ERROR, [ENETUNREACH], [4]) = 0
[pid 93305] epoll_ctl(3, EPOLL_CTL_DEL, 6, 0xc0000b1874) = 0
[pid 93307] epoll_pwait(3, [], 128, 0, NULL, 0) = 0
[pid 93307] epoll_pwait(3, --- PASS: TestRuntimeDial (1.01s)
PASS
 <unfinished ...>) = ?
[pid 93308] +++ exited with 0 +++
[pid 93306] +++ exited with 0 +++
[pid 93304] +++ exited with 0 +++
[pid 93307] +++ exited with 0 +++
[pid 93305] +++ exited with 0 +++
+++ exited with 0 +++

Note that the test finishes pretty much after one second, so maybe there is a race with an internal timeout?

What did you expect to see?

Dial returns either a timeout or ENETUNREACH in both cases.

What did you see instead?

Dial returns ENETUNREACH in one case, timeout error in the other one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants