Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gotest fails in src/pkg/net #271

Closed
gopherbot opened this issue Nov 19, 2009 · 15 comments
Closed

gotest fails in src/pkg/net #271

gopherbot opened this issue Nov 19, 2009 · 15 comments

Comments

@gopherbot
Copy link
Contributor

by wjosephson:

Before filing a bug, please check whether it has been fixed since
the latest release: run "hg pull -u" and retry what you did to
reproduce the problem.  Thanks.

What steps will reproduce the problem?
1. Build go
2. Run gotest in src/pkg/net
3. Once in a while, it will shoot itself in the head

What is the expected output? What do you see instead?

Test program crashes; see attached file.

What is your $GOOS?  $GOARCH?

FreeBSD/386 with two physical processors

Which revision are you sync'ed to?  (hg log -l 1)
changeset:   4150:cb559bd8a773

Please provide any additional information below.

Unfortunately, the bug does not always manifest itself.
It appears to be timing or scheduling related.  I haven't
yet been able to pin it down to a particular revision.

Attachments:

  1. trace.txt (2784 bytes)
@gopherbot
Copy link
Contributor Author

Comment 1 by wjosephson:

The bug appears to manifest itself at least in net.TestUnixDatagramServer.

@rsc
Copy link
Contributor

rsc commented Nov 20, 2009

Comment 2:

Devon- I'm sure there's a real bug here too, but separately
it looks like the signal handler is not getting the right
pointer for the register set, or is confused about what the
register set looks like:
eax     0x33
ebx     0xc
ecx     0x808d0e4
edx     0x4
edi     0xf883c189
esi     0xb9155b18
ebp     0x80dc50c
esp     0x20002
eip     0x3b
eflags  0x10002
cs      0x280
fs      0x0
gs      0x0
Most of those registers are implausible. 
cs cannot be 0x280, for example.
I suspect that in freebsd/386/signal.c, the last line here:
    uc = context;
    mc = &uc->uc_mcontext;
    sc = (Sigcontext*)mc;   // same layout, more conveient names
is not correct and the code should be using mc directly.
That line is from the Linux version, where mcontext
is just a big array.

Owner changed to r...@golang.org.

Status changed to Accepted.

@dhobsd
Copy link
Contributor

dhobsd commented Nov 20, 2009

Comment 3:

You're right. The comments in FreeBSD around sigcontext and mcontext_t implies they
should be interchangable, but they are in fact different... sigcontext has the
sc_mask up in front. I guess this should do it for 386, though amd64 has the same
problem.
If you could reproduce with this, I think the traceback would be much more accurate.
(I tried, but I don't have a multi-core i386 machine running FreeBSD)
--dho

Attachments:

  1. signal.diff (2377 bytes)

@gopherbot
Copy link
Contributor Author

Comment 4 by wjosephson:

I haven't been able to reproduce the bug reliably yet.
Any reason not to apply the patch and its equivalent
to amd64 now?

@dhobsd
Copy link
Contributor

dhobsd commented Nov 20, 2009

Comment 5:

There's not really any reason that couldn't go in (other than needing to wait for
Russ to commit), but if I can get a new traceback, I may be able to fix the bug and
get that in the same CL. As Russ said, the registers are largely borked, and this is
probably because the top few fields are actually getting set to the sigmask, and
pushing the rest of the data into the wrong places.
--dho

@gopherbot
Copy link
Contributor Author

Comment 6 by wjosephson:

The thing is, I can't reproduce the bug anymore.

@gopherbot
Copy link
Contributor Author

Comment 7 by wjosephson:

I take that back; sorry for the noise.
=== RUN  net.TestTCPServer
SIGSEGV: segmentation violation
Faulting address: 0x0
PC=0xb90e7591
0xb90e7591 unknown pc
goroutine 12 [1]:
gosched+0x48 /array1/wkj/src/go/src/pkg/runtime/proc.c:521
    gosched()
runtime·exitsyscall+0x88 /array1/wkj/src/go/src/pkg/runtime/proc.c:582
    runtime·exitsyscall()
syscall·Syscall+0x48 /array1/wkj/src/go/src/pkg/syscall/asm_freebsd_386.s:35
    syscall·Syscall()
syscall·Read+0x64 /array1/wkj/src/go/src/pkg/syscall/zsyscall_freebsd_386.go:463
    syscall·Read(0xb, 0xb91b6000, 0x400, 0x400, 0x8, ...)
os·*File·Read+0x6a /array1/wkj/src/go/src/pkg/os/file.go:118
    os·*File·Read(0xb91257a0, 0xb91b6000, 0x400, 0x400, 0xfa, ...)
net·*netFD·Read+0x141 /array1/wkj/src/go/src/pkg/net/fd.go:380
    net·*netFD·Read(0xb90a4850, 0xb91b6000, 0x400, 0x400, 0xb91a7660, ...)
net·*TCPConn·Read+0x78 /array1/wkj/src/go/src/pkg/net/tcpsock.go:92
    net·*TCPConn·Read(0xb91b3420, 0xb91b6000, 0x400, 0x400, 0x0, ...)
net·runEcho+0x67 /array1/wkj/src/go/src/pkg/net/server_test.go:19
    net·runEcho(0xb91b41a0, 0xb91b3420, 0xb9088e40, 0xb91b41a0)
goexit /array1/wkj/src/go/src/pkg/runtime/proc.c:135
    goexit()
0xb91b41a0 unknown pc
goroutine 11 [4]:
gosched+0x48 /array1/wkj/src/go/src/pkg/runtime/proc.c:521
    gosched()
chanrecv+0x2d3 /array1/wkj/src/go/src/pkg/runtime/chan.c:319
    chanrecv(0xb9088e40, 0xb90d5048, 0x0, 0xb90d3fd0)
runtime·chanrecv1+0x4e /array1/wkj/src/go/src/pkg/runtime/chan.c:415
    runtime·chanrecv1(0xb9088e40, 0xb91b3420)
net·runServe+0x1d1 /array1/wkj/src/go/src/pkg/net/server_test.go:42
    net·runServe(0xb9187c00, 0x80991c4, 0x3, 0x80992d0, 0x2, ...)
goexit /array1/wkj/src/go/src/pkg/runtime/proc.c:135
    goexit()
0xb9187c00 unknown pc
goroutine 3 [3]:
runtime·entersyscall+0x60 /array1/wkj/src/go/src/pkg/runtime/proc.c:545
    runtime·entersyscall()
syscall·Syscall6+0x5 /array1/wkj/src/go/src/pkg/syscall/asm_freebsd_386.s:39
    syscall·Syscall6()
syscall·kevent+0x58 /array1/wkj/src/go/src/pkg/syscall/zsyscall_freebsd_386.go:102
    syscall·kevent(0x6, 0x0, 0x0, 0xb90a5004, 0xa, ...)
syscall·Kevent+0xa2 /array1/wkj/src/go/src/pkg/syscall/syscall_freebsd.go:377
    syscall·Kevent(0x6, 0x0, 0x0, 0x0, 0xb90a5004, ...)
net·*pollster·WaitFD+0x112 /array1/wkj/src/go/src/pkg/net/fd_freebsd.go:83
    net·*pollster·WaitFD(0xb90a5000, 0x0, 0x0, 0xb91b5000, 0x72, ...)
net·*pollServer·Run+0xc3 /array1/wkj/src/go/src/pkg/net/fd.go:242
    net·*pollServer·Run(0xb9087870, 0x80b8d18)
goexit /array1/wkj/src/go/src/pkg/runtime/proc.c:135
    goexit()
0xb9087870 unknown pc
goroutine 1 [4]:
gosched+0x48 /array1/wkj/src/go/src/pkg/runtime/proc.c:521
    gosched()
chanrecv+0x2d3 /array1/wkj/src/go/src/pkg/runtime/chan.c:319
    chanrecv(0xb91a7390, 0xb908af8c, 0x0, 0xb908c0d8)
runtime·chanrecv1+0x4e /array1/wkj/src/go/src/pkg/runtime/chan.c:415
    runtime·chanrecv1(0xb91a7390, 0x80d2504)
testing·Main+0x25f /array1/wkj/src/go/src/pkg/testing/testing.go:159
    testing·Main(0x80d24c8, 0xb)
main·main+0x29 /array1/wkj/src/go/src/pkg/net/_testmain.go:24
    main·main()
mainstart+0xf /array1/wkj/src/go/src/pkg/runtime/386/asm.s:81
    mainstart()
goexit /array1/wkj/src/go/src/pkg/runtime/proc.c:135
    goexit()
eax     0x0
ebx     0xb9122124
ecx     0x3
edx     0x1
edi     0xb908beb0
esi     0xb908beb4
ebp     0x9
esp     0xb90e7544
eip     0xb90e7591
eflags  0x10282
cs      0x33
fs      0x0
gs      0x0
gotest: line 164: 20465 Trace/BPT trap: 5       (core dumped) $E ./$O.out "$@"

@dhobsd
Copy link
Contributor

dhobsd commented Nov 20, 2009

Comment 8:

Oh, that's weird. Well in that case http://golang.org/cl/156113

@dhobsd
Copy link
Contributor

dhobsd commented Nov 20, 2009

Comment 9:

Hah, pre-empted. I'm guessing my syscall code (largely copied from elsewhere) is just
bad. I just copied that half-assedly, so I should probably revisit that. I'll have
something soonish.

@dhobsd
Copy link
Contributor

dhobsd commented Nov 20, 2009

Comment 10:

In the meantime, ktrace -tc 8.out / kdump and finding the syscall that's producing
this would probably not be bad.

@gopherbot
Copy link
Contributor Author

Comment 11 by wjosephson:

The error seems to be more easily reproduced under ktrace.
See attached kdump.

Attachments:

  1. trace.txt (3667 bytes)
  2. ktrace.txt (485922 bytes)

@dhobsd
Copy link
Contributor

dhobsd commented Nov 20, 2009

Comment 12:

Definitely timing related. I've seen the EUNAVAIL on my current i386 test box, but it
doesn't cause the process to commit harakiri, which I suspect is due to it executing
serially since it's a single processor. I'm going to try to get access to a multicore
FreeBSD/i386 install and see if I can figure out which thread it is specifically --
the most obvious ones in the trace don't seem to be very interesting. Though it seems
curious that SI is 0x1.
--dho

@dhobsd
Copy link
Contributor

dhobsd commented Nov 20, 2009

Comment 13:

er, s/EUNAVAIL/EAGAIN/.

@dhobsd
Copy link
Contributor

dhobsd commented Dec 1, 2009

Comment 14:

Can you test http://golang.org/cl/163052 with this? I'm guessing this is
actually related to #321 and #360 after re-reviewing the kdump output, and I still
don't have a FreeBSD/i386 SMP machine.

@rsc
Copy link
Contributor

rsc commented Dec 2, 2009

Comment 15:

This issue was closed by revision eb16346.

Status changed to Fixed.

Merged into issue #-.

@gopherbot gopherbot added the fixed label Dec 2, 2009
@golang golang locked and limited conversation to collaborators Jun 24, 2016
@rsc rsc removed their assignment Jun 22, 2022
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants