Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault in agent.ok() #80

Closed
maxhawkins opened this issue Jul 16, 2019 · 5 comments
Closed

segfault in agent.ok() #80

maxhawkins opened this issue Jul 16, 2019 · 5 comments

Comments

@maxhawkins
Copy link

Your environment.

  • Version: v0.5.1
  • OS: macOS

What did you do?

I am running a load test that opens a large number of peer connections concurrently, exchanges SDPs between them, and then closes them.

What happened?

After around 40 tests, pion logged

ice ERROR: 2019/07/16 13:27:08 error processing checkCandidatesTimeout handler the agent is closed
ice ERROR: 2019/07/16 13:27:08 error processing checkCandidatesTimeout handler the agent is closed

A few seconds later it segfaulted:

ice ERROR: 2019/07/16 13:27:13 error processing checkCandidatesTimeout handler the agent is closed
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1d8 pc=0x4221962]

goroutine 264 [running]:
github.com/pion/ice.(*Agent).ok(0x0, 0xc0004de350, 0x4446880)
        /Users/max/go/pkg/mod/github.com/pion/ice@v0.5.1/agent.go:153 +0x22
github.com/pion/ice.(*Agent).run(0x0, 0xc0004de350, 0x1, 0xc0004de340)
        /Users/max/go/pkg/mod/github.com/pion/ice@v0.5.1/agent.go:560 +0x40
github.com/pion/ice.(*Agent).OnConnectionStateChange(0x0, 0xc0004de340, 0x0, 0x0)
        /Users/max/go/pkg/mod/github.com/pion/ice@v0.5.1/agent.go:396 +0x65
github.com/pion/webrtc/v2.(*ICETransport).Start(0xc0001d7dc0, 0xc000318000, 0xc0001b22ca, 0x10, 0xc0001bc398, 0x20, 0x0, 0xc000065df0, 0x0, 0x0)
        /Users/max/go/pkg/mod/github.com/pion/webrtc/v2@v2.0.24-0.20190715150138-632530bc69a7/icetransport.go:83 +0x120
github.com/pion/webrtc/v2.(*PeerConnection).SetRemoteDescription.func3(0xc0003c8001, 0xc00013b800, 0xc0001b22ca, 0x10, 0xc0001bc398, 0x20, 0xc00067210c, 0x7, 0xc000672114, 0x5f)
        /Users/max/go/pkg/mod/github.com/pion/webrtc/v2@v2.0.24-0.20190715150138-632530bc69a7/peerconnection.go:961 +0x104
created by github.com/pion/webrtc/v2.(*PeerConnection).SetRemoteDescription
        /Users/max/go/pkg/mod/github.com/pion/webrtc/v2@v2.0.24-0.20190715150138-632530bc69a7/peerconnection.go:952 +0xe90
@enobufs
Copy link
Member

enobufs commented Jul 16, 2019

Would this happen with pion/ice@v0.4.3?

@enobufs
Copy link
Member

enobufs commented Jul 17, 2019

This one says, *Agent is is nil.

github.com/pion/ice.(*Agent).ok(0x0, 0xc0004de350, 0x4446880)
        /Users/max/go/pkg/mod/github.com/pion/ice@v0.5.1/agent.go:153 +0x22

which is here:

func (a *Agent) ok() error {
	select {
	case <-a.done: // <----------------------------- here
		return a.getErr()
	default:
	}
	return nil
}

I can not find in pion/ice a code that sets the agent to nil.
I believe an upper layer is calling a nil agent... (just my guess)

@enobufs
Copy link
Member

enobufs commented Jul 17, 2019

No sure if related but I have just saw a CI build error.

I will look into this.

=== RUN   TestRelayOnlyConnection
ice ERROR: 2019/07/17 07:15:29 error processing checkCandidatesTimeout handler the agent is closed
goroutine profile: total 36
12 @ 0x45855f 0x453aaa 0x453096 0x4b3ff5 0x4b67a5 0x4b6781 0x5f6ee0 0x640d61 0x647e43 0x647e24 0x64f276 0x4867d1
#	0x453095	internal/poll.runtime_pollWait+0x55			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/runtime/netpoll.go:182
#	0x4b3ff4	internal/poll.(*pollDesc).wait+0xe4			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_poll_runtime.go:87
#	0x4b67a4	internal/poll.(*pollDesc).waitRead+0x144		/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_poll_runtime.go:92
#	0x4b6780	internal/poll.(*FD).RawRead+0x120			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_unix.go:534
#	0x5f6edf	net.(*rawConn).Read+0x6f				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/rawconn.go:43
#	0x640d60	golang.org/x/net/internal/socket.(*Conn).recvMsg+0x3d0	/home/travis/gopath/pkg/mod/golang.org/x/net@v0.0.0-20190628185345-da137c7871d7/internal/socket/rawconn_msg.go:31
#	0x647e42	golang.org/x/net/internal/socket.(*Conn).RecvMsg+0x212	/home/travis/gopath/pkg/mod/golang.org/x/net@v0.0.0-20190628185345-da137c7871d7/internal/socket/socket.go:255
#	0x647e23	golang.org/x/net/ipv4.(*payloadHandler).ReadFrom+0x1f3	/home/travis/gopath/pkg/mod/golang.org/x/net@v0.0.0-20190628185345-da137c7871d7/ipv4/payload_cmsg.go:31
#	0x64f275	github.com/pion/mdns.(*Conn).start+0x155		/home/travis/gopath/pkg/mod/github.com/pion/mdns@v0.0.2/conn.go:249
7 @ 0x45855f 0x468eab 0x86c462 0x4867d1
#	0x86c461	github.com/pion/ice.(*Agent).taskLoop+0x481	/home/travis/gopath/src/github.com/pion/ice/agent.go:589
6 @ 0x45855f 0x468eab 0x86c227 0x4867d1
#	0x86c226	github.com/pion/ice.(*Agent).taskLoop+0x246	/home/travis/gopath/src/github.com/pion/ice/agent.go:576
4 @ 0x45855f 0x453aaa 0x453096 0x4b3ff5 0x4b5374 0x4b534a 0x5de2aa 0x6007de 0x5fe636 0x87398a 0x4867d1
#	0x453095	internal/poll.runtime_pollWait+0x55			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/runtime/netpoll.go:182
#	0x4b3ff4	internal/poll.(*pollDesc).wait+0xe4			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_poll_runtime.go:87
#	0x4b5373	internal/poll.(*pollDesc).waitRead+0x213		/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_poll_runtime.go:92
#	0x4b5349	internal/poll.(*FD).ReadFrom+0x1e9			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_unix.go:219
#	0x5de2a9	net.(*netFD).readFrom+0x79				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/fd_unix.go:208
#	0x6007dd	net.(*UDPConn).readFrom+0x8d				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/udpsock_posix.go:47
#	0x5fe635	net.(*UDPConn).ReadFrom+0x95				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/udpsock.go:121
#	0x873989	github.com/pion/ice.(*candidateBase).recvLoop+0x209	/home/travis/gopath/src/github.com/pion/ice/candidate_base.go:93
1 @ 0x45855f 0x42e969 0x42e93f 0x42e6db 0x544ea3 0x54ae09 0x544794 0x546df4 0x54564c 0x8b69f5 0x45814c 0x4867d1
#	0x544ea2	testing.(*T).Run+0x692		/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/testing/testing.go:917
#	0x54ae08	testing.runTests.func1+0xa8	/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/testing/testing.go:1157
#	0x544793	testing.tRunner+0x163		/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/testing/testing.go:865
#	0x546df3	testing.runTests+0x523		/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/testing/testing.go:1155
#	0x54564b	testing.(*M).Run+0x2eb		/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/testing/testing.go:1072
#	0x8b69f4	main.main+0x344			_testmain.go:210
#	0x45814b	runtime.main+0x20b		/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/runtime/proc.go:200
1 @ 0x45855f 0x42e969 0x42e93f 0x42e6db 0x6b47e4 0x6b4799 0x6b309e 0x87cc62 0x87a555 0x86946d 0x891035 0x544794 0x4867d1
#	0x6b47e3	github.com/pion/turn/internal/client.(*Transaction).WaitForResult+0x823	/home/travis/gopath/pkg/mod/github.com/pion/turn@v1.3.2/internal/client/transaction.go:91
#	0x6b4798	github.com/pion/turn.(*Client).PerformTransaction+0x7d8			/home/travis/gopath/pkg/mod/github.com/pion/turn@v1.3.2/client.go:358
#	0x6b309d	github.com/pion/turn.(*Client).Allocate+0x52d				/home/travis/gopath/pkg/mod/github.com/pion/turn@v1.3.2/client.go:251
#	0x87cc61	github.com/pion/ice.(*Agent).gatherCandidatesRelay+0x511		/home/travis/gopath/src/github.com/pion/ice/gather.go:334
#	0x87a554	github.com/pion/ice.(*Agent).gatherCandidates+0x2a4			/home/travis/gopath/src/github.com/pion/ice/gather.go:143
#	0x86946c	github.com/pion/ice.NewAgent+0x172c					/home/travis/gopath/src/github.com/pion/ice/agent.go:389
#	0x891034	github.com/pion/ice.TestRelayOnlyConnection+0x5e4			/home/travis/gopath/src/github.com/pion/ice/candidate_relay_test.go:58
#	0x544793	testing.tRunner+0x163							/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/testing/testing.go:865
1 @ 0x45855f 0x453aaa 0x453096 0x4b3ff5 0x4b5374 0x4b534a 0x5de2aa 0x6007de 0x5fe636 0x6c215d 0x4867d1
#	0x453095	internal/poll.runtime_pollWait+0x55			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/runtime/netpoll.go:182
#	0x4b3ff4	internal/poll.(*pollDesc).wait+0xe4			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_poll_runtime.go:87
#	0x4b5373	internal/poll.(*pollDesc).waitRead+0x213		/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_poll_runtime.go:92
#	0x4b5349	internal/poll.(*FD).ReadFrom+0x1e9			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_unix.go:219
#	0x5de2a9	net.(*netFD).readFrom+0x79				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/fd_unix.go:208
#	0x6007dd	net.(*UDPConn).readFrom+0x8d				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/udpsock_posix.go:47
#	0x5fe635	net.(*UDPConn).ReadFrom+0x95				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/udpsock.go:121
#	0x6c215c	github.com/pion/turn.(*Client).Listen.func1+0x9c	/home/travis/gopath/pkg/mod/github.com/pion/turn@v1.3.2/client.go:167
1 @ 0x45855f 0x453aaa 0x453096 0x4b3ff5 0x4b5374 0x4b534a 0x5de2aa 0x6007de 0x5fe636 0x6c2483 0x4867d1
#	0x453095	internal/poll.runtime_pollWait+0x55			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/runtime/netpoll.go:182
#	0x4b3ff4	internal/poll.(*pollDesc).wait+0xe4			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_poll_runtime.go:87
#	0x4b5373	internal/poll.(*pollDesc).waitRead+0x213		/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_poll_runtime.go:92
#	0x4b5349	internal/poll.(*FD).ReadFrom+0x1e9			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/internal/poll/fd_unix.go:219
#	0x5de2a9	net.(*netFD).readFrom+0x79				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/fd_unix.go:208
#	0x6007dd	net.(*UDPConn).readFrom+0x8d				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/udpsock_posix.go:47
#	0x5fe635	net.(*UDPConn).ReadFrom+0x95				/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/net/udpsock.go:121
#	0x6c2482	github.com/pion/turn.(*Server).listen.func1+0x92	/home/travis/gopath/pkg/mod/github.com/pion/turn@v1.3.2/server.go:229
1 @ 0x45855f 0x468eab 0x86beaa 0x86e584 0x8a2935 0x86c4a4 0x4867d1
#	0x86bea9	github.com/pion/ice.(*Agent).run+0x189				/home/travis/gopath/src/github.com/pion/ice/agent.go:565
#	0x86e583	github.com/pion/ice.(*Agent).Close+0x103			/home/travis/gopath/src/github.com/pion/ice/agent.go:749
#	0x8a2934	github.com/pion/ice.TestHandlePeerReflexive.func1.1+0xab4	/home/travis/gopath/src/github.com/pion/ice/agent_test.go:279
#	0x86c4a3	github.com/pion/ice.(*Agent).taskLoop+0x4c3			/home/travis/gopath/src/github.com/pion/ice/agent.go:592
1 @ 0x45855f 0x468eab 0x86beaa 0x86e584 0x8a3126 0x86c4a4 0x4867d1
#	0x86bea9	github.com/pion/ice.(*Agent).run+0x189				/home/travis/gopath/src/github.com/pion/ice/agent.go:565
#	0x86e583	github.com/pion/ice.(*Agent).Close+0x103			/home/travis/gopath/src/github.com/pion/ice/agent.go:749
#	0x8a3125	github.com/pion/ice.TestHandlePeerReflexive.func2.1+0x335	/home/travis/gopath/src/github.com/pion/ice/agent_test.go:310
#	0x86c4a3	github.com/pion/ice.(*Agent).taskLoop+0x4c3			/home/travis/gopath/src/github.com/pion/ice/agent.go:592
1 @ 0x5b209e 0x5b1e6a 0x5add1c 0x6c8497 0x4867d1
#	0x5b209d	runtime/pprof.writeRuntimeProfile+0x9d			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/runtime/pprof/pprof.go:708
#	0x5b1e69	runtime/pprof.writeGoroutine+0xc9			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/runtime/pprof/pprof.go:670
#	0x5add1b	runtime/pprof.(*Profile).WriteTo+0x4fb			/home/travis/.gimme/versions/go1.12.7.linux.amd64/src/runtime/pprof/pprof.go:329
#	0x6c8496	github.com/pion/transport/test.TimeOut.func1+0x96	/home/travis/gopath/pkg/mod/github.com/pion/transport@v0.8.6/test/util.go:18
panic: timeout

@enobufs
Copy link
Member

enobufs commented Jul 17, 2019

@maxhawkins Build error I saw was caused by a bug in the new pion/ice. Once it passes the review, I will tag it as v0.5.2. Hopefully, you could try it again! (see #81)

Sean-Der added a commit that referenced this issue Mar 1, 2020
Remove taskChan and make .run just take an Agent wide mutex and run the
function. These is now a blocking operation so all channels used to
communicate from it must be buffered.

After this we will slowly remove usage of .run and make things more
thread safe.

Relates to #80, #67, #2
Sean-Der added a commit that referenced this issue Mar 1, 2020
Remove taskChan and make .run just take an Agent wide mutex and run the
function. These is now a blocking operation so all channels used to
communicate from it must be buffered.

After this we will slowly remove usage of .run and make things more
thread safe.

Relates to #80, #67, #2
Sean-Der added a commit that referenced this issue Mar 1, 2020
Remove taskChan and make .run just take an Agent wide mutex and run the
function. These is now a blocking operation so all channels used to
communicate from it must be buffered.

After this we will slowly remove usage of .run and make things more
thread safe.

Relates to #80, #67, #2
Sean-Der added a commit that referenced this issue Mar 2, 2020
Remove taskChan and make .run just take an Agent wide mutex and run the
function. These is now a blocking operation so all channels used to
communicate from it must be buffered.

After this we will slowly remove usage of .run and make things more
thread safe.

Relates to #80, #67, #2
Sean-Der added a commit that referenced this issue Mar 2, 2020
Remove taskChan and make .run just take an Agent wide mutex and run the
function. These is now a blocking operation so all channels used to
communicate from it must be buffered.

After this we will slowly remove usage of .run and make things more
thread safe.

Relates to #80, #67, #2
Sean-Der added a commit that referenced this issue Mar 2, 2020
Remove taskChan and make .run just take an Agent wide mutex and run the
function. These is now a blocking operation so all channels used to
communicate from it must be buffered.

After this we will slowly remove usage of .run and make things more
thread safe.

Relates to #80, #67, #2
Sean-Der added a commit that referenced this issue Mar 2, 2020
Remove taskChan and make .run just take an Agent wide mutex and run the
function. These is now a blocking operation so all channels used to
communicate from it must be buffered.

After this we will slowly remove usage of .run and make things more
thread safe.

Relates to #80, #67, #2
@Sean-Der
Copy link
Member

Sean-Der commented Mar 2, 2020

This seems to be an issue with the caller, closing.

@Sean-Der Sean-Der closed this as completed Mar 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants