Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic when adding candidate #589

Closed
m1k1o opened this issue Jun 26, 2023 · 6 comments
Closed

Panic when adding candidate #589

m1k1o opened this issue Jun 26, 2023 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@m1k1o
Copy link
Member

m1k1o commented Jun 26, 2023

This panic is fairly rare, but in certain (yet unknown) circumstances it is more prevalent.

panic: runtime error: invalid memory address or nil pointer dereference	
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xe1758f]	

goroutine 6592 [running]:	
github.com/pion/ice/v2.(*Agent).addCandidate.func1({0xc0011adf90?, 0xc0011adf48?}, 0xc000004300?)	
	/go/pkg/mod/github.com/pion/ice/v2@v2.3.4/agent.go:798 +0x26f	
github.com/pion/ice/v2.(*Agent).taskLoop(0xc000004300)	
	/go/pkg/mod/github.com/pion/ice/v2@v2.3.4/agent.go:240 +0xc2	
created by github.com/pion/ice/v2.NewAgent	
	/go/pkg/mod/github.com/pion/ice/v2@v2.3.4/agent.go:363 +0xd92
@m1k1o m1k1o added the bug Something isn't working label Jun 26, 2023
@m1k1o m1k1o self-assigned this Jun 26, 2023
@Sean-Der
Copy link
Member

Sean-Der commented Jun 27, 2023

@m1k1o It looks like candidateConn is nil? https://github.com/pion/ice/blob/master/agent.go#L817C1-L818C1

I think it would be worth adding a defensive if candidateConn != nil check

It would be nice to figure out why this is happening though! Would it be possible for you to deploy some more logging + a check around this?

  • What candidates had two duplicates? IP/Port/Type etc..

@m1k1o
Copy link
Member Author

m1k1o commented Jun 27, 2023

From my findings, I see that right before panic, the code is frozen for a couple of seconds (up to 10). Probably because of some TCP connection blocking when write buffer is full. When it finally unblocks, it is flooded with all messages at once and most likely ocurrs some race condition that leads to panic.

Last message seems to be Ignore duplicate candidate: tcp4 host <ip>:0, where the IP is NAT1TO1 of the server. The candidate is announced as candidate:4225724578 1 tcp 1675624447 <ip> 0 typ host tcptype active. I saw recent changes in the code about Active ICE TCP Candidates, but this is v2.3.4 what should predate those changes. I might try to upgrade to latest version to see if it happens again.

@Sean-Der
Copy link
Member

That is a strange one! I think we might be able to reproduce this though. Can you create a TCP client that doesn't disconnect properly. Maybe as simple as connecting from phone then turning on Airplane mode?

I am very confident that Active ICE TCP changes will not have fixed this. This is the passive code.

@m1k1o
Copy link
Member Author

m1k1o commented Jul 12, 2023

Simply connecting using TCP on mobile and truning on Airplane Mode did not cut it:

6:09PM WRN write error: write tcp 172.17.0.3:52100->192.168.1.186:58735: use of closed network connection module=webrtc submodule=pion subsystem=ice-tcp
6:09PM INF error reading streaming packet: read tcp 172.17.0.3:52100->192.168.1.186:58735: use of closed network connection module=webrtc submodule=pion subsystem=ice-tcp
6:09PM WRN error closing connection: close tcp 172.17.0.3:52100->192.168.1.186:58735: use of closed network connection module=webrtc submodule=pion subsystem=ice-tcp
6:09PM WRN Failed to read from candidate tcp4 host 192.168.1.38:52100: io: read/write on closed pipe module=webrtc peer_id=1 session_id=user submodule=pion subsystem=ice
6:09PM WRN Failed to discover mDNS candidate 14e7f6f2-8a05-4212-8024-e43a6032d72a.local: mDNS: connection is closed module=webrtc peer_id=1 session_id=user submodule=pion subsystem=ice

image

I traced it back to this commit, that added connConfigs = append(connConfigs, connConfig{nil, 0, TCPTypeActive}), where nil was passed down to addCandidate.

image

It was since then reverted by and changed by you. Therefore I believe, upgrading to the latest should fix the issue.

Seems like it was not reproducible because under usual circumstances it would be first entry in candidates list, and therefore not marked as duplicate. Only if my client had 2 IP addresses, and the second one should be used, it panicked reproducibly.

@m1k1o
Copy link
Member Author

m1k1o commented Jul 12, 2023

After upgrading pion/ice to v2.3.9 i can confirm, that it does not panic anymore.

But actually I am not able to connect using UDPMux or TCPMux (with NAT1TO1). Only using ephemeral UDP the connection is possible. It has something to to again with the fact, that my server has 2 interfaces (first for the Internet, other for Management) and I am connecting from Management. After disabling the other Interface (or removing NAT1TO1), it works like a charm.

This is issue on its own, so maybe it should be extracted and this one closed.

@Sean-Der
Copy link
Member

Closing, no work to be done here!

@Sean-Der Sean-Der closed this as not planned Won't fix, can't repro, duplicate, stale Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants