Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICE going to failed now causes untyped use of closed network connection error to be returned, breaks existing code #252

Closed
Sean-Der opened this issue Jul 20, 2020 · 10 comments

Comments

@Sean-Der
Copy link
Member

Before ice.Agent never went to failed, so UDP sockets stayed open forever. Now because we close the sockets a user is getting use of closed network connection returned to them.

I feel like a user shouldn't have to worry about this, and should just have to watch PeerConnectionState and ICEConnectionState. During ICE Renegotiations I just wanted to keep sending packets, I don't want to worry about handling errors like this. ICE by design is lossy, so a higher level should handle this.

  • We can leave behavior like it is now (and consider it a known breaking change)
  • We can catch use of closed network connection and discard the error (but return others). This matches previous behavior.
@Sean-Der
Copy link
Member Author

@at-wat @jeremija ^

@Sean-Der Sean-Der changed the title ICE going to failed causes untyped error to be returned ICE going to failed now causes untyped use of closed network connection error to be returned, breaks existing code Jul 20, 2020
@jeremija
Copy link
Member

Now because we close the sockets a user is getting use of closed network connection returned to them.

Where is this error being returned from?

@Sean-Der
Copy link
Member Author

@jeremija candidateBase.writeTo

Before these sockets were only closed during IceAgent.Close

@jeremija
Copy link
Member

jeremija commented Jul 20, 2020

Thanks!

Not sure if this is related, but I wanted to point out that for ICE TCP we'll have to close the unused TCP connections if they are not selected:

When ICE processing for a media stream completes, each agent SHOULD close all TCP connections (that were opened due to this ICE session) except the ones between the candidate pairs selected by ICE.

And reuse existing ones while renegotiating:

9.2. ICE Restarts

If an ICE restart occurs for a media stream with TCP candidate pairs
that have been selected by ICE, the agents MUST NOT close the
connections after the restart. In the offer or answer that causes
the restart, an agent MAY include a simultaneous-open candidate whose
transport address matches the previously selected candidate. If both
agents do this, the result will be a simultaneous-open candidate pair
matching an existing TCP connection. In this case, the agents MUST
NOT attempt to open a new connection (or start new TLS or DTLS-SRTP
procedures). Instead, that existing connection is reused, and STUN
checks are performed.

Once the restart completes, if the selected pair does not match the
previously selected pair, the TCP connection for the previously
selected pair SHOULD be closed by the agent.

@at-wat
Copy link
Member

at-wat commented Jul 20, 2020

I don't have strong opinion on it, but if we catch and discard the error, we should buffer and resend the message if datachannel is set to be reliable.

In any case, ice package should return an error which is parsable from the caller.
At least, it should be

-		return n, fmt.Errorf("failed to send packet: %v", err)
+		return n, fmt.Errorf("failed to send packet: %w", err)

to allow unwrapping.
But would be more nice to define error type to allow both type assertion and unwrapping:

@Sean-Der
Copy link
Member Author

Sean-Der commented Aug 9, 2020

@at-wat sorry missed your response!

The DataChannel already will do a retry, so it doesn't need this error message. So we had loss before, but we just didn't tell the user.

Now we are returning the error, but I just don't think it is useful. The user should be watching the reports from ICE/RTP/SCTP individually. Making them handle the same error in two places is going to be confusing IMO.

@jech
Copy link
Member

jech commented Aug 10, 2020

It is my opinion that a library should return as many error indications as possible: it is easier to ignore an error that the client application is not interested in than to work out on your own whether an error happened. The main reasons I'd be interested in synchronous error indications (as opposed to the asynchronous OnIce... callback) are:

  • it makes debugging easier (why are my packets not going out?);
  • having a timely error indication may improve performance, for example by attempting to resend RTCP feedback eagerly, before the reporting interval expires; by the time the OnIce... callback fires, it is difficult to determine if there was a recent RTCP failure.

It is obviously impossible to provide error indications in all cases, but it is important to avoid false positives: if an error is returned, then the packet definitely did not reach its destination.

Thus, I would suggest:

  • Write and WriteRTP should return an error whenever it can be established cheaply that the packet wasn't shipped out;
  • the returned errors should be exported and documented;
  • the returned errors should wrap the underlying error, and implement net.Error so that the caller can determine if the error is temporary.

@Sean-Der
Copy link
Member Author

This also breaks DataChannels unfortunately. https://github.com/pion/sctp/blob/master/association.go#L450-L454 just checks for an error. Downstream also can't check if the error is temporary, this error is because the socket is closed so it isn't temporary.

@Sean-Der Sean-Der reopened this Nov 13, 2020
@Sean-Der
Copy link
Member Author

I am going to bring back the /v1 behavior for now, but will leave this open. I would love to see this done right, but I don't have the bandwidth to get this done.

Sean-Der added a commit that referenced this issue Nov 13, 2020
Downstream isn't able to distinguish a temporary from a Permanent
error yet.

Relates to #252
@Sean-Der
Copy link
Member Author

Resolving. I am happy with how the API behaves here.

  • If you want reliable communication it should be solved higher up
  • The lifecycle of the ICEAgent is communicated via the callbacks + you can query the state anytime.

cnderrauber added a commit that referenced this issue Oct 18, 2022
In #252 decide to ignore the writeTo failed error with a log,
application can choose callback and query state for the connection
state. But if application keep sending packet during try icerestart
to recover from ICEFailed, the warn log is annoying since the error
is not critical and no need to process. So change it to info.
cnderrauber added a commit that referenced this issue Oct 19, 2022
In #252 decide to ignore the writeTo failed error with a log,
application can choose callback and query state for the connection
state. But if application keep sending packet during try icerestart
to recover from ICEFailed, the warn log is annoying since the error
is not critical and no need to process. So change it to info.
ourwarmhouse added a commit to ourwarmhouse/Smart-Contract---Go that referenced this issue May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants