Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Simplify the sdp sending timeout protocol #4534

Merged
merged 6 commits into from Sep 13, 2018

Conversation

atomrc
Copy link
Contributor

@atomrc atomrc commented Sep 7, 2018

Previous behavior:

  • waiting 5 seconds for ICE candidate gathering
  • after 5 seconds, if the SDP contains relays ICE candidate we send the SDP else we wait 1 second and replay this timeout as long as we don't have relay candidates

New behavior:

  • we only wait 1 second for the ICE candidate gathering (and loop until we have a bunch of relays candidate in the SDP).

This PR also adds a log with the types of candidates we have gathered

@atomrc atomrc force-pushed the refactor/sdp_sending_timeout branch from e81d5ea to 9a653f1 Compare September 7, 2018 12:19
@gregor
Copy link
Contributor

gregor commented Sep 11, 2018

@atomrc The issue with this is that sending the SDP after a 5 second timeout if gathering did not complete is a behavior that is agreed upon and synched with the AVS team.

// @z-dule @c-g-owen

@atomrc
Copy link
Contributor Author

atomrc commented Sep 11, 2018

I am actually not sure what is the link with AVS?

To me, as long as the SDP contains some relay servers, AVS should be happy, am I wrong?
On chrome for example it almost never take more than 500ms, and still the connection works.
What is the difference here?

@gregor
Copy link
Contributor

gregor commented Sep 11, 2018

The link with AVS is that we want to align the behavior as much as possible. The whole calling protocol is timeout driven and clients are supposed to behave the same.

You are not wrong in your assumption that the call "should" work. Firefox is just really slow in gathering the ICE candidate routes. It "should" be as fast as Chrome. But it is not and gathers more routes. There is no point in having 15 routes in the SDP. That's why they warn against the use of more than 2 TURN serves and strictly advice against the use of 5 or more TURN servers.

The slowness is exactly the reason why we have the 5 second timeout BUT still check that there is a TURN based candidate in the SDP at that point. If we want to change the behavior we will need to align in with AVS in order to not diverge the protocol and client behavior down the line.

@atomrc
Copy link
Contributor Author

atomrc commented Sep 11, 2018

Also, I ran a bunch of tests last week and I discovered that everytime Firefox finds all the candidates very quick (actually as quickly as Chrome or just a bit slower), but just take a looooottt of time confirming that info (sending the null iceCandidate).

the timeline looks like this :

  • couple of ms
  • found candidate 1
  • couple of ms
  • found candidate 2
  • couple of ms
  • found candidate 3
  • [...]
  • found candidate > 10 (from that point something like 500ms has passed)
  • nothing happens for 13 seconds
  • send the null candidate

So two things:

  • in all the cases I tried I always had all the candidates in less than 1 second;
  • if no relays are there we still have a safety that relaunchs the timeout (we can even be more secure on this and check that we have more than, say, 3 relays ...)

@gregor
Copy link
Contributor

gregor commented Sep 11, 2018

@thomas But you were always on a decent network. The gathering will and can take much longer in other network conditions. Just tether your computer with a mobile device or similar. We need an approach that suits all environments. That's were the 5 second timeout from AVS on mobile stems from.

I fully agree that we can and should decrease it but we have to be clever on checking that the SDP contains not just any TURN candidate but the ones we want it to contain we do not have a subsequent ICE candidate trickling in place.

@atomrc
Copy link
Contributor Author

atomrc commented Sep 12, 2018

Bug submitted to Mozilla for the 13 seconds timeout https://bugzilla.mozilla.org/show_bug.cgi?id=1490672

@atomrc atomrc force-pushed the refactor/sdp_sending_timeout branch 2 times, most recently from d196974 to 5603133 Compare September 12, 2018 14:43
const logMessage = `No relay ICE candidates in local SDP. Timeout reset\n${iceCandidates}`;
this.callLogger.warn(logMessage, iceCandidates);
return this._setSendSdpTimeout(false);
const MIN_NUMBER_OF_RELAY_SERVERS = 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also move this to the config at the topic at the file

if (!typeMatches) {
return types;
}
const candidateType = typeMatches[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const [, candidateType] = typeMatches;

return types;
}
const candidateType = typeMatches[1];
types[candidateType] = types[candidateType] ? types[candidateType] + 1 : 1;
Copy link
Contributor

@gregor gregor Sep 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for a ternary:

types[candidateType] = types[candidateType] + 1 || 1; should give you the same result as underfined + 1 is NaN which is not truthy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok works, but feels a little cryptic...

const iceCandidateTypesLog = Object.keys(iceCandidateTypes)
.map(candidateType => {
return `${iceCandidateTypes[candidateType]} ${candidateType}`;
})
Copy link
Contributor

@gregor gregor Sep 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Single line?

.map(candidateType => `${iceCandidateTypes[candidateType]} ${candidateType}`)

obfuscated: [
localSdp.type,
iceCandidates.length,
this.callLogger.obfuscate(this.remoteUser.id),
this.callLogger.obfuscateSdp(this.localSdp().sdp),
],
},
message: `Sending local '{0}' SDP containing '{1}' ICE candidates for flow with '{2}'\n{3}`,
message: `Sending local '{0}' SDP containing '{1}' ICE candidates ('{2}') for flow with '{3}'\n{4}`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get rid of the ' from within ('{2}') as it would be a double indication of the expected variable content.

return iceCandidates.reduce((count, iceCandidate) => {
const isRelay = iceCandidate.toLowerCase().includes('relay');
return isRelay ? count + 1 : count;
}, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need a reduce here or does Array.filter().length do the trick as well?

return iceCandidates.filter(iceCandidate => iceCandidate.toLowerCase().includes('relay')).length;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very true

this.callLogger.warn(logMessage, iceCandidates);
return this._setSendSdpTimeout(false);
const MIN_NUMBER_OF_RELAY_SERVERS = 2;
if (this._getNumberOfRelayCandidates(iceCandidates) < MIN_NUMBER_OF_RELAY_SERVERS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also pull this into a const

Copy link
Contributor

@gregor gregor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback to send SDP after 5s no matter what is missing.

@@ -500,6 +501,7 @@ z.calling.entities.FlowEntity = class FlowEntity {
_createPeerConnection() {
return this._createPeerConnectionConfiguration().then(pcConfiguration => {
this.peerConnection = new window.RTCPeerConnection(pcConfiguration);
this.iceCandidatesGatheringAttempts = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's reset this to 0 in the reset method and put a default in the constructor.

const hasReachMaxGatheringAttempts = attempts >= FlowEntity.CONFIG.MAX_ICE_CANDIDATE_GATHERING_ATTEMPTS;
if (!hasReachMaxGatheringAttempts && !isValidGathering) {
const logMessage = `Not enough ICE candidates gathered (attempt '${attempts}'). Restarting timeout\n${iceCandidates}`;
this.iceCandidatesGatheringAttempts++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it would send it after 6 seconds and not 5

@atomrc atomrc merged commit 2b36c9a into dev Sep 13, 2018
@atomrc atomrc deleted the refactor/sdp_sending_timeout branch September 13, 2018 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants