Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relayed connection is used although direct connection is possible #78

Closed
csteuer opened this issue Feb 19, 2021 · 9 comments · Fixed by #84
Closed

Relayed connection is used although direct connection is possible #78

csteuer opened this issue Feb 19, 2021 · 9 comments · Fixed by #84
Labels
bug Something isn't working

Comments

@csteuer
Copy link
Contributor

csteuer commented Feb 19, 2021

Hi,
i have two clients in different networks (both using libjuice).
They are behind EIM-NATs and establishing a direct connection with libjuice always works with the help of a STUN server.
However; If I add a TURN server they will sometimes use the relayed connection instead of the direct connection.
I'm not an ICE expert but I think they should use the direct connection right?

Log from one of the clients when direct connection is used:
direct.log

Log from one of the clients when turn connection is used:
turn.log

Unfortunately I can't tell from the log why the relayed connection is chosen over the direct connection.

@paullouisageneau
Copy link
Owner

The behavior is indeed incorrect. Thank you for the logs, I'm looking into it!

@paullouisageneau paullouisageneau added the bug Something isn't working label Feb 20, 2021
@paullouisageneau
Copy link
Owner

I haven't been able to reproduce this, I think there might be some timing involved. Could you please post the verbose log or code triggering the issue?

@csteuer
Copy link
Contributor Author

csteuer commented Mar 8, 2021

Thanks for looking into this. I will try to get those logs this week.
Maybe I will also have the time to look/debug into it myself.
Do you have any hints were I should start looking?

@csteuer
Copy link
Contributor Author

csteuer commented Mar 15, 2021

Here are the verbose logs:
client_a.log
client_b.log

@paullouisageneau
Copy link
Owner

The problems seems to revolve around the usage of peer reflexive candidates in your setup. It appears the computation of the priority of these candidates was wrong as I didn't understand the RFC correctly at the time. Could you please check if #81 fixes the issue for you?

@csteuer
Copy link
Contributor Author

csteuer commented Mar 16, 2021

Unfortunately no.
I modified the log output when a new pair is selected so that it prints the selected candidates.

...
DEBUG: agent.c:958: New selected pair.
Local: a=candidate:2 1 UDP 1686110207 46.128.200.201 60713 typ srflx raddr 0.0.0.0 rport 0
Remote: a=candidate:5 1 UDP 1853882367 95.118.74.238 57569 typ prflx
INFO: agent.c:805: Changing state to connected
...
DEBUG: agent.c:958: New selected and nominated pair.
Local: a=candidate:3 1 UDP 8388607 85.214.81.109 57638 typ relay raddr 0.0.0.0 rport 0
Remote: a=candidate:3 1 UDP 1686110207 95.118.74.238 57569 typ srflx raddr 0.0.0.0 rport 0
INFO: agent.c:805: Changing state to completed
...

It uses the correct pair at first but later switches to a relayed connection.
The priorities of the candidates look good. Maybe there is still some error in the calculation for of the priority of the pair?

@paullouisageneau
Copy link
Owner

paullouisageneau commented Mar 16, 2021

Actually, I missed it before somehow, but in your use case there is a role conflict:

15:33:56.344 agent.c:1258: ICE role conflict (both controlling)

The priorities must be recalculated after the conflict is resolved, and it appears that it wasn't done properly. It should be fixed with #82

Still, you might want to avoid the conflict. It probably comes from the answerer calling juice_get_local_description() before setting the remote description.

I should add tests for the code path for conflict as it is not tested a lot...

@csteuer
Copy link
Contributor Author

csteuer commented Mar 17, 2021

Thanks for the hint with the controlling agent.
It was actually the juice_gather_candidates() method that I called to early on the accepting side.
I wasn't aware that the order in which juice_gather_candidates(), juice_get_local_description() and juice_set_remote_description() are called determine the controlling/controlled mode.

Now that I don't get the role conflict anymore everything works as expected.
However; If there is a role conflict the problem is still there.

client_a.log
client_b.log

After the role conflict has been resolved it seems like the controlled agent just ignores the nominated pair.

@paullouisageneau
Copy link
Owner

I think I found the final issue! It appears the controlled side wrongly matches the nominated pair. It should be fixed with #84

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants