Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Room gets a participantDidConnect for already existing participant with same identity #255

Closed
dineshflock opened this issue Apr 20, 2018 · 13 comments
Assignees

Comments

@dineshflock
Copy link

dineshflock commented Apr 20, 2018

Before filing an issue please check that the issue is not already addressed by the following:

If this is an issue with the QuickStart itself, file it here. If this is an issue with the SDK or how to do something with the SDK please use twilio-video-ios instead.

Description

Scenario (Based on our analytics here):

  1. Local participant L creates and enters a room.
  2. Remote participant R1 (sid : R1SID identity: R1ID) joins room.
  3. R1 audio/video tracks are subscribed
  4. Remote participant R2 (sid : R2SID identity: R2ID) joins room.
  5. R2 audio/video tracks are subscribed

After 15 seconds following events happen

  1. I get a call in participantDidConnect with (sid : R3SID identity: R2ID). (Note : identity is same as R2 but with new sid)
  2. Audio unsubscribe event for (sid : R3SID or R2SID, identity: R2ID)
  3. Video unsubscribe event for (sid : R3SID or R2SID, identity: R2ID)
  4. participantDidDisConnect event for (sid : R3SID or R2SID, identity: R2ID)

Update:
10. Our app crashed due to due to our internal assertion failure in event subscribedToAudioTrack for a participant with identity R2ID

Update : I am not sure about the sid's in step 7 8 and 9, those could be of R2SID as we keep only one participant with a given identity.

My understanding :
Seems like the SDK tries to refresh the R2 participant.
If thats the case then I think, SDK should first unsubscribe audio video followed by disconnect on R2(sid : R2SID identity: R2ID). Then it should call didConnected followed by subscribe audio/video tracks for R2(sid : R3SID identity: R2ID).

[Description of the issue]

Expected Behavior

Seems like the SDK tries to refresh the R2 participant.
If thats the case then I think, SDK should first unsubscribe audio video followed by disconnect on R2(sid : R2SID identity: R2ID). Then it should call didConnected followed by subscribe audio/video tracks for R2(sid : R3SID identity: R2ID).

Reproduces How Often

10%

Versions

All relevant version information for the issue.

Video iOS SDK

2.0.0-preview9 via CocoaPods

Xcode

9.2

iOS Version

11.3

@paynerc
Copy link
Contributor

paynerc commented Apr 23, 2018

@dineshflock,

Do you have any Room Sids that you can share for when you are receiving this issue? Also, are you experiencing any sort of network handover that would be causing a participant to rejoin the room with the same identity?

If you join the room with the same identity, you will get a new participant sid. I would expect to see the track unsubscriptions and disconnect events for the R2's first sid, not the second, as you are reporting here.

Let me know if you can send a Room SID where this has occurred.

Thank you,

Ryan

@ceaglest
Copy link
Contributor

ceaglest commented Apr 23, 2018

Hey @dineshflock,

2.0.0-preview9 via CocoaPods

Do you have any plans to upgrade to 2.0.0-beta4? We've fixed at least two bugs that you've reported against 2.0.0-preview9. In general this release is more stable and has much better support for bluetooth audio.

Best,
Chris

@dineshflock
Copy link
Author

dineshflock commented Apr 23, 2018

@paynerc

Do you have any Room Sids that you can share for when you are receiving this issue?

I have participant sid's here

Also, are you experiencing any sort of network handover that would be causing a participant to rejoin the room with the same identity?

Not sure about this.

@paynerc In the sheet, participant_guid is the identity.
@paynerc The problem for us here is, that in a call we can have only one participant with one identity at any given instant. The problem is not the different SID's, but its the fact that there were two participants with same identity at a given instant.
PS: We could have recorded wrong SID's for last three rows in the sheet.

@dineshflock
Copy link
Author

@ceaglest Thanks for the support Chris. We are planning to move on to the new update maybe in next two weeks or so.

@paynerc
Copy link
Contributor

paynerc commented Apr 23, 2018

@dineshflock,

Looking at the room in question, you have three "distinct" participants, however I see that each one of them reconnects at some point:

P1: name: zsltsrslpgltgsz1 | sid: PAa5242b0372de423f8dd286db85d9c70f
P2: name: dw8xrpxwm3jt8ees | sid: PA192f92dd5ec4dc9dae642c8d7dea89a3
P3: name: 882v88m99neuu2o2 | sid: PA34da99169a4f52a1b4c335702f0c68e9
P4: name: 882v88m99neuu2o2 | sid: PAae8b8a560e219086d32c41e58b1162a7
P5: name: zsltsrslpgltgsz1 | sid: PA7da98361980fb2649c546fdfa26bdf3c
P6: name: dw8xrpxwm3jt8ees | sid: PA2394018d31d520fde99de67d8c400c36

Looking at the logs for this room, I see that P3 connected at 2018-04-20T06:49:05.768Z. And then at 2018-04-20T06:49:21.820Z P4, with the same identity as P3, connects. This causes P3 to disconnect because of the duplicate identity. Looking at the logs, all of the correct unsubscription events were fired for P3 in that situation.

I also see that P1 connected at 2018-04-20T06:46:26.325Z. At 2018-04-20T06:49:49.259Z``P5 connects with the same identity as P1, thus triggering P1 to be disconnected.

Lastly, P2 connected at 2018-04-20T06:48:43.838Z. At 2018-04-20T07:02:01.987Z P2 disconnected from the room. And at 2018-04-20T07:02:10.982Z P6 connects, with the same identity as P2.

The questions I have are what triggered the second connect for 882v88m99neuu2o2 and zsltsrslpgltgsz1? I would suggest you look at the room connecting code to see if there is a possibility that you are firing it twice.

Another possibility is if you are managing room connection logic in your app? If so, you should defer to allowing the SDK to reconnect to the room for you. We do have a pretty long SIP timeout at present (known issue) which makes the reconnection take a little bit. We are also working on adding a reconnecting state (currently being worked on in the Javascript SDK) to the SDK which will let you know that the SDK is attempting to reconnect a participant.

Lastly, as @ceaglest mentioned, you are running on a bit out of date version of the iOS SDK. There are many improvements in 2.0.0-beta4 and I wonder if you will see this same issue if you use the latest.

Looking forward to hearing back from you on this,

Ryan

@dineshflock
Copy link
Author

Looking at the logs for this room, I see that P3 connected at 2018-04-20T06:49:05.768Z. And then at 2018-04-20T06:49:21.820Z P4, with the same identity as P3, connects.

Yes you are right, for 882v88m99neuu2o2 we got participantDidConnect again after ~15 seconds.

This causes P3 to disconnect because of the duplicate identity. Looking at the logs, all of the correct unsubscription events were fired for P3 in that situation.

The events might have been fired but the order was not correct. I did not got a disconnect for P3 before connect of P4, thus this allowed two participants with same identity.

I would suggest you look at the room connecting code to see if there is a possibility that you are firing it twice.

I am 100% sure we don't fire participantDidConnect event twice. Also in current case these two events were different, as the SID's we different.

Another possibility is if you are managing room connection logic in your app?

We show users a reconnect button, once we get a roomDidDisconnect with non nil error.

Thanks @paynerc

Dinesh

@paynerc
Copy link
Contributor

paynerc commented Apr 26, 2018

@dineshflock,

I ran a test using 2.0.0-beta4 where I joined a room on my iPhone. And then connected a participant from the web. Once it was connected and everything was published, I connect the same participant from another web browser. The following is the log output:

2018-04-26 10:01:58.945621-0400 VideoQuickStart[7275:8039329] Attempting to connect to room Optional("room")
2018-04-26 10:01:59.350514-0400 VideoQuickStart[7275:8039329] Connected to room room as Optional("iPhone6s")
2018-04-26 10:02:36.807530-0400 VideoQuickStart[7275:8039329] Participant RCP-(PA4e70c6b365caa5e9d9af06f605f6ebb6) connected with 1 audio and 1 video tracks
2018-04-26 10:02:36.849699-0400 VideoQuickStart[7275:8039329] Subscribed to b84978c0-657a-4207-be98-3c933b48ae05 audio track for Participant RCP-(PA4e70c6b365caa5e9d9af06f605f6ebb6)
2018-04-26 10:02:36.850086-0400 VideoQuickStart[7275:8039329] Subscribed to d48ae9f6-d06a-4f48-aea4-0b713f2cd48c video track for Participant RCP-(PA4e70c6b365caa5e9d9af06f605f6ebb6)
2018-04-26 10:03:01.271807-0400 VideoQuickStart[7275:8039329] Unsubscribed from b84978c0-657a-4207-be98-3c933b48ae05 audio track for Participant RCP-(PA4e70c6b365caa5e9d9af06f605f6ebb6)
2018-04-26 10:03:01.271915-0400 VideoQuickStart[7275:8039329] Unsubscribed from d48ae9f6-d06a-4f48-aea4-0b713f2cd48c video track for Participant RCP-(PA4e70c6b365caa5e9d9af06f605f6ebb6)
2018-04-26 10:03:01.276747-0400 VideoQuickStart[7275:8039329] Room room, Participant RCP-(PA4e70c6b365caa5e9d9af06f605f6ebb6) disconnected
2018-04-26 10:03:01.360866-0400 VideoQuickStart[7275:8039329] Participant RCP-(PA2b26eb83468d17f2f93bb9759cd68392) connected with 1 audio and 1 video tracks
2018-04-26 10:03:01.595582-0400 VideoQuickStart[7275:8039329] Subscribed to d312d097-04e2-42f9-abd0-7b357ae7bd9d audio track for Participant RCP-(PA2b26eb83468d17f2f93bb9759cd68392)
2018-04-26 10:03:01.595934-0400 VideoQuickStart[7275:8039329] Subscribed to 7f1fed4e-f700-4543-b976-4ef137c4a7cc video track for Participant RCP-(PA2b26eb83468d17f2f93bb9759cd68392)

The participant's identity is RCP. When it first connects, the server assigns it the sid PA4e70c6b365caa5e9d9af06f605f6ebb6. When I connect to the room again with the same identity, RCP, the following happens, as shown by the logs above:

  1. The audio track for RCP (PA4e70c6b365caa5e9d9af06f605f6ebb6) is unsubscribed
  2. The video track for RCP (PA4e70c6b365caa5e9d9af06f605f6ebb6) is unsubscribed
  3. The participant RCP (PA4e70c6b365caa5e9d9af06f605f6ebb6) is disconnected from the room
  4. The participant RCP with a new server assigned sid (PA2b26eb83468d17f2f93bb9759cd68392) is connected to the room.
  5. The audio track for RCP (PA2b26eb83468d17f2f93bb9759cd68392) is subscribed
  6. The video track for RCP (PA2b26eb83468d17f2f93bb9759cd68392) is subscribed.

The delegates are being fired in the proper order and there is never more than one participant with the same identity in the room at any given time.

Every time you connect to the same room with the same identity, the server will generate a new sid for you, as shown above.

Have you tested this scenario with 2.0.0-beta4? Can you collect log output that shows the ordering of the delegate callbacks that you are seeing?

Ryan

@ceaglest
Copy link
Contributor

Thanks for the analysis Ryan. Is it possible that the Participant disconnected vs connected events are racey? I know we have tests that cover this scenario but I'm not 100% sure the we strictly validate the callback ordering as seen on the Client.

@dineshflock
Copy link
Author

Thanks @paynerc

Have you tested this scenario with 2.0.0-beta4? Can you collect log output that shows the ordering of the delegate callbacks that you are seeing?

We have moved to beta4 and now I have added events tracks as the first line in the delegate of TVIRoom and TVIRemoteParticipant. With this I will share the accurate logs/events once we get this again.

@paynerc
Copy link
Contributor

paynerc commented Apr 27, 2018

@dineshflock

Just as a side note, we did GA 2.0.0 yesterday. You should probably pick that up instead of 2.0.0-beta4.

Ryan

@ceaglest
Copy link
Contributor

Hey @dineshflock,

We are discussing this issue internally and have cut CSDK-2289 for tracking. At this point, I don't believe it's possible to ensure that the disconnection of a duplicate Participant is seen by all Clients before the connection of its replacement Participant.

This issue is also exacerbated by long Participant timeouts, something else we are hoping to improve in future releases.

Best,
Chris

@ceaglest
Copy link
Contributor

ceaglest commented May 2, 2018

Hi @dineshflock,

After discussing with our team members, we believe that it would be possible to resolve this API inconsistency with a combination of Client and Server side changes. Right now we don't guarantee the ordering of individual signaling messages (but we do guarantee delivery), so there may be a small window where two Participants with the same identity (but a different Participant SID) can coexist.

How important is this issue to you? Is there a Client side workaround you could use? I think our efforts may be better served improving other aspects of the product, but if this is critical to your use case we will of course consider it further.

Regards,
Chris

@paynerc
Copy link
Contributor

paynerc commented Apr 6, 2020

@dineshflock,

I am closing this issue as there has been no further activity. Please feel free to reopen if you have further information or questions.

Thank you,

Ryan

@paynerc paynerc closed this as completed Apr 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants