Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lingering connections even after a client socket is disconnected #29

Closed
atbe opened this issue Apr 5, 2021 · 11 comments
Closed

Lingering connections even after a client socket is disconnected #29

atbe opened this issue Apr 5, 2021 · 11 comments

Comments

@atbe
Copy link

atbe commented Apr 5, 2021

I'm not sure what to include for this when it comes to logs because I can't pinpoint this down to a specific error

When there is a flood of new connections, some of the participants never get cleaned out when their socket disconnects

I ran this command to see how many connections were open on the server:

# cat /proc/net/tcp | wc -l
25

Although one of the rooms I'm currently in reports 926 participants and the redis records for all the disconnected users are still there.

This causes a miscount in how many participants are actually in the call that cannot be cleaned up without manually clearing redis, or starting a new room.

Please let me know if there's something I can include to make this bug more clear. I may be able reproduce it on the livekit sample if I spam the server with lots of connections all at once.

The only thing I see in the logs repeatedly is

2021-04-05T03:51:51.950Z        ERROR   routing/redisrouter.go:317      error processing signal message{"error": "channel is full"}
github.com/livekit/livekit-server/pkg/routing.(*RedisRouter).redisWorker
        /workspace/pkg/routing/redisrouter.go:317
2021-04-05T03:51:51.950Z        ERROR   routing/redisrouter.go:317      error processing signal message{"error": "channel is full"}
github.com/livekit/livekit-server/pkg/routing.(*RedisRouter).redisWorker
        /workspace/pkg/routing/redisrouter.go:317

when this occurred

@atbe
Copy link
Author

atbe commented Apr 5, 2021

Here's a view into redis for the room, all but 1 of these participants is still connected (the windows where these connections originated have already closed)

image

@atbe
Copy link
Author

atbe commented Apr 5, 2021

And the room object returned by the connect call reports a participants property of size 926

@atbe
Copy link
Author

atbe commented Apr 5, 2021

I even restarted the machine where the connections originated and they're still there

@davidzhao
Copy link
Member

Abe, this is working by design. These redis values are private to LiveKit internals and you should not depend on them as any indication of the client being present. The values clear out after 24h.

To get room state, please call listRoom/listParticipant APIs on RoomService

@atbe
Copy link
Author

atbe commented Apr 5, 2021

So I shouldn't be using this https://github.com/livekit/client-sdk-js/blob/main/src/room/Room.ts#L39 to list participants in a call? When I check that property, it includes all those participants that already disconnected

@davidzhao
Copy link
Member

yes, that property should not contain participants that have disconnected.. that may very well be a bug. In this ticket though, you are describing redis state, which is not how that field gets populated.

@atbe
Copy link
Author

atbe commented Apr 5, 2021

yeah, I was outlining the issue on the server side as a way to provide some insight into why that property might include disconnected participants, but to your point its not relevant to how that field is populated

@davidzhao
Copy link
Member

davidzhao commented Apr 5, 2021

are you seeing room participants not clearing? hmm I can't seem to reproduce that. Let me know how I might be able to reproduce this.

Do you have logs showing participantDisconnected?

@atbe
Copy link
Author

atbe commented Apr 5, 2021

Right, the room has participants that wont clear.

You can try to join the room that has dead participants in it here https://livespot.co/session/c9e77293-f854-473c-b7cb-ca101743eb01

Eventually the connection will succeed, it takes a long time to connect after a bunch of retries since we updated to the latest version but that might be a different bug.

If you look at the response from the join message, there are tons of participants getting return that are most definitely not connected

image

@davidzhao
Copy link
Member

I see, that's definitely not right if the participants are no longer around:

  1. how are these participants connecting? are these actual livekit clients? or bots?
  2. how did you trigger a disconnect?

@atbe
Copy link
Author

atbe commented Apr 5, 2021

  1. Almost all of those are actual clients. Some of them are "logged out users" which just means someone who hasn't logged into livespot but wants to stream a call. We generate a random identity value for them
  2. Disconnects are not always explicitly handled because some people click leave call, and some people just close their window. If they close their window (or refresh), we try to call unpublishTracks on any tracks they're publishing and move on.

I don't think there's an explicit disconnect method on the js client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants