Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ghost webrtc streams #55

Open
philippebourcier opened this issue Apr 26, 2023 · 12 comments · May be fixed by #91
Open

Ghost webrtc streams #55

philippebourcier opened this issue Apr 26, 2023 · 12 comments · May be fixed by #91

Comments

@philippebourcier
Copy link

philippebourcier commented Apr 26, 2023

@ayufan OK, this time a real bug. 😅

I noticed that when I watch a WebRTC stream and I relaunch the browser (or just go back to the index and click again on /webrtc), most of the time, I get a ghost UDP stream...

Here's a log view of the phenomenon :
util/http/http.c: HTTP8080/6: Client connected 192.168.1.36 (fd=10).
util/http/http.c: HTTP8080/6: Request 'GET' '/webrtc' ''
util/http/http.c: HTTP8080/6: Client disconnected 192.168.1.36.
util/http/http.c: HTTP8080/7: Client connected 192.168.1.36 (fd=11).
util/http/http.c: HTTP8080/7: Request 'POST' '/webrtc' ''
output/webrtc/webrtc.cc: rtc-tonwouvavrrzosxdjudo: Stream requested.
util/http/http.c: HTTP8080/7: Client disconnected 192.168.1.36.
util/http/http.c: HTTP8080/8: Client connected 192.168.1.36 (fd=12).
util/http/http.c: HTTP8080/8: Request 'POST' '/webrtc' ''
output/webrtc/webrtc.cc: rtc-tonwouvavrrzosxdjudo: Answer received.
util/http/http.c: HTTP8080/8: Client disconnected 192.168.1.36.

util/http/http.c: HTTP8080/9: Client connected 192.168.1.36 (fd=13).
util/http/http.c: HTTP8080/9: Request 'GET' '/webrtc' ''
util/http/http.c: HTTP8080/9: Client disconnected 192.168.1.36.
util/http/http.c: HTTP8080/0: Client connected 192.168.1.36 (fd=4).
util/http/http.c: HTTP8080/0: Request 'POST' '/webrtc' ''
output/webrtc/webrtc.cc: rtc-wpesdkknzakphjitiknw: Stream requested.
util/http/http.c: HTTP8080/0: Client disconnected 192.168.1.36.
util/http/http.c: HTTP8080/1: Client connected 192.168.1.36 (fd=5).
util/http/http.c: HTTP8080/1: Request 'POST' '/webrtc' ''
output/webrtc/webrtc.cc: rtc-wpesdkknzakphjitiknw: Answer received.
util/http/http.c: HTTP8080/1: Client disconnected 192.168.1.36.

When things go well, you get this line :
output/webrtc/webrtc.cc: rtc-vohzdpemelhsnkocaalk: Client removed: stream closed.

But most of the time you don't...
At some point I had like 10 ghosted streams in parallel and my WiFi didn't like it. 😅

Here's a graph with 3 streams, the non-ghosted one is in red, the ghosted ones in green and blue :
image

So it looks like a kind of stream timeout handling or detection is missing on the C side (or the JS needs to send a signal to tell the server to stop streaming)...

@philippebourcier
Copy link
Author

philippebourcier commented Apr 26, 2023

I added :
window.addEventListener("beforeunload", function() { pc.close(); });
at the end of the startWebRTC() javascript function and it seems to do the trick but there might be more elegant ways of doing this. 🤷‍♂️

@JoelBecker1998
Copy link

Can confirm the issue.

@ayufan
Copy link
Owner

ayufan commented May 9, 2023

@philippebourcier This is not bad workaround. Can you open PR?

@RemiNV
Copy link

RemiNV commented Jun 25, 2023

I also noticed that after opening the webrtc stream in a Chrome tab, then closing it, the camera-streamer daemon is still using a fair bit of CPU time; more than before opening the stream.

Having the client close the stream does seem necessary, but given that the server may keep using resources indefinitely (for example if the client crashes or doesn't execute the JS unload), I agree some stream timeout / detection that the remote is gone is necessary.

@jpiccari
Copy link

@ayufan This is still an issue in the latest release. I noticed libdatachannel does not call pc->onStateChange in all cases (see repro steps below). Also, at least in Firefox and iOS Safari, the onunload event workaround isn't necessary as the clients seem to already send this in many (all?) cases.

I'm new to WebRTC so forgive my complete ignorance here... I added some hacky code to the connection initiation to iterate the clients and check their peer connection state for disconnected/failed/closed and close any that are found. However, the status never got updated by libdatachannel and so I end up with a massive list of clients. My understanding is that ICE servers should be used to signal between client and server but I'm not clear if there is a heartbeat in the spec (or built into libdatachannel) or if that is something that we need to implement. My guess is that no such heartbeat exists and there is no built-in provision for detecting client failures/networking issues which is why we get ghost streams.

Finally, here are some repro steps that reliably produce orphaned streams.

  1. Open /webrtc stream
  2. Close browser tab/app

This works on Chrome, Edge, iOS Safari. But strangely doesn't work on Firefox (Firefox behaves nicely and closes the stream). I didn't try the unload event but I'm sure it would be buggy since it doesn't work for network connectivity issues or for browser crashes, dead batteries, etc. Would be much preferable to come up with a robust solution for identifying and cleaning up orphaned stream.

@jpiccari
Copy link

I've got a fix that I've been testing out for a couple hours now with many dirty "client" disconnects. Before we dive into the theory of the change so we can debate if it is correct, I think it would be helpful to clarify some terms I'm about to use which make the conversation easier but aren't necessarily correct for a discussion on webrtc.

  1. client - I'm using this to refer to the viewer of the stream. They send a request to /webrtc (the signalling server) which initiates the peer connection setup on both ends.
  2. server - Refers to the side of the rtc connection sending video data.
  3. signalling server - This is the main server hosting the endpoints for /webrtc, /status, etc

Alright, so the change I have been running includes client-side changes (update to the javascript) and server-side changes. On the server, when webrtc_server() is called, I create a new thread that loops every 10s (planning to make this configurable via cli args) and acquires a lock for to iterate webrtc_clients and removes each client where their last heartbeat is older than some staleness threshold.

To track heartbeats, I have added a last_heartbeat_s field to the Client class which stores the server timestamp of the most recent heartbeat message from the client. Finally, I have updated the javascript in webrtc.html to include periodic heartbeat messages being sent every 1s.

One interesting disadvantage with this change is that backwards compatibility is a trade-off. This change will break users that use their own javascript logic to communicate with the signalling server. Because it relies on heartbeats, existing services will need to update to include the heartbeat logic in their client-side code or their streams will die after the timeout period (just a few seconds). One solution would be to bump version numbers or otherwise note that it is not backwards compatible. Another would be to include a default last heartbeat threshold that is very long (perhaps even infinite) and provide an option to configure it down as a sort of opt-in to automatic session cleanup. I don't have any preference here so just let me know the direction and I can put out a PR.

@jpiccari jpiccari linked a pull request Aug 5, 2023 that will close this issue
@AndyHazz
Copy link

AndyHazz commented Sep 4, 2023

Just to add a bit of emphasis to this issue, I was away from home recently, and using camera-streamer through crowsnest/mainsail and wireguard VPN - the webrtc stream ate through my 20gb of mobile data allowance in a day even though I was probably only viewing the stream for a few minutes :(

It seemed to keep the stream open even after restarting the phone, without viewing the camera stream again ... although that might be an edge case from the way the wireguard VPN works.

@ayufan
Copy link
Owner

ayufan commented Sep 4, 2023 via email

@jasonmacdonald
Copy link

I'm often running into these ghost streams; when I check the bandwidth used by my Pi 4, it's often up in the 20 Mb/s range in only a few days. In comparison, a single stream should only be using around 1Mb/s. I must continue to shut down crowsnest and restart it to clear them out.

Is there a short term workaround?

@jpiccari
Copy link

jpiccari commented Dec 9, 2023

I fixed this back in August with the addition of the liveness checks. The PR is linked above if you want to try it out. There is some discussion about how webrtc works in that PR which might be why it never got merged.

@ayufan
Copy link
Owner

ayufan commented Dec 9, 2023 via email

@Samconboy
Copy link

Has this been fixed?

Just got stuck for 25gb of data via octoanywhere by switching to webrtc and I believe it could be related

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants