Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large Bandwidth Usage From Multiple Unclosed WebRTC Streams #339

Open
theokeeler opened this issue Mar 3, 2024 · 22 comments
Open

Large Bandwidth Usage From Multiple Unclosed WebRTC Streams #339

theokeeler opened this issue Mar 3, 2024 · 22 comments
Labels
bug Something isn't working

Comments

@theokeeler
Copy link

Bug Report

Description

Mobileraker now uses huge amounts of data in the background when a print is running; I burned through over 40GB of mobile data in about 5 hours of printing. This behaviour first started once ProgressNotifications for Android started working. I've tried

Steps to Reproduce

  1. Start a print on your printer
  2. On an android device, connect via mobile data (you'll need some sort of remote connection)
  3. Wait
  4. Look at your mobile data usage and/or receive an angry text from your cell provider.

Expected Behavior

Mobileraker shouldn't be using this much data in the background (or even in the foreground). Ideally, whatever data is being downloaded here wouldn't be when on a mobile connection.

Version Information

  • Device-OS (Android/iOS): Android
  • Mobileraker: v0.4.0-52-g26028f11
  • Klipper: v0.12.0-115-gb98375b3
  • Moonraker: v0.8.0-317-g0850c16b

Debug Logs

Not attached, as this would result in publicly posting private information about my network and printer configs that is included in your logs.

Additional Context

I'm accessing my printer remotely via a Wireguard connection to my LAN; I've tried setting up a manual "remote connection", but all that seems to let me do is change the endpoint to connect to. I've also set include_snapshot to False in my config but the behaviour has not changed. I've observed this behaviour with both very large and relatively small gcode files; The large gcode files use more data, but not to the same order of magnitude of difference as the size of the files reflect (e.g. a 9MB gcode file being printed consumes 5GB in 20 minutes, but a 200MB gcode consumes 8.5GB).

I will be unable to perform further testing on this for quite a while, as this has exhausted my data plan.


Checklist

To help us diagnose the issue, please ensure you've completed the following steps:

  • [ X ] Provided a clear bug description.
  • [ X ] Listed detailed steps to reproduce the issue.
  • [ X ] Described the expected behavior.
  • [ X ] Included the Mobileraker version you are using.
  • [ N ] Attached Mobileraker's debug log files. (No; Your logs are insufficiently redacted to be publicly posted)
  • [ X ] Specified the version numbers of Klipper and Moonraker if applicable.
@theokeeler theokeeler added the bug Something isn't working label Mar 3, 2024
@Clon1998
Copy link
Owner

Clon1998 commented Mar 3, 2024

Hey,
can you attach the log files of the companion?
40 GB is a shit ton and tbh. I currently have no idea what could cause that.
The progress notifications are issued in 5% steps and have a size of around 1kb as the progress notification do not include a snapshot.

The only other thing I could think of is that one of the webcams did not stop when you closed the app/ put the app in background.

@theokeeler
Copy link
Author

logs-20240303-164252.zip
Mobileraker Companion logs attached.

In terms of webcams, this machine has one 1280x1024@30fps webrtc feed. Looking at this on a desktop via mainsail, my system reports between 2.8 and 3.8Mbps; That should only account for ~570MB/20 minutes, so I don't see how it could be the (only) culprit here (unless the webcam feed gets reopened and left open multiple times).

@Clon1998
Copy link
Owner

Clon1998 commented Mar 4, 2024

logs-20240303-164252.zip Mobileraker Companion logs attached.

In terms of webcams, this machine has one 1280x1024@30fps webrtc feed. Looking at this on a desktop via mainsail, my system reports between 2.8 and 3.8Mbps; That should only account for ~570MB/20 minutes, so I don't see how it could be the (only) culprit here (unless the webcam feed gets reopened and left open multiple times).

That is true.
The logs of the companion also do not contain anything that stands out. It sends a notifation every 2-3 mins so nothing that would cause 40GB of traffic.

Are you using the normal WebRtc or Go2Rtc?

@theokeeler
Copy link
Author

This is (as far as I know) a normal WebRTC stream from camera-streamer.

@theokeeler
Copy link
Author

I've done a little additional digging here - curiously, the excess data usage survives both the uninstallation of the mobileraker app on the phone and a subsequent reboot of the phone. My router can see several continuing connections, each approximately 2.2Mbps, across various ports that look like a reasonable range for webrtc between the phone's wireguard IP and the printer's. That certainly makes me lean towards thinking this is a result of multiple connections being opened and orphaned. If I kill these active connections at the router, the same number of connections re-establish themselves; this happens even with the phone disconnected from the network entirely, so something on the printer end of things is continuously re-establishing these connections and pushing data on them.

@Clon1998
Copy link
Owner

Mhhh.
That's actually really weired 🙃
There should be nothing going on if the app is removed 😅.
Also I double checked the webcam Implementation and can confirm that the webcams are paused/closed if the app goes into the background.

@theokeeler
Copy link
Author

Yeah, my hunch here is we're looking at the client app seeing a disconnect or interruption to the connection somewhere, and requesting a new stream from camera-streamer; The old streams persist, however, because they're all UDP and thus they don't know that the other side of the conversation is gone.

I'm not conversant enough in the implementation details of webrtc to say if it's expecting a NAK here that would kill the previous streams, but I'm going to have a dig through the mobileraker source here to see if I can spot any obvious cases where a disconnect is met with an immediate request for a new connection.

The other direction I can see to tackle this would be using a TCP connection for webrtc instead of UDP; I took a look at the traffic between another printer and my phone using OctoApp (which doesn't exhibit this bandwidth-leaking behaviour) and all of its traffic is TCP, which will intrinsically let the printer side know if a given connection is down (as the phone stops sending ACKs). My understanding is that webrtc supports both modes of operation.

@theokeeler
Copy link
Author

Ah, the actual webrtc code isn't available in the public repos, so I can't look at how it's handled there.

In the meantime I've set my webcam to use an mjpeg-adaptive stream, which avoids this problem (though unfortunately that change appears to affect Mainsail as well).

@Clon1998
Copy link
Owner

Ah, the actual webrtc code isn't available in the public repos, so I can't look at how it's handled there.

In the meantime I've set my webcam to use an mjpeg-adaptive stream, which avoids this problem (though unfortunately that change appears to affect Mainsail as well).

I can post snippets tomorrow. But it uses the flutter webrtc lib.

Regarding the cam/effect on mainsail.

You can also add a the cam twice 😉

@Clon1998
Copy link
Owner

Here is the code that manages the webrtc.

https://gist.github.com/Clon1998/d912327938cdeaafa50df0c7e8105b44

As you might notice, it just uses the classes of the webrtc package.

If you know how to enable TCP instead of UDP for webrtc please let me know 😉

@theokeeler
Copy link
Author

Nothing immediately jumps out at me as problematic in that snippet; Unfortunately the flutter-webrtc documentation is effectively just the demo repo, so it's a little difficult to check the implementation details. I did notice that the demo app makes sure to close individual tracks when cleaning up, but that may be only for a sending connection rather than a receiving connection.

@AlexanderS
Copy link

AlexanderS commented Apr 22, 2024

I have the same issue and I can replicate it by simply closing Mobileraker via the Android Application switcher and opening it again. Each time I do this I can see an increase of the bandwidth. This may also happen when the app is in the background and is killed by the OS due to memory reasons or background process limitations.

I do not know much about Android application architecture, but it seems that it is not possible to get a callback for this event and a Service may be needed for handling this events. Maybe it can be fixed at the camera-streamer side: ayufan/camera-streamer#123 and ayufan/camera-streamer#91 seems to be related.

@Clon1998
Copy link
Owner

I have the same issue and I can replicate it by simply closing Mobileraker via the Android Application switcher and opening it again. Each time I do this I can see an increase of the bandwidth. This may also happen when the app is in the background and is killed by the OS due to memory reasons or background process limitations.

I do not know much about Android application architecture, but it seems that it is not possible to get a callback for this event and a Service may be needed for handling this events. Maybe it can be fixed at the camera-streamer side: ayufan/camera-streamer#123 and ayufan/camera-streamer#91 seems to be related.

That is interesting and good to know. Afaik, I do everything I can to close all active connections, but if the camera streamer continues, there is not much I can do. However, did you try to reproduce this behavior via the browser of the phone rather than just the mobileraker app? If it behaves the same way, we know for sure that this is not related to my client's implementation.

Also, @theokeeler, can you confirm that you tracked it down to the WebRTC and, therefore, this issue is not related to Notifications at all? If that is the case, we might want to rename this issue.

@theokeeler theokeeler changed the title Huge Bandwidth Use on Android since ProgressNotifications Were Enabled Large Bandwidth Usage From Multiple Unclosed WebRTC Streams Apr 22, 2024
@theokeeler
Copy link
Author

Yep, it's definitely looking like webRTC from what I can see here; Most notably, if I don't open the app to look at the camera but leave it sending notifications, I don't see the same behaviour. Renamed the issue accordingly.

@Clon1998
Copy link
Owner

Thanks!
Also, after taking another look at it and debugging, I realized that the PeerConnection doesn't close when the app changes its AppLifecycleState to the background. Upon further investigation, I discovered that I never registered the callback to stop the WebRtc stream when the app goes into the background. I have now fixed that issue and will push a dev build for you to try out. Please let me know if you are using IOS or Android, so I can invite you to give it a try and collect some feedback. I'm hoping this update solves the problem.

@theokeeler
Copy link
Author

I'm on Android.

@Clon1998 Clon1998 reopened this Apr 22, 2024
@AlexanderS
Copy link

I'm on Android too.

@Clon1998
Copy link
Owner

Alright,
I will provide you guys an APK as soon as I am done with building it via github :)

@theokeeler
Copy link
Author

The problem persists with me for this build, though fewer streams are being created. I'm also seeing some WebRtcDisconnect events, and reconnections after those invariably produce another orphaned stream.

@Clon1998
Copy link
Owner

The problem persists with me for this build, though fewer streams are being created. I'm also seeing some WebRtcDisconnect events, and reconnections after those invariably produce another orphaned stream.

Well I am out of ideas now. I am now 100% confident that my client implemention closed all connection if backgrounded.

@theokeeler
Copy link
Author

After several days of testing this version, I've got a little more information about the behaviour with this patch in place:

  • Any WebRtc disconnection failure can spawn one more video stream
  • Webrtc connection failures can spawn upwards of 5 dead streams at a time
  • Higher bitrates contribute to both of those types of event (my workflow for testing a higher bitrate is crude and analog - my textured print sheet looks like noise and makes the variable bitrate of the stream ~ double vs trying to stream images of my smooth plate, and I can see these failures more often when viewing a job on the textured plate vs the smooth).

IMO the patch is worth patching as-is even if it doesn't perfectly resolve things - it's harder to generate the extra streams without entering a failure state on this patch than on mainline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants