-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ios: Battery/CPU usage audit #113
Comments
To complete this, I'm using the Energy Diagnostics tool from Xcode Instruments. I'll be pushing to the same branch @buildbreaker is using for #114 here: #159. This will be done using two apps:
The apps will be run (one at a time) on a physical device (iPhone 6s iOS 12.2.x) for 1 hour with the battery/developer logging enabled (see Apple docs above), after which the CPU and battery usage will be measured. |
Battery analysis (test apps)Ran the above experiment, and found the following results.
According to Apple's documentation, battery consumption is scored at a level from 0 to
The logs from Envoy showed a score of 5/20: This is compared with control, which had a score of 4/20: Overall, the device battery dropped by 5% over the hour that Envoy was running on the device. The app's battery usage was low without Envoy, and stayed low even with Envoy proxying requests, changing from 4/20 to 5/20. |
@rebello95 The shape of that test result looks weird; your control has way less cellular transmit power draw and envoy has way more foreground activity that doesn't really correlate to network. This might actually be an indication that something strange is happening or just the fact this is an unscientific test.
|
@Reflejo thanks for taking a look. Yep, I'm planning on running it again, and will do a CPU analysis in parallel. I may also run this with the Lyft passenger app as a more real-world environment. |
Did quite a bit more testing, and have some interesting findings. CachingThe first pass above was using New approachInstead of running a test with Envoy actually within the Lyft passenger app, I examined how many requests are executed on the "request ride" screen of the app (~1.5req/sec) and decided to simulate something a little more intense than that. In this test, I continued sending a request every 0.2s, and sent 1,000 requests with the Inconsistent Envoy I/OOnce the above caching issue was identified with control, I did some validation against the build with Envoy installed, and found that the network I/O data still appeared sporadic in Xcode Instruments without the caching - even though the responses were coming back successfully consistently. As a quick test, I randomized URL request parameters, headers, and body data in the outbound requests, but still saw the same sporadic results. After speaking with @goaway and @junr03, we decided it was possible that the data going in and out of the app simply wasn't being caught by Instruments since Envoy uses pure sockets to send it. To validate this theory, we tried installing Charles Proxy on the test device to see if traffic appeared there. For the same reasons, it did not. To try something lower level, we took the following approach:
Sure enough, traffic appeared consistently (every ~0.2s) within Wireshark. (Note: We had to disable TLS in the Envoy config file in order to make our lives easier identifying which traffic was coming from the app). Thus, we can ignore the seemingly sporadic network I/O shown in Instruments for the Envoy tests: BatteryBattery for control averaged a score of 1/20: Comparatively, the app with Envoy compiled/running showed an average of 12/20, a massive increase from control: CPUCPU usage in the control app didn't go much past 12%: Envoy, however, had magnitudes more CPU usage, around 185%: MemoryMemory usage in both apps was relatively reasonable, with control using ~6MB of persistent memory: And Envoy using ~12MB (a high relative increase, but still not too bad): Summary
I wanted to post these results to get feedback before continuing. Potential steps forward:
|
Nice, awesome work getting this setup.
|
|
Let's sync up either tomorrow afternoon or Monday (I'm out Thu/Fri), but it would be useful if there is a way I can inspect the profile myself vs. screenshots. At a high level the event loop is waking up too often, and this is probably a result of the watchdog timers being too aggressive, and probably other default things that need to be tuned (I'm pretty sure everything should be configurable). Note also that once we move over to the native socket, iOS will handle all of the eventing for us which is likely to also be more efficient. |
One of them that I know needs to be configured is the miss timers: https://www.envoyproxy.io/docs/envoy/latest/api-v2/config/bootstrap/v2/bootstrap.proto#config-bootstrap-v2-watchdog. We need to completely remove that for Mobile (let's file an issue) but for now I would configure that to a very high value for the miss timers so they never wake up. |
Sounds good, we have some time scheduled on Monday to sync up on this. In the meantime, if you'd like to load up the traces in Xcode Instruments, here are the files you can import from the runs above: Control: Control.trace.zip Let's chat regarding increasing the miss timers / disabling them as well. |
Also, I ran the test again with the demo apps, and am indeed seeing the same results as before. |
Per offline convo, here are the things I would play around with next to get more info:
If we want to actually log every time we wake up the Envoy dispatcher, I realize that there isn't going to be a trivial way to do that in one place. I can provide more info as a next step if we want to take a look at that, so I would recommend that we start with ^ and see what we see. |
Hmm, @rebello95 sorry I just looked at this again and I think I know a major problem. libevent appears to be using the poll dispatcher vs. the kqueue dispatcher (the trace shows |
Thanks for posting those details @mattklein123. Per those suggestions and our offline conversation, I ran the test using the configurations outlined in items 1-3 above (as seen in this commit), and am seeing basically no change in the CPU performance graphs. For suggestion 4, I set Envoy's log level to These findings coincide with your last suggestion that After talking with @mattklein123 offline, I think the best next steps are to investigate how to use |
Was able to verify that polling vs kqueue was the problem! When forcing |
There is a set of configurations that we can slow down on mobile from their defaults upstream because they aren't as relevant to mobile clients. This change updates our example configurations to use new values based on [this discussion](#113 (comment)). When we switch to typed configurations, these should also be set automatically for production clients: #169 Signed-off-by: Michael Rebello <mrebello@lyft.com>
There is a set of configurations that we can slow down on mobile from their defaults upstream because they aren't as relevant to mobile clients. This change updates our example configurations to use new values based on [this discussion](#113 (comment)). When we switch to typed configurations, these should also be set automatically for production clients: #169 Signed-off-by: Michael Rebello <mrebello@lyft.com>
Yay! Awesome! |
There is a set of configurations that we can slow down on mobile from their defaults upstream because they aren't as relevant to mobile clients. This change updates our example configurations to use new values based on [this discussion](envoyproxy/envoy-mobile#113 (comment)). When we switch to typed configurations, these should also be set automatically for production clients: envoyproxy/envoy-mobile#169 Signed-off-by: Michael Rebello <mrebello@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
- Adds iOS documentation on the CPU/battery usage based on the investigation done in envoyproxy/envoy-mobile#113 - Combines the existing documentation for this into a single file for CPU/battery Resolves envoyproxy/envoy-mobile#113. Signed-off-by: Michael Rebello <mrebello@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
There is a set of configurations that we can slow down on mobile from their defaults upstream because they aren't as relevant to mobile clients. This change updates our example configurations to use new values based on [this discussion](envoyproxy/envoy-mobile#113 (comment)). When we switch to typed configurations, these should also be set automatically for production clients: envoyproxy/envoy-mobile#169 Signed-off-by: Michael Rebello <mrebello@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
- Adds iOS documentation on the CPU/battery usage based on the investigation done in envoyproxy/envoy-mobile#113 - Combines the existing documentation for this into a single file for CPU/battery Resolves envoyproxy/envoy-mobile#113. Signed-off-by: Michael Rebello <mrebello@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
Document reference here.
Android issue: #114
The text was updated successfully, but these errors were encountered: