Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Prevent OperationRepo's 5 second wait from being skipped, and continuously pulling when empty #2033

Merged
merged 5 commits into from Mar 28, 2024

Conversation

jkasten2
Copy link
Member

@jkasten2 jkasten2 commented Mar 20, 2024

Description

One Line Summary

Prevent OperationRepo from continuously pulling when empty, which might be putting extra load on the device.

Details

This also fixes a bug where batching wasn't waiting 5 seconds in most cases.

Motivation

The pulling might be causing some battery drain due to the app's process never fully sleeping or CPU continuing to be woken up. However the theory is this may only be noticable factor on some devices (probably more noticeable on watches running Android).

Scope

Only changes to the OperationRepo polling logic.

Testing

Unit testing

Added a new test to confirm it continuously pulls when the operations list is empty.

Manual testing

Test setup

  • Device = LGL58VL
  • Android Version = 6.0.1
  • Phone mode: Airplane mode on, with WiFi on

Tested the battery drain by unplugging the device for a long time then obserserving how much power it takes to rechanged the device to full again. Recharge was measured with a USB multiple meter, getting the Wh it took to recharge the device until it stopped accepting power.

We test 3 scenarios, baseline to see how much power the device loses over time without OneSignal, and then two more comparing before after. Results are as follows:

Baseline (OneSignal app not installed on device)

  • Unplugged for ~1 hour, recharged 0.12Wh in 11:17 = Drain of .12Wh
  • Unplugged for ~19 hours, recharged 0.77Wh in 57:56 = Drain of 0.0405Wh

Test 1 - 5.1.6

  • Unplugged for ~1 hour, recharged.14Wh in 20:06 = Drain of .14Wh
  • Unplugged for ~1 hour, recharged .09Wh in 8:50 = Drain of .09Wh
  • Unplugged for ~50 minutes, recharged .10Wh in 12:57 = Drain of .12Wh
  • Unplugged for ~23 hours, recharged .85Wh in 54:41 = Drain of 0.0370Wh

Test 2 - 5.1.6 (with OperationRepo fix)

  • Unplugged for ~1 hour recharged .05Wh in 8:03 = Drain of .05Wh
  • Unplugged for ~1 hour, recharged 0.12Wh in 9:38 = Drain of .12Wh
  • Unplugged for ~6 hours, recharged .20Wh in 22:32 = Drain of .0333Wh
  • Unplugged for ~84 hours, recharged 1.99Wh in 1:24:29 = Drain of .023Wh

Summary of testing
The data is inconclusive, unfortunately this method of testing is resulting in wildly different numbers between tests, even tests that last 24 hours. There stands to be some theatrical improvement still from this PR, but requires a different testing strategy and/or different devices to prove.

Affected code checklist

  • Notifications
    • Display
    • Open
    • Push Processing
    • Confirm Deliveries
  • Outcomes
  • Sessions
  • In-App Messaging
  • REST API requests
  • Public API changes

Checklist

Overview

  • I have filled out all REQUIRED sections above
  • PR does one thing
  • Any Public API changes are explained in the PR details and conform to existing APIs

Testing

  • I have included test coverage for these changes, or explained why they are not needed
  • All automated tests pass, or I explained why that is not possible
  • I have personally tested this on my device, or explained why that is not possible

Final pass

  • Code is as readable as possible.
  • I have reviewed this PR myself, ensuring it meets each checklist item

This change is Reviewable

Create test to confirm that OperationRepo.processQueueForever keeps poll
when it should suspend the thread while waiting for the next operation.

My theory is this polling every 5 seconds means the app process may
never fully sleep, which contributes to battery drain.

We moved code to a new getNextOps() function simply to make it testable,
no logic changes in this commit.
Created a new private Mocks class to encapsulate creating the depended
mocks into a one liner each test can use. Some tests override the
provided mock behavior by calling every {} again.
Refactored OperationRepo.processQueueForever to prevent it from
continuously pulling when the queue is empty. My theory is this polling
every 5 seconds means the app process may never fully sleep, which
contributes to battery drain.

This commit makes the
"ensure processQueueForever suspends when queue is empty" pass.
@jkasten2 jkasten2 force-pushed the fix/op-repo-fix-keeping-cpu-awake branch from 207c797 to 4a64d97 Compare March 21, 2024 19:57
@jkasten2 jkasten2 changed the title WIP [Improvement] Prevent OperationRepo from pulling when empty WIP [Improvement] Prevent OperationRepo from continuously pulling when empty Mar 22, 2024
@jkasten2 jkasten2 changed the title WIP [Improvement] Prevent OperationRepo from continuously pulling when empty [Improvement] Prevent OperationRepo from continuously pulling when empty Mar 25, 2024
@jkasten2 jkasten2 assigned jkasten2 and nan-li and unassigned jkasten2 Mar 27, 2024
// 2. Wait at least the time defined in opRepoExecutionInterval
// so operations can be grouped, unless one of them used
// flush=true (AKA force)
val startTime = _time.currentTimeMillis
Copy link
Contributor

@nan-li nan-li Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if I missed something about this logic, technically startTime is not necessary since this variable is never used or updated?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah good point, I removed it in a --fixup commit just now. I think I refactored this a few times and forgot I didn't need startTime any more.

Comment on lines -152 to -153
} catch (e: Throwable) {
Logging.log(LogLevel.ERROR, "Error occurred with Operation work loop", e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This try-catch is removed because it does not seem to have a specific purpose or thing it is meant to catch right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, there isn't anything that I can see that could throw, and if it does we want to know instead of silently moving on. Also note I left the try-catch on executeOperations since it wasn't in scope of this PR.

@nan-li
Copy link
Contributor

nan-li commented Mar 28, 2024

A longer multiple day test is being done to confirm the results, will update PR afterwards with the results.

Was there an outcome to this?

We renamed waitForWake to waitForNewOperationAndExecutionInterval to
describe that it is doing two things.

Fixed the logic where enqueuing two or more operations would skip our
executionInterval logic. Added test coverage for this logic, as well
as a 2nd test to ensure flush can still skip waiting for it.

Lastly cleaned up the force state in the main processQueueForever loop,
we were able to fully encapsulate it into
waitForNewOperationAndExecutionInterval. The other part of the logic is
once we start processing the queue we want to do so until it's empty
again, which ops != null is the only checked needed in
processQueueForever to cover that.
We don't want to do any waiting when we call wake, so use trySend
instead. Also added result checking, to ensure we throw instead of
silently handling the error so we know if there is an issue at any time.
@jkasten2 jkasten2 force-pushed the fix/op-repo-fix-keeping-cpu-awake branch from 60a4280 to fd60e70 Compare March 28, 2024 21:50
@jkasten2
Copy link
Member Author

jkasten2 commented Mar 28, 2024

A longer multiple day test is being done to confirm the results, will update PR afterwards with the results.

Was there an outcome to this?

After some longer tests it seems inconclusive now.

Don't think it worth the time to try other tests right now, there is going to be a lot of other improvements in the upcoming 5.1.7 release (such as retrying with backoff) that might be the root source of some reported battery drain.

@jkasten2 jkasten2 merged commit 79a2fbd into main Mar 28, 2024
2 of 3 checks passed
@jkasten2 jkasten2 deleted the fix/op-repo-fix-keeping-cpu-awake branch March 28, 2024 22:48
@jkasten2 jkasten2 mentioned this pull request Mar 29, 2024
@jkasten2 jkasten2 changed the title [Improvement] Prevent OperationRepo from continuously pulling when empty [Improvement] Prevent OperationRepo's 5 seconds from being skipped, and continuously pulling when empty Apr 11, 2024
@jkasten2 jkasten2 changed the title [Improvement] Prevent OperationRepo's 5 seconds from being skipped, and continuously pulling when empty [Improvement] Prevent OperationRepo's 5 second wait from being skipped, and continuously pulling when empty Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants