Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomize jitter #38545

Merged
merged 1 commit into from
Mar 5, 2020
Merged

Conversation

aditya-vector
Copy link
Contributor

@aditya-vector aditya-vector commented Feb 21, 2020

Summary:

The jitter option to ActiveJob#retry_on was added in this PR #31872 and relevant updates made in #38003 and #37923. Thanks to these changes, the thundering herd issue is taken care of.

Currently, an issue with the jitter option is that even when non zero jitter values are provided, the resultant wait time may not be random. This is a case for many combinations of wait types and values (some of them with default argument value combinations).

For some set of arguments provided, the jitter values fetched from Kernel.rand are not randomized. For instance, in the determine_jitter_for_delay(delay, jitter) method, when delay * jitter results in a value between 1 and 2, Kernel.rand will always be 0. It may provide false positive to the user that a random jitter will be added to the wait time. And a situation may occur where there may be a possibility of thundering herd problem.

Some examples for reference,

  1. With wait time,
    -> retry_on(*exceptions, jitter: 0.6) will always result in delay_jitter as 0 and wait as 3 seconds for the first attempt after failure.

  2. With exponential wait time,
    -> retry_on(*exceptions, wait: :exponentially_longer, jitter: 1.2) will always result in delay_jitter as 0 and wait as 3 seconds for the first attempt after failure.

Also, with a positive numeric argument, Kernel.rand returns an integer which reduces the possible set of returned values and thus, resulting in a higher probability that a value will be repeated. Eg: Kernel.rand(2.4) will only have 2 possible values, 0 and 1.

Proposed solution:

This PR attempts to pass a range to Kernel.rand beginning with 0 to the jitter_multiplier. With positive float values, the return value will be a random float number from the range.

  • Includes test cases to verify random wait time when the jitter_multiplier is between 1 and 2.
  • Updated relevant test cases stubbing the Kernel.rand method, refactored some by removing unwanted stubs for Kernel.rand method where jitter is falsey.

PS: This happens to be my first PR on Rails repository. Apologies if I've missed something obvious.

@rails-bot rails-bot bot added the activejob label Feb 21, 2020
@aditya-vector
Copy link
Contributor Author

I see the build failing but I can't zero down to the source where my set of changes are causing exceptions. Any help would be appreciated!

@aditya-vector aditya-vector changed the base branch from master to 1-2-stable February 23, 2020 10:50
@aditya-vector aditya-vector changed the base branch from 1-2-stable to master February 23, 2020 10:50
@aditya-vector
Copy link
Contributor Author

The extra log entries for base changes and actionpack and activerecord labels are a result of my earlier rebase not showing changes from the base branch (master). I had to change the base to a random branch and change it back to master to fix it. But that did not remove the actionpack and activerecord labels

@aditya-vector aditya-vector force-pushed the activejob-randomize-jitter branch 2 times, most recently from 0a5ca02 to c7c7f6f Compare March 4, 2020 18:25
@aditya-vector
Copy link
Contributor Author

It'd be great if someone could take a look at this one! Sorry if I'm being too pushy! :)

Copy link
Member

@jeremy jeremy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, and thanks for contributing @aditya-vector!

activejob/lib/active_job/exceptions.rb Outdated Show resolved Hide resolved
author Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1582316102 +0530
committer Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1583159505 +0530

parent 6d0895a
author Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1582316102 +0530
committer Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1583159327 +0530

Randomize jitter
- This PR attempts to fix a problem with ActiveJob jitter where the `determine_jitter_for_delay` value may not always be randomized. Especially when the jitter delay multplier is between 1 and 2 it always returns 0.
- With this change, we pass a range to `Kernel.rand` beginning with 0 to the `jitter_multiplier`. With positive float values, the return value will be a random float number from the range.
- Includes test cases to verify random wait time when the jitter_multiplier is between 1 and 2.
- Updated relevant test cases stubbing the `Kernel.rand` method, refactored some by removing unwanted stubs for `Kernel.rand` method where jitter is falsey.

Fixed rubocop issue - used assert_not_equal instead of refute_equal in test case

Fixed rubocop issue - used assert_not_equal instead of refute_equal in test case

Fixed rubocop issue - used assert_not_equal instead of refute_equal in test case

Review updates - separated test cases for random wait time with default and exponentially retrying jobs
- Another test case added to make sure negative wait time does not affect the randomization

Review updates
- Instead of using Kernel.rand with range, used simple multiplication with Kernel.rand for calculating delay for jitter
- Updates to the tests according to changes
@aditya-vector
Copy link
Contributor Author

@rafaelfranca @jeremy Thanks again for your reviews! I've made the suggested changes. Please take a look! 🤞

Copy link
Member

@jeremy jeremy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @aditya-vector. Grateful for your contribution. Thank you!

@jeremy jeremy merged commit 3aacd85 into rails:master Mar 5, 2020
@aditya-vector
Copy link
Contributor Author

Thanks, @jeremy! For being patient and supportive. Happy to have my first commit merged! 😃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants