-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Randomize jitter #38545
Randomize jitter #38545
Conversation
I see the build failing but I can't zero down to the source where my set of changes are causing exceptions. Any help would be appreciated! |
The extra log entries for base changes and |
0a5ca02
to
c7c7f6f
Compare
It'd be great if someone could take a look at this one! Sorry if I'm being too pushy! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch, and thanks for contributing @aditya-vector!
c7c7f6f
to
07e31aa
Compare
author Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1582316102 +0530 committer Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1583159505 +0530 parent 6d0895a author Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1582316102 +0530 committer Aditya Narsapurkar <adityanarsapurkar@yahoo.com> 1583159327 +0530 Randomize jitter - This PR attempts to fix a problem with ActiveJob jitter where the `determine_jitter_for_delay` value may not always be randomized. Especially when the jitter delay multplier is between 1 and 2 it always returns 0. - With this change, we pass a range to `Kernel.rand` beginning with 0 to the `jitter_multiplier`. With positive float values, the return value will be a random float number from the range. - Includes test cases to verify random wait time when the jitter_multiplier is between 1 and 2. - Updated relevant test cases stubbing the `Kernel.rand` method, refactored some by removing unwanted stubs for `Kernel.rand` method where jitter is falsey. Fixed rubocop issue - used assert_not_equal instead of refute_equal in test case Fixed rubocop issue - used assert_not_equal instead of refute_equal in test case Fixed rubocop issue - used assert_not_equal instead of refute_equal in test case Review updates - separated test cases for random wait time with default and exponentially retrying jobs - Another test case added to make sure negative wait time does not affect the randomization Review updates - Instead of using Kernel.rand with range, used simple multiplication with Kernel.rand for calculating delay for jitter - Updates to the tests according to changes
07e31aa
to
77435e2
Compare
@rafaelfranca @jeremy Thanks again for your reviews! I've made the suggested changes. Please take a look! 🤞 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @aditya-vector. Grateful for your contribution. Thank you!
Thanks, @jeremy! For being patient and supportive. Happy to have my first commit merged! 😃 |
Summary:
The
jitter
option toActiveJob#retry_on
was added in this PR #31872 and relevant updates made in #38003 and #37923. Thanks to these changes, the thundering herd issue is taken care of.Currently, an issue with the
jitter
option is that even when non zero jitter values are provided, the resultant wait time may not be random. This is a case for many combinations of wait types and values (some of them with default argument value combinations).For some set of arguments provided, the jitter values fetched from
Kernel.rand
are not randomized. For instance, in thedetermine_jitter_for_delay(delay, jitter)
method, whendelay * jitter
results in a value between 1 and 2,Kernel.rand
will always be0
. It may provide false positive to the user that a random jitter will be added to thewait
time. And a situation may occur where there may be a possibility of thundering herd problem.Some examples for reference,
With wait time,
->
retry_on(*exceptions, jitter: 0.6)
will always result indelay_jitter
as 0 andwait
as 3 seconds for the first attempt after failure.With exponential wait time,
->
retry_on(*exceptions, wait: :exponentially_longer, jitter: 1.2)
will always result indelay_jitter
as 0 andwait
as 3 seconds for the first attempt after failure.Also, with a positive numeric argument,
Kernel.rand
returns an integer which reduces the possible set of returned values and thus, resulting in a higher probability that a value will be repeated. Eg:Kernel.rand(2.4)
will only have 2 possible values, 0 and 1.Proposed solution:
This PR attempts to pass a range to
Kernel.rand
beginning with 0 to thejitter_multiplier
. With positive float values, the return value will be a random float number from the range.Kernel.rand
method, refactored some by removing unwanted stubs forKernel.rand
method where jitter is falsey.PS: This happens to be my first PR on Rails repository. Apologies if I've missed something obvious.