Skip to content

Unify Entangler and AtomicSwitcher retry interface#39

Merged
bbuchalter merged 7 commits into
masterfrom
retry_refactor
Jul 30, 2018
Merged

Unify Entangler and AtomicSwitcher retry interface#39
bbuchalter merged 7 commits into
masterfrom
retry_refactor

Conversation

@bbuchalter

Copy link
Copy Markdown

These two classes both implementing a retry-on-exception pattern.
Chunker will be the third consumer of this pattern.
Let's make reusing this pattern very easy while making the implementation
consistent.

This change offers several improvements:

  • No homegrown retry logic
  • Retries now have more configuration choices
  • Retries now backoff
  • Better logging during retries
  • Fixes a N+1 assertion in the test suite
  • Fixes a slow unit test
  • Stops usage of entangler.instance_variable_set

@insom insom left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all this looks good. I have some comments about retries and some nit-picks.

I have to say that having instance variables in classes which are only used by a helper module feels kind of bad -- that's not immediately discoverable -- but it's much better than what it's replacing so 🤷‍♂️

Want to hold off on formal +1 until someone else can tell me if I'm right or wrong about the max_elapsed_time parameter.

Comment thread lib/lhm/retry_helper.rb Outdated
def retry_config
{
on: {
StandardError => [/Lock wait timeout exceeded/]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is what the original code said, but I think we can assume Mysql2::Error to be a little more specific.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I chose to use StandardError is because it is actually a bit unpredictable which error class will raise the exception. For example ActiveRecord::StatementInvalid is supposed to be the superclass for all database execution errors. However, if I look in Bugsnag for all errors with this message, it's reporting ActiveRecord::LockWaitTimeout as the error class.

My thinking is, no matter who handles the error, the message will come from MySQL, so that's what we should target. I'm open to further discussion on this, but I'd rather over compensate than under compensate.

Comment thread lib/lhm/retry_helper.rb Outdated
base_interval: @retry_wait || self.class::DEFAULT_RETRY_WAIT, # initial interval in seconds between tries
multiplier: 1.5, # each successive interval grows by this factor
rand_factor: 0.25, # percentage to randomize the next retry interval time
max_elapsed_time: 900, # max total time in seconds that code is allowed to keep being retried

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could totally be wrong, but my reading of https://github.com/kamui/retriable/blob/master/lib/retriable.rb#L41 is that this includes the first try of the code -- that is: start_time gets set as soon as #retriable is called.

One way to verify this in code would be a test using Timecop, but I'm not sure that's necessary here; a second opinion might be enough.

If max_elapsed_time does include the initial try, we probably need to set it to Float::INFINITY as it looks like it's a mandatory parameter.

Comment thread lib/lhm/retry_helper.rb
}
end

private

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just kill this line

Brian Buchalter added 3 commits July 30, 2018 10:52
This change offers several improvements:
* No homegrown retry logic
* Retries now have more configuration choices
* Retries now backoff
* Better logging during retries
* Fixes a N+1 assertion in the test suite
* Fixes a slow unit test
* Stops usage of entangler.instance_variable_set
This makes the retry implemenation of AtomicSwitcher match Entangler.
@bbuchalter

bbuchalter commented Jul 30, 2018

Copy link
Copy Markdown
Author

I have to say that having instance variables in classes which are only used by a helper module feels kind of bad

@insom Thanks for raising this point. I had to think on it a while to decide how I felt about it. I agree 100% that using composition isn't very super discoverable, but I disagree that it's kind of bad; it's just a trade off.

The alternative is to use inheritance, with the base class raising exceptions if the sub-class does not implement the interface. While we gain discoverability (everybody looks at the base class, right? 😉), but we lose flexibility. That's why, generally, we prefer composition with modules.

The same guards are in place which are provided by inheritance, so we really don't loose too much. It's just a question of: "is my reader more likely to look at the base class or the included modules?" I think the honest answer is likely "neither", so I'll take the flexibility provided by composition. ❤️

These two consumers of Retriable were extremely similar.
Since we're going to be added a third consumer, let's make reusing
this pattern very easy.
@bbuchalter

Copy link
Copy Markdown
Author

Want to hold off on formal +1 until someone else can tell me if I'm right or wrong about the max_elapsed_time parameter.

@insom you were spot on with your reading of the max_elapsed_time param. I've modified it to Float::INFINITY. Great catch!

@bbuchalter bbuchalter requested a review from insom July 30, 2018 17:38

@insom insom left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments on the use of modules vs. inheritance and also regarding the base class -- all sounds good to me.

@bbuchalter

Copy link
Copy Markdown
Author

@jordanwheeler or @shuhaowu - I'd like to get a second review on this PR please. It's ready now.

@shuhaowu

Copy link
Copy Markdown

Minor nitpick: is there a reason why we are repeating what's essentially:

 Retriable.retriable(retry_config) do
  @connection.execute(tagged(stmt))
end

Can't the RetryHelper define a method called execute_with_retries and just take in a statement so we can avoid this repeat? This way if we ever want to change the retry logic underneath for some reason, we can modify it once.

@bbuchalter

Copy link
Copy Markdown
Author

@shuhaowu that seems very reasonable. Will change.

This consolidates all retry implementation details in the helper
so if it needs to change, it will change in only one place.
@bbuchalter

Copy link
Copy Markdown
Author

Great suggestion @shuhaowu - I've implemented your changes and am ready for a second review. ❤️

@shuhaowu

Copy link
Copy Markdown

Looking at the code again and some previous comments, I'm also not a big fan of the instance variable and class constants defined such that the module can use it. Instead of doing that, can we do something along the following lines:

  1. Define a method configure_retry(hash) in the RetryHelper module. This method will allow the caller to configure the retry_config hash (either limited to the current two keys or can be expanded to cover more of it).
  2. retry_config will merge a default ones it defines and the one the user set via configure_retry.
  3. Each class that includes RetryHelper calls configure_retry in the initialize, passing in the relevant arguments.

Would this be an acceptable compromise?

@bbuchalter

Copy link
Copy Markdown
Author

@shuhaowu I'll put together a change and we can decide if we like it better. 👍

Brian Buchalter added 2 commits July 30, 2018 15:40
By asking users of RetryHelper to call `configure_retry` we
can be more explicit about how to configure this module.

Additionally:
* all of Retriables configuration can be customized
* the configuration is easily observable
* if `configure_retry` is not invoked, `@retry_config` is `nil` which
causes Retriable to raise an exception when `execute_with_retries` is called
This makes the configuration choices provided by retriable
identical to the options passed to the RetryHelper through
the initialization options of the caller.

Hash#dig was introduced in Ruby 2.3.0, so let's make it an explicit
requirement. Ruby 2.3.0 was released Dec 2015, so I think it's been
long enough.
@bbuchalter

Copy link
Copy Markdown
Author

@shuhaowu I've added two commits based on your feedback, and I think it's a nice improvement. Please take another look.

@shuhaowu shuhaowu left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@bbuchalter bbuchalter merged commit 10b676d into master Jul 30, 2018
@bbuchalter bbuchalter deleted the retry_refactor branch July 30, 2018 22:51
bbuchalter pushed a commit that referenced this pull request Aug 27, 2018
This is a re-implementation of the RetryHelper that was introduced
in #39.

By being a class instead of a module, it has a much more clear interface
and requires less setup. It features a true integration test for lock conditions
which were replaced in #20 with mocks.

It's configuration is currently identical to the configuration of RetryHelper.
bbuchalter pushed a commit that referenced this pull request Aug 28, 2018
This is a re-implementation of the RetryHelper that was introduced
in #39.

By being a class instead of a module, it has a much more clear interface
and requires less setup. It features a true integration test for lock conditions
which were replaced in #20 with mocks.

It's configuration is currently identical to the configuration of RetryHelper.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants